Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2005.14165] Language Models are Few-Shot Learners #898

Open
1 task
ShellLM opened this issue Aug 20, 2024 · 1 comment
Open
1 task

[2005.14165] Language Models are Few-Shot Learners #898

ShellLM opened this issue Aug 20, 2024 · 1 comment
Labels
human-verified <INST>NEVER PICK THIS LABEL</INST> in-context-learning Examples of few-shot prompts for in-context learning. llm Large Language Models llm-benchmarks testing and benchmarking large language models llm-experiments experiments with large language models MachineLearning ML Models, Training and Inference Papers Research papers prompt-engineering Developing and optimizing prompts to efficiently use language models for various applications and re

Comments

@ShellLM
Copy link
Collaborator

ShellLM commented Aug 20, 2024

Language Models are Few-Shot Learners

[2005.14165] Language Models are Few-Shot Learners

Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3's few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora. Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general.

Comments: 40+32 pages
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2005.14165 [cs.CL]
(or arXiv:2005.14165v4 [cs.CL] for this version)

https://doi.org/10.48550/arXiv.2005.14165

Suggested labels

None

@ShellLM ShellLM added in-context-learning Examples of few-shot prompts for in-context learning. llm Large Language Models llm-benchmarks testing and benchmarking large language models llm-experiments experiments with large language models Papers Research papers labels Aug 20, 2024
@ShellLM
Copy link
Collaborator Author

ShellLM commented Aug 20, 2024

Related content

#897 similarity score: 0.9
#823 similarity score: 0.89
#686 similarity score: 0.88
#221 similarity score: 0.88
#769 similarity score: 0.87
#856 similarity score: 0.86

@irthomasthomas irthomasthomas added MachineLearning ML Models, Training and Inference human-verified <INST>NEVER PICK THIS LABEL</INST> prompt-engineering Developing and optimizing prompts to efficiently use language models for various applications and re labels Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
human-verified <INST>NEVER PICK THIS LABEL</INST> in-context-learning Examples of few-shot prompts for in-context learning. llm Large Language Models llm-benchmarks testing and benchmarking large language models llm-experiments experiments with large language models MachineLearning ML Models, Training and Inference Papers Research papers prompt-engineering Developing and optimizing prompts to efficiently use language models for various applications and re
Projects
None yet
Development

No branches or pull requests

2 participants