Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2204.02311] PaLM: Scaling Language Modeling with Pathways #897

Open
1 task
ShellLM opened this issue Aug 20, 2024 · 1 comment
Open
1 task

[2204.02311] PaLM: Scaling Language Modeling with Pathways #897

ShellLM opened this issue Aug 20, 2024 · 1 comment
Labels
base-model llm base models not finetuned for chat code-generation code generation models and tools like copilot and aider human-verified <INST>NEVER PICK THIS LABEL</INST> in-context-learning Examples of few-shot prompts for in-context learning. llm Large Language Models llm-evaluation Evaluating Large Language Models performance and behavior through human-written evaluation sets MachineLearning ML Models, Training and Inference ml-inference Running and serving ML models. Models LLM and ML model repos and links Papers Research papers Research personal research notes for a topic

Comments

@ShellLM
Copy link
Collaborator

ShellLM commented Aug 20, 2024

PaLM: Scaling Language Modeling with Pathways

Snippet

"Large language models have been shown to achieve remarkable performance across a variety of natural language tasks using few-shot learning, which drastically reduces the number of task-specific training examples needed to adapt the model to a particular application. To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM. We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML system which enables highly efficient training across multiple TPU Pods. We demonstrate continued benefits of scaling by achieving state-of-the-art few-shot learning results on hundreds of language understanding and generation benchmarks. On a number of these tasks, PaLM 540B achieves breakthrough performance, outperforming the finetuned state-of-the-art on a suite of multi-step reasoning tasks, and outperforming average human performance on the recently released BIG-bench benchmark. A significant number of BIG-bench tasks showed discontinuous improvements from model scale, meaning that performance steeply increased as we scaled to our largest model. PaLM also has strong capabilities in multilingual tasks and source code generation, which we demonstrate on a wide array of benchmarks. We additionally provide a comprehensive analysis on bias and toxicity, and study the extent of training data memorization with respect to model scale. Finally, we discuss the ethical considerations related to large language models and discuss potential mitigation strategies."

Subjects

Computation and Language (cs.CL)

Suggested labels

None

@ShellLM ShellLM added code-generation code generation models and tools like copilot and aider in-context-learning Examples of few-shot prompts for in-context learning. llm Large Language Models ml-inference Running and serving ML models. Models LLM and ML model repos and links Papers Research papers Research personal research notes for a topic labels Aug 20, 2024
@ShellLM
Copy link
Collaborator Author

ShellLM commented Aug 20, 2024

Related content

#769 similarity score: 0.88
#686 similarity score: 0.87
#735 similarity score: 0.85
#681 similarity score: 0.85
#823 similarity score: 0.85
#317 similarity score: 0.85

@irthomasthomas irthomasthomas added base-model llm base models not finetuned for chat MachineLearning ML Models, Training and Inference llm-evaluation Evaluating Large Language Models performance and behavior through human-written evaluation sets human-verified <INST>NEVER PICK THIS LABEL</INST> labels Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
base-model llm base models not finetuned for chat code-generation code generation models and tools like copilot and aider human-verified <INST>NEVER PICK THIS LABEL</INST> in-context-learning Examples of few-shot prompts for in-context learning. llm Large Language Models llm-evaluation Evaluating Large Language Models performance and behavior through human-written evaluation sets MachineLearning ML Models, Training and Inference ml-inference Running and serving ML models. Models LLM and ML model repos and links Papers Research papers Research personal research notes for a topic
Projects
None yet
Development

No branches or pull requests

2 participants