Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mastering LLM Techniques: Inference Optimization | NVIDIA Technical Blog #153

Open
1 task
irthomasthomas opened this issue Dec 14, 2023 · 0 comments
Open
1 task
Labels
Algorithms Sorting, Learning or Classifying. All algorithms go here. finetuning Tools for finetuning of LLMs e.g. SFT or RLHF llm Large Language Models llm-experiments experiments with large language models MachineLearning ML Models, Training and Inference unclassified Choose this if none of the other labels (bar New Label) fit the content.

Comments

@irthomasthomas
Copy link
Owner

  • Mastering LLM Techniques: Inference Optimization | NVIDIA Technical Blog

    Stacking transformer layers to create large models results in better accuracies, few-shot learning capabilities, and even near-human emergent abilities on a wide range of language tasks. These foundation models are expensive to train, and they can be memory- and compute-intensive during inference (a recurring cost). The most popular large language models (LLMs) today can reach tens to hundreds of billions of parameters in size and, depending on the use case, may require ingesting long inputs (or contexts), which can also add expense. 

This post discusses the most pressing challenges in LLM inference, along with some practical solutions. Readers should have a basic understanding of transformer architecture and the attention mechanism in general. It is essential to have a grasp of the intricacies of LLM inference, which we will address in the next section.

@irthomasthomas irthomasthomas added inbox-url llm Large Language Models finetuning Tools for finetuning of LLMs e.g. SFT or RLHF llm-experiments experiments with large language models Algorithms Sorting, Learning or Classifying. All algorithms go here. MachineLearning ML Models, Training and Inference unclassified Choose this if none of the other labels (bar New Label) fit the content. labels Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algorithms Sorting, Learning or Classifying. All algorithms go here. finetuning Tools for finetuning of LLMs e.g. SFT or RLHF llm Large Language Models llm-experiments experiments with large language models MachineLearning ML Models, Training and Inference unclassified Choose this if none of the other labels (bar New Label) fit the content.
Projects
None yet
Development

No branches or pull requests

1 participant