Add torchtune checkpoint link, modify product position statement loca…

…tion (#241) This PR: 1 - add's feature note and link to checkpoint doc on supporting torchtitan weights being saved and loaded into torchtune for fine tuning. 2 - moves the product position info from top of page to bottom.
pytorch · Apr 16, 2024 · 4f6ed9a · 4f6ed9a
1 parent 04f5b82
commit 4f6ed9a
Showing 1 changed file with 5 additions and 4 deletions.
diff --git a/README.md b/README.md
@@ -15,11 +15,9 @@
 * Modular components instead of monolithic codebase.
 * Get started in minutes, not hours!
 
-Please note: `torchtitan` is a proof-of-concept for Large-scale LLM training using native PyTorch. It is (and will continue to be) a repo to showcase PyTorch's latest distributed training features in a clean, minimal codebase. torchtitan is complementary to and not a replacement for any of the great large-scale LLM training codebases such as Megatron, Megablocks, LLM Foundry, Deepspeed, etc. Instead, we hope that the features showcased in torchtitan will be adopted by these codebases quickly. torchtitan is unlikely to ever grow a large community around it.
-
-
 ## Pre-Release Updates:
-#### (4/16/2024): `torchtitan` is now public but in a pre-release state and under development.  Currently we showcase pre-training Llama2 models (LLMs) of various sizes from scratch. `torchtitan` is tested and verified with the PyTorch nightly version `torch-2.4.0.dev20240412`.
+#### (4/16/2024): `torchtitan` is now public but in a pre-release state and under development.
+Currently we showcase pre-training Llama2 models (LLMs) of various sizes from scratch. `torchtitan` is tested and verified with the PyTorch nightly version `torch-2.4.0.dev20240412`. (We recommend latest PyTorch nightly).
 
 Key features available:</br>
 1 - [FSDP2 (per param sharding)](docs/fsdp.md) </br>
@@ -31,6 +29,7 @@ Key features available:</br>
 7 - Fused RMSNorm (optional), learning rate scheduler, meta init, and more.</br>
 8 - All options easily configured via [toml files](train_configs/).</br>
 9 - [Performance](docs/performance.md) verified on 64 A100 GPUs.</br>
+10 - [Save pre-trained torchtitan model weights](docs/checkpoint.md) and load directly into [`torchtune`](https://github.com/pytorch/torchtune) for fine tuning. </br>
 
 
 ## Coming soon features:
@@ -108,6 +107,8 @@ If your gpu count per node is not 8, adjust:
 
 in the SBATCH command section.
 
+## Product Position:
+Please note: `torchtitan` is a proof-of-concept for Large-scale LLM training using native PyTorch. It is (and will continue to be) a repo to showcase PyTorch's latest distributed training features in a clean, minimal codebase. torchtitan is complementary to and not a replacement for any of the great large-scale LLM training codebases such as Megatron, Megablocks, LLM Foundry, Deepspeed, etc. Instead, we hope that the features showcased in torchtitan will be adopted by these codebases quickly. torchtitan is unlikely to ever grow a large community around it.
 
 ## License