MyLlama2

This is a ground-up reimplementation of the Llama 2 family of large language models. It adheres to the exact same architecture, based on a decoder-only transformer model equipped with Group-Query Attention (GQA), key-value caching, SwiGLU feedforward layers, and SentencePiece embeddings. It is compatible with the original Llama 2 weights. Unlike Meta's model, this implementation does not incorporate parallel layers or any distributed processing APIs. Consequently, it can only run on a single GPU and is also capable of running on a CPU without the need for special tools (e.g., torchrun, mpi, etc.).

This model was created for demonstration purposes, with the intent of sharing it with the community. During its development, I identified a few minor issues in FAIR's implementation, which I plan to contribute back through pull requests. I believe this implementation is more accessible for those new to AI, and I've included references to the papers where these concepts were first introduced. However, this code has not been extensively reviewed. For production projects, I recommend starting with Meta's implementation. For high-performance CPU-only inference, consider compiling llama.cpp while targeting the native architecture.

Example usage:

python inference_example.py \
	llama/llama-2-7b \
	./tokenizer.model \
	--top_p 0.8 \
	--max_generation_length 100 \
	--context "Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal."

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
llama		llama
.pylintrc		.pylintrc
README.md		README.md
inference_example.py		inference_example.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MyLlama2

Example usage:

About

Releases

Packages

Contributors 39

Languages

flu0r1ne/myllama2

Folders and files

Latest commit

History

Repository files navigation

MyLlama2

Example usage:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 39

Languages

Packages