Skip to content

Load nanoGPT-style transformers in Julia. Code ported from @karpathy's llama2.c

License

Notifications You must be signed in to change notification settings

rai-llc/LanguageModels.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LanguageModels.jl

A port of @karpathy's llama2.c to pure Julia.

Build Status

Video of LanguageModels.jl running llama2-13b

Special licensing exception

This repo contains one external file, sentencepiece_model.proto, which has its own copyright and license. This file is used to generate a Julia interface in the sentencepiece subdirectory, which is then used to load the tokenizer model for llama2.

How to install

This package is currently unregistered. To install it, run

julia> using Pkg; Pkg.add(url="https://github.com/rai-llc/LanguageModels.jl")

Basic usage

julia> using LanguageModels

julia> LanguageModels.talkto(temperature=0.0)
REPL mode language_model_mode initialized. Press ~ to enter and backspace to exit.
"Prompt(\"language_model> \",...)"

language_model> Once upon a time, there was a llama named
[ Info: dim = 288
[ Info: hidden_dim = 768
[ Info: n_layers = 6
[ Info: n_heads = 6
[ Info: n_kv_heads = 6
[ Info: vocab_size = 32000
[ Info: seq_len = 256
[ Info: shared_weights = true
[ Info: vocab_size = 32000
Once upon a time, there was a llama named Joe. Joe was a very happy little boy who loved to play. One day, Joe was playing in the park when he saw a big, red balloon. Joe was so excited and he wanted to play with it.
Joe ran over to the balloon and started to play with it. He was having so much fun, he didn't notice the balloon was getting bigger and bigger. Suddenly, the balloon popped and Joe was so sad.
Joe's mom saw what happened and she said, "Joe, why did you poke the balloon? You should be more careful." Joe felt very sorry and said, "I'm sorry, Mommy. I didn't mean to poke the balloon."
Joe's mom hugged him and said, "It's okay, Joe. We all make mistakes. But remember, it's important to be careful and think before you do something. That way, you can avoid accidents and stay safe."
Joe nodded and said, "I will remember, Mommy. I will be more careful next time."

Supported keyword arguments:

  • checkpoint_filename: defaults to stories15M artifact which is downloaded when package is built
  • tokenizer_filename: ditto. The tokenizer.model file should be in the sentencepiece ModelProto Protocol Buffers format.
  • temperature::Float32: how much randomness to use in sampling. Default: 0.9f0
  • steps: How many tokens to generate. Default: 256. Must not exceed seq_len in the model definition.
  • mmap: Whether to use memory mapped I/O to load models that are too large to fit in memory. Default: false
  • stop_on_special_tokens. Stop sampling tokens once the special tokens such as UNK (unknown), BOS (beginning-of-sentence) or EOS (end of sentence) are encountered. Default: true. For behavior similar to llama2.c, set to false.

Models

This implementation by default uses the stories15M model from Andrej Karpathy's tinyllamas project on HuggingFace, and its corresponding tokenizer.model from the llama2.c repo. This model is automatically downloaded when the package is built.

llama-2

If you have access to the llama-2 model weights, you can follow the llama2.c instructions for converting them into a usable format. The tokenizer.model can be read as is. (The code here automatically generates the reader format using the sentencepiece Protobuf specification.)

julia> LanguageModels.main(checkpoint_filename="/Users/jiahao/models/llama/llama-2-7b.bin", tokenizer_filename="/Users/jiahao/models/llama/tokenizer.model", prompt="Once upon a time, there was a llama called",)
Once upon a time, there was a llama called Pinky and a sheep called Blue.
Pinky was a llama who ate vermicelli. He was very fussy about how it was cooked. He wanted it long, and hot and slightly damp. It was the only way he would eat it.
He lived with two humans, Flora and Henry. They didn’t like him very much. They found him funny-looking and they thought he was smelly.
Flora thought he smelled of durian.
One day, Blue and Pinky went to their favourite restaurant together. It was called Not Very Delicious at All.
I’ll leave it you to make up your own mind. You can hear the poem performed here.
The background I used was the free pixel artist at the top of the sidebar.
Great story. I need to fin a pic of just flora, or just Henry. Do you have one or do I need to make one?
I don’t have one with Flora alone. I’ll have a look through my stash and see if I can find one though.

About

Load nanoGPT-style transformers in Julia. Code ported from @karpathy's llama2.c

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages