Building a Generative Pre-Trained Transformer (GPT) to generate Shakespeare-like text from scratch!
This repo is inspired by and follows along Andrej Karpathy's Neural Networks: Zero to Hero youtube playlist.
In this repo, I create a GPT trained on the tiny Shakespeare dataset which contains over 1 million characters.
The GPT implemented in shakespeare-gpt.py is a character-level Bigram GPT model. The final settings of this model resulted in more than 10 million trainable parameters, achieving the results found in output.txt
The Bigram GPT Model was trained by using Google Colab's T4 GPU with CUDA capabilities. Even so, the model took ~45 minutes to train. If you are trying to clone or replicate this repo, I suggest using a Google Colab T4 GPU or your own GPU (specifically NVIDIA GPUs with CUDA capabilities) to train the model.
To test this repo yourself, follow these steps:
- Use the command
git clone https://github.com/Ryan-W31/Shakespeare-GPT.git
- If running locally
- It is recommended to create a virtual environment, after doing so, all packages can be found in requirements.txt.
pip install -r requirements.txt
- Run the python script.
python shakespeare-gpt.py
- It is recommended to create a virtual environment, after doing so, all packages can be found in requirements.txt.
- If using Google Colab:
- Create a new notebook.
- Upload input.txt and shakespeare-gpt.py to your Google Drive.
- Within the notebook, mount your Google Drive.
- Navigate to where input.txt and shakespeare-gpt.py are in your Google Drive using
%cd
. - Ensure you are using a GPU or TPU runtime, then run:
!python shakespeare-gpt.py