Implementation of the CyclicFormer architecture with hf_integration.
This architecture created by Talha Rüzgar Akkuş.
One day, I found myself contemplating how to advance the transformer architecture. I pondered, "What if I introduce a cyclic loop into the RNN, enabling the first node to be aware of the last node?" Excited by the idea, I quickly implemented the code. To my amazement, CyclicRNN outperformed standard RNNs on several tasks. This led me to think, "Why not apply this concept to the transformer architecture?" By doing so, I could enhance inter-layer connectivity within the model, much like the human brain.
After examining existing transformer implementations, I intentionally avoided searching for similar architectures in the literature, aiming to create something unprecedented. I initially developed this from scratch using plain Torch, and later decided to integrate it with Hugging Face. And thus, I present to you: "CyclicFormer with hf_integration."
To use the CyclicFormer, follow these steps:
- Clone the repository to your local machine.
git clone https://github.com/LegallyCoder/CyclicFormer
- Open a terminal or command prompt and navigate to the script's directory.
cd src
- Install the required packages using this command:
pip3 install -r requirements.txt
- Open new python file at the script's directory.
from modeling_cyclicformer import CyclicFormerForCausalLM
from transformers import AutoTokenizer
model = CyclicFormerForCausalLM.from_pretrained('Q-bert/CyclicFormer-tiny-shakespeare')
tokenizer = AutoTokenizer.from_pretrained('gpt2')
text = "Hi"
input_ids = tokenizer.encode(text, return_tensors="pt")
output = model.generate(input_ids, max_length=20, num_beams=5, no_repeat_ngram_size=2)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)
You can reach me on,