Skip to content

CyclicFormer is a new architecture designed to enhance the performance of the transformer architecture. It introduces a new perspective for decoder layers, forming a cyclic loop between all the layers.

Notifications You must be signed in to change notification settings

LegallyCoder/CyclicFormer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 

Repository files navigation

CyclicFormer

cyclicformer

Implementation of the CyclicFormer architecture with hf_integration.

This architecture created by Talha Rüzgar Akkuş.

History Of Development CyclicFormer:

One day, I found myself contemplating how to advance the transformer architecture. I pondered, "What if I introduce a cyclic loop into the RNN, enabling the first node to be aware of the last node?" Excited by the idea, I quickly implemented the code. To my amazement, CyclicRNN outperformed standard RNNs on several tasks. This led me to think, "Why not apply this concept to the transformer architecture?" By doing so, I could enhance inter-layer connectivity within the model, much like the human brain.

After examining existing transformer implementations, I intentionally avoided searching for similar architectures in the literature, aiming to create something unprecedented. I initially developed this from scratch using plain Torch, and later decided to integrate it with Hugging Face. And thus, I present to you: "CyclicFormer with hf_integration."

Working Mechanism:

Usage:

To use the CyclicFormer, follow these steps:

  1. Clone the repository to your local machine.
git clone https://github.com/LegallyCoder/CyclicFormer
  1. Open a terminal or command prompt and navigate to the script's directory.
cd src
  1. Install the required packages using this command:
pip3 install -r requirements.txt
  1. Open new python file at the script's directory.
from modeling_cyclicformer import CyclicFormerForCausalLM
from transformers import AutoTokenizer

model = CyclicFormerForCausalLM.from_pretrained('Q-bert/CyclicFormer-tiny-shakespeare')
tokenizer = AutoTokenizer.from_pretrained('gpt2')

text = "Hi"

input_ids = tokenizer.encode(text, return_tensors="pt")

output = model.generate(input_ids, max_length=20, num_beams=5, no_repeat_ngram_size=2)

generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

print(generated_text)

Training Code:

Open In Colab

For more:

You can reach me on,

Linkedin

Twitter

Hugging Face

About

CyclicFormer is a new architecture designed to enhance the performance of the transformer architecture. It introduces a new perspective for decoder layers, forming a cyclic loop between all the layers.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages