electra-ka

Introduction

electra-ka is an open-source model for the Georgian language.

The model is available on huggingface hub

The model is trained on 33GB of Georgian text collected from 4854621 pages in the commoncrawl archive.

The fine-tuned model is also available on the hub.

from transformers import ElectraTokenizerFast
model = ElectraForSequenceClassification.from_pretrained("jnz/electra-ka-discrediting")
tokenizer = ElectraTokenizerFast.from_pretrained("jnz/electra-ka")

inputs = tokenizer("your text goes here...", return_tensors="pt")
predictions = model(**inputs)

Under the hood, the electra model uses the same architecture as BERT, but to avoid misuse can only serve as a discriminator, which makes it much harder to use for text generation.

BERT architecture language model for Georgian language.

To read more about electra please refer to the paper ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators.

In case of any questions/comments please feel free to reach out at djanezashvili[at]gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
hugging_electra_fine_tuning.ipynb		hugging_electra_fine_tuning.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

electra-ka

Introduction

About

Releases

Packages

Languages

davitjnz/electra-ka

Folders and files

Latest commit

History

Repository files navigation

electra-ka

Introduction

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages