Question we are answering:

Mixtral 8x7B claims it is strong in French, German, Spanish and Italian. What does this mean in terms of training, and does this mean other languages should be used in training to allow users in that language to use the advantages of AI?

How we answered this:

We fine-tuned Mixtral on a question set in Hindi and Spanish to compare the results - and also did the same in RAG using an embeddings model from OpenAI and an evaluation model from RAG using a custom eval set in the native language, to see if the results were any different, but biased in the sense of that eval model.

Results:

For RAG, 0% accuracy for Hindi, 50% accuracy for Spanish, English being a baseline that we presume is higher. Finetuning results will be part of demo.

Issues we encountered:

Internet issues, recent OpenAI model deprecations, the popularity of Mixtral on Hugging Face (kept timing out for RAG) and time for fine-tuning. Also some great hallucinations in Hindi.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Hackathon - Translation - Mixtral - RAG - Hindi.ipynb		Hackathon - Translation - Mixtral - RAG - Hindi.ipynb
Hackathon - Translation - Mixtral - RAG - Spanish.ipynb		Hackathon - Translation - Mixtral - RAG - Spanish.ipynb
README.md		README.md
agi_house_hackathon.ipynb		agi_house_hackathon.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Question we are answering:

How we answered this:

Results:

Issues we encountered:

About

Releases

Packages

Contributors 2

Languages

j-space-b/AGI_house_hackathon_Translation_Augmented_Generation

Folders and files

Latest commit

History

Repository files navigation

Question we are answering:

How we answered this:

Results:

Issues we encountered:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages