The repository is designed to support the growing interest within the community in developing large language models (LLMs) that cater not only to English speakers but also to speakers of the other 6,500+ languages worldwide. Its purpose is to aid researchers in discovering pertinent literature in this field. The repository will encompass a comprehensive collection of core training and evaluation datasets, multilingual-capable LLMs, and associated scholarly articles.
- In-Context Learning and Prompting Strategies
- Performance and Capabilities in Specific Languages
- Challenges and Limitations in Multilingual LLMs
- Multilingual LLMs in Programming and Code
- Comparative Studies and Benchmarks
- Datasets And Benchmarks
- Translation and Language Understanding
- Instruction Tuning
- Safety
- Miscellaneous Studies and Surveys
- [2024] Boosting Many-to-Many Multilingual Translation Performance with Large Language Models via Prompt Strategies and Cross-Lingual Consistency Regularization (XConST) by Pengzhi Gao et al.
- [2023] All Languages Matter: On the Multilingual Safety of Large Language Models: Develops a multilingual safety benchmark for LLMs, demonstrating the need for safety alignment in non-English languages.
- [2023] Multilingual Code Co-evolution using Large Language Models: Discusses the co-evolution of code in multiple languages using LLMs.
- [2023] Prompting Large Language Models with Speech Recognition Abilities: Presents a method to enhance LLMs with multilingual speech recognition capabilities using a small audio encoder.
- [2023] M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Large Language Models: Introduces a comprehensive benchmark for evaluating the performance of LLMs across diverse languages.
- [2023] BigTranslate: Augmenting LLMs with Multilingual Translation Capability over 100 Languages: Presents BigTranslate, a model augmenting LLMs with extensive multilingual translation capabilities.
- [2023] Eliciting Translation Ability of LLMs via Multilingual Finetuning with Translation Instructions: Analyzes how multilingual LLMs carry out translation instructions for different languages through multilingual finetuning.
- [2023] BUFFET: Benchmarking LLMs for Few-shot Cross-lingual Transfer: Introduces BUFFET, a benchmark for evaluating multilingual LLMs in few-shot cross-lingual transfer across various tasks and languages.
- [2023] ChatGPT Beyond English: Comprehensive Evaluation in Multilingual Learning: Evaluates the performance of ChatGPT in multilingual learning across 7 different tasks and 37 languages.
- [2023] Multilingual Machine Translation with LLMs: Empirical Results and Analysis: Investigates the use of LLMs for multilingual machine translation, evaluating popular models including ChatGPT and GPT-4.
- [2023] Implications of LLMs for Dental Medicine: Discusses the potential benefits and risks of using LLMs like ChatGPT for dental medicine, with a focus on multilingual communication.
- [2023] Massively Multilingual Shallow Fusion with LLMs: Explores using a single multilingual language model to improve automatic speech recognition in multiple languages.
- [2022] Bootstrapping Multilingual Semantic Parsers using Large Language Models: Examines the effectiveness of LLMs in translating English datasets into multiple languages for multilingual semantic parsing.
- Holmström et al.: Explores the performance of English and multilingual LLMs in Swedish.
- Zhu et al.: Investigates the advantages and challenges in multilingual machine translation using LLMs.
- Joshi et al.: Introduces RING, a multilingual repair engine powered by a language model trained on code.
- Ahuja et al.: Discusses the development and evaluation of BLOOM, a 176B-parameter open-access multilingual language model.
- Lai et al.: Evaluates ChatGPT and other LLMs on multilingual NLP tasks.
- [2023] CulturaX: A Large, Multilingual Dataset for Training Large Language Models: Introduces CulturaX, a large, multilingual dataset for training LLMs in 167 languages, emphasizing quality with careful data cleaning and deduplication.
- Ladhak et al. - Introduces WikiLingua, a benchmark dataset for cross-lingual abstractive summarization in 18 languages from WikiHow (2020).
- Gupta and Srikumar - Presents X-Fact, a multilingual dataset for factual verification in 25 languages labeled for veracity by expert fact-checkers (2021).
- Nguyen et al. - Proposes CulturaX, with 6.3 trillion tokens in 167 languages, for training multilingual LLMs (2023).
- Barrière et al. - Introduces a dataset of online debates in English for multilingual stance classification related to the European Green Deal (2022).
- Wang et al. - Proposes a dataset for evaluating safeguards in LLMs and trains classifiers achieving results similar to GPT-4 (2023).
- Laperriere et al. - Updates the French MEDIA SLU dataset for spoken language understanding, integrated into the SpeechBrain toolkit (2022).
- Hu et al. - Introduces the Multi3WOZ dataset for training and evaluating multilingual and cross-lingual task-oriented dialog systems (2023).
- 2023 - Presents MEGA, benchmarking generative LLMs across 70 languages and comparing them to non-autoregressive models (2023).
- 2023 - Investigates Large Language Model-based evaluators for multilingual evaluation, highlighting bias and calibration needs (2023).
- 2023 - Proposes the SEAHORSE dataset for evaluating multilingual, multifaceted summarization systems (2023).
- 2023 - Proposes MINT, a multilingual textual intimacy dataset with tweets in 10 languages (2023).
- 2023 - Evaluates ChatGPT on 37 languages across 7 tasks, revealing a performance gap compared to previous models (2023).
- 2023 - Proposes Eva-KELLM, a benchmark for evaluating knowledge editing in LLMs with a focus on cross-lingual knowledge transfer (2023).
- 2023 - Discusses the ComMA dataset, a multilingual dataset annotated with tags for different types of aggression and bias in four languages (2023).
- 2023 - Introduces the GINCO training dataset for automatic genre identification of web documents.
- 2022 - Introduces RING, a multilingual repair engine powered by a large language model trained on code for program repair in multiple languages (2022).
- 2022 - Introduces MEE, a Multilingual Event Extraction dataset with over 50K event mentions in 8 languages (2022).
- Bang et al. - Presents a framework for evaluating interactive LLMs using a newly designed multimodal dataset.
- [2022] BigScience: Social Construction of a Multilingual Large Language Model: Discusses BigScience, a collaborative project that created a multilingual dataset and trained BLOOM, a multilingual LLM.
- Zhu et al. - Proposes the CoST dataset with parallel data from 7 programming languages for code snippet translation (2022).
- Guerreiro et al.: Provides insights into the presence of hallucinations in multilingual translation models.
- Li et al.: Discusses the translation abilities of large language models in multilingual contexts.
Here is a list of papers related to instruction tuning applied to fine-tune large language models (LLMs) for multilingual cases, presented in markdown format with the year of publication, title hyperlinked, and a brief description:
- [2023] CrossAlpaca: Cross-Lingual Alignment for Instruction-tuned Large Language Models: Proposes CrossAlpaca for improving cross-lingual abilities of It-LLMs, emphasizing the need for semantic alignment beyond non-English data instruction tuning.
- [2023] FIAT: Fusing Instruction and Parameter Adaptation for Multilingual Large Language Models: Introduces FIAT, blending in-context learning and full fine-tuning, outperforming other methods across multilingual tasks.
- [2023] M^3IT: Multimodal, Multilingual Instruction Tuning Dataset: Introduces M^3IT dataset for optimizing vision-language model alignment, featuring 2.4 million instances in 80 languages.
- [2023] A Survey on Instruction Tuning for Large Language Models: A comprehensive survey covering the methodology, dataset construction, training, applications, and future avenues of instruction tuning.
- [2023] Comparative Study on LoRA-Based Fine-Tuning for Instruction-Tuning of Large Language Models: Investigates the benefits of LoRA-based fine-tuning over full-parameter tuning for instruction-tuning, especially for Chinese LLMs.
- [2023] Instruction Fine-Tuning for Enhanced Multilingual Adaptability of Large Language Models: Discusses enhancing the adaptability and performance of LLMs for multilingual cases through instruction fine-tuning.
- [2023] Analyzing Translation Instruction Abilities of Large Language Models: Analyzes how LLMs achieve translation capabilities across languages, highlighting the impact of language similarity and pretraining data.
- [2023] Jais and Jais-chat: Arabic-centric Large Language Models: Introduces Jais and Jais-chat, Arabic-centric LLMs showing superior knowledge in Arabic and competitive performance in English.
- [2023] TaCo: Translation-Assisted Cross-Linguality for Instruction-Tuning LLMs - Discusses TaCo, a method using translation to instruction-tune LLMs on new languages.
- [2023] M3IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning - Introduces M^3IT dataset for optimizing alignment between vision-language models and human instructions.
- [2023] Instruction Tuning for Multilingual Large Language Models - Explores the importance of multilingual instruction tuning for LLMs' robustness.
- [2023] Instruction Fine-tuning for LLMs - Investigates the impacts of instruction fine-tuning on LLMs.
- [2023] PVIT: Position-enhanced Visual Instruction Tuning for Multimodal LLMs - Proposes PVIT, a method for fine-grained cross-modal alignment in MLLMs.
- [2022] Multilingual Adaptive Fine-Tuning for Pre-trained Language Models on African Languages: Explores adaptive fine-tuning for improving pre-trained model performance on African languages with reduced model size.
- [2023] Stereotypes in Multilingual Large Language Models and Cross-Linguistic Leakage: Investigates the presence of stereotypes in multilingual LLMs and their cross-linguistic leakage, with an analysis of different languages' susceptibility.
- [2023] Multilingual Jailbreak Challenges in Large Language Models: Explores multilingual jailbreak challenges in LLMs, addressing both unintentional and intentional risks, and proposes a safety fine-tuning framework.
- Pahune et al.: Emphasizes recent developments and efforts made for various kinds of LLMs, including multilingual language models.