Projects from the students of Huawei NLP Course current run. Previous projects could be accessed here.
Name | Description | Team | Repository |
---|---|---|---|
Building language model by users review about goods and using model in the context adversting | Creation of a word corpus from a website with goods' reviews. Teaching of the classification model based on these reviews. Context detection in the dialog and detection of a good's title in order to show ads according to the required context. | @akalend @rehcoeg | https://github.com/akalend/mobile_nlp_analisys |
Movie Poster Caption Generation | Particular case of text generation task from images embeddings | @kazzand | https://github.com/kazzand/huaweiproject |
A Russian Question Answering System for Inclusive Education | A closed domain model for question-answering in Russian built with transfer learning techniques. The model is fine-tuned on a custom dataset collected with the methodology described in the SQuAD original paper | @vifirsanova | https://github.com/vifirsanova/nlp-huawei-project |
Indonesian-Russian Machine Translation | Indonesian-Russian translation pair is pretty weak now even in Yandex and Google translation systems. I try to find extend knowing ind-ru corpuses by mapping ind-en and en-rus corpuses. The main goal of this project is to practice with models based on Transformers and build the machine translation system producing the decent BLEU. | @minakyan | https://github.com/minakovaa/indonesian-russian-translation |
Sentiment Analysis and AutoML | The goal is to perform the Sentiment Analysis on the amazon-fine-food-reviews and compare different hyperparameters search engines: hyperopt, BOHB and Optuna | @aazarov | https://github.com/aazarov/NLP_SentimentAnalysis_HyperparametersSearch |
English sentiment analysis | I will try to solve common NLP problem related to sentiment analysis. The data is taken from Twitter and needs to be pre-processed, bacause the texts are very raw. Also, since the classes are unbalanced, I will try to apply data augmentation. | @slavkostrov | https://github.com/slavkostrov/project-sentiment-eng |
Topic modeling and classification news on Hebrew | The goal is gathering data and creating a classification model on Hebrew, which is one of the low-resource languages such, and also performing topic modeling | @imvladikon | https://github.com/imvladikon/huawei-nlpcourse-project |
Russian Text Sentiment Transfer | Transforming a sentence to alter its sentiment while preserving the content | @orzhan | https://github.com/orzhan/russian-text-sentiment-transfer |
Domain adaptation of transformer model for improving semantic search | I am exploring the domain adaptation of the transformer model for improving search relevancy. | @algis | https://github.com/Dumbris/semantic-search-domain-adaptation |
Predicting user ratings by movie's description | The task of predicting a number from the text is classical, but the found solution for data from the TMDB website on a similar dataset are based not only on the description, and have also large number of other fields. In this task we will try using only description to get a similar result or better | @aakzn | https://gitlab.com/iKzN/predict-film-rating-votes-by-overview |
Topical extractive summarization | Topical extractive summarization is directed toward extract sentences most relevant to a given topic. | @pacifikus | https://github.com/pacifikus/nlp-course |