Projects from students of NLP Course
Name | Description | Team | Repository |
---|---|---|---|
Movie Poster Caption Generation | @kazzand | https://github.com/kazzand/huaweiproject | |
Chinese-Russian Machine Translation | @RonanenkovN | https://github.com/RomanenkovN/HuaweiNLP | |
Aspect-Based Sentiment Analysis in German | Identify aspect and document-level polarity of messages in German. It is important for German services providers such as railways. | @DrFirestream | https://github.com/DrFirestream/NLP |
Aspect Extraction with Capsule Networks | Topic modelling with CapsNet. Knowing what people are talking about and understanding their problems and opinions is highly valuable to businesses, administrators, political campaigns. And it’s really hard to manually read through such large volumes and compile the topics. Thus is required an automated algorithm that can read through the text documents and automatically output the topics discussed. | @KirillKrasikov | https://github.com/KirillKrasikov/TopicModelingWithCapsNet |
Text Summarization in Russian | The project's goal is to summarize the text for the Russian language. I think that one of the most valuable and expensive things in a person's life is their time. The task of selecting the main from text item will allow you not to read news articles in their entirety and save a lot of time. I planned to build a model that would make a summary for news about stock trading in Russian language. To create my own set of texts and there’s summary I have short news tweets in the telegram(as summary) and full news articles about trading on the exchange(texts) on the site https://quote.rbc.ru. | @medphisiker | https://github.com/medphisiker/Huawei-s-nlp-course-project |
BERT-based Aspect Extraction | The goal of my project is to solve the problem of aspect extraction from text data. In order to solve the problem one should discover not only an author's opinion of an entity mentioned in text but also opinions relative to specific properties of the entity called aspects. Aspects are represented in texts via aspect terms. The practical importance of the problem includes the possibility to use the developed models in analysis of social media to assess users' perception of products, manage brand reputation, conduct different political and social researches and so on. | @ulaelfray | https://bitbucket.org/ulaelfray/huawei-nlp-course/ |
Setiment Analysis in Russian | @alekxd | https://github.com/alekxd/project-NLP-sentiment-rus | |
Text Summarization Task in Russian | The problem which I am going to solve is summarization task in Russian. Nowadays, we have a lot of information and it is important to extract the main idea from a text, in my case the model will help people to generate headlines for news articles. | @alexvishnevskiy | https://github.com/alexvishnevskiy/Huawei-project |
Generation of news headlines | Summarization task in Russian for news data set | @germanjke, @kotyukov | https://github.com/germanjke/huaweiNLP |
Russian aspect-based sentiment analysis | BERT-based techniques to identify the sentiment of the selected entity in the text. For example, "In general I like the car but I hate it's ". The sentiment of the "color" is negative.The most relevant dataset is https://github.com/songyouwei/ABSA-PyTorch/tree/master/datasets/semeval14 | @preduct0r | https://github.com/preduct0r/huawei |
Jigsaw Multilingual Toxic Comment Classification | "Jigsaw Multilingual Toxic Comment Classification" is the Kaggle competition. Use TPUs to identify toxicity comments across multiple languages. We have to predict the probability that a comment is toxic/non-toxic. https://www.kaggle.com/c/jigsaw-multilingual-toxic-comment-classification | LeonidMorozov, Mteterin | https://github.com/LeonidMorozov/jigsaw_toxic_classification |
Headlines generation from news articles in Russian | Reading full texts is time consuming. If the headline of the text reflects the main idea of the original version, then reading it saves a lot of time. I will be working on Rossiya Segodnya (RIA) corpus, consisting of long text-heading pairs. I'm going to make Data preprocessing and then use Pre-trained embeddings to build Attentive RNN model in pyTorch implementation. | @vadimvvlasov | https://github.com/vadimvvlasov/nlp-project |