The problem of correlation between manual and automatized assessment of machine translation

Вантажиться...
Ескіз
Дата
2022
Автори
Назва журналу
Номер ISSN
Назва тому
Видавець
Berdyansk State Pedagogical University
Анотація
The article under review outlines the problems of development and assessment of machine translation that can greatly facilitate global communication, despite the imperfect quality of the source text. Most often the results of online tools require post-editing and can only be effectively used by those who already speak the target language to some extent. The need for a competent translation is growing every year. Today, the search for an algorithm to deliver this quality of translation is one of the most important questions in computer science and linguistics, therefore informing the scientific relevance of this work. It is analyzed different approaches to the machine translation systems, their characteristics, efficacy and the quality of their output. Different approaches to the machine translation systems, their characteristics, efficacy and the quality of their output are analyzed in the article. The main problems we see arising from such translations goes from the fact that the systems depend on a large amount of high-quality data sets (i.e., corpora of texts for specific language pairs). The quality of these sets directly influences the quality of the output, which in our case is the quality of the target language text. It can be seen by comparing the average quality of translation between Google’s and Microsoft’s systems. The former one makes less mistakes on average and does not have as many issues in regards to identifying a contextual meaning of a polysemantic lexeme. It is underlined in the article, that this issue can be fixed to a certain extent one of two ways: hiring professional translators and linguists to compile those parallel corpora or create a possibility for every person to contribute to this process even on a small scale. The first approach would be very time and labor consuming, but would ultimately provide us with a higher quality data set, which may lead to further improvements in MT. The second is already being deployed by all three major NMT systems but may lead slower progression due to lack of quality control and oversight. The potential prospect of this research is seen in widening the subject area of texts chosen to reflect the variety of writing styles in use on the Internet right now. Inclusion of texts from confessional, business, and other styles may allow us to highlight more lacunae in the neural network models and to suggest further means of improvement.
Опис
Ключові слова
Бібліографічний опис
Suima I. The problem of correlation between manual and automatized assessment of machine translation / Irina Suima // Наукові записки БДПУ. Сер.: Педагогічні науки. – 2022. – Вип. 3. – С.379–388.
Зібрання