Abstract. In computer linguistics, the corpus is important, especially the corpus of texts is used in the creation of the database of electronic dictionaries and grammatical rules. The corpus serves as a research object for machine translation, speech synthesizer, text analysis, sentiment analysis and other areas of computer linguistics.
Corpus-based linguistic research shows that corpora allow for fast searchability, diversity in literary genres and genres, and statistical analysis. The electronic corpus of the Uzbek language serves as an important resource in the development of learner dictionaries.
References:
1. Ступин Л.П. Лексикография английского языка М.: Высшая школа, 1985. –С. 2.
2. Landau S.I. Dictionaries: The Art and Craft of Lexicography. – N. Y., 1989. –P. 4.
3. Дубичинский В.В. Лексикография русского языка (Учебное пособие) Москва Издательство «Флинта» Издательство «Наука» 2008. –С.8.
4. Дубичинский В.В. Лексикография русского языка (Учебное пособие) Москва Издательство «Флинта» Издательство «Наука» 2008. –С.13-14.
5. https://txm.gitpages.huma-num.fr/textometrie/en/Introduction/
6. Mohamed Zakaria Kurdi. Natural Language Processing and Computational Linguistics: Speech, Morphology and Syntax. –Great Britain, USA: Wiley-ISTE 2016. –P. 12.
7. Jurafskiy D., James H. Martin Speech and language processing. An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, 2007
8. Gries, Stefan Th. How to use statistics in quantitative corpus analysis. In Michael McCarthy & Anne O'Keeffe (eds.), The Routledge Handbook of Corpus Linguistics. New York & London: Routledge., 2021.
9. Jurafsky, D., and J. H. Martin. 2008. Speech and language processing, 2nd ed. Upper Saddle River: Prentice Hall.; Manning, C., and H. Schütze. 1999. Foundations of statistical natural language processing. Cambridge: The MIT Press.; Roark, B., and R. Sproat. 2007. Computational approaches to morphology and syntax. Oxford: Oxford University Press.; The Oxford handbook of computational linguistics (edited by Mitkov R.) – Oxford, 2003. – P. 63
10. Graeme Kennedy An introduction to corpus linguistics London: Longman, 1998. – P. 2
11. Mohamed Zakaria Kurdi. Natural Language Processing and Computational Linguistics: Speech, Morphology and Syntax. –Great Britain, USA: Wiley-ISTE 2016. – P. 12
12. https://txm.gitpages.huma-num.fr/textometrie/en/Introduction/
13. А. М. Лаврентьев, Ф. Н. Соловьев, А. М. Чеповский Внедрение в TXM дополнительных инструментов автоматической обработки текста / PROCEEDINGS OF THE INTERNATIONAL CONFERENCE «CORPUS LINGUISTICS–2019», - C. 55.
14. http://www.nop-dipo.ru/node Марчук Ю.Н. Типология текстов и машинный перевод.
15. I lknur Durgar El-Kahlout, Kemal Oflazer Initial Explorations in English to Turkish Statistical Machine Translation / Proceedings of the Workshop on Statistical Machine Translation, pages 7–14, New York City, June 2006. (Association for Computational Linguistics)
16. Serge Sharoff, Elena Umanskaya, James Wilson A frequency dictionary of Russian. Routledge, London and New York, 2013. – P. 5.
17. N. Abdurakhmonova, I. Alisher and R. Sayfulleyeva, "MorphUz: Morphological Analyzer for the Uzbek Language," 2022 7th International Conference on Computer Science and Engineering (UBMK), 2022, pp. 61-66, doi: 10.1109/UBMK55850.2022.9919579.
18. Abdurakhmonova N. The bases of automatic morphological analysis for machine translation. Izvestiya Kyrgyzskogo gosudarstvennogo tekhnicheskogo universiteta. 2016;2 (38):12-7.
19. Abdurakhmonova N, Tuliyev U. Morphological analysis by finite state transducer for Uzbek-English machine translation/Foreign Philology: Language. Literature, Education. 2018(3):68.
20. Mengliev D, Barakhnin V, Abdurakhmonova N. Development of Intellectual Web System for Morph Analyzing of Uzbek Words. Applied Sciences. 2021; 11(19):9117. https://doi.org/10.3390/app11199117
21. Abdurakhmonova N., Tuliyev U. and Gatiatullin A., "Linguistic functionality of Uzbek Electron Corpus: uzbekcorpus.uz," 2021 International Conference on Information Science and Communications Technologies (ICISCT), 2021, pp. 1-4, doi: 10.1109/ICISCT52966.2021.9670043.
22. Kubedinova L. Khusainov A., Suleymanov D., Gilmullin R., Abdurakhmonova N. First Results of the TurkLang-7 Project: Creating Russian-Turkic Parallel Corpora and MT Systems. Proceedings of the Computational Models in Language and Speech Workshop (CMLS 2020) co-located with 16th International Conference on Computational and Cognitive Linguistics (TEL 2020) .2020/11: 90-101