FORMATION OF A DATABASE FOR SENTIMENT ANALYSIS OF TEXTS IN THE UZBEK LANGUAGE

Niyazmetova Kumushoy; Raximov Komron; Anvarova Dilrabo; Bekjanov Ro’zimboy

doi:10.5281/zenodo.10143621

FORMATION OF A DATABASE FOR SENTIMENT ANALYSIS OF TEXTS IN THE UZBEK LANGUAGE

16.11.2023 International Scientific Journal "Science and Innovation". Series C. Volume 2 Issue 11

Niyazmetova Kumushoy, Raximov Komron, Anvarova Dilrabo, Bekjanov Ro’zimboy

Abstract. In sentiment analysis of user comments, we first need to start with pre-processing the comments. Because the commentary texts were written by different people in different languages, with different spelling mistakes in the writings. If the input texts for classification algorithms in data mining are pre-processed, the accuracy of the sentiment analysis algorithm will increase and we can achieve the expected result. Solving such problems is an important task of natural language processing. In this article, we have prepared a Dataset using feedback given to restaurants located in the city of Tashkent on the Google map and analyzed Sentiment using logistic regression models. Overall evaluation results show that the system performs well by performing pre-processing steps such as stemming for agglutinative languages

Keywords: sentiment analysis, dataset, bag of words model, NLP, TF-IDF algorithm

References:

1. Niyazmetova K., Quriyozov E. “Restoran sohasidagi o ‘zbek tilidagi matnlarning sentiment tahlili” //computer linguistics: problems, solutions, prospects. – 2023. – Т. 1. – №. 1. 2. Рахимов Х. К. и др. “O‘zbek tili sentiment analizning nazariy masalalari” //международный журнал искусство слова. – 2023. – Т. 6. – №. 1. 3. Bakaev, Ilkhom. "Creating a tokenization algorithm based on the knowledge base for the Uzbek language." 2022 International Conference on Information Science and Communications Technologies (ICISCT). IEEE, 2022. 4. Sharipov, Maksud, et al. "UzbekTagger: The rule-based POS tagger for Uzbek language." arXiv preprint arXiv:2301.12711 (2023). 5. Mahmudjonova G. “Nomuhim so ‘zlar tushunchasi va uning ahamiyati” //computer linguistics: problems, solutions, prospects. – 2023. – Т. 1. – №. 1. 6. Elov B. et al. “O‘zbek, turk va uyg ‘ur tillarida POS teglash va stemming” //Uzbekistan: Language and Culture. – 2023. – Т. 1. – №. 1. 7. Madatov K., Bekchanov S., Vičič J. “Uzbek text summarization based on TF-IDF” //arXiv preprint arXiv:2303.00461. – 2023. 8. Alisher o’g’li R. S. “Logistik regressiya modeli” //formation of psychology and pedagogy as interdisciplinary sciences. – 2023. – Т. 2. – №. 16. – С. 61-66. 9. Yusupov D.F., Abdullayeva G., Aliyev O., Hamrayeva S. Management of the different systems of oil extraction enterprise based on models of current and calendar planning// AIP Conference Proceedings 2402, 050002 (2021), 050002-1 – 050002- 10. Atanazarovich M. S., Saparbayevna R. L., Ilkhomovna A. X. DETERMINING THE KNOWLEDGE LEVEL OF PUPILS IN THE" SMART SCHOOL" INFORMATION SYSTEM //International Journal of Contemporary Scientific and Technical Research. – 2022. – С. 86-90. 11. Atanazarovich, Masharipov Sanatbek, S. Q. Iskandarov, and R. B. Sharifboyeva. "SUN’IY INTELLEKT ASOSIDA YOSHLARNI KASBGA TO‘G‘RI YO‘NALTIRISH TIZIMINI ISHLAB CHIQISH." Komputer texnologiyalari 1.10 (2022)