CODEBOT – A VIETNAMESE CHATBOT SYSTEM FOR ANSWERING C++ AND PYTHON-RELATED QUESTIONS

Công Tâm Lương 1, Lê Minh Nguyên Vương 1, , Viết Hưng Nguyễn 1, Đỗ Thái Nguyên Nguyễn 1, Trần Hy Hiến Lương 1, Trần Ngọc Khiết Lương 1, Thị Trinh Phan 2
1 Trường Đại học Sư phạm TP. HCM
2 Trường Cao đẳng Công nghệ Thủ Đức

Main Article Content

Abstract

 

During the fourth industrial revolution, the ability of programming is one of the most essential skills for the youth to earn an edge over the competitors in their specializations. Programming techniques are not only important to software development but also useful for statistical analytics and mathematical modelling in other fields of study. However, fundamental programming materials on the Internet are mostly written in English instead of Vietnamese, which sets a distance between these materials and Vietnamese youth. This led to the idea of having a simple yet effective Vietnamese question answering chatbot to engage and motivate Vietnamese students to climb the steep learning curve of programming. This paper, combining natural language processing with knowledge representation and reasoning, aimed to implement such a question answering chatbot in pure Vietnamese to help students with their programming-related questions. A simple knowledge representation method was introduced to integrate external knowledge into the system. A knowledge reasoning and retrieval-based question answering method was also proposed to effectively yield proper responses from user’s queries. The range of topics the chatbot supports is limited to C++ and Python, two of the most taught programming languages in Vietnamese colleges and universities. At the heart of our chatbot, two machine learning models were designed to classify user’s intents. They were trained and evaluated on our annotated dataset, which was contributed by students from the Faculty of Information Technology, Ho Chi Minh University of Education. Our proposed models achieved surprisingly high F1-scores of 0.96 and 0.99 on our evaluation dataset.

 

Article Details

Author Biographies

Công Tâm Lương, Trường Đại học Sư phạm TP. HCM

 

Lê Minh Nguyên Vương, Trường Đại học Sư phạm TP. HCM

 

Viết Hưng Nguyễn, Trường Đại học Sư phạm TP. HCM

 

Đỗ Thái Nguyên Nguyễn, Trường Đại học Sư phạm TP. HCM

 

Trần Hy Hiến Lương, Trường Đại học Sư phạm TP. HCM

 

Trần Ngọc Khiết Lương, Trường Đại học Sư phạm TP. HCM

 

Thị Trinh Phan, Trường Cao đẳng Công nghệ Thủ Đức

 

References

Følstad, A., & Brandtzaeg, P. (2017). Chatbots and the new worlds of HCI. Interactions, 38-42.
Luong, L. T., Cao, S. M., Le, T. D., & Phan, H. X. (2017). Intent extraction from social media texts using sequential segmentation and deep learning models. 2017 9th International Conference on Knowledge and Systems Engineering (KSE) (pp. 215-220). IEEE.
Ngo, L., Pham, L., Takeda, H., Pham, S., & Phan, H. (2017). On the Identification of Suggestion Intents from Vietnamese Conversational Texts. ACM International Conference Proceeding Series (pp. 417-424).
Ngo, L., Pham, S., Pham, L., Phan, H., & Son, C. (2018). Dialogue act segmentation for Vietnamese human-human conversational texts. 2017 9th International Conference on Knowledge and Systems Engineering (KSE) (pp. 203-208). IEEE.
Nguyen, H. D., Pham, V., Nguyen Le, V., Tran, T., & Pham, X. (2020). Build a search engine for the knowledge of the course about Introduction to Programming based on ontology Rela-model. 2020 12th International Conference on Knowledge and Systems Engineering (KSE), 207-212.
Nguyen, H. D., Tran, D., Do, H., & Pham, V. (2020). Design an Intelligent System to automatically Tutor the Method for Solving Problems. International Journal of Integrated Engineering, 211-223.
Nguyen, S., Ngo, Q., & Jiamthapthaksin, R. (2019). State-of-the-Art Vietnamese Word Segmentation. arXiv:1906.07662 [cs.CL], 1-6. Retrieved from https://arxiv.org/abs/1906.07662
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., . . . & Duchesnay, E. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12, 2825-2830.
Quan, T., Trinh, T., Ngo, D., Phan, H., Hoang, L., Hoang, H., . . . & Mai, T. (2018). Lead Engagement by Automated Real Estate Chatbot. 2018 5th NAFOSTED Conference on Information and Computer Science (NICS 2018), (pp. 357-359).
Shah, H., Warwick, K., Vallverdu, J., & Wu, D. (2016). Can Machines Talk? Comparison of Eliza with Modern Dialogue Systems. Computers in Human Behavior, 278.
Smutný, P., & Schreiberova, P. (2020). Chatbots for learning: A review of educational chatbots for Facebook Messenger. Computers and Education, 103862.
Tran, O., & Luong, T. (2020). Understanding what the users say in chatbots: A case study for the Vietnamese language. Engineering Application of Artificial Intelligence, 103322.
Underthesea. (2019). Underthesea. Retrieved from Underthesea: http://undertheseanlp.com/
Winkler, R., & Söllner, M. (2018). Unleashing the Potential of Chatbots in Education: A State-Of-The-Art Analysis. Academy of Management Proceedings, 15903.