Current issue: Vol. 6 2025


Nuttawut Thuayhanruksa and Pree Thiengburanathun

Published in Data Science and Engineering (DSE) Record 2025 Vol. 6 No. 1 pp. 1-30

This paper explores the application of various natural language processing (NLP) models for sentiment analysis on financial news articles sourced from Thai financial news websites, focusing on Thai-language data. The study evaluates machine learning and deep learning models, including Lo-gistic Regression, Bidirectional Long Short-Term Memory (Bi-LSTM), Con-volutional Neural Networks (CNN), WangChanBERTa, OpenAI’s GPT-3.5 and OpenThaiGPT. The models' performance is assessed using accuracy, precision, recall, and F1-score. The findings reveal that the Fine-tuned WangChanBERTa model achieved the highest accuracy of 0.84 on the test-ing set, demonstrating its superior ability in classifying sentiment in Thai financial news. BI-LSTM and CNN models also performed well, with test-ing accuracies of 0.781 and 0.791 In contrast, OpenAI’s GPT-3.5 and Open-ThaiGPT, which lacked fine-tuning and optimized prompts due to computa-tional constraints, exhibited practical limitations in resource-constrained settings.

Kitichart Nukaew and Arinya Pongwat

Published in Data Science and Engineering (DSE) Record 2025 Vol. 6 No. 1 pp. 31-55

Massive Open Online Courses (MOOCs) have seen continuous growth in popularity and rapid expansion. In the instructional design process, receiv-ing feedback from learners is crucial, as it helps tailor the content to better meet learners' needs. The application of NLP models in analyzing learners' feedback is an effective approach for extracting insights from a large volume of comments related to the courses. These models can categorize feedback into three distinct categories: course, instructors, and assessments. Addi-tionally, the models can predict the sentiment of the feedback, determining whether it is positive or negative. In developing these models, semi-supervised learning techniques have been employed to address the chal-lenge of limited data availability. Experimental results indicate that, for feedback categorization, a GRU model combined with tri-training with dis-agreement yields the highest prediction accuracy. Conversely, for sentiment analysis, a GRU model combined with tri-training produces the best out-comes.

Kamonwit Makkaphan, Prompong Sungunnasil, Waranya Mahanan, and Sumalee Sangamuang

Published in Data Science and Engineering (DSE) Record 2025 Vol. 6 No. 1 pp. 56-90

Online customer reviews represent a valuable source of information for businesses seeking to understand consumer perceptions and preferences. This paper introduces a framework for competitive positioning analysis by leveraging these online reviews and sentiment analysis. The framework employs Natural Language Processing (NLP) techniques in three phases: 1) identifying key themes and topics from reviews using Latent Dirichlet Allocation (LDA); 2) extracting product features through zero-shot text classification; and 3) visualizing competitive positioning via Net Promoter Score (NPS) and sentiment analysis plots. A case study on Amazon’s laptop market revealed a moderate correlation (58.8%) between NPS and sentiment analysis, suggesting potential limitations in feature classification accuracy. While the study demonstrates the value of NLP for analyzing online reviews, it also emphasizes the need for improved feature recognition methods and more robust datasets to enhance the precision of competitive positioning analysis.

Manaschai Aonon and Phasit Charoenkwan

Published in Data Science and Engineering (DSE) Record 2025 Vol. 6 No. 1 pp. 91-130

This research presents a comprehensive framework for analyzing customer behavior in walking street markets using advanced person re-identification techniques. We deployed dual CCTV cameras at strategic points along a 200-meter section of a walking street market in Chiang Mai, Thailand, to track customer movements and analyze behavioral patterns. Our methodol-ogy comprises three main components: (1) a novel segmentation-enhanced multi-region feature extraction framework combining YOLOv11 segmenta-tion with Swin Transformer, (2) a robust person re-identification approach with PCA-enhanced feature matching, and (3) detailed customer behavior analysis based on movement patterns, speeds, and interactions. Our feature extraction method achieves 92.31% Rank-1 accuracy and 59.62% mAP, significantly outperforming traditional approaches. Using the re-identification results, we identify five distinct customer behavior types (Goal-Oriented, Browsing, Lingering, Focused, and Brief Visitors) with ac-tionable insights for market management. This research contributes both methodological advances in per-son re-identification and practical applica-tions for retail analytics in dynamic public spaces.

Noratap Muangudom and Karn Patanukhom

Published in Data Science and Engineering (DSE) Record 2025 Vol. 6 No. 1 pp. 131-167

In recent years, Large Language Models (LLMs) have demonstrated signifi-cant potential in various applications, including healthcare, education, and customer support. This study investigates the integration of LLMs into group chat environments to facilitate medical counseling between doctors and heart disease patients. Traditional chatbot systems primarily operate in one-on-one interactions, which can lead to redundant queries and ineffi-ciencies in medical consultations. This research introduces a novel chatbot system designed for group chat settings, allowing multiple users and medi-cal professionals to interact seamlessly within the same conversation.The chatbot system retrieves medical knowledge from a predefined document database using an information retrieval model to ensure responses are rele-vant and accurate. A verification mechanism is integrated, enabling doctors to review and validate chatbot-generated responses before they are present-ed to patients. The study employs hypothesis testing and real-world evalua-tions to measure chatbot performance across three key dimensions: re-sponse accuracy, response speed, and user satisfaction. Experimental re-sults indicate that group chat environments improve communication effi-ciency, reduce repetitive queries, and enhance patient engagement compared to traditional one-on-one chatbot interactions.Furthermore, user feedback highlights the strengths and limitations of the proposed system. While the chatbot successfully provides relevant medical information, challenges re-main in ensuring response accuracy, reducing response time, and improving contextual understanding in group conversations. Future work will focus on refining chatbot algorithms, enhancing natural language processing capa-bilities, and expanding the medical knowledge base to support a wider range of healthcare scenarios. This research underscores the potential of LLMs in transforming digital healthcare support, making medical consulta-tions more efficient, accessible, and collaborative.