Kanyarat Srisuttha and Arinya Pongwat
Published in Data Science and Engineering (DSE) Record 2026 Vol. 7 No. 1 pp. 25-36
Abstract
This independent study aims to analyze the factors influencing online jewelry purchasing decisions based on the 7Ps marketing mix framework, utilizing Natural Language Processing (NLP) and Machine Learning techniques. The dataset comprises 5,766 user reviews scraped from Shopee and Lazada. The research followed the CRISP-DM standard, employing TF-IDF for vectorization and Proportional SMOTE for data balancing to preserve the original significance of the factors. Comparative performance results revealed that the XGBoost algorithm achieved the highest accuracy at 75.39% and an F1-score of 75.30%. Meanwhile, the WangchanBERTa model, fine-tuned for 20 epochs, reached an accuracy of 74.09%, hindered by data volume constraints and Out-of-Vocabulary (OOV) issues. However, the Random Forest Classifier yielded the highest ROC AUC at 93.73%, demonstrating superior class differentiation capabilities. The findings indicate that the most discussed factors are Product (33.37%) and Process (24.19%), with "aesthetic design" and "shipping speed" identified as critical drivers of maximum customer satisfaction. These insights assist entrepreneurs in strategic marketing planning, inventory management, and packaging development to sustainably enhance competitiveness in the online marketplace.