Jie Yang, Sakgasit Ramingwong

Published in Data Science and Engineering (DSE) Record 2022 Vol. 3 No. 1 pp. 66-77


There is a relatively well-established process for using machine learning to predict influencing factors and sales, but for small and medium-sized enter-prises, they often face problems such as low data volume and unrepresenta-tive data types, and the large data requirements become the threshold for us-ing machine learning methods to help business activities. The original data for this study was sourced from publicly available data from Alibaba's Tianchi platform, containing sales data from a small shop in three different branches. This paper studies the influencing factors from the correlation of data and uses random forest regression method to rank the importance of features. In order to predict sales, this paper uses a pre-training model to compare and analyze multiple machine learning models. The results show that the pre-training method has different degree of improvement or decline for different models.