Patcharapol Yasamut, Pree Thiengburanathum

Published in Data Science and Engineering (DSE) Record 2022 Vol. 3 No. 1 pp. 78-92


The demand for electricity in buildings on a national and international scale is currently rising rapidly. Building electricity usage can be decreased by using a forecasting model. It can reduce utility costs not just for one build-ing but also throughout a whole region. According to literature review, ma-chine-learning and deep-learning techniques have been used in previous studies on forecasting electricity consumption. However, there is a dearth of research into the use of clustering to predict electricity consumption in tropical regions such as Thailand or any of the countries in Southeast Asia. In this project, we present new research for hourly forecasting building en-ergy usage. 1-hour interval electricity consumption data is collected from nineteen buildings for a year and five months by smart meters. 1-hour inter-val weather data including PM 10, PM 2.5, temperature, and humidity col-lected is also collected from one building. The analysis of the cross correla-tion between weather data and electricity consumption indicated that that there was a weak correlation between weather and electricity consumption data. Vector Auto Regression (VAR), Vector Auto-Regressive Moving Av-erage (VARMA), Support Vector Machine (SVM) and Multi-Layer Percep-tron (MLP) models were used to develop the forecasting models as the base-line models. The SVR model can outperform the other models with the low-est RMSE validation scores on training dataset. The hyperparameters of SVR models were optimized to maximize forecasting accuracy on training dataset. To reduce time consuming for training and optimizing the models, the k-Shape clustering approach is used to analyse electricity consumption into pattern groups. We used the centroid of each cluster as a representation of the cluster's electricity consumption data in order to forecast the electric-ity consumption of buildings within the cluster. The result of comparing the forecasting performance of SVR with and without clustering technique by using t-test indicated that there is no statistically significant evidence that the forecasting performance of SVR model with and without clustering technique are different at P-values of 0.7258.