Issue: October 2020 Vol. 1 No. 1


Nat Weerawan and Pruet Boonma

Published in Data Science and Engineering (DSE) Record 2020 Vol. 1 No. 1 pp. 9-15

This independent study aims to develop a data pipeline system that is able to transform a printed standard of business cheque image into digital numeric data using the OCR technique. This system developed specifically to enhance the efficiency of the data input process of the insurance claim payment process. The evaluation of the system is in two folds. The first one is to evaluate the effi-ciency among different algorithms used in building the OCR system based on accuracy and runtime. The selected algorithms are k-Nearest Neighbors (kNN), Support Vector Machine (SVM), and Gradient Boosting Machine (GBM) respec-tively. GBM was found to be the most accurate and it demanded the least runtime among the three techniques. The second one is to appraise based on the result of evaluation survey from 10 experts who are either the developers or the person in charges of claiming process in the insurance industry. The survey result shows that both of the accuracy and speediness of the system developed is outstanding and satisfaction. Therefore, it can be concluded that the purposed system can increase capability of data input process of the insurance claim payment process.

Sutipong Sutinaraphan and Juggapong Nartwichai

Published in Data Science and Engineering (DSE) Record 2020 Vol. 1 No. 1 pp. 16-20

This independent study aims to create a data transformation process and essential analysis and illustrate visualization from the passenger journey data on mass transit system. In this era, urban and population expansion has caused an incredible concern on urbanization especially infrastructure of the city. Mass public transportation is considered to be the core development for an urban logistic and one of the most important tools to minimize the social disparity. Unfortunately, due to some constrains, one of the most vital data such as passenger travelled data cannot be obtained in ready-to-use form. In this independent study, we will focus on develop the automation process to transform the origin-destination matrix of mass rail transportation into transactional data which will enable the processing of analytical and visualization data further. Furthermore, this study will develop analytical visualization in accordance with the requirements gathered from the executive of the Mass Rapid Transit Authority of Thailand (MRTA). This study has shown significant insights from passengers of the mass rail transit system which will be able to supply as supporting data for yearly and quarterly meeting for MRTA’s board of director to improve the ticketing system and most importantly to be a primary supporting data for future public transportation development in Thailand.

Thiraphat Tanphiriyakun and Phasit Charoenkwan

Published in Data Science and Engineering (DSE) Record 2020 Vol. 1 No. 1 pp. 21-25

This study proposed methods for collaboration information extraction using affiliation from MEDLINE database. The protocol enabled a large scale dataset extraction. The results show increasing trends of collaboration parameters year by year and in a higher Journal Quartile score.

Chinnawat Ngamsom and Trasapong Thaiupathump

Published in Data Science and Engineering (DSE) Record 2020 Vol. 1 No. 1 pp. 1-8

This independent study aims to develop a data pipeline system that is able to transform a printed standard of business cheque image into digital numeric data using the OCR technique. This system developed specifically to enhance the efficiency of the data input process of the insurance claim payment process. The evaluation of the system is in two folds. The first one is to evaluate the efficiency among different algorithms used in building the OCR system based on accuracy and runtime. The selected algorithms are k-Nearest Neighbors (kNN), Support Vector Machine (SVM), and Gradient Boosting Machine (GBM) respectively. GBM was found to be the most accurate and it demanded the least runtime among the three techniques. The second one is to appraise based on the result of evaluation survey from 10 experts who are either the developers or the person in charges of claiming process in the insurance industry. The survey result shows that both of the accuracy and speediness of the system developed is outstanding and satisfaction. Therefore, it can be concluded that the purposed system can increase capability of data input process of the insurance claim payment process.