Published in Data Science and Engineering (DSE) Record 2025 Vol. 6 No. 1 pp. 246-272
Abstract
Customer segmentation is a vital component of data-driven marketing, ena-bling businesses to understand customer behavior and enhance strategic de-cision-making. This study explores an efficient segmentation approach us-ing Recency, Frequency, and Monetary (RFM) analysis, combined with mul-tiple clustering techniques, to identify optimal customer groups. Four clus-tering approaches were implemented and compared centroid-based density based, distribution-based, and hierarchical clustering (Agglomerative). Each of these algorithms were evaluated based on its ability to form well-separated and meaningful clusters, with silhouette score as the primary per-formance metric. The dataset was standardized before applying the cluster-ing models to ensure comparability. The results reveal that different algo-rithms exhibit varying strengths depending on the underlying data struc-ture. K-Means demonstrated efficiency in partitioning customers into dis-tinct groups but struggled with non-spherical clusters. DBSCAN effectively identified outliers but was sensitive to parameter tuning. GMM provided flexibility by modeling cluster probability distributions, making it suitable for overlapping customer behaviors. Hierarchical clustering offered an in-terpretable structure but required significant computational resources for large datasets. Overall, the findings highlight the importance of selecting an appropriate clustering technique for customer segmentation based on data characteristics. This study provides valuable insights for businesses aiming to develop marketing strategies through data-driven segmentation.