ANALISIS SIMULASI PREDIKSI CUSTOMER CHURN E-COMMERCE MENGGUNAKAN ALGORITMA RANDOM FOREST BERBASIS DATA SINTETIS

Penulis

  • Rayhan Bagoes Santoso Universitas Amikom Yogyakarta
  • Bety Wulan Sari Universitas Amikom Yogyakarta

DOI:

https://doi.org/10.35794/jmbi.v13i1.67545

Abstrak

The high customer churn rate is a critical challenge for the e-commerce industry in Indonesia, with potential losses reaching billions of rupiah per year. This study aims to implement a customer churn prediction system as a proof-of-concept using machine learning algorithms. Given the limited access to private e-commerce data, this study uses a methodological approach with a synthetic dataset consisting of 1000 customer data and 9 key features including tenure, monthly spending, total transactions, support tickets, and last purchase days. Three machine learning algorithms are implemented, namely Logistic Regression, Decision Tree, and Random Forest to classify churn predictions. The results show that Random Forest provides the most stable performance with an accuracy of 87.5%, precision of 86%, recall of 87%, and F1-score of 86%. The decrease in performance compared to the deterministic model indicates that the model was tested on more realistic data conditions and did not experience overfitting to the generative rules

Referensi

Ahmad, A. K., Jafar, A., & Aljoumaa, K. (2019). Customer churn prediction in telecom using machine learning in big data platform. Journal of Big Data, 6(1), 1–24.

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794).

De Caigny, A., Coussement, K., & De Bock, K. W. (2020). A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees. European Journal of Operational Research, 269(2), 760–772.

García, S., Luengo, J., & Herrera, F. (2020). Data preprocessing in data mining. Springer.

Géron, A. (2022). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow (3rd ed.). O’Reilly Media.

Kumar, R., & Sharma, P. (2022). A comprehensive study on customer churn prediction in e-commerce using ML techniques. International Journal of Information Technology, 14(5), 2567–2580.

Lalwani, P., Mishra, M. K., Chadha, J. S., & Sethi, P. (2022). Customer churn prediction system: A machine learning approach. Computing, 104(2), 271–294.

Lazarov, S., & Capota, M. (2023). Churn prediction in e-commerce using machine learning and ensemble methods. IEEE Access, 11, 45789–45801.

Óskarsdóttir, M., Bravo, C., Verbeke, W., Sarraute, C., Baesens, B., & Vanthienen, J. (2021). Social network analytics for churn prediction in telco: Model building, evaluation and network architecture. Expert Systems with Applications, 184, Article 115508.

Provost, F., & Fawcett, T. (2023). Data science for business: What you need to know about data mining and data-analytic thinking. O’Reilly Media.

Saghir, M., Bibi, Z., Bashir, S., & Khan, F. H. (2019). Churn prediction using neural network based individual and ensemble models.

Striuk, V., & Ternov, O. (2021). Customer churn prediction for e-commerce using machine learning algorithms.

Verbraken, W., Bravo, C., Weber, R., & Baesens, B. (2014). Development and application of consumer credit scoring models using profit-based classification measures. European Journal of Operational Research, 238(2), 505–513.

Zhang, Y., & Qi, Y. (2020). Customer churn prediction in e-commerce based on deep learning.

Diterbitkan

2026-03-30

Cara Mengutip

Rayhan Bagoes Santoso, & Bety Wulan Sari. (2026). ANALISIS SIMULASI PREDIKSI CUSTOMER CHURN E-COMMERCE MENGGUNAKAN ALGORITMA RANDOM FOREST BERBASIS DATA SINTETIS. JMBI UNSRAT (Jurnal Ilmiah Manajemen Bisnis Dan Inovasi Universitas Sam Ratulangi)., 13(1), 317–331. https://doi.org/10.35794/jmbi.v13i1.67545