Sentiment Analysis of Healthcare Services at RSUD Soe Using Machine Learning and Latent Dirichlet Allocation

Authors

  • Agatha Marilin Saekoko Faculty of Information Technology, Satya Wacana Christian University, Salatiga, Indonesia
  • Hindriyanto Dwi Purnomo Faculty of Information Technology, Satya Wacana Christian University, Salatiga, Indonesia
  • Yessica Nataliani Faculty of Information Technology, Satya Wacana Christian University, Salatiga, Indonesia

DOI:

https://doi.org/10.35799/jis.v26i1.67193

Keywords:

Healthcare Services, Latent Dirichlet Allocation, Machine Learning, RSUD Soe, Sentiment Analysis

Abstract

Healthcare services constitute a crucial aspect in improving public well-being. Every individual has the right to receive healthcare services that are of high quality, safe, efficient, and affordable. This study aims to identify and analyze public perceptions and sentiments toward healthcare services at RSUD Soe, as well as to evaluate the performance of several machine learning methods in classifying such sentiments. The data were collected from 278 respondents through a Likert-scale questionnaire that represents perceptions and levels of satisfaction regarding various service aspects. Sentiment analysis was conducted using four machine learning algorithms, namely Naïve Bayes, C4.5, Random Forest, and Support Vector Machine. The results indicate that Naïve Bayes achieved the highest accuracy of 82.14 percent, followed by SVM at 80 percent, Random Forest at 79 percent, and C4.5 at 73.21 percent. This study also applied the Latent Dirichlet Allocation (LDA) method to identify the main themes within public feedback. LDA generated twelve topics reflecting key issues such as waiting time, availability of medical personnel, facility cleanliness, and the attitudes of healthcare staff. The majority of comments exhibited positive sentiment, particularly concerning staff friendliness and service quality. These findings were used to formulate improvement recommendations, including enhancing service quality, increasing the number of medical personnel, and optimizing facilities. This research demonstrates that a data-driven quantitative approach is effective in evaluating healthcare service quality and supporting more targeted decision-making. The results are expected to assist RSUD Soe in continuously and effectively improving service quality.

References

Abd-alrazaq, A., Alhuwail, D., Househ, M., Hamdi, M., & Shah, Z. (2020). Top Concerns of Tweeters During the COVID-19 Pandemic: Infoveillance Study Corresponding Author. Journal of Medical Internet Research, 22(4), e19016. https://doi.org/10.2196/19016

Aday, L. A., & Andersen, R. (1974). A framework for the study of access to medical care. Health Services Research, 9(3), 208–220.

Alpaydin, E. (2020). Introduction to Machine Learning (4th ed.). The MIT Press.

Andaleeb, S. S. (2001). Service quality perceptions and patient satisfaction: A study of hospitals in a developing country. Social Science & Medicine, 52(9), 1359–1370. DOI: 10.1016/s0277-9536(00)00235-5

Blei, D.M., Ng, A.Y., & Jordan, M.I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993–1022. DOI: 10.5555/944919.944937

Breiman, L. (2001). Random forest. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324

Chen, M., Hao, Y., Hwang, K., Wang, L., & Wang, L. (2017). Disease Prediction by Machine Learning Over Big Data From Healthcare Communities. IEEE Access, 5, 8869–8879. https://doi.org/10.1109/ACCESS.2017.2694446

Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018

Donabedian, A. (1988). The quality of care: {How} can it be assessed? JAMA, 260(12), 1743–1748. DOI: 10.1001/jama.260.12.1743

Etikan, I., Musa, S.A., & Alkassim, R.S. (2016). Comparison of Convenience Sampling and Purposive Sampling. American Journal of Theoretical and Applied Statistics, 5(1), 1–4. https://doi.org/10.11648/j.ajtas.20160501.11

Greaves, F., Ramirez-Cano, D., Millett, C., Darzi, A., & Donaldson, L. (2013). Harnessing the cloud of patient experience: using social media to detect poor quality healthcare. BMJ Quality & Safety, 22(3), 251–255. https://doi.org/10.1136/bmjqs-2012-001527

Kohavi, R. (1995). A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI), 1137–1145. https://doi.org/10.5555/1643031.1643047

Lai, S. T., & Mafas, R. (2022). Sentiment Analysis in Healthcare: Motives, Challenges & Opportunities pertaining to Machine Learning, 2022 IEEE International Conference on Distributed Computing and Electrical Circuits and Electronics (ICDCECE), 8(1), 1–4. https://doi.org/10.1109/ICDCECE53908.2022.9792766

Mahfudhoh, M., & Muslimin, I. (2020). Pengaruh Kualitas Pelayanan Terhadap Kepuasan Pasien Pada Rumah Sakit Umum Daerah Kota Cilegon. JIMKES, 8(1), 39–46. https://doi.org/10.37641/jimkes.v8i1.310

Manning, C. D., Raghavan, P., & Schütze, H. (2008). An Introduction to Information Retrieval. Cambridge University Press.

McCallum, A., & Nigam, K. (1998). A comparison of event models for naive bayes text classification. AAAI Conference on Learning Artificial Intelligence.

Mohammad, S. M. (2016). Sentiment analysis: Detecting valence, emotions, and other affectual states from text. Emotion Measurement, 201–237).

https://doi.org/10.48550/arXiv.2005.11882

Mohd Sofi, S., & Selamat, A. (2023). Aspect Based Sentiment Analysis: Feature Extraction using Latent Dirichlet Allocation (LDA) and Term Frequency - Inverse Document Frequency (TF-IDF) in Machine Learning (ML). MyJICT, 169–179. https://doi.org/10.53840/myjict8-2-102

Nazief, B., & Adriani, M. (2004). Stemming algorithm for Bahasa Indonesia. International Conference on Information Technology and Multimedia.

Pang, B., & Lee, L. (2008). Opinion Mining and Sentiment Analysis. Foundations and Trends® in Information Retrieval, 2, 1–135. https://doi.org/10.1561/1500000011

Parasuraman, A., Zeithaml, V.A., & Berry, L.L. (1988). SERVQUAL: A Multiple-Item Scale for Measuring Consumer Perceptions of Service Quality. Journal of Retailing, 64(1), 12–40.

Paul, M.J., & Dredze, M. (2014). Discovering Health Topics in Social Media Using Topic Models. PLOS ONE, 9(8), 1–11. https://doi.org/10.1371/journal.pone.0103408

Powers, D.M.W. (2011). Evaluation: From precision, recall and F-measure to ROC, informedness, markedness & correlation. 2(1), 37–63.

https://doi.org/10.48550/arXiv.2010.16061

Quinlan, J.R. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers.

Rao, J., Zhang, Q., Liu, S., & Liu, X. (2024). Machine Learning in Action: Topic-Centric Sentiment Analysis and Its Applications. IJMRGE, 5(6), 1274–1278. https://doi.org/10.54660/.IJMRGE.2024.5.6.1274-1278

Republic of Indonesia. (2009). Law No. 36 of 2009 on Health.

Röder, M., Both, A., & Hinnerburg, A. (2015). Exploring the space of topic coherence measures. https://doi.org/10.1145/2684822.2685324

Salzberg, S. L. (1994). C4.5: Programs for Machine Learning by J. Ross Quinlan. Machine Learning, 16(3), 235–240. https://doi.org/10.1007/BF00993309

Sarker, A., Ginn, R., Nikfarjam, A., O’Connor, K., Smith, K., Jayaraman, S., Upadhaya, T., & Gonzalez, G. (2015). Utilizing social media data for pharmacovigilance: A review. Journal of Biomedical Informatics, 54, 202–212. https://doi.org/https://doi.org/10.1016/j.jbi.2015.02.004

Setiawan, H., & Kusuma, A. (2020). Penerapan Text Preprocessing untuk Analisis Sentimen pada Ulasan Pengguna. Jurnal Teknologi Informasi Dan Komputer, 6(2), 45–53.

Sharma, A., & Dey, S. (2012). A comparative study of feature selection and machine learning techniques for sentiment analysis. Proceedings of the 2012 ACM Research in Applied Computation Symposium, 1–7. https://doi.org/10.1145/2401603.2401605

Shickel, B., Tighe, P. J., Bihorac, A., & Rashidi, P. (2018). Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis. IEEE Journal of Biomedical and Health Informatics, 22(5), 1589–1604. https://doi.org/10.1109/JBHI.2017.2767063

Stewart, M. A. (1995). Effective physician-patient communication and health outcomes: A review. CMAJ, 152(9), 1423–1433. PMCID: PMC1337906

Syahputra, R., & Ruldeviyani, Y. (2020). Penerapan LDA untuk Analisis Topik pada Ulasan Pengguna. Jurnal Teknik ITS, 9(1), A86-A91.

Wibowo, A., & Saputra, R.E. (2021). Klasifikasi Sentimen dengan Metode Naive Bayes, Decision Tree dan SVM. Jurnal RESTI, 5(1), 10–17.

Witten, I. H., Frank, E., & Hall, M.A. (2016). Data Mining: Practical Machine Learning Tools and Techniques, 4th ed. Elsevier Inc.

Zeithaml, V. A. (1988). Consumer perceptions of price, quality and value: A means–end model and synthesis of evidence. Journal of Marketing, 52(3), 2–22. https://doi.org/10.2307/1251446

Zhang, Lei., Wang, Shuai., & Liu, Bing. (2010). Understanding sentiment analysis: Mining opinions, sentiments, and emotions. ACM Transactions on Intelligent Systems and Technology (TIST), 5(1), 1–40.

Downloads

Published

2026-04-22

How to Cite

Saekoko, A. M., Purnomo, H. D., & Nataliani, Y. (2026). Sentiment Analysis of Healthcare Services at RSUD Soe Using Machine Learning and Latent Dirichlet Allocation. Jurnal Ilmiah Sains, 26(1), 86–104. https://doi.org/10.35799/jis.v26i1.67193

Issue

Section

Articles