A Statistical Approach to Crime Rate Prediction Using Multiple Linear Regression

Authors

  • Setiawan Ardi Wijaya Faculty of Business Management and Information Technology, Universiti Muhammadiyah Malaysia, Malaysia
  • Edo Arribe Faculty of Data Science and Computing, Universiti Malaysia Kelantan, Malaysia
  • Muhitualdi Department of Information Systems, Universitas Muhammadiyah Riau, Indonesia

DOI:

https://doi.org/10.64539/sjcs.v1i2.2025.47

Keywords:

Prediction, Crime, Multiple Linear Regression, Riau, MAPE

Abstract

The high crime rate in Riau Province poses a serious threat to social stability and public safety, requiring accurate prediction strategies to support crime prevention efforts. Based on data from the Central Statistics Agency (BPS), Riau ranked seventh among the provinces with the highest crime rates in Indonesia in 2022, indicating that conventional prevention efforts remain insufficient. However, studies applying statistical data-based prediction models to crime in Riau are still limited, creating a gap in data-driven decision making. This study aims to develop a crime rate prediction model in Riau Province using the Multiple Linear Regression (MLR) method with BPS crime data from 2019–2023. The independent variables include six types of crime: corruption, drug dealers, drug users, terrorism, illegal logging, and human trafficking, while the dependent variable is the total number of crimes per district or city. The research process involved data collection, understanding, preprocessing, application of linear regression algorithms, model training and testing, and evaluation using Mean Absolute Percentage Error (MAPE) and Root Mean Square Error (RMSE). The results show that Pekanbaru City recorded the highest number of cases, mostly related to drug crimes. The model predicts an increase in Pekanbaru’s cases from 3,331 in 2024 to 5,852 in 2027, while Dumai City is projected to decline from 543 to 397 cases. The model demonstrates high accuracy in most areas, particularly in Kampar (MAPE 0.28%), Siak (0.52%), and Rokan Hilir (0.94%), though less accurate in the Meranti Islands (565.99%) due to data instability. These findings prove that the Multiple Linear Regression method effectively predicts crime trends and can serve as a quantitative decision-making tool for law enforcement and local governments. Further research should include socioeconomic factors such as poverty and unemployment, and compare results with alternative forecasting methods like ARIMA and Exponential Smoothing to enhance prediction accuracy.

References

[1] F. Duarte, Á. Jiménez-Molina, and I. Sarmiento, “Crime and mental health: Examining the associations between perceived insecurity, crime victimization, and psychological distress in Chile,” Wellbeing, Space and Society, vol. 9, Dec. 2025, doi: 10.1016/j.wss.2025.100303.

[2] I. Adamse, A. Blokland, and V. Eichelsheim, “The geographical aspect of offending across crime types: A study on the journey to crime and co-offender dispersion,” J Crim Justice, vol. 101, p. 102529, Nov. 2025, doi: 10.1016/j.jcrimjus.2025.102529.

[3] R. E. A. Churchill, E. Amankwah, S. Awaworyi Churchill, and L. Farrell, “Crime and chance: Understanding the relationship between neighbourhood crime and gambling,” Public Health, vol. 248, Nov. 2025, doi: 10.1016/j.puhe.2025.105953.

[4] P. Burlian, Patologi Sosial. Bumi Aksara, 2022. Accessed: Oct. 30, 2025. [Online]. Available: https://books.google.co.id/books?id=0L5mEAAAQBAJ

[5] global organized crime index, “Indeks Kriminalitas Global,” Aug. 15, 2024. Accessed: Feb. 23, 2025. [Online]. Available: https://goodstats.id/article/indonesia-masuk-20-besar-negara-dengan-indeks-kriminalitas-tertinggi-di-dunia-3ktwI

[6] Linda, “Terjadi 5.764 Kasus Kejahatan di Riau, Pekanbaru Paling Rawan,” Metro Riau. Accessed: Oct. 30, 2025. [Online]. Available: https://metroriau.com/berita/17910-terjadi-5-764-kasus-kejahatan-di-riau-pekanbaru-paling-rawan.html

[7] F. S. Pratiwi, “Data Sebaran Kasus Kejahatan Menurut Provinsi di Indonesia pada 2022,” Data Indonesia. Accessed: Oct. 30, 2025. [Online]. Available: https://dataindonesia.id/varia/detail/data-sebaran-kasus-kejahatan-menurut-provinsi-di-indonesia-pada-2022

[8] T. Hall and K. Rasheed, “A Survey of Machine Learning Methods for Time Series Prediction,” Applied Sciences (Switzerland), vol. 15, no. 11, Jun. 2025, doi: 10.3390/app15115957.

[9] J. Chen, K. Tong, Q. Yu, S. Chen, T. Balezentis, and D. Streimikiene, “Innovative knowledge-based system for forecasting daily hotel operations amid external events using multi-source data: A time-varying parameter state-space model,” Journal of Innovation & Knowledge, vol. 11, p. 100858, Jan. 2026, doi: 10.1016/j.jik.2025.100858.

[10] E. Spiliotis and E. Theodorou, “Improving wind power forecasting accuracy through bias correction of wind speed predictions,” Sustainable Energy Technologies and Assessments, vol. 83, Nov. 2025, doi: 10.1016/j.seta.2025.104599.

[11] J. Wang, S. Xu, Y. Wang, and H. EL BOUHISSI, “An Analytics Framework for Healthcare Expenditure Forecasting with Machine Learning,” Healthcare Analytics, p. 100428, Oct. 2025, doi: 10.1016/j.health.2025.100428.

[12] A. Lusiana and P. Yuliarty, “Penerapan Metode Peramalan (Forecasting) pada Permintaan Atap di PT X,” Industri Inovatif : Jurnal Teknik Industri, vol. 10, no. 1, pp. 11–20, Jun. 2020, doi: 10.36040/industri.v10i1.2530.

[13] M. A. Kalhoro et al., “Machine learning-based prediction and forecasting of chlorophyll-a in the northern Indian Ocean using satellite data,” Ecol Inform, p. 103482, Oct. 2025, doi: 10.1016/j.ecoinf.2025.103482.

[14] P. Roback and J. Legler, Beyond Multiple Linear Regression, 1st Edition. New York: CRC Press, 2021. doi: 10.1201/9780429066665.

[15] H. Winnos and R. Septima, “Perbandingan Metode Regresi Linier Berganda dan Autoregressive Integrated Moving Average (ARIMA) Untuk Prediksi Saham PT. BSI, Tbk,” Dec. 2022. doi: 10.58192/ocean.v1i4.350.

[16] A. Sobral, J. Folgado, and C. Quental, “A predictive method for estimating the glenohumeral joint center from palpable landmarks using multiple linear regression trained on CT data,” J Biomech, vol. 192, Nov. 2025, doi: 10.1016/j.jbiomech.2025.112954.

[17] Y. Huang, Y. Wei, Z. Xia, C. Qu, Y. Suo, and X. Zhou, “Effect of subinhibitory concentrations of ascorbic and lactic acids on Salmonella Enteritidis surface adhesion: A multiple linear regression analysis,” Journal of Future Foods, Sep. 2025, doi: 10.1016/j.jfutfo.2025.09.004.

[18] T. Nguyen Luu Minh et al., “Multiple linear regression in adsorption capacity prediction: Application in plastic waste pyrolysis oil purification,” Sep Purif Technol, vol. 378, Dec. 2025, doi: 10.1016/j.seppur.2025.134651.

[19] M. Warahmah, Risnita, and M. S. Jailani, “Pendekatan Dan Tahapan Penelitian Dalam Kajian Pendidikan Anak Usia Dini,” Jurnal DZURRIYAT Jurnal Pendidikan Islam Anak Usia Dini, vol. 1, no. 2, pp. 72–81, Sep. 2023, doi: 10.61104/jd.v1i2.32.

[20] R. N. T. Siregar, V. Sitorus, and W.P. Ananta, “Analisis Prediksi Harga Rumah di Bandung Menggunakan Regresi Linear Berganda,” Journal of Creative Student Research, vol. 1, no. 6, pp. 395–404, Dec. 2023, doi: 10.55606/jcsrpolitama.v1i6.3038.

[21] Q. Xiao et al., “Standard molar entropy estimation of binary composite oxides: A novel hybrid approach based on deep learning and hyperparameters optimization,” Alexandria Engineering Journal, vol. 131, pp. 16–32, Nov. 2025, doi: 10.1016/j.aej.2025.10.002.

[22] A. Shukla, R. K. Pachauri, A. Hussain, A. Ali, and B. Khan, “Comparative analysis dust accumulation impact on PV performance using artificial neural network and machine learning algorithms,” Results in Engineering, vol. 26, Jun. 2025, doi: 10.1016/j.rineng.2025.105024.

[23] K. Ullah et al., “Hybrid CNN–BiGRU model with attention mechanism for enhanced short-term load forecasting,” Energy Reports, vol. 14, pp. 2570–2577, Dec. 2025, doi: 10.1016/j.egyr.2025.09.035.

[24] M. Sonderegger and M. Sóskuthy, “Advancements of phonetics in the 21st century: Quantitative data analysis,” J Phon, vol. 111, Jul. 2025, doi: 10.1016/j.wocn.2025.101415.

[25] S. Huang and Q. Ma, “A systematic review of data-driven learning research on language learning and teaching for pre-tertiary learners: Balancing qualitative and quantitative research,” Learn Individ Differ, vol. 122, Aug. 2025, doi: 10.1016/j.lindif.2025.102752.

[26] F. Abdusyukur, “Penerapan Algoritma Support Vector Machine (SVM) untuk Klasifikasi Pencemaran Nama Baik di Media Sosial Twitter,” KOMPUTA : Jurnal Ilmiah Komputer dan Informatika, vol. 12, no. 1, May 2023, doi: 10.34010/komputa.v12i1.9418.

[27] X. Cai, K. Xiong, Z. Luo, D. Weng, S. Ye, and Y. Wu, “CodeLin: An in situ visualization method for understanding data transformation scripts,” Visual Informatics, vol. 9, no. 2, Jun. 2025, doi: 10.1016/j.visinf.2025.03.002.

[28] W. G. Shin, J. S. Lee, Y. C. Ju, H. M. Hwang, and S. W. Ko, “Data preprocessing and machine learning method based on ameliorated mathematical models for inferring the power generation of photovoltaic system,” Energy Convers Manag, vol. 333, Jun. 2025, doi: 10.1016/j.enconman.2025.119793.

[29] A. Lazcano and M. A. Jaramillo-Morán, “Data preprocessing techniques and neural networks for trended time series forecasting,” Appl Soft Comput, vol. 174, Apr. 2025, doi: 10.1016/j.asoc.2025.113063.

[30] D. Alita, A. D. Putra, and D. Darwis, “Analysis of classic assumption test and multiple linear regression coefficient test for employee structural office recommendation,” IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 15, no. 3, p. 295, Jul. 2021, doi: 10.22146/ijccs.65586.

[31] A. Damayanti, F. D. Marleny, and A. A. Ningrum, “Implementasi Regresi Linear Berganda Untuk Prediksi Penjualan pada PT Trimandiri Sarana Propetindo Banjarmasin,” Jurnal Informatika dan Teknik Elektro Terapan, vol. 13, no. 3, Jul. 2025, doi: 10.23960/jitet.v13i3.6679.

[32] H. K. Chinmayi, K. C. Flynn, G. S. Baath, P. Gowda, B. Northup, and A. Ashworth, “Monitoring legume nutrition with machine learning: The impact of splits in training and testing data,” Appl Soft Comput, vol. 176, May 2025, doi: 10.1016/j.asoc.2025.113186.

[33] Ernianti Hasibuan and n Aldian Karim, “Implementasi Machine Learning untuk Prediksi Harga Mobil Bekas dengan Algoritma Regresi Linear berbasis Web,” Jurnal Ilmiah Komputasi, vol. 21, no. 4, Dec. 2022, doi: 10.32409/jikstik.21.4.3327.

[34] D. C. I. Astuti, D. M. Khairina, and S. Maharani, “Peramalan Nilai Ekspor Nonmigas Kalimantan Timur dengan Metode Double Moving Average (DMA),” Adopsi Teknologi dan Sistem Informasi (ATASI), vol. 2, no. 1, pp. 20–34, Jun. 2023, doi: 10.30872/atasi.v2i1.393.

[35] R. Novita, I. Yani, and G. Ali, “Sistem Prediksi untuk Penentuan Jumlah Pemesanan Obat Menggunakan Regresi Linier,” MALCOM: Indonesian Journal of Machine Learning and Computer Science, vol. 2, no. 1, pp. 62–70, May 2022, doi: 10.57152/malcom.v2i1.198.

[36] M. Walaszek et al., “Machine learning prediction of suicide attempt counts in Poland: Insights from Google trends and historical data,” International Journal of Clinical and Health Psychology, vol. 25, no. 4, p. 100644, Oct. 2025, doi: 10.1016/j.ijchp.2025.100644.

[37] M. Ćalasan, I. Radonjić, M. Micev, M. Petronijević, and L. Pantić, “Voltage root mean square error calculation for solar cell parameter estimation: A novel g-function approach,” Heliyon, vol. 10, no. 18, Sep. 2024, doi: 10.1016/j.heliyon.2024.e37887.

[38] A. Villasante, Á. Fernández-Serrano, C. Osuna-Sequera, and E. Hermoso, “Methodology for stiffness prediction in structural timber using cross-validation RMSE analysis,” Journal of Building Engineering, vol. 107, Aug. 2025, doi: 10.1016/j.jobe.2025.112767.

[39] F. Andrian, S. Martha, and S. Rahmayuda, “Sistem Peramalan Jumlah Mahasiswa Baru Menggunakan Metode Triple Exponential Smoothing,” Coding Jurnal Komputer dan Aplikasi, vol. 8, no. 1, Jan. 2020, doi: 10.26418/coding.v8i1.39199.

Downloads

Published

2025-10-30

How to Cite

Wijaya, S. A., Arribe, E., & Muhitualdi. (2025). A Statistical Approach to Crime Rate Prediction Using Multiple Linear Regression. Scientific Journal of Computer Science, 1(2), 63–70. https://doi.org/10.64539/sjcs.v1i2.2025.47

Similar Articles

You may also start an advanced similarity search for this article.