Abstract
This thesis examines the use of advanced machine learning to predict crime danger in Los Angeles, where 2023 violent crime rates (503 per 100,000) surpass the national average (363.8 per 100,000). Rooted in theories like social disorganization and victim vulnerability, it addresses the lack of combined victim-centric modeling by focusing on three questions: (1) Do blended ensemble models outperform individual models in predicting crime danger? (2) Can unsupervised learning techniques enhance supervised models’ accuracy through label generation or augmentation? (3) How can we interpret accurate machine learning models' decision-making processes? Using historical crime data from Los Angeles (2020–2025) and various environmental and demographic variables, this mixed-methods approach includes approximately 1,000,000 crime incidents. Blended models such as Random Forest, Gradient Boosting, XGBoost, and Multilayer Perceptron were tested against individual models. Clustering methods (K-means and DB Scan) identified crime patterns and improved label quality. SHAP analyses explained model decisions. Findings show blended models have slightly higher predictive accuracy of 68% compared to individual ones (67%). However, XGBoost algorithm outperformed the blended model with an accuracy of 69%. Clustering improved model performance by 3–5%. Socioeconomic factors and temporal patterns were significant influencers, offering insights for both individuals and law enforcement. In summary, blended ensemble algorithms may fail to outperform standard models, but unsupervised learning enhance crime prediction with a victim-centric approach, though ethical concerns about biases remain. Recommendations include using these models for targeted resource allocation in Los Angeles. Future research should extend to other cities, use real-time data, and incorporate fairness-aware algorithms to address ethical issues.
Library of Congress Subject Headings
Crime forecasting--California--Los Angeles; Predictive analytics; Machine learning; Cluster analysis
Publication Date
5-2025
Document Type
Thesis
Student Type
Graduate
Degree Name
Professional Studies (MS)
Department, Program, or Center
Graduate Programs & Research
Advisor
Sanjay Modak
Advisor/Committee Member
Khalil Al Hussaeni
Recommended Citation
Okpako, Joshua Oghenetega, "Predicting the Probability of Crime Related Danger in Los Angeles" (2025). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/12161
Campus
RIT Dubai
Plan Codes
PROFST-MS