Abstract

Traffic accidents on the road have been one of the most significant global issues as far as public safety is concerned, with the severity of crashes, as well as their fatalities, representing an undue burden on human and economic damages. The research constructs a machine learning-based model to examine and forecast the level of traffic accidents based on environmental and time, road and human variables. There were 798 valid observations and 14 variables discussing a dataset using the exploratory data analysis tool, supervised classification models (Decision Tree and Random Forest), and unsupervised K-Means clustering. The technique of feature engineering was used in forming composite indicators like Experience-Age Ratio and Traffic Risk Index. Random Forest model had a general accuracy of 58.75, as compared to Decision Tree and was found to be the best model in severity prediction and identified driver age, driving experience and other exposure related factors as most influential predictors. Despite an average predictive performance with restricted ability to identify the numerous and rare high-severity cases, owing to the imbalance in the classes being investigated, the framework was able to capture meaningful behavioural and situational regularities. The cluster analysis also showed that there were unique accident patterns which justified specific intervention measures. The results prove the superiority of combining supervised and unsupervised machine learning methods in the development of better interpretability and evidence-based traffic safety planning.

Publication Date

5-2026

Document Type

Thesis

Student Type

Graduate

Degree Name

Professional Studies (MS)

Department, Program, or Center

Graduate Programs & Research

Advisor

Ehsan Warriach

Advisor/Committee Member

Sanjay Modak

Comments

This thesis has been embargoed. The full-text will be available on or around 11/12/2026.

Campus

RIT Dubai

Share

COinS