Abstract

This thesis investigates the enhancement of the Network Readiness Index (NRI) through the application of machine learning methodologies to refine indicator weighting, clustering, and dimensionality. The study builds on research promoting data-driven methods to address the limitations of equal-weighted indices in capturing indicator importance and interdependence. The research aims to evaluate how unsupervised and supervised learning methods can optimize the interpretation of NRI data. The study focuses on three primary research questions: (1) How can PCA be used to reduce the dimensionality of the NRI without significant loss of information? (2) Can k-means clustering reveal meaningful groups of countries with similar digital readiness profiles? (3) What indicators most strongly predict overall NRI scores, and how can this inform a more accurate weighting scheme? The dataset consists of 134 countries and over 50 indicators from the 2023 NRI. PCA was employed to reduce redundancy among variables and to retain components explaining over 80% of the variance. K-means clustering was then applied to the PCA-transformed data, identifying six distinct clusters. To determine optimal k, both the Elbow Method and the Silhouette Method were used. Random Forest regression was applied to estimate the importance of individual indicators in predicting the NRI score, with the top predictors subsequently used to compute a new weighted NRI score. Findings reveal that Random Forest regression achieved high predictive performance (RMSE = 3.45; R² = 0.91). A revised NRI score was calculated using normalized feature importances, producing a version of the index more reflective of empirical impact. Spearman correlation between the original and enhanced rankings was high (r = 0.990), but several countries experienced rank shifts, indicating improved sensitivity to critical indicators. Statistical validation of clustering results via Fisher's Exact Test confirmed a significant relationship (p = 0.004) between data-driven clusters and traditional NRI tiers, affirming the alignment between unsupervised learning groupings and score-based categorizations. This study concludes that machine learning methods offer a robust alternative to traditional equal-weight approaches in composite index construction.

Library of Congress Subject Headings

Telecommunication policy--Data processing; Computer networks--Developing countries--Data processing; Machine learning; Principal components analysis

Publication Date

8-27-2025

Document Type

Thesis

Student Type

Graduate

Degree Name

Professional Studies (MS)

Department, Program, or Center

Graduate Programs & Research

Advisor

Sanjay Modak

Advisor/Committee Member

Ehsan Warriach

Campus

RIT Dubai

Plan Codes

PROFST-MS

Share

COinS