Abstract
Networks serve as a fundamental framework for modeling complex systems across diverse domains, from biological interactions and social structures to economic markets and information networks. Machine learning has significantly enhanced the ability to extract meaningful patterns from graph-based systems, improving predictive accuracy in applications such as market design, recommendation systems, and biomedical discovery. Despite the successes of machine learning in analyzing complex networks, several fundamental challenges remain. A major issue is the reliance on simplistic models for theoretical analysis, simulation, and benchmarking, which often fail to capture the complexities of real-world networked systems. Many theoretical studies focus on oversimplified generative models that assume uniform randomness in edge formation, preference rankings, or network evolution, limiting their applicability to real-world problems. While these models provide analytical tractability, they often overlook the structured dependencies and correlations that shape real-world interactions. Similarly, the simulation and benchmarking of machine learning models for networks frequently rely on synthetic datasets that do not reflect real-world network structures. Many standard benchmarks use random graph generation or random edge masking for evaluation, disregarding the fact that real-world networks evolve dynamically and exhibit structured missingness. For example, in link prediction tasks on biomedical graphs, edges are often removed at random for training and evaluation, despite the fact that in practice, new edges are discovered sequentially over time, requiring models to generalize to future observations rather than random missing links. Similarly, in stable matching problems, synthetic preferences are often generated using uniform distributions, ignoring the empirical correlation patterns observed in real-world matching markets, such as labor markets or school admissions. Addressing these challenges requires a shift toward realistic domain-specific data modeling, evaluation methodologies that reflect temporal and structured missingness, and models that are designed for deployment in dynamic, real-world networks. This dissertation addresses these challenges by: (1) Investigating fairness in stable matching with correlated preferences, revealing how preference asymmetries shape fairness outcomes and proposing efficient stability-preserving fairness-aware solutions. (2) Developing a principled model selection framework for signed networks, introducing GRASMOS, a maximum likelihood-based approach to infer realistic sign assignment patterns in gene regulatory networks. (3) Improving evaluation methodologies for link prediction methods by incorporating temporal graph evolution, reducing generalization gaps in biomedical link prediction. By advancing fairness-aware algorithms, realistic generative models, and improved evaluation methodologies, this thesis contributes to more reliable, interpretable, and generalizable machine learning methods for structured decision-making in graph-based systems across social and biological domains.
Library of Congress Subject Headings
Machine learning; Neural networks (Neurobiology); Generative programming (Computer science); Graph theory
Publication Date
3-2025
Document Type
Dissertation
Student Type
Graduate
Degree Name
Computing and Information Sciences (Ph.D.)
College
Golisano College of Computing and Information Sciences
Advisor
Ivona Bezakova
Advisor/Committee Member
Stanislaw Radziszowski
Advisor/Committee Member
Varsha Dani
Recommended Citation
Brilliantova, Angelina, "Machine Learning for Complex Networks: Generalization, Fairness, and Model Selection in Graph-Based Systems Across Social and Biological Domains" (2025). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/12103
Campus
RIT – Main Campus
Plan Codes
COMPIS-PHD