MathAMR+: A Unified Graph Neural Network Framework for Multimodal Mathematical Information Retrieval
Abstract
Mathematical Information Retrieval (MIR) focuses on developing systems that enable users to search for and retrieve documents containing mathematical content. A key challenge in building effective math-aware retrieval systems lies in jointly modeling the symbolic and operational structure of mathematical expressions together with their surrounding linguistic context. Existing approaches often use linear token sequences, losing structural information, or rely on separate models for text and math, limiting their ability to capture cross-modal patterns and learn contextualized representations. This thesis proposes MathAMR+, a graph neural network-based retrieval framework that jointly models Abstract Meaning Representation (AMR) graphs, Operator Trees (OPTs), and Symbol Layout Trees (SLTs). Unlike prior work that linearizes similar graph representations for Transformer encoders, the proposed model preserves graph structure and learns contextualized multi-vector representations for fine-grained alignment between textual and mathematical components. It employs a Relational Graph Convolutional Network (RGCN) with hierarchical virtual nodes at the formula, sentence, and document levels to enable structured aggregation and long-range message passing. The model is trained using contrastive learning for dense multi-vector retrieval and evaluated on the ARQMath-3 benchmark. The central research question is whether explicitly modeling contextualized text-formula interactions through a unified graph representation improves math-aware retrieval compared to prior approaches that either linearize such representations or rely on separate specialized techniques for the two modalities. Initial results suggest that MathAMR+ effectively leverages its joint graph representation to achieve strong math-aware retrieval performance, with experiments indicating that its gains stem from the complementary structure of operator and symbol layout trees, the robustness of relational message passing over hierarchical virtual nodes, and the benefits of finegrained multi-vector interactions for aligning mathematical expressions with surrounding textual context.
Library of Congress Subject Headings
Neural networks (Computer science); Natural language processing (Computer science); Information storage and retrieval systems--Mathematics; Graph theory
Publication Date
5-2026
Document Type
Thesis
Student Type
Graduate
Degree Name
Computer Science (MS)
Department, Program, or Center
Computer Science, Department of
College
Golisano College of Computing and Information Sciences
Advisor
Richard Zanibbi
Advisor/Committee Member
Carlos Rivero
Advisor/Committee Member
Matthew Fluet
Recommended Citation
Yoon, Jacob, "MathAMR+: A Unified Graph Neural Network Framework for Multimodal Mathematical Information Retrieval" (2026). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/12543
Campus
RIT – Main Campus
Plan Codes
COMPSCI-MS
