## Abstract

We describe a symbol classification technique for identifying the expected locations of neighboring symbols in mathematical expressions. We use the seven symbol layout classes of the DRACULAE math notation parser (Zanibbi, et al., 2002) to represent expected locations for neighboring symbols: Ascender, Descender, Centered, Open Bracket, Non-Scripted, Variable Range (e.g., integrals) and Root. A new feature based on the shape context (Belongie, et al., 2002), named layout context, is used to describe the arrangement of neighboring symbols relative to a reference symbol, and the nearest neighbor rule is used for classification. 1917 mathematical symbols from the University of Washington III document database are used in our experiments. Using a leave-one-out estimate, our best classification rate reaches nearly 80%. In our experiments, we find that the size of the reference symbol neighborhood area, the number and the sampling positions of the points of the key points model representing a symbol's location, play important roles in the classification process.

## Library of Congress Subject Headings

Mathematical symbols (Typefaces)--Classification; Layout (Printing)

## Publication Date

11-17-2009

## Document Type

Thesis

## Department, Program, or Center

Chester F. Carlson Center for Imaging Science (COS)

## Advisor

Zanibbi, Richard

## Recommended Citation

Ouyang, Ling, "A Symbol layout classification for mathematical formula using layout context" (2009). Thesis. Rochester Institute of Technology. Accessed from

https://repository.rit.edu/theses/3031

## Campus

RIT – Main Campus

## Comments

Note: imported from RIT’s Digital Media Library running on DSpace to RIT Scholar Works. Physical copy available through RIT's The Wallace Library at: Z250.6.M3 O89 2009