Abstract
The Multimedia Content Description Interface (ISO/IEC 15938), commonly known to as MPEG-7, became a standard as of September of 2001. Unlike its predecessors, MPEG- 7 standardizes multimedia metadata description. By providing robust descriptors and an effective system for storing them, MPEG-7 is designed to provide a means of navigation through audio-visual content. In particular, MPEG-7 provides two two-dimensional shape descriptors, the Angular Radial Transform (ART) and Curvature Scaled Space (CSS), for use in image and video annotation and retrieval. Field Programmable Gate Arrays (FPGAs) have a very general structure and are made up of programmable switches that allow the end-user, rather than the manufacturer, to configure these switches for whatever design is needed by their application. This flexibly has led to the use of FPGAs for prototyping and implementing circuit designs as well as their use being suggesting as part of reconfigurable computing. For this work, an FPGA based ART extractor was designed and simulated for a Xilinx Virtex-E XCV300e in order to provide a speedup over software based extraction. The design created is capable of processing over 69,4400 pixels a minute. This design utilizes 99% of the FPGA's logical resources and operates at a clock rate of 25 MHz. Along with the proposed design, the MPEG-7 shape descriptors were explored as to how well they retrieved similar objects and how these objects matched up to what a human would expect. Results showed that the majority of the retrievals made using the MPEG-7 shape descriptors returned visually acceptable results. It should be noted that even the human results had a high amount of variance. Finally, this thesis briefly explored the potential of utilizing the ART descriptor for optical character recognition (OCR) in the context of image retrieval from databases. It was demonstrated that the ART has potential for use in OCR, however there is still research to be performed in this area.
Library of Congress Subject Headings
MPEG (Video coding standard); Image processing--Digital techniques--Standards; VHDL (Computer hardware description language); Field programmable gate arrays
Publication Date
2003
Document Type
Thesis
Student Type
Graduate
Degree Name
Computer Engineering (MS)
Department, Program, or Center
Computer Engineering (KGCOE)
Advisor
Andreas Savakis
Advisor/Committee Member
Ricardo de Queiroz
Recommended Citation
Woz, Bret, "An Exploration of MPEG-7 Shape Descriptors" (2003). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/7562
Campus
RIT – Main Campus
Comments
Physical copy available from RIT's Wallace Library at TA1637 .W69 2003