Abstract
Current methods in computer vision and object detection rely heavily on neural networks and deep learning. This active area of research is used in applications such as autonomous driving, aerial imaging, defense and surveillance. State-of-the-art object detection methods rely on rectangular shaped, horizontal/vertical bounding boxes drawn over an object to accurately localize its position. Such orthogonal bounding boxes ignore object pose, resulting in reduced object localization, and limiting downstream tasks such as object understanding and tracking. To overcome these limitations, this research presents object detection improvements that aid tighter and more precise detections. In particular, we modify the object detection anchor box definition to firstly include rotations along with height and width and secondly to allow arbitrary four corner point shapes. Further, the introduction of new anchor boxes gives the model additional freedom to model objects which are centered about a 45-degree axis of rotation. The resulting network allows minimum compromises in speed and reliability while providing more accurate localization. We present results on the DOTA dataset, showing the value of the flexible object boundaries, especially with rotated and non-rectangular objects.
Library of Congress Subject Headings
Computer vision; Pattern recognition systems
Publication Date
2-2019
Document Type
Thesis
Student Type
Graduate
Degree Name
Computer Engineering (MS)
Department, Program, or Center
Computer Engineering (KGCOE)
Advisor
Raymond Ptucha
Advisor/Committee Member
Clark Hochgraf
Advisor/Committee Member
Alexander Loui
Recommended Citation
Bhat, Aneesh, "Aerial Object Detection using Learnable Bounding Boxes" (2019). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/10203
Campus
RIT – Main Campus
Plan Codes
CMPE-MS