Abstract
Deep learning and Computer vision are extensively used to solve problems in wide range of domains from automotive and manufacturing to healthcare and surveillance. Research in deep learning for food images is mainly limited to food identification and detection. Food segmentation is an important problem as the first step for nutrition monitoring, food volume and calorie estimation. This research is intended to expand the horizons of deep learning and semantic segmentation by proposing a novel single-pass, end-to-end trainable network for food segmentation. Our novel architecture incorporates both channel attention and spatial attention information in an expanded multi-scale feature representation using the WASPv2 module. The refined features will be processed with the advanced multi-scale waterfall module that combines the benefits of cascade filtering and pyramid representations without requiring a separate decoder or postprocessing.
Library of Congress Subject Headings
Image processing--Digital techniques; Pattern recognition systems; Deep learning (Machine learning); Computer vision; Food--Data processing
Publication Date
11-2021
Document Type
Thesis
Student Type
Graduate
Degree Name
Computer Engineering (MS)
Department, Program, or Center
Computer Engineering (KGCOE)
Advisor
Andreas Savakis
Advisor/Committee Member
Andres Kwasinski
Advisor/Committee Member
Alexander Loui
Recommended Citation
Sharma, Udit, "GourmetNet: Food Segmentation Using Multi-Scale Waterfall Features With Spatial and Channel Attention" (2021). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/11034
Campus
RIT – Main Campus
Plan Codes
CMPE-MS