Abstract
The emergence of large, general-purpose foundation models has sparked significant interest in the broader machine learning community. Among the many models being released, the Segment Anything Model (SAM) has demonstrated exceptional capabilities for object segmentation in various settings. Given its expansive training data, SAM has been used for image segmentation across various downstream tasks ranging from tumour segmentation to aerial object detection. However, the majority of pretraining data used by SAM consists of naturally-occurring images, which have significantly different characteristics from images of tumours or images taken by drones. In order to align SAM to these previously unseen domains, we need to fine-tune the model to leverage its prior knowledge and learn generalizable features from newer images to increase performance on a given target dataset. However, given the extremely large size and number of parameters, traditional fine-tuning methods are too costly to be applied to foundation models such as SAM. To overcome this limitation, a new family of methods known as Parameter-Efficient Fine-Tuning (PEFT) techniques have emerged to effectively and efficiently tailor these large models to application domains outside their training data. While there has been considerable research on developing new PEFT techniques, different methods modify the representation of a model differently, making it a non-trivial task to select the most appropriate method for a particular domain of interest. To this end, we propose a new framework, Mixture-of-PEFTs (MoPEFT), that is inspired by traditional Mixture-of-Experts (MoE) methodologies and use it to fine-tune SAM. Our MoPEFT framework incorporates three different PEFT techniques as submodules and learns to dynamically activate the ones that are best suited for a given data-task setup. We test our method on the Segment Anything Model across 22 different datasets spread over 5 domains and show that MoPEFT consistently outperforms other fine-tuning methods on the MESS benchmark.
Publication Date
8-2024
Document Type
Thesis
Student Type
Graduate
Degree Name
Data Science (MS)
Department, Program, or Center
Software Engineering, Department of
College
Golisano College of Computing and Information Sciences
Advisor
Andreas Savakis
Advisor/Committee Member
Travis Desell
Advisor/Committee Member
Qi Yu
Recommended Citation
Sahay, Rajat, "A Mixture-of-Experts Approach to Fine-Tuning the Segment Anything Model" (2024). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/11880
Campus
RIT – Main Campus