Abstract
Generating motifs from known active sites and matching those motifs to an uncharacterized protein is a classic way of determining protein function. Until now, the generation of motifs has been based purely on enzymatic function. This approach does not account for situations where highly different active sites can arrive at the same function by processes like convergent evolution. As such, a secondary metric on which to base the generation of motifs is necessary. This metric exists in the form of UniProt designation for homologous proteins on a global scale or PFam for designation of homologous proteins at the active site level.
Here, we describe a tool to generate highly selective motifs using the aforementioned metrics. We were able to collapse a large number of proteins into their representative motifs with little loss in sensitivity, creating an “average” representation of each motif. These motifs will aid the characterizing proteins of known structure but unknown function.
Library of Congress Subject Headings
Proteins--Analysis--Data processing; Proteomics
Publication Date
9-24-2015
Document Type
Thesis
Student Type
Graduate
Degree Name
Bioinformatics (MS)
Department, Program, or Center
Thomas H. Gosnell School of Life Sciences (COS)
Advisor
Paul A. Craig
Advisor/Committee Member
Gary Skuse
Advisor/Committee Member
Feng Cui
Recommended Citation
Baker, Cameron, "Homology Based Motif Generation" (2015). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/9163
Campus
RIT – Main Campus
Plan Codes
BIOINFO-MS