|Title||Improving Remote Homology Detection Using Sequence Properties and Position Specific Scoring Matrices|
|Publication Type||Conference Paper|
|Year of Publication||2009|
|Authors||Gina Cooper, Michael Raymer|
|Conference Name||The 2009 International Conference on Bioinformatics and Computational Biology (BIOCOMP 09)|
|Conference Location||Las Vegas, Nevada|
Understanding the structure and function of proteins is a key part of understanding biological systems. Although proteins are complex biological macromolecules, they are made up of only 20 basic building blocks known as amino acids. The makeup of a protein can be described as a sequence of amino acids. One of the most important tools in modern bioinformatics is the ability to search for biological sequences (such as protein sequences) that are similar to a given query sequence. There are many tools for doing this (Altschul et al., 1990, Hobohm and Sander, 1995, Thomson et al., 1994, Karplus and Barrett, 1998). Most of these tools, however, focus on closely related, or homologous, sequences. Distantly related proteins sequences (remote homologs) are of interest to biologists but remain notoriously difficult to find. This dissertation presents a novel method for finding remote homologs in databases of protein sequences. In this method, proteins are characterized according to physiochemical and sequence-based features. Features are then weighted according to their utility in identifying distantly related protein sequences. The feature weights are optimized by a custom genetic algorithm. Position-specific-scoring matrices are used to further increase the ability of the tuned algorithm to generalize its search capability to new sequences. The resulting search method outperforms the most well-known techniques for finding distant homologs, both in terms of accuracy and computation time.
|Full Text|| |
Michael Raymer and Gina Cooper, 'Improving Remote Homology Detection Using Sequence Properties and Position Specific Scoring Matrices,' The 2009 International Conference on Bioinformatics and Computational Biology (BIOCOMP 09), Las Vegas, Nevada, July 13-16 2009.