Abstract title
Authors
Institution
Abstract
Huge amounts of DNA sequence data are available today as a result of the many genome projects throughout the world. A big challenge for scientists is the analysis of such data, analysis which has to be automated as much as possible due to the volume of data involved. We intend to contribute to this effort, focusing on transmembrane transport proteins. Our goal is to build software tools that: given a protein sequence, determine whether it is a transport protein and classify it; given a genome, or a large portion of a genome, find the transport systems present there; given a piece of a gene or protein, finds out whether this sequence shares similarities which characterize a transport family; and so on.
Classification of transporters is based on the TC comission families, available at Dr. M. Saier's web site www.biology.ucsd.edu/~msaier/transport, where detailed comments and example sequences are given for each family. It is the existence of these examples, carefully reviewed by experts, that motivated our concentration on transport proteins. The corpus of classified and annotated examples permits automation and bioinformatic treatment, providing also a valuable benchmark. To further evaluate our programs, we intend to use the Xylella fastidiosa ans Xanthomonas axonopodis pv. citri annotations, in which we took active part. In this talk we will present the current status of our project.