Hoffmann, R. A wiki for the life sciences where authorship matters. Nature Genetics (2008)

Editor

Personal info

View

About

Terms

Javascript disabled

Please enable Javascript support in your browser to use this application.

Instructions on how to enable JavaScript on different browsers can be found here: http://www.google.com/support/bin/answer.py?answer=23852.

[edit this page]

Levitt, M. et al.

Levitt, Gerstein,

A unified statistical framework for sequence comparison and structure comparison.

We present an approach for assessing the significance of sequence and structure comparisons by using nearly identical statistical formalisms for both sequence and structure. Doing so involves an all-vs.-all comparison of protein domains [taken here from the Structural Classification of Proteins (scop) database] and then fitting a simple distribution function to the observed scores. By using this distribution, we can attach a statistical significance to each comparison score in the form of a P value, the probability that a better score would occur by chance. As expected, we find that the scores for sequence matching follow an extreme-value distribution. The agreement, moreover, between the P values that we derive from this distribution and those reported by standard programs (e.g., BLAST and FASTA validates our approach. Structure comparison scores also follow an extreme-value distribution when the statistics are expressed in terms of a structural alignment score (essentially the sum of reciprocated distances between aligned atoms minus gap penalties). We find that the traditional metric of structural similarity, the rms deviation in atom positions after fitting aligned atoms, follows a different distribution of scores and does not perform as well as the structural alignment score. Comparison of the sequence and structure statistics for pairs of proteins known to be related distantly shows that structural comparison is able to detect approximately twice as many distant relationships as sequence comparison at the same error rate. The comparison also indicates that there are very few pairs with significant similarity in terms of sequence but not structure whereas many pairs have significant similarity in terms of structure but not sequence.[1]

References

A unified statistical framework for sequence comparison and structure comparison. Levitt, M., Gerstein, M. Proc. Natl. Acad. Sci. U.S.A. (1998) [Pubmed]

Annotations and hyperlinks in this abstract are from individual authors of WikiGenes or automatically generated by the WikiGenes Data Mining Engine. The abstract is from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.About WikiGenes Open Access Licence Privacy Policy Terms of Use apsburg

The world's first wiki where authorship really matters (Nature Genetics, 2008). Due credit and reputation for authors. Imagine a global collaborative knowledge base for original thoughts. Search thousands of articles and collaborate with scientists around the globe.