Publications
Citation
Li W, Jaroszewski L, Godzik A
Clustering of Highly Homologous Sequences to Reduce the Size of Large Protein Databases.
Bioinformatics (Oxford, England). 2001 Mar 01; 17: 282-3.
Abstract
We present a fast and flexible program for clustering large protein databases at different sequence identity levels. It takes less than 2 h for the all-against-all sequence comparison and clustering of the non-redundant protein database of over 560,000 sequences on a high-end PC. The output database, including only the representative sequences, can be used for more efficient and sensitive database searches.
This publication is listed for reference purposes only. It may be included to present a more complete view of a JCVI employee's body of work, or as a reference to a JCVI sponsored project.