http://bioinformatics.burnham-inst.org/cd-hi
The program removes redundant sequences and generate a database with only the representatives, therefore the output database is much smaller. The use of clustered database can not only save time in database searching and result parsing, but also increase the search sensitivity.
The program is written by
Weizhong Li
UCSD, San Diego Supercomputer Center
La Jolla, CA, 92093
Email liwz@sdsc.edu
at
Adam Godzik's lab
The Burnham Institute
La Jolla, CA, 92037
Email adam@burnham-inst.org
This program is free. Download with this click.