cd-hit 4.3-2010-10-25 (tgz)
Release notes:
With an improved multi-level counting algorithm, the speed for clustering large datasets is increased significantly. For example, a 10 million short reads dataset can be clustered in a hour. See http://cd-hit.org for updates.
Changelog:
CD-HIT-V4.3: Fix: a few bugs related to multi-level counting; Change: implementation for -M option. CD-HIT-V4.2.x: Some bug fixings. CD-HIT-V4.2: Add: multi-level counting array to improve the speed. CD-HIT-V4.1.1: Change: improve estimating alignment band for sequences with low complexity. CD-HIT-V4.1: Fix: a bug in searching best alignment band; Fix: a bug in handling 'N' for EST; CD-HIT-V4.0: New implementation with parallelization using OpenMP.