[BiO BB] Clustering small DNA sequences into groups
Iddo Friedberg
idoerg at burnham.org
Tue Aug 9 18:07:34 EDT 2005
CD-HIT does not work on DNA, or short sequences for that matter..
Dan Bolser wrote:
>On Tue, 9 Aug 2005, Samantha Fox wrote:
>
>
>
>>Hi,
>>
>>I have a set of small DNA sequences (about 40) 6-10 bp, and wish to
>>group them into clusters based on sequence.
>>
>>Any suggestions for doing that ?
>>
>>
>
>I never tried using CD-HIT to cluster DNA, but it should work (you will
>have to alter the 'throwaway' length to something like 4 to stop all your
>sequences being filterd as too short.
>
>I found blastclust (which can be explicitly set to cluster
>DNA) automatically ignores any protein sequence of less than 30
>residues. While it could cluster those together (100% identical for
>example) it always seems to put any protein fragment less than 30 residues
>into a new cluster.
>
>Not sure if the behaviour is the same in DNA mode.
>
>
>
>
>>Thanks,
>>
>>Samantha
>>_______________________________________________
>>Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org
>>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>>
>>
>>
>
>_______________________________________________
>Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org
>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>
>
>
>
--
Iddo Friedberg, Ph.D.
The Burnham Institute
10901 N. Torrey Pines Rd.
La Jolla, CA 92037
Tel: (858) 646 3100 x3516
Fax: (858) 713 9930
http://ffas.ljcrf.edu/~iddo
More information about the BBB
mailing list