[BiO BB] Clustering small DNA sequences into groups

Iddo Friedberg idoerg at burnham.org
Tue Aug 9 18:12:29 EDT 2005


How about building a distance matrix of your own (based on %ID between 
fragments) and then use WEKA for the clustering?



./I

Samantha Fox wrote:

>Thanks so much for your replies. However, it did not work yet. cd-hit
>gave this error, and blastclust is not usable for such small sequences
>!
>
>Any suggestions ? 
>
>  
>
>>cat fasta
>>one
>>    
>>
>tagcgc
>  
>
>>two
>>    
>>
>atcgtt
>  
>
>>./cd-hit -i fasta -o www
>>    
>>
>total seq: 0
>longest and shortest : 0 and 99999
>Total letters: 0
>terminate called after throwing an instance of 'std::bad_alloc'
>  what():  St9bad_alloc
>Abort (core dumped)
>
>  
>
>>./cd-hit -i fasta -o www -l 5
>>    
>>
>
>Fatal Error
>Too short -l, redefine it
>
>Program halted !!
>
>
>
>On 8/9/05, Martin Gollery <marty.gollery at gmail.com> wrote:
>  
>
>>I believe those sequences are too short for Blastclust. The default
>>word size is 32.
>>
>>Marty
>>
>>On 8/9/05, Marcos Oliveira de Carvalho <operon at cbiot.ufrgs.br> wrote:
>>    
>>
>>>Hi Samantha,
>>>
>>>BLASTCLUST can group DNA sequences. Maybe you will need to tweak the
>>>parameters (almost the same for BLAST). You can get it at the NCBI ftp:
>>>ftp://ftp.ncbi.nih.gov/blast/
>>>
>>>cheers
>>>Marcos
>>>
>>>
>>>
>>>On Tue, 09 Aug 2005 14:24:41 -0300, Samantha Fox <bioinfosm at gmail.com>
>>>wrote:
>>>
>>>      
>>>
>>>>Hi,
>>>>
>>>>I have a set of small DNA sequences (about 40) 6-10 bp, and wish to
>>>>group them into clusters based on sequence.
>>>>
>>>>Any suggestions for doing that ?
>>>>
>>>>Thanks,
>>>>
>>>>Samantha
>>>>        
>>>>
>_______________________________________________
>Bioinformatics.Org general forum  -  BiO_Bulletin_Board at bioinformatics.org
>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>
>
>  
>


-- 

Iddo Friedberg, Ph.D.
The Burnham Institute
10901 N. Torrey Pines Rd.
La Jolla, CA 92037
Tel: (858) 646 3100 x3516
Fax: (858) 713 9930
http://ffas.ljcrf.edu/~iddo




More information about the BBB mailing list