Hi all,
I am trying to cluster 300 K EST's that i got from the NCBI using stackPack. The clutsering give me results totally different from what is present in uniGene. I happen to get 2 large clusters (64k and 34k sequences). If i try to break them by increasing the identity or length of region compared, i get and huge amount of singletons.
I was wondering if anyone dealt with such a hassle before ???
Any suggestions will be greately appreciated.
Thank you
bingo11{AT}hotmail.com
|