[BiO BB] Non-redudant swissprot protein sequences

Dan Bolser dmb at mrc-dunn.cam.ac.uk
Fri Jul 23 08:55:57 EDT 2004


They do this already at uniprot, which is nice. It is a service called
'uniref'.

ftp://ftp.ebi.ac.uk/pub/databases/uniprot/uniref/

You can get 100, 90 and 50% nr, where the 90 set is derived from the 100
set (via a two step process), and 50 is derived from 90 via a three step
process.

I still haven't quite got my head round the way they handle varsplic in
uniref with relation to uniprot.

It is probably documented though.

Cheers,


On Thu, 22 Jul 2004, Iddo wrote:

>
>How about downloading the entire SWISSPROT and then clustering it 
>yourself using CD-HIT?
>
>http://bioinformatics.org/cd-hit/
>
>Iddo
>
>
>Ya Zhang wrote:
>
>> Does anyone know where can I download swissprot prtein sequences with
>> similarity lower than 95% to each other? I know the ASTRAL has such
>> type of dataset for PDB sequences. Is there anyone know how to make the
>> dataset?
>>
>> Thanks!
>>
>> Ya
>>
>>   _______________________________________________
>> BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org
>> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>>
>>
>
>
>




More information about the BBB mailing list