Make non redundant protein set
A set of proteins is generated from the selected proteins
where the sequence identity ratio is not higher than the given threshold.
Proteins are excluded from the set for which another very similar protein exists.
This procedure is rather primitive and more sophisticated programs for this task exist (E.g. nrdb90).
Because it is aligning each sequence against all others by ClustalW the time rises with the square of the number of sequences.
Enter the threshold ( 0 ... 1 ).
The larger the number, the more sequences will be returned.