Dan, You mentioned the "product of p values" method for combining hits with one query to different sequences in the same family: @inproceedings{product-of-p-values, title="Classifying proteins by family using the product of correlated p-values", author="Bailey, Timothy L. and Grundy, William N.", booktitle=recomb99, month="April 11-14", year="1999", pages="10-14", publisher="ACM Press" } That is a useful technique, but different from what I was proposing, which is to combine search results from independent queries (the peptides) so that different queries bringing up the same sequence will strongly reinforce the signal for that sequence. Perhaps the best bet is to do as Joseph Bedell suggests, and concatenate the peptides with XXXXXXXXXX spacers, and use the already written multi-hit functions in BLAST. Since the order of the peptides is unknown, 6 searches should be done, one for each order of the residues. I may be misunderstanding the problem, but I was assuming that the problem was to identify a protein from an organism that did NOT have a genomic sequencing project near completion. Thus the need to look for homologs in other organisms (which may not be very similar). If there is some genomic data, the full-length putative homologs may be used to seach the genome of the organism for a match One a putative homolog is found, an HMM based on its full-length sequence could be used (created using SAM-T2K or PSI-BLAST and HMMer) could be used for the search, and to identify any regions likely to be highly conserved in the protein. The highly conserved regions may allow designing a primer to fish out the gene itself. Kevin Karplus karplus at soe.ucsc.edu http://www.soe.ucsc.edu/~karplus Professor of Computer Engineering, University of California, Santa Cruz Undergraduate and Graduate Director, Bioinformatics Affiliations for identification only.