At 10:51 AM 10/6/2004, you wrote: >If you care about the alignments, an HMM model will often produce >better multiple alignments even if it finds the same sequences as >BLAST. Very true. But let's put a caveat here that better HMM-produced alignment will be generated if the initial alignment used for HMM training was good to begin with. For closely related sequences, this usually is not a problem. >The original question asked about models for "protein domain families >(as defined in SCOP)," which may mean family-level models, or >superfamily, or even fold, depending on how precisely Manisha Goel was >using the term "families". If one wants to build a model that >recognizes only one family and not other families in the same >superfamily, the usual HMM methods will generally generalize too far. >So far as I know, the best technique for family-level classification >is to build an SVM classifier that uses an HMM to produce the input >vectors for the SVM. (See, for example, Rachel Karchin's Master's >thesis, or her paper For those interested in this general subject, I suggest this paper: Nucleic Acids Res. 2002 Apr 1;30(7):1575-84An efficient algorithm for large-scale detection of protein familiesEnright AJ, Van Dongen S, Ouzounis CAhttp://nar.oupjournals.org/cgi/content/full/30/7/1575 and this software: http://micans.org/mcl/ In general, once you detect a group of proteins, and these may be at the superfamily level if one is not inclined to tune the parameters, this algorithm will take results of the all-against-itself BLAST search (even for hundreds of proteins this doesn't take very long) and cluster them into groups based on various tunable parameters. The most useful parameter is probably the inflation value (see description on the page above), and high values of this parameter are likely to generate clustering that is pretty close to family classifications. Probably not as sophisticated as the method Kevin suggested, but works very well in most cases and is extremely fast. Cheers, Mensur ========================================================================== | Mensur Dlakic, PhD | Tel: (406) 994-6576 | | Department of Microbiology | Fax: (406) 994-4926 | | Montana State University | | | 109 Lewis Hall, P.O. Box 173520 | http://myprofile.cos.com/mensur | | Bozeman, MT 59717-3520 | E-mail: mdlakic at montana.edu | ==========================================================================