[BiO BB] Find common regions in 3 organisms
dan.bolser at gmail.com
Thu Sep 17 06:25:58 EDT 2009
2009/9/17 Nevan King <nevan.ml at gmail.com>:
> This question has probably been asked, but I'm not sure what search
> terms to use to find answers. This is a question from one of the
> researchers in my lab.
> I want to find common regions of sequences in 3 organisms. The first
> organism (P. gingivalis) has been fully sequenced and described. It
> has around 2000 genes. The other two are similar to P. gingivalis.
> I've set up all three organisms in Blast, but comparing the genes one
> by one would be a big task. What's the best way to automate this? I
> understand that you can enter a list of fastas into blast and it will
> compare each one to all the genes in its database. Is there a way to
> do this with 3 organisms? Is Blast the best tool to use for this job?
> Sorry if this is short on details, I don't fully understand the topic.
Often the answer to this sort of question is 'there is more than one
way to do it', and the way that you use usually depends on what you
want to see...
I would suggest something like this:
1) blast all genes of organism A against organism B and vice verse
(as described above).
2) Pick 'orthologues' using the 'reciprocal best hits' method (i.e. if
gene Ax' and gene Bx'' both find each other as the 'top blast hit' in
the respective organisms gene list, call them an orthologus pair.
3) Repeat step 1 and 2, but use organism A and C instead of A and B.
4) Pick 'orthologues' when Ax' and Bx'' are an orthologus pair AND Ax'
and Cx''' are an orthologus pair.
5) er... do you need to do B vs. C?
Once you get the above blast results (A vs. B, A vs. C, B vs. C and
vice verse) into a database, you will have more than enough data to
play with. You can then define orthologues however you like.
That is just one idea to get you going.
> BBB mailing list
> BBB at bioinformatics.org
More information about the BBB