[BiO BB] About clustering genes to gene family
zfu at cs.ucr.edu
Fri Aug 8 13:10:27 EDT 2003
How to differentiate the fist case(complex cluster) and the
second(distantly related with homolog).
And where can I find the information about GENEFAMMER?
On Thu, 7 Aug 2003, Dan Bolser wrote:
> What you describe can occur for 2 good reasons...
> You are forming a 'complex cluster', created by *multiple domain*
> A has domains in common with B,
> B has domains in common with C.
> A and C have no domains in common, and hence no homology.
> A: |------W------/-----X-----|
> B: |------x-----/-----Y-------|
> C: |------y-------/--------hello
> A and C are too distantly related for sequence searches to uncover their
> true homology. However, sequence B is *intermediate* to A and C,
> having homology to both...
> / \
> / \
> / \
> A C
> NB: Sequence similarity is not a metric, as it does not obey triangular
> (I think it is metric at high levels of similarity though?)
> In this case you have used the transitive nature of sequence similarity
> to uncover
> distant homology via an intermediate sequence.
> Jong Park and Sarah Techimann worked on both these ideas, and has
> created a
> family clustering package called GENEFAMMER, Specifically DIVCLUS breaks up
> complex clusters into domain families. Transitivity is implemented
> (kinda) in psiblast /
> hmm models, all three of which are used in PFAM, so you might want to
> look there
> for your families.
> Or you could insist your allignments cover 90% of the shortest sequence,
> and then
> cluster using single linkage.
> Zheng Fu wrote:
> >Hi everyone,
> >Does anyone know how to clustering genes to a gene family based on the
> >sequence alignments.
> >For two genes, we can define a threshold to seperate the homolog and
> >non-homolog. But for three or more genes,how to define the homologs?(Such
> >as Gene A and Gene B has high alignment score, A and C also has high sore,
> >but B and C doesn't have high socre, can we say ABC are homologs?
> >Thank you.
> BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org
Love & Peace
More information about the BBB