MAST - Motif Alignment and Search Tool
MAST version 3.0 (Release date: 2001/03/01 11:41:04)
For further information on how to interpret these results or to get
a copy of the MAST software please access http://meme.sdsc.edu.
REFERENCE
If you use this program in your research, please cite:
Timothy L. Bailey and Michael Gribskov,
"Combining evidence using p-values: application to sequence homology
searches", Bioinformatics, 14(48-54), 1998.
DATABASE AND MOTIFS
DATABASE farntrans5.s (peptide)
Last updated on Thu Mar 1 08:28:38 2001
Database contains 5 sequences, 1900 residues
MOTIFS meme.farntrans5.zoops.4.sunsparcsolaris.html (peptide)
MOTIF WIDTH BEST POSSIBLE MATCH
----- ----- -------------------
1 32 QQPEGGFNGRPNKLPDVCYSWWVLGSLPIIGR
2 50 GEVDTRFVYCALSVASLLNILTPELVEGAIEFVLRCQNFDGGFGCCPGAE
3 34 FILMCQDEEQGGLADKPGNMVDFYHTVYCIAGLS
PAIRWISE MOTIF CORRELATIONS:
MOTIF 1 2
----- ----- -----
2 0.30
3 0.41 0.22
No overly similar pairs (correlation > 0.60) found.
Random model letter frequencies (from non-redundant database):
A 0.073 C 0.018 D 0.052 E 0.062 F 0.040 G 0.069 H 0.022 I 0.056 K 0.058
L 0.092 M 0.023 N 0.046 P 0.051 Q 0.041 R 0.052 S 0.074 T 0.059 V 0.064
W 0.013 Y 0.033
SECTION I: HIGH-SCORING SEQUENCES
- Each of the following 5 sequences has E-value less than 10.
- The E-value of a sequence is the expected number of sequences
in a random database of the same size that would match the motifs as
well as the sequence does and is equal to the combined p-value of the
sequence times the number of sequences in the database.
- The combined p-value of a sequence measures the strength of the
match of the sequence to all the motifs and is calculated by
- finding the score of the single best match of each motif
to the sequence (best matches may overlap),
- calculating the sequence p-value of each score,
- forming the product of the p-values,
- taking the p-value of the product.
- The sequence p-value of a score is defined as the
probability of a random sequence of the same length containing
some match with as good or better a score.
- The score for the match of a position in a sequence to a motif
is computed by by summing the appropriate entry from each column of
the position-dependent scoring matrix that represents the motif.
- Sequences shorter than one or more of the motifs are skipped.
- The table is sorted by increasing E-value.
| Links | Sequence Name | Description | E-value | Length
|
|---|
|
| RATRABGERB
| Rat rab geranylgeranyl tr...
| 1.9e-98
| 331
|
|
| BET2_YEAST
| YPT1/SEC4 PROTEINS GERANY...
| 6.4e-98
| 325
|
|
| PFTB_RAT
| PROTEIN FARNESYLTRANSFERA...
| 1.6e-89
| 437
|
|
| RAM1_YEAST
| PROTEIN FARNESYLTRANSFERA...
| 7.2e-87
| 431
|
|
| CAL1_YEAST
| RAS PROTEINS GERANYLGERAN...
| 1.1e-28
| 376
|
SECTION II: MOTIF DIAGRAMS
- The ordering and spacing of all non-overlapping motif occurrences
are shown for each high-scoring sequence listed in Section I.
- A motif occurrence is defined as a position in the sequence whose
match to the motif has POSITION p-value less than 0.0001.
- The POSITION p-value of a match is the probability of
a single random subsequence of the length of the motif
scoring at least as well as the observed match.
- For each sequence, all motif occurrences are shown unless there
are overlaps. In that case, a motif occurrence is shown only if its
p-value is less than the product of the p-values of the other
(lower-numbered) motif occurrences that it overlaps.
- The table also shows the E-value of each sequence.
- Spacers and motif occurences are indicated by
- occurrence of motif `n' with p-value less than 0.0001.
- Sequences longer than 1000 are not shown to scale and are indicated by thicker lines.
| Links | Name | Expect |    Motifs
|
|---|
|
| RATRABGERB
| 1.9e-98
|
|
|
| BET2_YEAST
| 6.4e-98
|
|
|
| PFTB_RAT
| 1.6e-89
|
|
|
| RAM1_YEAST
| 7.2e-87
|
|
|
| CAL1_YEAST
| 1.1e-28
|
|
| SCALE
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| 1
| 25
| 50
| 75
| 100
| 125
| 150
| 175
| 200
| 225
| 250
| 275
| 300
| 325
| 350
| 375
| 400
| 425
| |
|
|---|
SECTION III: ANNOTATED SEQUENCES
- The positions and p-values of the non-overlapping motif occurrences
are shown above the actual sequence for each of the high-scoring
sequences from Section I.
- A motif occurrence is defined as a position in the sequence whose
match to the motif has POSITION p-value less than 0.0001 as
defined in Section II.
- For each sequence, the first line specifies the name of the sequence.
- The second (and possibly more) lines give a description of the
sequence.
- Following the description line(s) is a line giving the length,
combined p-value, and E-value of the sequence as defined in Section I.
- The next line reproduces the motif diagram from Section II.
- The entire sequence is printed on the following lines.
- Motif occurrences are indicated directly above their positions in the
sequence on lines showing
- the motif number of the occurrence,
- the position p-value of the occurrence,
- the best possible match to the motif, and
- columns whose match to the motif has a positive score (indicated
by a plus sign).
RATRABGERB
Rat rab geranylgeranyl transferase beta-subunit
LENGTH = 331 COMBINED P-VALUE = 3.84e-99 E-VALUE = 1.9e-98
DIAGRAM: 42-[2]-46-[2]-34-[1]-11-[3]-32
[2]
4.7e-09
GEVDTRFVYCALSVASLLNILTPELVEGAIEFV
+++ +++ +++ + + + + ++
1 MGTQQKDVTIKSDAPDTLLLEKHADYIASYGSKKDDYEYCMSEYLRMSGVYWGLTVMDLMGQLHRMNKEEILVFI
[2]
3.3e-43
LRCQNFDGGFGCCPGAE GEVDTRFVYCAL
++++ ++ + + ++++++++++++
76 KSCQHECGGVSASIGHDPHLLYTLSAVQILTLYDSIHVINVDKVVAYVQSLQKEDGSFAGDIWGEIDTRFSFCAV
[1]
7.6
SVASLLNILTPELVEGAIEFVLRCQNFDGGFGCCPGAE QQP
++++++++++++++++++++++++++++++++++++++ +++
151 ATLALLGKLDAINVEKAIEFVLSCMNFDGGFGCRPGSESHAGQIYCCTGFLAITSQLHQVNSDLLGWWLCERQLP
[3]
e-33 2.2e-36
EGGFNGRPNKLPDVCYSWWVLGSLPIIGR FILMCQDEEQGGLADKPGNMVDFYHTVYCIAGLS
+++++++++++++++++++++++++++++ ++++++++++++++++++++++++++++++++++
226 SGGLNGRPEKLPDVCYSWWVLASLKIIGRLHWIDREKLRSFILACQDEETGGFADRPGDMVDPFHTLFGIAGLSL
BET2_YEAST
YPT1/SEC4 PROTEINS GERANYLGERANYLTRANSFERASE BETA SUBUNIT (EC 2.
LENGTH = 325 COMBINED P-VALUE = 1.29e-98 E-VALUE = 6.4e-98
DIAGRAM: 62-[3]-35-[2]-37-[1]-11-[3]-30
[3]
1.4e-07
FILMCQDEEQGGL
+++ + + + + +
1 MSGSLTLLKEKHIRYIESLDTNKHNFEYWLTEHLRLNGIYWGLTALCVLDSPETFVKEEVISFVLSCWDDKYGAF
[2]
8.6e-45
ADKPGNMVDFYHTVYCIAGLS GEVDTRFVYCALSVASLLN
+ + ++ + ++ +++++++++ +++++++++
76 APFPRHDAHLLTTLSAVQILATYDALDVLGKDRKVRLISFIRGNQLEDGSFQGDRFGEVDTRFVYTALSALSILG
[1]
3.5e-34
ILTPELVEGAIEFVLRCQNFDGGFGCCPGAE QQPEGGF
+++++++ +++++++++++++++++++++++ +++++++
151 ELTSEVVDPAVDFVLKCYNFDGGFGLCPNAESHAAQAFTCLGALAIANKLDMLSDDQLEEIGWWLCERQLPEGGL
[3]
6.4e-33
NGRPNKLPDVCYSWWVLGSLPIIGR FILMCQDEEQGGLADKPGNMVDFYHTVYCIAGLS
+++++++++++++++++++++++++ +++++++++++++++++ ++++++++++++++++
226 NGRPSKLPDVCYSWWVLSSLAIIGRLDWINYEKLTEFILKCQDEKKGGISDRPENEVDVFHTVFGVAGLSLMGYD
PFTB_RAT
PROTEIN FARNESYLTRANSFERASE BETA SUBUNIT (EC 2.5.1.-) (CAAX FARNES
LENGTH = 437 COMBINED P-VALUE = 3.12e-90 E-VALUE = 1.6e-89
DIAGRAM: 133-[1]-31-[2]-35-[1]-24-[3]-66
[1]
5.4e-09
QQPEGGFNGRPNKLPDV
+++++++ + + +
76 EKHFHYLKRGLRQLTDAYECLDASRPWLCYWILHSLELLDEPIPQIVATDVCQFLELCQSPDGGFGGGPGQYPHL
[2]
4.2e-44
CYSWWVLGSLPIIGR GEVDTRFVYCALSVASLLNILTPELVEGA
++ + + +++ +++++++++++++++++++++++++++++
151 APTYAAVNALCIIGTEEAYNVINREKLLQYLYSLKQPDGSFLMHVGGEVDVRSAYCAASVASLTNIITPDLFEGT
[1]
1.9e-29
IEFVLRCQNFDGGFGCCPGAE QQPEGGFNGRPNKLPDVCY
+++++++++++++++++++++ +++++++++++++++++++
226 AEWIARCQNWEGGIGGVPGMEAHGGYTFCGLAALVILKKERSLNLKSLLQWVTSRQMRFEGGFQGRCNKLVDGCY
[3]
2.6e-30
SWWVLGSLPIIGR FILMCQDEEQGGLADKPGNMVDFYHTVYCIAGLS
++++++ ++++++ ++++++++++++++++++++++++++++++++++
301 SFWQAGLLPLLHRALHAQGDPALSMSHWMFHQQALQEYILMCCQCPAGGLLDKPGKSRDFYHTCYCLSGLSIAQH
RAM1_YEAST
PROTEIN FARNESYLTRANSFERASE BETA SUBUNIT (EC 2.5.1.-) (CAAX FARN
LENGTH = 431 COMBINED P-VALUE = 1.43e-87 E-VALUE = 7.2e-87
DIAGRAM: 205-[2]-36-[1]-15-[3]-59
[2]
3.3e-45
GEVDTRFVYCALSVASLLNI
++++++++++++++++++++
151 PGQLSHLASTYAAINALSLCDNIDGCWDRIDRKGIYQWLISLKEPNGGFKTCLEVGEVDTRGIYCALSIATLLNI
[1]
2.8e-27
LTPELVEGAIEFVLRCQNFDGGFGCCPGAE QQPEGGFNG
+++++++++++++++++++++++++++++ ++++++++
226 LTEELTEGVLNYLKNCQNYEGGFGSCPHVDEAHGGYTFCATASLAILRSMDQINVEKLLEWSSARQLQEERGFCG
[3]
1.2e-28
RPNKLPDVCYSWWVLGSLPIIGR FILMCQDEEQGGLADKPGNMVDFYHTVYCIAGLS
++++++++++++++++++++++ ++++++++++++++++++++++++++++++ +++
301 RSNKLVDGCYSFWVGGSAAILEAFGYGQCFNKHALRDYILYCCQEKEQPGLRDKPGAHSDFYHTNYCLLGLAVAE
CAL1_YEAST
RAS PROTEINS GERANYLGERANYLTRANSFERASE (EC 2.5.1.-) (PROTEIN GER
LENGTH = 376 COMBINED P-VALUE = 2.25e-29 E-VALUE = 1.1e-28
DIAGRAM: 212-[2]-8-[1]-14-[3]-26
[2]
7.4e-06
GEVDTRFVYCALS
+ ++++
151 DYKTNCGSSVDSDDLRFCYIAVAILYICGCRSKEDFDEYIDTEKLLGYIMSQQCYNGAFGAHNEPHSGYTSCALS
[1]
2.8e-27
VASLLNILTPELVEGAIEFVLRCQNFDGGFGCCPGAE QQPEGGFNGRPNKLPDVCYSWWVLGSLPII
+++++ + + + ++ + ++++++++++++++++++++++++++++++
226 TLALLSSLEKLSDKFKEDTITWLLHRQVSSHGCMKFESELNASYDQSDDGGFQGRENKFADTCYAFWCLNSLHLL
[3]
1.0e-08
GR FILMCQDEEQGGLADKPGNMVDFYHTVYCIAGLS
++ + + +++++ + + ++ + + + ++
301 TKDWKMLCQTELVTNYLLDRTQKTLTGGFSKNDEEDADLYHSCLGSAALALIEGKFNGELCIPQEIFNDFSKRCC
Debugging Information
CPU: nbcr2
Time 0.860000 secs.
mast meme.farntrans5.zoops.4.sunsparcsolaris.html -stdout
Button Help
Links to Entrez database at NCBI
Links to sequence scores (section I)
Links to motif diagrams (section II)
Links to sequence/motif annotated alignments (section III)
This information