MAST - Motif Alignment and Search Tool
MAST version 3.0 (Release date: 2001/03/01 11:41:04)
For further information on how to interpret these results or to get
a copy of the MAST software please access http://meme.sdsc.edu.
REFERENCE
If you use this program in your research, please cite:
Timothy L. Bailey and Michael Gribskov,
"Combining evidence using p-values: application to sequence homology
searches", Bioinformatics, 14(48-54), 1998.
DATABASE AND MOTIFS
DATABASE -stdin (nucleotide)
Last updated on Wed Dec 31 16:00:00 1969
Database contains 4 sequences, 499297 residues
Scores for positive and reverse complement strands are combined.
MOTIFS meme.adh.oops.4.sunsparcsolaris.html (peptide)
MOTIF WIDTH BEST POSSIBLE MATCH
----- ----- -------------------
1 29 YSASKFAVRMLTRSMAHEYAPHGIRVNCI
2 29 KVVLITGCSSGIGKATAKHLHKEGAKVVL
3 11 GPVDVLVNNAG
PAIRWISE MOTIF CORRELATIONS:
MOTIF 1 2
----- ----- -----
2 0.30
3 0.31 0.35
No overly similar pairs (correlation > 0.60) found.
Random model letter frequencies (from non-redundant database):
A 0.070 C 0.024 D 0.040 E 0.052 F 0.040 G 0.074 H 0.029 I 0.041 K 0.052
L 0.096 M 0.017 N 0.032 P 0.065 Q 0.042 R 0.067 S 0.084 T 0.052 V 0.059
W 0.016 Y 0.022
SECTION I: HIGH-SCORING SEQUENCES
- Each of the following 4 sequences has E-value less than 10.
- The E-value of a sequence is the expected number of sequences
in a random database of the same size that would match the motifs as
well as the sequence does and is equal to the combined p-value of the
sequence times the number of sequences in the database.
- The combined p-value of a sequence measures the strength of the
match of the sequence to all the motifs and is calculated by
- finding the score of the single best match of each motif
to the sequence (best matches may overlap),
- calculating the sequence p-value of each score,
- forming the product of the p-values,
- taking the p-value of the product.
- The sequence p-value of a score is defined as the
probability of a random sequence of the same length containing
some match with as good or better a score.
- The score for the match of a position in a sequence to a motif
is computed by by summing the appropriate entry from each column of
the position-dependent scoring matrix that represents the motif.
- Sequences shorter than one or more of the motifs are skipped.
- The frame of the (best) motif match(es) is shown.
Frames 1, 2, and 3 are labeled a, b c, respectively.
- The table is sorted by increasing E-value.
| Links | Sequence Name | Description | Frame | E-value | Length
|
|---|
|
| gb|AE003602.1|AE003602
| Drosophila melanogaster...
| b
| 2.9e-28
| 297266
|
|
| gb|AE003567.1|AE003567
| Drosophila melanogaster...
| c
| 2.3e-13
| 104808
|
|
| gb|AE002569.1|AE002569
| Drosophila melanogaster...
| a
| 0.8
| 12850
|
|
| gb|AE002567.1|AE002567
| Drosophila melanogaster...
| a
| 2
| 84373
|
SECTION II: MOTIF DIAGRAMS
- The ordering and spacing of all non-overlapping motif occurrences
are shown for each high-scoring sequence listed in Section I.
- A motif occurrence is defined as a position in the sequence whose
match to the motif has SEQUENCE p-value less than 0.0001.
- The SEQUENCE p-value of a match is the probability of
some random subsequence in a set of n,
where n is the sequence length minus the motif width plus 1,
scoring at least as well as the observed match.
- For each sequence, all motif occurrences are shown unless there
are overlaps. In that case, a motif occurrence is shown only if its
p-value is less than the product of the p-values of the other
(lower-numbered) motif occurrences that it overlaps.
- The table also shows the E-value of each sequence.
- Spacers and motif occurences are indicated by
- occurrence of motif `n' with p-value less than 0.0001
in frame f. Frames 1, 2, and 3 are labeled a, b c.
A minus sign indicates that the occurrence is on the
reverse complement strand.
- Sequences longer than 3000 are not shown to scale and are indicated by thicker lines.
| Links | Name | Expect |    Motifs
|
|---|
|
| gb|AE003602.1|AE003602
| 2.9e-28
|
| +2b
|
| +3b
|
| +1b
|
| +2c
|
| +1c
|
| -1b
|
| -3b
|
| -2b
|
| -1b
|
| -3b
|
| -2b
|
|
|
|
|
| gb|AE003567.1|AE003567
| 2.3e-13
|
|
|
| gb|AE002569.1|AE002569
| 0.8
|
|
|
| gb|AE002567.1|AE002567
| 2
|
|
| SCALE
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| 1
| 75
| 150
| 225
| 300
| 375
| 450
| 525
| 600
| 675
| 750
| 825
| 900
| 975
| 1050
| 1125
| 1200
| 1275
| 1350
| 1425
| 1500
| 1575
| 1650
| 1725
| 1800
| 1875
| 1950
| 2025
| 2100
| 2175
| 2250
| 2325
| 2400
| 2475
| 2550
| 2625
| 2700
| 2775
| 2850
| 2925
| 3000
| 3075
| 3150
| 3225
| 3300
| 3375
| 3450
| |
|
|---|
SECTION III: ANNOTATED SEQUENCES
- The positions and p-values of the non-overlapping motif occurrences
are shown above the actual sequence for each of the high-scoring
sequences from Section I.
- A motif occurrence is defined as a position in the sequence whose
match to the motif has SEQUENCE p-value less than 0.0001 as
defined in Section II.
- For each sequence, the first line specifies the name of the sequence.
- The second (and possibly more) lines give a description of the
sequence.
- Following the description line(s) is a line giving the length,
combined p-value, and E-value of the sequence as defined in Section I.
- The next line reproduces the motif diagram from Section II.
- The entire sequence is printed on the following lines.
- Motif occurrences are indicated directly above their positions in the
sequence on lines showing
- the motif number/frame of the occurrence (a minus sign indicates that
the occurrence is on the reverse complement strand),
- the position p-value of the occurrence,
- the best possible match to the motif (or its reverse),
- columns whose match to the motif has a positive score (indicated
by a plus sign), and
- the protein translation of the match (or its reverse).
gi|7296683|gb|AE003602.1|AE003602
Drosophila melanogaster genomic scaffold 142000013386043 section 3 of 8, complete sequence
LENGTH = 297266 COMBINED P-VALUE = 7.17e-29 E-VALUE = 2.9e-28
DIAGRAM: 101665-[+2b]-141-[+3b]-183-[+1b]-736-[+2c]-357-[+1c]-752-[-1b]- 183-[-3b]-141-[-2b]-765-[-1b]-183-[-3b]-141-[-2b]-191224
[+2b]
7.5e-13
K..V..V..L..I..T..G..C..
+ + + + + + + +
K..V..V..L..I..T..G..A..
101617 CTTGAGATCCCCTTACATAATTCTGGATCGGATCATGAATTTCGCGGGCAAAGTGGTCCTTATTACGGGAGCA
S..S..G..I..G..K..A..T..A..K..H..L..H..K..E..G..A..K..V..V..L..
+ + + + + + + + + + + + + + + +
S..S..G..I..G..A..A..T..A..I..K..F..A..K..Y..G..A..C..L..A..L..
101690 AGCTCCGGAATCGGAGCTGCAACCGCCATTAAGTTTGCCAAGTACGGCGCCTGTCTGGCTCTCAATGGACGCA
[+3b]
1.6e-07
G..P..V..D..V..
+ + + + +
G..K..L..D..V..
101836 CGACATTGCCAAGGAGGCGGACACCCAGAGGATTTGGTCGGAAACCCTGCAGCAGTACGGCAAATTGGATGTG
L..V..N..N..A..G..
+ + + + + +
L..V..N..N..A..G..
101909 CTGGTCAACAATGCCGGAATCATCGAGACGGGCACCATCGAGACGACTAGCCTGGAGCAGTACGACCGCGTCA
[+1b]
1.2e-09
Y..S..A..S..K..F..
+ + +
Y..N..I..S..K..M..
102055 CATCGTGAACGTGTCCAGTGTCAATGGGATTCGCTCCTTCCCTGGCGTTCTGGCCTACAACATATCCAAAATG
A..V..R..M..L..T..R..S..M..A..H..E..Y..A..P..H..G..I..R..V..N..C..I..
+ + + + + + + + + + + + + + + + + + + +
G..V..D..Q..F..T..R..C..V..A..L..E..L..A..A..K..G..V..R..V..N..C..V..
102128 GGAGTGGATCAGTTCACCCGCTGTGTGGCGTTGGAGCTGGCTGCCAAGGGTGTGCGCGTGAACTGCGTGAATC
[+2c]
3.2e-11
K..V..V..L..I..T..G..C..S..S..G..I..G..K..A..T..A..K..H..L..H..K..E..G.
+ + + + + + + + + + + + + + + + + + + +
K..V..V..L..I..T..G..A..A..S..G..I..G..A..A..A..A..E..M..F..S..K..L..G.
102931 GCAAAGTGGTGCTTATCACGGGCGCAGCCTCCGGGATCGGGGCCGCCGCGGCGGAGATGTTCTCGAAGCTGGG
.A..K..V..V..L..
+ +
.A..C..L..A..L..
103004 TGCCTGCCTGGCCCTGGTGGATCGGGAGGAGGAGGGCCTCATATGTGTGATGAAACGCTGCATGAAGATGGGC
[+1c]
2.6e-13
Y..S..A..S..K..F..A..V..R..M..L..T..R..S..M..A..H..E..Y..A..P..H.
+ + + + + + + + + + + + + + + + + + + +
Y..N..M..S..K..A..A..V..D..Q..F..T..R..S..L..A..L..D..L..G..P..Q.
103369 TGGTGGCCTACAACATGTCCAAGGCGGCGGTGGACCAGTTTACCCGCTCCCTTGCCCTGGATCTGGGTCCCCA
.G..I..R..V..N..C..I..
+ + + + + + +
.G..V..R..V..N..A..V..
103442 GGGTGTTCGGGTGAATGCGGTCAATCCGGGTGTGATTCGCACCAATCTGCAAAAGGCGGGGGGCATGGACGAG
[-1b]
1.3e-12
..I..C..N..V..R..I..G..H..P..
+ + + + + + + + +
..V..S..N..V..R..V..G..K..P..
104172 AGTCCACCACGACGCTGCAGCTCGGTGATGATTACGCCGGGATTCACGGAGTTCACACGCACACCCTTGGGAG
A..Y..E..H..A..M..S..R..T..L..M..R..V..A..F..K..S..A..S..Y
+ + + + + + + + + + + + + + + +
A..L..E..L..A..V..C..R..T..F..Q..D..V..A..A..K..S..V..N..Y
104245 CTAGCTCCAGAGCCACGCACCTGGTGAACTGATCCACGGCAGCCTTGGAAACATTGTATGCTAAGACTCCGGG
[-3b]
1.3e-07
..G..A..N..N..V..L..V..D..V..P..G
+ + + + + + + + + + +
..G..A..N..N..V..L..V..D..I..R..G
104464 CGATGCTGCCTAGCTCCAAGATTCCGGCGTTGTTCACCAGCACGTCGATGCGACCGTGCTTGGCCAATGTGGC
[-2b]
9.6e-09
..L..V..V..K..A..G..E..
+ +
..I..T..L..L..G..G..L..
104610 GCCACTATCTGCTCCGCGGTCTCGTTGAGCTTATCCAAATTCCTGCCCACGATGGTGAGCAGGCCTCCCAGTT
K..H..L..H..K..A..T..A..K..G..I..G..S..S..C..G..T..I..L..V..V..K
+ + + + + + + + + + + + + + + + +
K..A..L..L..V..S..T..G..A..G..I..G..S..S..A..G..T..V..I..I..V..K
104683 TAGCCAAGAGCACCGAAGTACCCGCTCCAATTCCCGAACTGGCTCCGGTCACGATTATAACTTTATCCTTGAA
[-1b]
1.5e-10
..I..C..N..V..R..I..G..H..P..A..Y..E..H..A..M..
+ + + + + + + + + + + + + +
..V..A..N..V..R..V..G..K..P..A..L..E..L..A..I..
105486 ATGTCAGTCACAATCACGCCGGGATTCACGGCGTTTACGCGGACTCCTTTGGGGGCCAGTTCCAGGGCTATGC
S..R..T..L..M..R..V..A..F..K..S..A..S..Y
+ + + + + + + + + +
C..A..T..F..Q..D..V..A..A..K..S..V..N..Y
105559 AGGCCGTGAACTGGTCCACCGCTGCCTTAGACACATTGTAGGCCAGGACACCAGGAAAGGCACGCAGTCCACA
[-3b]
1.3e-07
..G..A..N..N..V..L..V..D..V..P..G
+ + + + + + + + + + +
..G..A..N..N..V..L..V..D..I..R..G
105778 GGATGCCCGCATTGTTAACGAGCACATCGATGCGACCGTGCTTGGCCAGAGTGGCGCCCACAATCTGCTGCAC
[-2b]
4.6e-11
..L..V..V..K..A..G..E..K..H..L..H..K..A..
+ + + + + + + +
..I..V..L..L..G..G..L..K..A..L..H..V..A..
105924 GTCTCCTTCAGCTTCTCCTCGTTGCGACCCACGATGACCAGGAGCCCTCCGAGTTTCGCCAAATGGACGGCGG
T..A..K..G..I..G..S..S..C..G..T..I..L..V..V..K
+ + + + + + + + + + + + + +
A..S..A..G..I..G..S..S..A..G..T..V..I..I..V..K
105997 CACTTGCTCCGATGCCGGAGCTGGCGCCGGTAACAATAATCACTTTGTCCTTGAAGGATGACATCGTGGGGTG
gi|7295475|gb|AE003567.1|AE003567
Drosophila melanogaster genomic scaffold 142000013386050 section 54 of 54, complete sequence
LENGTH = 104808 COMBINED P-VALUE = 5.79e-14 E-VALUE = 2.3e-13
DIAGRAM: 87158-[-1c]-17563
[-1c
4.6e
..I.
+
..L.
87090 AGCCGCTTCATTCGCCGACTCATTCTCGTACAGTGCTTTCGAGAACTTTGTCCTGATGACTCCTGGAGCCAGG
]
-14
.C..N..V..R..I..G..H..P..A..Y..E..H..A..M..S..R..T..L..M..R..V..A..F..K..
+ + + + + + + + + + + + + + + + + + + + + + +
.C..N..V..R..I..G..E..P..A..L..D..K..A..A..A..K..T..L..G..I..L..A..T..K..
87163 CAGTTGACGCGAATGCCCTCGGGCGCCAGATCCTTGGCGGCTGCCTTGGTCAAGCCAATCAGCGCGGTCTTGC
S..A..S..Y
+ + +
S..V..S..Y
87236 TGACGGAATAGGCTCCCAGTAGCTATGCGATTAAGATAACGGAGATAAGCATTGAACAACTGAACCGCAGATA
gi|7289065|gb|AE002569.1|AE002569
Drosophila melanogaster genomic scaffold 142000013385354, complete sequence
LENGTH = 12850 COMBINED P-VALUE = 2.01e-01 E-VALUE = 0.8
DIAGRAM: 12850
gi|7289301|gb|AE002567.1|AE002567
Drosophila melanogaster genomic scaffold 142000013385554, complete sequence
LENGTH = 84373 COMBINED P-VALUE = 4.91e-01 E-VALUE = 2
DIAGRAM: 84373
Debugging Information
CPU: nbcr2
Time 7.470000 secs.
mast meme.adh.oops.4.sunsparcsolaris.html -stdin -dna -seqp -stdout
Button Help
Links to Entrez database at NCBI
Links to sequence scores (section I)
Links to motif diagrams (section II)
Links to sequence/motif annotated alignments (section III)
This information