MAST - Motif Alignment and Search Tool
MAST version 3.0 (Release date: 2001/03/01 11:41:04)
For further information on how to interpret these results or to get
a copy of the MAST software please access http://meme.sdsc.edu.
REFERENCE
If you use this program in your research, please cite:
Timothy L. Bailey and Michael Gribskov,
"Combining evidence using p-values: application to sequence homology
searches", Bioinformatics, 14(48-54), 1998.
DATABASE AND MOTIFS
DATABASE INO_up800.s (nucleotide)
Last updated on Thu Mar 1 08:28:30 2001
Database contains 7 sequences, 5600 residues
Scores for positive and reverse complement strands are combined.
MOTIFS meme.INO_up800.tcm.4.sunsparcsolaris.html (nucleotide)
MOTIF WIDTH BEST POSSIBLE MATCH
----- ----- -------------------
1 15 TTCACATGCCGCCCC
2 21 GGCCACATCGCCACTGGCGGC
3 29 TTGTCTACGTTTCTGCCGTTCTTCAGGCC
PAIRWISE MOTIF CORRELATIONS:
MOTIF 1 2
----- ----- -----
2 0.27
3 0.23 0.15
No overly similar pairs (correlation > 0.60) found.
Random model letter frequencies (from non-redundant database):
A 0.281 C 0.222 G 0.229 T 0.267
SECTION I: HIGH-SCORING SEQUENCES
- Each of the following 7 sequences has E-value less than 10.
- The E-value of a sequence is the expected number of sequences
in a random database of the same size that would match the motifs as
well as the sequence does and is equal to the combined p-value of the
sequence times the number of sequences in the database.
- The combined p-value of a sequence measures the strength of the
match of the sequence to all the motifs and is calculated by
- finding the score of the single best match of each motif
to the sequence (best matches may overlap),
- calculating the sequence p-value of each score,
- forming the product of the p-values,
- taking the p-value of the product.
- The sequence p-value of a score is defined as the
probability of a random sequence of the same length containing
some match with as good or better a score.
- The score for the match of a position in a sequence to a motif
is computed by by summing the appropriate entry from each column of
the position-dependent scoring matrix that represents the motif.
- Sequences shorter than one or more of the motifs are skipped.
- The table is sorted by increasing E-value.
| Links | Sequence Name | Description | E-value | Length
|
|---|
|
| INO1
| sequence of the region up...
| 1.9e-16
| 800
|
|
| FAS1
| sequence of the region up...
| 1.5e-11
| 800
|
|
| FAS2
| sequence of the region up...
| 1.8e-10
| 800
|
|
| ACC1
| sequence of the region up...
| 3.8e-08
| 800
|
|
| OPI3
| sequence of the region up...
| 1.3e-05
| 800
|
|
| CHO2
| sequence of the region up...
| 0.0024
| 800
|
|
| CHO1
| sequence of the region up...
| 0.0087
| 800
|
SECTION II: MOTIF DIAGRAMS
- The ordering and spacing of all non-overlapping motif occurrences
are shown for each high-scoring sequence listed in Section I.
- A motif occurrence is defined as a position in the sequence whose
match to the motif has POSITION p-value less than 0.0001.
- The POSITION p-value of a match is the probability of
a single random subsequence of the length of the motif
scoring at least as well as the observed match.
- For each sequence, all motif occurrences are shown unless there
are overlaps. In that case, a motif occurrence is shown only if its
p-value is less than the product of the p-values of the other
(lower-numbered) motif occurrences that it overlaps.
- The table also shows the E-value of each sequence.
- Spacers and motif occurences are indicated by
- occurrence of motif `n' with p-value less than 0.0001.
A minus sign indicates that the occurrence is on the
reverse complement strand.
- Sequences longer than 1000 are not shown to scale and are indicated by thicker lines.
| Links | Name | Expect |    Motifs
|
|---|
|
| INO1
| 1.9e-16
|
| +3
|
| +3
|
| -2
|
| +3
|
| -1
|
| -1
|
| -1
|
| +3
|
| +1
|
| +2
|
|
|
|
|
| FAS1
| 1.5e-11
|
|
|
| FAS2
| 1.8e-10
|
|
|
| ACC1
| 3.8e-08
|
|
|
| OPI3
| 1.3e-05
|
|
|
| CHO2
| 0.0024
|
|
|
| CHO1
| 0.0087
|
|
| SCALE
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| 1
| 25
| 50
| 75
| 100
| 125
| 150
| 175
| 200
| 225
| 250
| 275
| 300
| 325
| 350
| 375
| 400
| 425
| 450
| 475
| 500
| 525
| 550
| 575
| 600
| 625
| 650
| 675
| 700
| 725
| 750
| 775
| 800
| |
|
|---|
SECTION III: ANNOTATED SEQUENCES
- The positions and p-values of the non-overlapping motif occurrences
are shown above the actual sequence for each of the high-scoring
sequences from Section I.
- A motif occurrence is defined as a position in the sequence whose
match to the motif has POSITION p-value less than 0.0001 as
defined in Section II.
- For each sequence, the first line specifies the name of the sequence.
- The second (and possibly more) lines give a description of the
sequence.
- Following the description line(s) is a line giving the length,
combined p-value, and E-value of the sequence as defined in Section I.
- The next line reproduces the motif diagram from Section II.
- The entire sequence is printed on the following lines.
- Motif occurrences are indicated directly above their positions in the
sequence on lines showing
- the motif number of the occurrence (a minus sign indicates that
the occurrence is on the reverse complement strand),
- the position p-value of the occurrence,
- the best possible match to the motif (or its reverse complement), and
- columns whose match to the motif has a positive score (indicated
by a plus sign).
INO1
sequence of the region upstream from YJL153C
LENGTH = 800 COMBINED P-VALUE = 2.66e-17 E-VALUE = 1.9e-16
DIAGRAM: 92-[+3]-123-[+3]-92-[-2]-103-[+3]-28-[-1]-6-[-1]-34-[-1]-5-[+3]- 21-[+1]-4-[+2]-74
[+3]
3.1e-08
TTGTCTACGTTTCTGCCGTTCTTCAGGCC
++++ + + +++++ ++ + ++++ ++
76 TCAACAAGGACGACTTGTTGTTAATGGTTTTGGCGGTTTTCATTCCCCCAGTGGCCGTCTGGAAGCGTAAGGGTA
[+3]
6.7e-08
TTGTCTACGTTTCTGCCGTTCTTCAGGCC
+ ++ ++++++ + + +++ +++++++
226 ACGTTGTATATGAAACGAGTAGTGAACGTTCGTACGATCTTTCACGCAGACATGCGACTGCGCCCGCCGTAGACC
[-2]
1.5e-10
GCCGCCAGTG
++++++++++
301 GTGACCTGGAAGCTCACCCTGCAGAGGAATCTCAAGCACAGCCTCCAGCATATGATGAAGACGATGAGGCCGGTG
GCGATGTGGCC
+++++++++++
376 CCGATGTGCCCTTGATGGACAACAAACAACAGCTCTCTTCCGGCCGTACTTAGTGATCGGAACGAGCTCTTTATC
[+3]
6.5e-13
TTGTCTACGTTTCTGCCGTTCTTCAGGCC
+++++++++++++++++++++++++++ +
451 ACCGTAGTTCTAAATAACACATAGAGTAAATTATTGCCTTTTTCTTCGTTCCTTTTGTTCTTCACGTCCTTTTTA
[-1] [-1]
5.3e-06 1.7e-06
GGGGCGGCATGTGAA GGGGCGGCATGTGAA
+++++++ +++ + + +++++++ +++++++
526 TGAAATACGTGCCGGTGTTCCGGGGTTGGATGCGGAATCGAAAGTGTTGAATGTGAAATATGCGGAGGCCAAGTA
[-1] [+3]
6.6e-08 1.7e-08
GGGGCGGCATGTGAA TTGTCTACGTTTCTGCCGTTCTTCAGGCC
+++++++++++++++ +++++++ ++ + ++++++++++ ++
601 TGCGCTTCGGCGGCTAAATGCGGCATGTGAAAAGTATTGTCTATTTTATCTTCATCCTTCTTTCCCAGAATATTG
[+1] [+2]
2.1e-06 2.1e-07
TTCACATGCCGCCCC GGCCACATCGCCACTGGCGGC
+++++++++++++ + ++++ ++ ++ +++++++
676 AACTTATTTAATTCACATGGAGCAGAGAAAGCGCACCTCTGCGTTGGCGGCAATGTTAATTTGAGACGTATATAA
FAS1
sequence of the region upstream from YKL182W
LENGTH = 800 COMBINED P-VALUE = 2.19e-12 E-VALUE = 1.5e-11
DIAGRAM: 52-[+2]-21-[+1]-243-[+3]-419
[+2]
1.8e-07
GGCCACATCGCCACTGGCGGC
+ ++++ + ++ ++++++++
1 CCGGGTTATAGCAGCGTCTGCTCCGCATCACGATACACGAGGTGCAGGCACGGTTCACTACTCCCCTGGCCTCCA
[+1]
1.7e-08
TTCACATGCCGCCCC
++++++++++++++
76 ACAAACGACGGCCAAAAACTTCACATGCCGCCCAGCCAAGCATAATTACGCAACAGCGATCTTTCCGTCGCACAA
[+3]
3.1e-10
TTGTCTACGTTTCTGCCGTTCTT
+++++ + +++++++++++++
301 TTTTGGCATTTTTGGCATACTTTTTATCGATTGAACCATCTTCTCCAAACACTTTTCCTTTTTCCTTCTATTCTG
CAGGCC
++++ +
376 CAGGACCAACTAAAACTGGGTATATATATCATTATCTATATATATAAACGGCTTTCAACAAAGTTATAGGGGAAA
FAS2
sequence of the region upstream from YPL231W
LENGTH = 800 COMBINED P-VALUE = 2.53e-11 E-VALUE = 1.8e-10
DIAGRAM: 9-[-3]-138-[-3]-38-[+3]-112-[+3]-153-[+1]-131-[+2]-67
[-3]
5.8e-08
GGCCTGAAGAACGGCAGAAACGTAGACAA
+++ +++++++ + +++++ + ++++
1 TCCAGGCAAGGCACCAAGAGTTATTGAAACTAGAAAAATCCATGGCAGAACTTACTCAATTGTTTAATGACATGG
[-3]
1.8e-08
GGCCTGAAGAACGGCAGAAACGTAGACAA
+ +++++++++++ ++++++ + + +++
151 AACAGGGTGTCGGTCATACCGATAAAGCCGTCAAGAGTGCCAGAAAAGCAAGAAAGAACAAGATTAGATGTTGGT
[+3]
3.9e-08
TTGTCTACGTTTCTGCCGTTCTTCAGGCC
++++ +++++ ++++++++ +++ + +
226 TGATTGTATTCGCCATCATTGTAGTCGTTGTTGTTGTCGTTGTTGTCCCAGCCGTTGTCAAAACGCGTTAATTCC
[+3]
8.9e-09
TTGTCTACGTTTCTGCCGTTCTTCAGGCC
+++ +++ ++++ ++++ +++ ++++++
376 TTTACTATATTTCCTAAATTTTCTCTGGTCTGCAGGCCAAAAACAACAACTTACTACTGAATCATGGACGTGTAT
[+1]
4.0e-07
TTCACATGCCGCCCC
+++++++++ +++++
526 GCTTAGCAAAATCCAACCATTTTTTTTTTATCTCCCGCGTTTTCACATGCTACCTCATTCGCCTCGTAACGTTAC
[+2]
3.7e-09
GGCCACATCGCCACTGGCGGC
+ +++++++++++++++++++
676 TGCCTCATATATAACTTGTTAACTGAAGGTTACACAAGACCACATCACCACTGTCGTGCTTTTCTAATAACCGCT
ACC1
sequence of the region upstream from YNR016C
LENGTH = 800 COMBINED P-VALUE = 5.50e-09 E-VALUE = 3.8e-08
DIAGRAM: 52-[-2]-9-[+1]-171-[+3]-51-[-3]-379-[-3]-15
[-2]
2.5e-07
GCCGCCAGTGGCGATGTGGCC
+++ ++++ +++ ++ ++++
1 TATCCAAAGGGGAATGCTTCATCTTGTTGAACAACGCCCAACAATTTCCACTGCCCACCGAATCGTTGCGCCCGT
[+1]
8.8e-07
TTCACATGCCGCCCC
++++++++++++ +
76 TAAAATCTTCACATGGCCCGGCCGCGCGCGCGTTGTGCCAACAAGTCGCAGTCGAAATTCAACCGCTCATTGCCA
[+3]
2.0e-08
TTGTCTACGTTTCTGCCGTTCTTCAGGCC
++++ +++ ++++ + +++ +++ + +
226 TCGTATTCTGGCACAGTATAGCCTAGCACAATCACTGTCACAATTGTTATCGGTTCTACAATTGTTCTGCTCTCT
[-3]
1.1e-07
GGCCTGAAGAACGGCAGAAACGTAGAC
++++++++ +++++ +++ +++ +
301 TCAATTTTCCTTTCCTTATTCTACTCTTTTTATCCCTTTCGTACAGTTTACCTGAAGATAAAAAACAACAAAGCC
AA
++
376 AATTCCCTAATTTGCAATCGCCATTTGCATCTATATATATATATTTGTTGTGCCATTTTTTTATCCTCTGTGAGT
[-3]
1.0e-07
GGCCTGAAGAACGGCAGAAACGTAGACAA
++ +++ + +++++++ +++++++
751 TTTCCGTTCCCGAAACAGCGCAGAAAATTAGAAAAAATCAAGTTTCTACC
OPI3
sequence of the region upstream from YJR073C
LENGTH = 800 COMBINED P-VALUE = 1.84e-06 E-VALUE = 1.3e-05
DIAGRAM: 291-[+2]-41-[-2]-24-[+3]-73-[-2]-15-[+3]-16-[-1]-40-[+1]-149
[+2]
9.1e-07
GGCCACATC
+ +++ +++
226 ATAACGAGTCCGGTGAACTTGGTTCCTTGCTGAACAGTGTCTTCTTGTAAAGCTTCCCATTTGGTGGTCCCGTTC
[-2]
1.7e-06
GCCACTGGCGGC GCCGCCAGTGGCGATGTGGCC
+++ +++++ ++ + +++++++ ++ +++ + +
301 AACTCCGTCAGGTCTTCCACGTGGAACTGCCAAGCCTCCTTCAGATCGCTCTTGTCGACCGTCTCCAAGAGATCC
[+3]
5.3e-05
TTGTCTACGTTTCTGCCGTTCTTCAGGCC
+ +++ + +++ ++ +++++++ +
376 ACGATAATGCTTTCATTGGTGGCTAGTCCATCTTCGAATTCTTCTTCATCGCGACGGGAATTGACGTACACCTCC
[-2]
2.5e-08
GCCGCCAGTGGCGATGTGGCC
+++++++++ +++++++ +++
451 TGTGTATCGGGGACTTCTCTTAGAGTAGAAGCGTCTATAAACCCAGGTGGGACGACAGTAGTGATGGCGCCGCCG
[+3] [-1]
5.7e-05 3.4e-06
TTGTCTACGTTTCTGCCGTTCTTCAGGCC GGGGCGGCATGTGAA
++ + +++ + + +++++++ + + +++ +++++++++ +
526 TATAATTCGACTTCCTTGTTGTTCATGCTTCCTTGATGACCAGGGTAGGTGTCAATGAGAGTGCATGTGGAAAGT
[+1]
2.0e-06
TTCACATGCCGCCCC
++++ ++++++ +++
601 TGCACCGGTTGTGAAATATGAGAAGCCTTTTCAATCTTCATATGCAAACCCACACATGCATCGTTGGTTTCTGTC
CHO2
sequence of the region upstream from YGR157W
LENGTH = 800 COMBINED P-VALUE = 3.48e-04 E-VALUE = 0.0024
DIAGRAM: 353-[+1]-109-[-1]-161-[+3]-118
[+1]
1.2e-06
TTCACATGCCGCCCC
+++ +++++++++++
301 ATATATATTTTTGCCTTGGTTTAAATTGGTCAAGACAGTCAATTGCCACACTTTTCTCATGCCGCATTCATTATT
[-1]
1.2e-05
GGGGCGGCATGTGAA
+++ +++++ ++++
451 GTGGATCTCCGACACATGTGAATTTATAAGTAGGCATATGAAAATACAGATTCTTTCCACTGTGTTCCCTTTTAT
[+3]
1.3e-06
TTGTCTACGTTTCTGCCGTTCT
++++ ++++++ +++ +++
601 TTTTTTTTGCCTCTTTGATTAGTTTATCTTCTTTTCTTCATTTTATCCCCTAATTTTATACGTTAGTTCAACCTA
TCAGGCC
++ ++
676 ACAATCCAGGATTTCATTAACAAGAAAGGTAAAAGTAACCTATCAAGGCTATTTTGAAAAAAAAAATTCCGCCCT
CHO1
sequence of the region upstream from YER026C
LENGTH = 800 COMBINED P-VALUE = 1.25e-03 E-VALUE = 0.0087
DIAGRAM: 162-[+1]-433-[+1]-14-[+1]-83-[+3]-34
[+1]
6.4e-06
TTCACATGCCGCCCC
+++ +++ +++ +++
151 CGTCTGGCGCCCTTCCCATTCCGAACCATGTTATATTGAACCATCTGGCGACAAGCAGTATTAAGCATAATACAT
[+1] [+1]
4.5e-07 1.5e-06
TTCACATGCCGCCCC TTCACATGCCGCCCC
++++++ ++++++++ +++++++++++++ +
601 ACTTTGAACGTTCACACGGCACCCTCACGCCTTTGAGCTTTCACATGGACCCATCTAAAGATGAAGATCCGTATT
[+3]
3.2e-05
TTGTCTACGTTTC
+++++ + ++
676 TTATAGGAAACATTATAAATAAGGAAAGAGAGATACACCTATTTTTTTCATTTTGTGGGTGATTGTCATTTTTAG
TGCCGTTCTTCAGGCC
++ + + ++++ +
751 TTGTCTATTTGATTCAATCAAAAAACAAAAATAAAACTATATATTAAAAA
Debugging Information
CPU: nbcr2
Time 0.220000 secs.
mast meme.INO_up800.tcm.4.sunsparcsolaris.html -stdout
Button Help
Links to Entrez database at NCBI
Links to sequence scores (section I)
Links to motif diagrams (section II)
Links to sequence/motif annotated alignments (section III)
This information