ViewVC Help
View File | Revision Log | Show Annotations | Root Listing
root/PrimerMatch/compress_seq.htm
Revision: 1.1.1.1 (vendor branch)
Committed: Wed Dec 22 21:37:17 2004 UTC (11 years, 6 months ago) by nje01
Branch: MAIN
CVS Tags: HEAD, RELEASE-20041222, HEAD
Changes since 1.1: +0 -0 lines
Log Message:
Initial primer_match import

Line File contents
1 <html>
2 <p align="center"><font size=+2 face="Arial">compress_seq</font></p>
3
4 <h3>Name</h3>
5 <blockquote>
6 <p align=left>
7 <b>
8 compress_seq</b> - Normalize and compress a multi-FASTA
9 sequence database</p>
10 </blockquote>
11 <h3>Synopsis</h3>
12 <blockquote>
13 <p><b>compress_seq</b> -i <i> fasta_sequence_database</i> [ <i> options</i> ]</p>
14 </blockquote>
15 <h3>Description</h3>
16 <blockquote>
17 <p><b>compress_seq</b> takes a multi-FASTA sequence database and splits it into
18 separate sequence, header and index files. It can output the sequence data in a
19 variety of forms, depending of the command line options given, including a DNA optimized normalized form, and a bit compressed form. It can also add
20 an arbitrary end of sequence character to each sequence entry, and force each sequence character to uppercase.</p>
21 <p>If <b>compress_seq</b> is used to pre-process a sequence database for <b>primer_match</b>,
22 the best performance will result from the use of the <u>-n true</u> option,
23 to index and normalize the sequence database.</p>
24 <p>By default, <b> compress_seq</b> splits the multi-FASTA sequence database file
25 <i> db</i> into three files: <i>db</i>.seq, <i>db</i>.hdr and <i>db</i>.idb. In order to make
26 <b> compress_seq</b> more suitable for use in computational analysis pipelines, it only re-constructs the sequence database component
27 files if the file system timestamps indicate that it is necessary. Note that the -F option can be used to force each component to be re-made.</p>
28 </blockquote>
29 <h3>Command Line Options</h3>
30 <blockquote>
31 <p>-i <i> fasta_sequence_database</i></p>
32 <blockquote>
33 <p>Name of the multi-Fasta sequence database file to process. <b>Required</b>.</p>
34 </blockquote>
35 <p>-I ( true | false )</p>
36 <blockquote>
37 <p>Write fasta index file in binary format. Default: true.</p>
38 </blockquote>
39 <p>-n ( true | false )</p>
40 <blockquote>
41 <p>Create a normalized version of the sequence data. This creates additional
42 files with suffixes .sqn and .tbl. <b> Default: </b> false.</p>
43 </blockquote>
44 <p>-D ( true | false )</p>
45 <blockquote>
46 <p>Optimize the normalized sequence data for DNA sequence. <b> Default:</b> true.</p>
47 </blockquote>
48 <p>-z ( true | false )</p>
49 <blockquote>
50 <p>Create a bit compressed normalized version of the sequence data. This creates additional files with suffixes .sqz and .tbz.
51 <b> Default: </b> false.</p>
52 </blockquote>
53 <p>-u ( true | false )</p>
54 <blockquote>
55 <p>Force the sequence data to uppercase characters. <b> Default:</b> false.</p>
56 </blockquote>
57 <p>-e ( true | false )</p>
58 <blockquote>
59 <p>Add an end of sequence character after each entry from the multi-FASTA sequence database file. This can help ensure that text matching algorithms cannot find a match that straddles two
60 FASTA entries.<b> Default: </b>true.</p>
61 </blockquote>
62 <p>-E eos</p>
63 <blockquote>
64 <p>Use eos as the end of sequence character. eos is the ascii code for the desired character, it may be specifed as a decimal, octal or hexadecimal number.
65 <b> Default: </b> 12 (newline).</p>
66 </blockquote>
67 <p>-S ( true | false )</p>
68 <blockquote>
69 <p>Insert end of sequence character before initial sequence entry. <b> Default:</b> false.</p>
70 </blockquote>
71 <p>-F ( true | false )</p>
72 <blockquote>
73 <p>Force each component of the compressed sequence database to be regenerated, even if the file timestamps indicate that this isn't necessary.
74 <b> Default: </b> false.</p>
75 </blockquote>
76 <p>-C ( true | false )</p>
77 <blockquote>
78 <p>Cleanup unnecessary temporary files. <b>Default: </b>true.</p>
79 </blockquote>
80 <p>-B</p>
81 <blockquote>
82 <p>Use buffered standard I/O rather than mmap to stream through the sequence
83 database. On some platforms, where the use of mmap is somewhat unpredictable,
84 this option may make it possible to run <b>compress_seq</b> reliably.&nbsp;</p>
85 </blockquote>
86 <p>-v&nbsp;</p>
87 <blockquote>
88 <p> Output the release tag of the binary.</p>
89 </blockquote>
90 <p>-h&nbsp;</p>
91 <blockquote>
92 <p> Command-line help.</p>
93 </blockquote>
94 </blockquote>
95 <h3>See Also</h3>
96 <blockquote>
97 <p><a href="primer_match.htm">primer_match</a>, <a href="pcr_match.htm">pcr_match</a></p>
98 </blockquote>
99 <h3>Author</h3>
100 <blockquote>
101 <p>Nathan Edwards</p>
102 </html>