ViewVC Help
View File | Revision Log | Show Annotations | View Changeset | Root Listing
root/bioseg/trunk/README.bioseg
(Generate patch)
# Line 20 | Line 20
20   INSTALLATION
21   ============
22  
23 < Change into the contrib directory in PostgreSQL and unpack the bioseg tar
24 < file:
23 > Change into the contrib directory in the PostgreSQL source and unpack the
24 > bioseg tar file:
25      gzip -d < bioseg-x.y.tar.gz | tar xf -
26  
27 + (Or check-out from subversion with:
28 +   svn checkout svn://bioinformatics.org/svnroot/bioseg/trunk bioseg
29 + in the contrib directory)
30 +
31   To install the type, change to the bioseg directory and run
32  
33      make
34      make install
35  
36   The user running "make install" may need root access; depending on the
37 < configuration of PostgreSQL.
37 > configuration of PostgreSQL.  If so this may work:
38 >
39 >    sudo make install
40  
41   This only installs the type implementation and documentation.  To make the
42   type available in any particular database, do
# Line 51 | Line 57
57   If you have a full installation of PostgreSQL, including the pg_config
58   program, bioseg can be unpacked anywhere and built like:
59  
60 <     make USE_PGXS=t
61 <     make install USE_PGXS=t
60 >    make USE_PGXS=t clean
61 >    make USE_PGXS=t
62 >    make install USE_PGXS=t
63 >    (or: sudo make install USE_PGXS=t)
64  
65   and the type can then be installed in a particular database by any user with:
66  
# Line 62 | Line 70
70   SYNTAX
71   ======
72  
73 < The external representation of an interval is formed using one or two
73 > The user visible representation of an interval is formed using one or two
74   integers greater than 0 joined by the range operator ('..' or '...').
75   The first integer must be less than or equal to the second.
76  
77 < 11..22        An interval from 10 to 20 inclusive - length 11 (= 22-11+1)
77 >  11..22        An interval from 11 to 22 inclusive - length 12 (= 22-11+1)
78 >
79 >  1...2         The same as 1..2
80  
81 < 1...2         The same as 1..2
81 >  50            The same as 50..50
82  
83 < 50            The same as 50..50
83 > In a statement, bioseg values have the form:
84 >  '<start>..<end>'::bioseg
85 > or can be created with:
86 >  bioseg_create(start, end)
87 >
88 > For example:
89 >  CREATE TABLE test_bioseg (id integer, seg bioseg);
90 >  insert into test_bioseg values (1, '1000..2000'::bioseg);
91 > or, equivalently
92 >  insert into test_bioseg values (1, bioseg_create(1000, 2000));
93  
94  
95   USAGE
96   =====
97  
98 < Available operators include:
98 > See http://www.bioinformatics.org/bioseg/wiki/Main/BiosegUsage for usage
99 > examples.
100 >
101 > The following is a list of the available operators.  The [a, b] should be
102 > replaced in a statement with 'a..b'::bioseg or bioseg_create(a, b).
103  
104   [a, b] && [c, d]        Overlaps
105  
# Line 118 | Line 141
141          The segment [a, b] is contained in [c, d], that is,
142          a >= c and b <= d
143  
121 Although the mnemonics of the following operators is questionable, I
122 preserved them to maintain visual consistency with other geometric
123 data types defined in PostgreSQL.
124
144   Other operators:
145  
146   [a, b] < [c, d]         Less than
# Line 134 | Line 153
153          you want to use ORDER BY with this type
154  
155  
156 < NOTE: The performance of an R-tree index can largely depend on the
157 < order of input values. It may be very helpful to sort the input table
139 < on the BIOSEG column (see the script sort-segments.pl for an example)
156 > NOTE: The performance of an R-tree index can largely depend on the order of
157 > input values.  It may be helpful to sort the input table on the BIOSEG column.
158  
159  
160   INDEXES
# Line 148 | Line 166
166    CREATE TABLE tt (range bioseg, id integer);
167    CREATE INDEX tt_range_idx ON tt USING gist (range);
168  
169 + Or for an existing table a function index can be used.  For example on a
170 + feature table with fmin and fmax:
171 +
172 +  CREATE INDEX bioseg_index ON feature USING gist (bioseg_create(fmin, fmax));
173 +
174 + This query will then find features that overlap 2000..3000, using the index:
175 +
176 +  SELECT * FROM feature
177 +           WHERE '2000..3000'::bioseg && bioseg_create(fmin, fmax);
178 +
179  
180   INTERBASE COORDINATES
181   =====================
# Line 157 | Line 185
185   based" or "half-open intervals") run the build with INTERBASE_COORDS defined
186   in make, ie.:
187  
188 +    make clean
189      make INTERBASE_COORDS=t
190      make install INTERBASE_COORDS=t
191 +    (or: sudo make install INTERBASE_COORDS=t)
192  
193   This will compile and install the implementation for the "bioseg0" type.
194 < The "0" in the name being a mnemonic for "0-based".
194 > The "0" in the name is a mnemonic for "0-based".
195  
196 < Then restart PostgreSQL and read "bioseg0.sql":
196 > Then read "bioseg0.sql" into your database:
197      psql -d databasename < bioseg0.sql
198 < as to install the type in a database.
198 > to install the type.
199  
200 < Note
201 < ----
200 > The bioseg and bioseg0 types can be mixed in the same database.
201 >
202 > Notes
203 > -----
204   In the interbase system '1..10'::bioseg0 and '10..20'::bioseg0 don't overlap,
205   whereas in the 1-based system '1..10'::bioseg and '10..20'::bioseg have a one
206   base overlap.  Also note that the length of '1..10'::bioseg0 is 9, whereas the
207   length of '1..10'::bioseg is 10.
208  
209 + Unlike the bioseg type the start and/or end of a bioseg0 can be negative, with
210 + the expected reults.
211 +  eg.   bioseg0_size('-10..10'::bioseg0) == 20
212 +
213   See:
214   http://www.gmod.org/wiki/index.php/Introduction_to_Chado#Interbase_Coordinates
215   for a longer discussion of the differences between the coordinate systems.
# Line 192 | Line 228
228  
229   Note from the author: Most of the code and all of the hard work needed to
230   implement BIOSEG was by Gene Selkov, Jr, author of the SEG type (contrib/seg
231 < in the PostgreSQL source).  All bugs are due to me.
231 > in the PostgreSQL source).  All bugs are due to me (kmr).
232 >
233 >
234 > THANKS
235 > ======
236 >
237 > Thanks to bioinformatics.org for hosting the project.
238  
239  
240   AUTHOR
241   ======
242  
243   Kim Rutherford <kmr@flymine.org>
244 +
245 + SEG code by Gene Selkov, Jr.
246 +

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines