ViewVC Help
View File | Revision Log | Show Annotations | View Changeset | Root Listing
root/bioseg/trunk/README.bioseg
(Generate patch)
# Line 24 | Line 24
24   file:
25      gzip -d < bioseg-x.y.tar.gz | tar xf -
26  
27 + (Or check-out from subversion with:
28 +   svn checkout svn://bioinformatics.org/svnroot/bioseg/trunk bioseg
29 + in the contrib directory)
30 +
31   To install the type, change to the bioseg directory and run
32  
33      make
34      make install
35  
36   The user running "make install" may need root access; depending on the
37 < configuration of PostgreSQL.
37 > configuration of PostgreSQL.  If so this may work:
38 >
39 >    sudo make install
40  
41   This only installs the type implementation and documentation.  To make the
42   type available in any particular database, do
# Line 51 | Line 57
57   If you have a full installation of PostgreSQL, including the pg_config
58   program, bioseg can be unpacked anywhere and built like:
59  
60 <     make USE_PGXS=t
61 <     make install USE_PGXS=t
60 >    make USE_PGXS=t clean
61 >    make USE_PGXS=t
62 >    make install USE_PGXS=t
63 >    (or: sudo make install USE_PGXS=t)
64  
65   and the type can then be installed in a particular database by any user with:
66  
# Line 62 | Line 70
70   SYNTAX
71   ======
72  
73 < The external representation of an interval is formed using one or two
73 > The user visible representation of an interval is formed using one or two
74   integers greater than 0 joined by the range operator ('..' or '...').
75   The first integer must be less than or equal to the second.
76  
77 < 11..22        An interval from 10 to 20 inclusive - length 11 (= 22-11+1)
77 >  11..22        An interval from 11 to 22 inclusive - length 12 (= 22-11+1)
78 >
79 >  1...2         The same as 1..2
80  
81 < 1...2         The same as 1..2
81 >  50            The same as 50..50
82  
83 < 50            The same as 50..50
83 > In a statement, bioseg values have the form:
84 >  '<start>..<end>'::bioseg
85 > or can be created with:
86 >  bioseg_create(start, end)
87 >
88 > For example:
89 >  CREATE TABLE test_bioseg (id integer, seg bioseg);
90 >  insert into test_bioseg values (1, '1000..2000'::bioseg);
91 > or, equivalently
92 >  insert into test_bioseg values (1, bioseg_create(1000, 2000));
93  
94  
95   USAGE
96   =====
97  
98 < Available operators include:
98 > See http://www.bioinformatics.org/bioseg/wiki/Main/BiosegUsage for usage
99 > examples.
100 >
101 > The following is a list of the available operators.  The [a, b] should be
102 > replaced in a statement with 'a..b'::bioseg or bioseg_create(a, b).
103  
104   [a, b] && [c, d]        Overlaps
105  
# Line 134 | Line 157
157          you want to use ORDER BY with this type
158  
159  
160 < NOTE: The performance of an R-tree index can largely depend on the
161 < order of input values. It may be very helpful to sort the input table
139 < on the BIOSEG column (see the script sort-segments.pl for an example)
160 > NOTE: The performance of an R-tree index can largely depend on the order of
161 > input values.  It may be helpful to sort the input table on the BIOSEG column.
162  
163  
164   INDEXES
# Line 148 | Line 170
170    CREATE TABLE tt (range bioseg, id integer);
171    CREATE INDEX tt_range_idx ON tt USING gist (range);
172  
173 + Or for an existing table a function index can be used.  For example on a
174 + feature table with fmin and fmax:
175 +
176 +  CREATE INDEX bioseg_index ON feature USING gist (bioseg_create(fmin, fmax));
177 +
178 + This query will then find features that overlap 2000..3000, using the index:
179 +
180 +  SELECT * FROM feature
181 +           WHERE '2000..3000'::bioseg && bioseg_create(fmin, fmax);
182 +
183  
184   INTERBASE COORDINATES
185   =====================
# Line 157 | Line 189
189   based" or "half-open intervals") run the build with INTERBASE_COORDS defined
190   in make, ie.:
191  
192 +    make clean
193      make INTERBASE_COORDS=t
194      make install INTERBASE_COORDS=t
195 +    (or: sudo make install INTERBASE_COORDS=t)
196  
197   This will compile and install the implementation for the "bioseg0" type.
198 < The "0" in the name being a mnemonic for "0-based".
198 > The "0" in the name is a mnemonic for "0-based".
199  
200 < Then restart PostgreSQL and run:
201 <    psql -d databasename < bioseg.sql
202 < as usual to install the type in the database.
200 > Then read "bioseg0.sql" into your database:
201 >    psql -d databasename < bioseg0.sql
202 > to install the type.
203  
204 < Note
205 < ----
204 > The bioseg and bioseg0 types can be mixed in the same database.
205 >
206 > Notes
207 > -----
208   In the interbase system '1..10'::bioseg0 and '10..20'::bioseg0 don't overlap,
209   whereas in the 1-based system '1..10'::bioseg and '10..20'::bioseg have a one
210   base overlap.  Also note that the length of '1..10'::bioseg0 is 9, whereas the
211   length of '1..10'::bioseg is 10.
212  
213 + Unlike the bioseg type the start and/or end of a bioseg0 can be negative, with
214 + the expected reults.
215 +  eg.   bioseg0_size('-10..10'::bioseg0) == 20
216 +
217   See:
218   http://www.gmod.org/wiki/index.php/Introduction_to_Chado#Interbase_Coordinates
219   for a longer discussion of the differences between the coordinate systems.
# Line 192 | Line 232
232  
233   Note from the author: Most of the code and all of the hard work needed to
234   implement BIOSEG was by Gene Selkov, Jr, author of the SEG type (contrib/seg
235 < in the PostgreSQL source).  All bugs are due to me.
235 > in the PostgreSQL source).  All bugs are due to me (kmr).
236 >
237 >
238 > THANKS
239 > ======
240 >
241 > Thanks to bioinformatics.org for hosting the project.
242  
243  
244   AUTHOR
245   ======
246  
247   Kim Rutherford <kmr@flymine.org>
248 +
249 + SEG code by Gene Selkov, Jr.
250 +

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines