[Pipet Devel] databases

Thu Dec 2 14:45:44 EST 1999

Harry Mangalam wrote:

<snip>

> 
> How are you currently representing this problem at the Field Museum?
> Especially the dynamic nature of the problem?
> 
Keep in mind that I'm not here working on databases... that I've only
become peripherally interested and involved because these are problems
that are not being addressed well by computing services.  Being at a
Museum has huge drawbacks - we don't have easy access to experts in the
fields who could help put together theory and applications solve some of
these informational issues.

That said, our databases are separate - a big problem in and of itself.

Our databases are relational databases which clearly do not easily
address the problem of changing identifiers (or any other
characteristic), especially when temp workers are hired to enter data
and make on-the-fly corrections without consulting the curator.  e.g., a
pot that was recorded in field notes as being collected in Rhodesia is
entered into the database as being collected from the modern political
equivalent; bad move as the listing of location as Rhodesia is an
important time stamp as to when the pot was collected.  Time stamps on
databases are clearly of utmost importance.  I don't know how these
relational databases are currently keeping track of species name (or
political country) changes; I would guess that some kind of "memo" field
might be in use.

Currently, anybody conducting a historical biodiversity survey of our
collections ("What organisms are in your collection from the Pacific
Northwest?") has to consult over half a dozen different databases, all
relational, but using different products.  Most have limited
web-accessibility.

On the molecular end, we've got individuals working on particular genes
for different groups of species.  They do their alignment (by eye only -
*cringe*), then plunk the data into NEXUS format for use in Paup*.  So
they have bunches of different text files floating around their hard
drives.  A really useful thing would be a database of aligned genes for
the different groups (e.g., the ribosomal database project)... but how
would one keep the alignment up-to-date?  What would be the best
underlying structure for such data?

Lots of problems, no clear solutions...

-jennifer  

--------------------------
J. Steinbachs, PhD
Computational Biologist
Dept of Botany
The Field Museum
Chicago, IL 60605-2496

office: 312-665-7810
fax: 312-665-7158
--------------------------