[BiO BB] Final Call: Data Integration in the Life Sciences, DILS04

Dan Bolser dmb at mrc-dunn.cam.ac.uk
Mon Dec 1 06:30:41 EST 2003

Dear Sir/Madam,

I am having trouble meeting the deadline, but I believe my paper is
very relevant to the conference. I would like to ask if I can still
send my paper in the next couple of days?

Below I have reproduced the abstract.

I would be very grateful if you could consider this paper for submission
in a couple of days.

Yours sincerely,
Dan Bolser.

A major challenge in the post-genomic era is the integration, classification and
dissemination of of diverse sources of biological data. Data integration is
attractive for several reasons, providing context for biological analysis, allowing
different types of data to be cross-correlated, and providing a validation framework
for different sources of experimentally and computationally derived data.
Centralised data repositories have proven very effective at organising specific
kinds of data, both helping to coordinate the research community and providing
standards for the categorisation and distribution of the accumulated knowledge.
However, the underlying conceptual complexity of data in the biological domain, as
well as the continual development of new concepts, makes the task of producing a
'universal' data repository much more challenging. As a consequence many integrated
databases have become arbitrarily complex, with no intrinsic classification value.

Here, we emphasise a point which is not widely acknowledged by the database
community, that data modelling in the scientific domain is equivalent to the
scientific process itself. To this end we propose a 'distributed data model'
framework for data integration in the scientific domain. Different sources of data
to be integrated with will naturally have conceptual overlap (if they are to be
integrated at all), but may have very complex conceptual and/or algorithmic
associations. In this framework the burden of integration is put back on the domain
expert (to implement or devise specific integration strategies), but the underlying
issues of data access and subsequent distribution are made transparent via the data
model framework. The advantage of integrating and distributing data within this
framework is that new data and new concepts (produced as the result of integrative
analysis for example) are naturally accommodated by specific extensions to the data
models of the underlying data.

Here we develop components of a high level data model relating to the principal axes
of an integrated protein classification database (called the protein periodic
table). Additionally we have developed conceptually clear ORM style models for
integrated protein interaction data and metabolic pathway data (suitable for
metabolic reconstruction). The the subsequent analysis of these models highlights
several key areas for model development, and thus highlights areas for scientific
research into specific integration and classification techniques.

<quote who="DILS04">
> Int. Workshop on Data Integration in the Life Sciences (DILS 2004)
> Deadline: Nov. 30, 2003
> Industrial exhibits welcome
> Proceedings will be published in Springer LNCS
> http://izbi.uni-leipzig.de/dils04
> Workshop date: March 25-26, 2004, Univ. of Leipzig, Germany
> -------------------------------------------------------
> New advances in life sciences, e.g. molecular biology,
> biodiversity, drug discovery and medical research,
> increasingly depend on bioinformatics methods to manage
> and analyze vast amounts of highly diverse data.
> The volume of data is increasing at an unprecedented pace,
> fueled by world-wide research activities producing publicly
> available data, and new technologies, e.g. high-throughput
> devices such as microarrays. Thus, data mining and analysis
> require comprehensive integration of heterogeneous data,
> that is typically distributed across many data sources
> on the web and often structured only to a limited extent.
> Despite new interoperability technologies such as XML and
> web services, data integration is a highly difficult and
> still largely manual task, especially due to the high
> degree of semantic heterogeneity and varying data quality
> as well as specific application requirements.
> DILS 2004 aims at providing a new forum for
> presenting novel research results and assessing
> the state of the art in the field of data integration
> for life sciences. It addresses researchers,
> professionals, and industrial practitioners to
> share their knowledge on this highly important
> bioinformatics subject.
> In an effort to bring together academics and industrial
> practitioners, we solicit both research papers and
> application / experience papers. It is planned to
> publish accepted papers by Springer-Verlag in the
> Lecture Notes in Computer Science (LNCS) series.
> Topics of interest include, but are not limited to:
> * Challenges for data integration in life sciences
> * Architectures for data integration in life sciences
> * Ontology-based data integration and analysis
> * Metadata and annotation management
> * Data quality and data cleaning
> * Tool integration and experimental workflows
> * Evaluation of data integration approaches in life sciences
> * Data integration for specific applications
> * Prototypes and commercial solutions
> Authors are invited to submit original, previously
> unpublished papers. All submitted papers will be
> peer-refereed for quality, correctness, originality
> and relevance. Accepted papers will be published in
> the workshop proceedings which will be available
> at the workshop.
> Submissions must not exceed 15 pages and should be
> formatted according to the LNCS guidelines under:
>  http://www.springer.de/comp/lncs/authors.html
> All submissions will be handled electronically.
> Please send your submissions in PDF format to
>    dils04 at izbi.uni-leipzig.de
> Please also visit the workshop website for further
> information http://izbi.uni-leipzig.de/dils04
>    Paper submissions:     November 30, 2003
>    Author notification:   January   8, 2004
>    Camera-ready due:      January  28, 2004
>    Workshop date:         March 25-26, 2004
> Erhard Rahm, Univ. of Leipzig
> Howard Bilofsky, GlaxoSmithKline, USA
> Terence Critchlow, Lawrence Livermore National Laboratory, USA
> Peter Gray, Univ. of Aberdeen, UK
> Barbara Heller, Univ. of Leipzig, Germany
> Ralf Hofestaedt, Univ. of Bielefeld, Germany
> Jessie Kennedy, Napier Univ. Edinburgh, UK
> Ulf Leser, HU Berlin, Germany
> Bertram Ludäscher, San Diego Supercomputer Center, USA
> Sergey Melnik, Microsoft Research, USA
> Peter Mork, Univ. of Washington, Seattle, USA
> Felix Naumann, HU Berlin, Germany
> Frank Olken, Lawrence Berkeley National Laboratory, USA
> Norman Paton, Univ. of Manchester, UK
> Erhard Rahm, Univ. of Leipzig, Germany
> Louiqa Raschid, Univ. of Maryland, USA
> Kai-Uwe Sattler, TU Ilmenau, Germany
> Steffen Schulze-Kremer, FU Berlin, Germany
> Robert Stevens, Univ. of Manchester, UK
> Sharon Wang, IBM Life Sciences, USA
> Limsoon Wong, Institute for Infocomm Research, Singapore
> Sponsors of the event can exhibit their software
> and other products.
> The workshop is organized by the Bioinformatics centre
> of the University of Leipzig (www.izbi.de).
> -------------------------------
> Prof. Dr. Erhard Rahm
> http://dbs.uni-leipzig.de
> _______________________________________________
> BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board

More information about the BBB mailing list