Main»Home Page

Home Page

Introduction

This is the main project page for NCBI2RDF. The NCBI2RDF tool is a Java-based API for enabling RDF-compliant access to the NCBI databases. It offers a programmatic interface for posing queries in SPARQL and receiving the results in SPARQL Results format. The API is quite straightforward to use, and its functionallity can be easily understood by looking at the provided examples.

The tool is free to download from http://www.bioinformatics.org/ftp/pub/ncbi2rdf/

For any questions, contact Alberto Anguita at aanguita@infomed.dia.fi.upm.es

Downloads

The tool is freely available in the following ftp server:

http://www.bioinformatics.org/ftp/pub/ncbi2rdf/

There is a README.TXT file in the ftp which explains how to use the library, plus several examples and precompiled jars, documentation and configuration files needed for installation.

Please read the README.TXT file contained in this ftp to learn the purpose of the available files, or read the next subsection.

Library instructions

Introduction

The NCBI2RDF tool is a Java-based API for enabling RDF-compliant access to the NCBI databases. It offers a programmatic interface for posing queries in SPARQL and receiving the results in SPARQL Results format. The API is quite straightforward to use, and its functionallity can be easily understood by looking at the provided examples.

Tool installation

The API can be used in a standalone Java application. All its functionality is bundled in the JAR that can be downloaded at the following web page: http://www.bioinformatics.org/ftp/pub/ncbi2rdf/.

The tool installation includse the following files:

 - README.txt: this file
 - NCBI2RDF.jar: the Java library containing all the tool code (including third-party libraries)
 - JavaDoc.rar: the Javadoc documentation of the API
 - examples.rar: a set of three examples in Java
 - RDFSchema.rdf: the RDF schema that NCBI2RDF generates and that represents the available data in NCBI
 - ConfigFiles.rar: this archive file contains a set of XML configuration files which NCBI2RDF needs in order to correctly work

To use the API in a Java project:

i) Download and decompress ConfigFiles.rar in the root directory of your Java project. This will create a directory called EutolsWrapper, with three more directories containing the XML configuration files in it. These files must be placed there whenever the NCBI2RDF API is invoked.

ii) Download and import the NCBI2RDF jar library and use the public class es.upm.gib.eutilsrdfwrapper.Controller. This class offers a series of static methods for performing RDF-compliant queries over the NCBI databases, described below.

public static String launchQueryGetPath(String query);
- Performs a query and retrieves the results as a SPARQL Results file
- query: a SPARQL query
- returns the path to the generated SPARQL Results file. This file will contain as many results as indicated in the LIMIT element of the SPARQL query, or 100 if no limit was indicated in the query

public static Results launchQueryGetResults(String query);
- Performs a query and retrieves the results as a Results object which allows retrieving the results as an iterator
- query: a SPARQL query
- returns a Results object for reading the query results

public static String launchQueryGetPath(ConceptsQuery query);
- Performs a query and retrieves the results as a SPARQL Results file
- query: a ConceptsQuery object containing the query to perform
- returns the path to the generated SPARQL Results file. This file will contain 100 results

public static Results launchQueryGetResults(ConceptsQuery query);
- Performs a query and retrieves the results as a Results object which allows retrieving the results as an iterator
- query: a ConceptsQuery object containing the query to perform
- returns a Results object for reading the query results

As can be seen, the first method in the list admits a String parameter which must be a SPARQL-compliant query. This query should conform the provided RDF schema in order to generate results. This method generates a file in SPARQL Results format and returns its path.

The other methods offer different formats for specifying the query or obtaining the results. The ConceptsQuery class offers a programmatic way of defining queries to the system. Results class offers a programmatic way to retrieve results related to a posed query (it offers the methods hasNext and nextRow to iterate through the query results).

It is recommended to check the attached examples to see how the API is invoked with some sample queries.

EXAMPLE 1:
// the query to launch
// this query asks for publications in PubMed with the general search term "dietary probiotics" (note that SPARQL escapes the characters with a '/' character, just like java, and in order to include the '"' character in the query, it must be escaped), and extracts the UID, title and journal from the retrieved publications. No limit is specified, so a maximum of 100 results are retrieved
String query = "PREFIX eurdf: <http://RDFEutilsWrapper#>\n" +
"SELECT ?p1_uid ?p1_titl ?p1_jour\n" +
"WHERE {\n" +
" ?p1 a eurdf:pubmed.\n" +
" ?p1 eurdf:pubmed_ALL ?p1_all.\n" +
" ?p1 eurdf:pubmed_UID ?p1_uid.\n" +
" ?p1 eurdf:pubmed_TITL ?p1_titl.\n" +
" ?p1 eurdf:pubmed_JOUR ?p1_jour.\n" +
"\n" +
" FILTER (?p1_all = \"\\\"dietary probiotics\\\"\").\n" +
"}";

// NCBI2RDF is invoked
String resultPath = Controller.launchQueryGetPath(query);

// The results are generated in a file located in .\EutilsWrapper\Results\results_"currentdate".xml
System.out.println("Results are in " + resultPath);

---------------------------------------------------

EXAMPLE 2:
// the query to launch
// this query retrieves publications in which "russ altman" in one of the authors, and the publications have related entries in the gene database. In each case, the publication uid and title, and the gene uid are retrieved
// the limit of retrieved results is set to 20
String query = "PREFIX eurdf: <http://RDFEutilsWrapper#>\n" +
"SELECT ?pubmed_uid ?pubmed_title ?gene_uid\n" +
"WHERE {\n" +
" ?pubmed a eurdf:pubmed.\n" +
" ?gene a eurdf:gene.\n" +
" ?pubmed eurdf:pubmed_UID ?pubmed_uid.\n" +
" ?pubmed eurdf:pubmed_TITL ?pubmed_title.\n" +
" ?pubmed eurdf:pubmed_AUTH ?pubmed_auth.\n" +
" ?pubmed eurdf:pubmed_gene ?gene.\n" +
" ?gene eurdf:gene_UID ?gene_uid.\n" +
"\n" +
" FILTER (?pubmed_auth = \"russ altman\").\n" +
"}\n" +
"LIMIT 20";

// NCBI2RDF is invoked
String resultPath = Controller.launchQueryGetPath(query);

// The results are generated in a file located in .\EutilsWrapper\Results\results_"currentdate".xml
System.out.println("Results are in " + resultPath);

---------------------------------------------------

EXAMPLE 3:
// the query to launch
// this query retrieves the gene with uid 3992 (FADS1), and for that gene, related publications that refer to that gene in PubMed.
String query = "PREFIX eurdf: <http://RDFEutilsWrapper#>\n" +
"SELECT ?gene_uid ?pubmed2_title\n" +
"WHERE {\n" +
" ?gene a eurdf:gene.\n" +
" ?pubmed2 a eurdf:pubmed.\n" +
" ?gene eurdf:gene_UID ?gene_uid.\n" +
" ?gene eurdf:gene_pubmed ?pubmed2.\n" +
" ?pubmed2 eurdf:pubmed_TITL ?pubmed2_title.\n" +
" FILTER (?gene_uid = \"3992\").\n" +
"}";

// NCBI2RDF is invoked
String resultPath = Controller.launchQueryGetPath(query);

// The results are generated in a file located in .\EutilsWrapper\Results\results_"currentdate".xml
System.out.println("Results are in " + resultPath);

Contact

For any comments, questions or suggestions, please write an email to aanguita@infomed.dia.fi.upm.es