The LECB 2-D PAGE Gel Images Data Sets

The LECB 2-D PAGE gel images database is available for public use. It contains data sets from four types of experiments with over 300 gif images with annotation and landmark data in html, tab-delimited and xml formats. It could be used for samples of several types of biological materials and for test data for 2D gel analysis software development and comparison with other similar samples. PAGE is polyacrylamide gel electrophoresis. The LECB was the U.S. National Cancer Institute's Laboratory of Experimental and Computational Biology. Since this work was done, LECB has been reorganized as the CCR Nanobiology Program. The database is available at two Web sites (the bioinformatics.org is the mirror site):

The database consist of four 2D gel image data sets previously analyzed with the GELLAB-II system (also GELLAB-II history). These data consist of over 300 gel images (including some replicate samples and some replicate scans) and are summarized below. The data set experiment conditions are documented in the associated literature references. The four data sets are:

  1. Human leukemias (Eric Lester, Peter Lemkin)
  2. HL-60 cell lines (Eric Lester, Peter Lemkin)
  3. MOLT-4 cells (Eric Lester, Peter Lemkin)
  4. Fetal alcohol syndrome (FAS) - serum (James Myrick, Mary Robinson, Peter Lemkin)
The data sets are described in the papers associated with each data set. The case or control samples could be used for comparison with your 2D PAGE gel samples or as test data for 2D gel analysis software.

These gel data could be used with the Open2Dprot project (open2dprot.sourceforge.net) software or Flicker gel comparison program (open2dprot.sourceforge.net/Flicker) as well as other 2D gel analysis software such as ImageJ, Photoshop, GIMP, etc.

This data is released for public use under the following conditions listed below (Section 3).

You may view the data directly from the Web site by clicking on hyperlinks in the following (accession and landmark) HTML files to pop up static images in your Web browser (see below). Alternatively, you could compare the images dynamically using the Flicker program on downloaded data sets.

1. Distribution documentation

The annotation documentation on the diagnosis and experimental conditions as well as the running of the gels is described in the referenced papers, when available. List of gels for each project summarize these conditions and diagnoses are summarized in tab-delimited spreadsheets for reading into various software packages. The accession data also includes a computing window for a valid region of interest in the gel [cwx1:cwx2, cwy1:cwy2]. The gel scan data grayscale was calibrated with a neutral density calibration step wedge. The List of OD values and corresponding gray values are given in the accession data table for each gel. This lets you compute integrated density rather than integrated grayscale (the former being more accurate). All coordinates are given in a raster coordinate system with the upper left hand corner being (0,0). For browsing of this database we also include tab-delimited data, XML data, and HTML web pages with links to the images for looking at specific gel images. These are the accession.tbl, accession.xml and accession.html files.

Manually landmarked gel data is also made available. Landmarks are useful in aligning gels in spot pairing software. These are the landmark.tbl, landmark.xml, and landmark.html files. These are vertically stacked spreadsheets. A landmark is a spot position in a reference gel that corresponds to the putatively same spot in another gel - typically matched by flickering the gels to find corresponding local regions. By pairing N-1 gels to a reference gel in a N gel database, it is possible to pair spots back to the reference gel and to build corresponding spot expression lists for analysis. In this data, there are instances where a particular landmark spot is missing from one of the non-landmark gels. This is to be expected in real-world data. The position was estimated by visually aligning neighboring spots in a local region around the landmark in the reference gel with the local region in the other gel. The landmark coordinates should be on the spot since it is assumed that spot matching software will latch onto the actual spot centroid from the manually specified coordinates.

To simplify reading all of the data contained in the accession.tbl and landmark.tbl, we also generated a publish.tbl and publish.xml files which is the relational join of the other two files. It would be used primarily by programs (such as the R program) to read all data about the data set for further analysis.

Finally, the complete set of files including the GIF images is packaged in a project.tar.gz file for each project and may be unpacked with WinZip on Windows PCs, gunzip on Unix, etc.

2. List of 2D gel data sets

There are additional references (see list of GELLAB References), but just a few are listed.

    Human leukemias (AML, ALL, CLL, HCL and other) (Lester, Lipkin, Lemkin). 170 gels [512x512 pixels, 8-bit, 250 microns/pixel, GIF]

    References:

    1. Lester EP, Lemkin PF, Lipkin LE.Protein indexing in leukemias and lymphomas. Ann N Y Acad Sci. 1984;428:158-72.
    2. Lester EP, Lemkin P, Lipkin L. A two-dimensional gel analysis of autologous T and B lymphoblastoid cell lines. Clinical Chemistry 1982 Apr;28(4 Pt 2):828-39.


    HL-60 cell line (Lester, Lipkin, Lemkin). 111 gels [512x512 pixels, 8-bit, 250 microns/pixel, GIF].

    References:

    1. Lester EP, Lemkin P, Lipkin L, Cooper HL. A two-dimensional electrophoretic analysis of protein synthesis in resting and growing lymphocytes in vitro. J Immunol. 1981 Apr;126(4):1428-34.
    2. Lester EP, Lemkin P, Lipkin L, Cooper HL. Computer-assisted analysis of two-dimensional electrophoreses of human lymphoid cells. Clinical Chemistry 1980 Sep;26(10):1392-402.


    Molt-4 cell line (Lester, Lipkin, Lemkin). Four gels [512x512 pixels, 8-bit, 250 microns/pixel, GIF].


    Fetal Alchohol Syndrome serum biomarkers case-control study (Robinson, Myrick and Lemkin). 53 gels [512x512 pixels, 8-bit, 340 microns/pixel, GIF]

    References:

    1. Robinson MK, Myrick JE, Henderson LO, Coles CD, Powell MK, Orr GA, Lemkin PF. Two-dimensional protein electrophoresis and multiple hypothesis testing to detect potential serum protein biomarkers in children with fetal alcohol syndrome. Electrophoresis 1995 Jul;16(7):1176-83.


3. Notice and Disclaimer for 2D Gel Data Sets

The data sets included herein are provided as a service to the research community under the following conditions by the National Cancer Institute (NCI), a member institute of the National Institutes of Health (NIH) and part of the United States Department of Health and Human Services.

  1. THE DATA SETS ARE BEING PROVIDED TO THE RESEARCH COMMUNITY 'AS IS' WITH NO WARRANTIES, EXPRESS OR IMPLIED, INCLUDING ANY WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

  2. THE DATA SETS SHALL NOT BE USED IN THE TREATMENT OR DIAGNOSIS OF HUMAN SUBJECTS.

  3. No indemnification for any loss, claim, damage or liability is intended or provided by the NCI. The NCI, as an agency of the United States Government, assumes liability only to the extent provided under the federal Tort Claims Act, 28 U.S.C. 2671 et seq.

  4. Users of the data sets agree not to claim, infer, or imply endorsement by the Government of the United States of America, the NIH, the NCI or any of its employees.

  5. Users shall not request or attempt to obtain in any manner or form any private patient information that may be associated with the data sets.

4. Comparing the images dynamically

It is possible to flicker-compare any two images using the Flicker program on downloaded data sets. You should:
  1. Download the Flicker program (and read the documentation).
  2. Download one of the above 2D gel datasets you are interested in. Then unzip the dataset file. There will be a directory ppx which contains the images.
  3. Copy that ppx directory into the Images directory where you have installed Flicker. You may want to rename the ppx directory if you plan on having several image subdirectories in the Images directory.
  4. If you have your own images to compare, you can add them to the ppx directory or you can copy your directoory to the Images directory.
  5. When you start Flicker, your data will appear in the (File | Open user images | Pairs of images | ppx | ...) menu. This will list all combinations of images in your ppx directory. Then select a pair of images and it will load them into Flicker, at which point you can do the comparison.

5. Links to Web sites that have used the data

The following are web sites that use or reference the data. If you have used the data, send us your links to add to this list.


Go to [LECB (is now CCRNP) | NCI Home | Open2Dprot | Flicker | Bioinformatics.org ]

$Date: 2006/06/30$ / P. Lemkin.
lemkin@ncifcrf.gov or lemkin@bioinformatics.org