Menu:

Overview


RECONSTRUCT is a command line program for reconstruction of protein contact maps that uses the distgeom program of the TINKER package.

Installation:
You need Java 1.6 or newer (available from http://java.sun.com).

  • Download the zip file and unpack it.

  • Get TINKER and PRM files from the TINKER website

    Linux TINKER executables are available at:
    http://dasher.wustl.edu/tinker/downloads/linux.tar.gz
    Force Field Parameter files are available at:
    http://dasher.wustl.edu/tinker/distribution/params


    Unfortunately TINKER is written in statically allocated FORTRAN so memory is allocated at program startup and there must be enough of it available or the program will fail. The allocation sizes are controlled by static variables in sizes.i file (in the TINKER source distribution).

    The default binaries provided in the TINKER web site are compiled with too restrictive static values. In practice this means that the "distgeom" program will only run for very small proteins.

    Thus in order to be able to run the reconstruct program for reasonably sized proteins you will need to download the TINKER source code at:
    http://dasher.wustl.edu/tinker/downloads/tinker-5.1.02.tar.gz
    modify the sizes.i constants file and recompile it. The parameters to modify are MAXGEO and MAXATM and MAXKEY. Values MAXATM=100000, MAXGEO=10000 and MAXKEY=20000 should suffice. These values would already require a memory allocation of >1GB

    Compilation seems to work with the GNU fortran compiler (e.g. gfortran package in Ubuntu). The TINKER package provides basic compilation, library make and linking scripts in the linux/gfortran directory of the source code distribution. First run compile.make, then library.make and finally link.make. We had better success using static linking (use -static flag in gfortran commands in link.make). In 64-bit systems there seems to be more issues at linking time. In an Ubuntu 64-bit we managed to compile and link successfully with MAXATM=50000, MAXGEO=5000 and MAXKEY=10000.

    If the programs can not run because they can't allocate enough memory one work-around is to increase the size of your swap file. TINKER only allocates a lot of memory at startup but most of the time doesn't actually use most of it.

  • Edit the file reconstruct.cfg and set the parameters TINKER_BIN_DIR and PRM_FILE. The only type of PRM_FILE supported is AMBER force field parameter files.
    A per-user reconstruct.cfg file can be placed in the user's home directory.

  • Run it:
    ./reconstruct

Parallel version, Sun Grid Engine:

  • For the parallel version to work (option -A) the path to the SGE root directory needs to be set in the reconstruct shell script (sgeroot variable)

Contact Map Files:

  • The contact map files that the reconstruct program reads are simple text files with a few headers and 3 columns: 1st for i residue numbers, 2nd for j residue numbers and 3rd for weights (currently ignored).
    An example file is provided: sample.cm (Cbeta 8A cutoff contact map for PDB 1bxy, chain A) The format is the same used by our CMView contact map visualization program (see the CMView site )
    The headers are essential for the reconstruct program to work. The parameters of the contact map: SEQUENCE, CONTACT TYPE (CT) and CUTOFF are read from the headers.

Command line options:

This is the full list of available command line options, run ./reconstruct -h to get it

Reconstructs a protein structure from a contact map using TINKER's distgeom.

This program requires a local installation of the TINKER package. The bin directory and PRM file used can be specified in the reconstruct.cfg file in the current directory or user's home directory

Two modes of operation:
a) normal : specify one or more contact map files. The sequence, contact type and cutoff will be taken from the file
b) benchmarking: specify a pdb code + pdb chain code or a pdb file(-p) and optionally contact type (-t) and cutoff (-d)
Usage:

reconstruct [options] [contact_map_file_1 [contact_map_file_2] [...]]

-p <string> : pdb code + pdb chain code, e.g. 1abcA or a pdb file (benchmarking). The PDB data will be downloaded from the PDB's ftp server. If in a) i.e. reconstructing from contact map files then the given pdb id/pdb file will be used for rmsd reporting
[-t <string>] : one or more contact types comma separated (benchmarking). Default: Cb
[-d <floats>] : one or more distance cutoffs comma separated (benchmarking), matching given contact types. If only one specified then it will be used for all contact types. Default: 8.0
[-i <intervals>] : use phi/psi restraints from given structure (needs -p). Specify a set of intervals from the given structure from which the phi/psi values will be taken, e.g.: 3-23,30-35,40-50
[-e] : restrain omega torsion angles to trans conformation
[-b <string>] : base name of output files. Default: rec or pdbId given in -p
[-o <dir>] : output dir. If option -A (parallel) is used then this directory MUST be a globally accessible one (all nodes in cluster must be able to read/write to it). Default: current
[-n <int>] : number of models to generate. Default: 1
[-m <int>] : filter contacts to min range. Default: no filtering
[-M <int>] : filter contacts to max range. Default: no filtering
[-f <float>] : force constant. Default: 100.0
[-F] : fast mode: refinement will be done via minimization (faster but worse quality model). Default: slow (refinement via simulate annealing)
[-A] : if specified reconstruction will be run in parallel using the Sun Grid Engine job scheduler (EXPERIMENTAL)
[-g] : debug mode, prints some debug info