Frequently-asked questions to NeedleHaystack

by Dr. Andreas Hoppe

1. I would like to assign atoms of a certain type only to atoms of the same type, e. g. carbons should only be assigned to carbons, or hydrogens should not be assigned to heavy atoms.

By default, an atom can be assigned to any other atom. However, NeedleHaystack has a implemented the option that the assignments of unlike atoms are penalized with a given pattern. The concept is extemely flexible, any atom specific for an amino acid can be given an individual score for any other atom/amino acid combination. If such a resolution is not required, wildcards are also allowed. The option is called -A >filename&ls;, a file is input which holds the penalty matrix. A specific help text can be found with haystack -hh. Here are some exemplary penalty matrices:

2. Could you give a computation protocol used in your paper.

The computations in the paper are performed with hierarchical makefiles which should be too complicated at this point. But a basic but still flexible and complete computation protocol is given as follows:

  1. Put all Models (small atom sets) in the directory models/ and all Targets in the directory targets/.
  2. Try options on a small set of runs.
  3. Save the parameters into a file with haystack ... <options> ... -hm > Params.ini
The code is as follows:

mkdir -p log
for m in models/*; do
  mkdir -p $m;
  for t in targets/*; do
    haystack -I Params.ini $m $t -su $m/$t > log/$m-$t.out 2> log/$m-$t.err
  done
done

3. In different runs on the same example files the results are slightly different. Why are the results not reproducible?

NeedleHaystack uses elements of stochastical methods by intention. They are set at compile time and controlled by compiler options.

They are called: NOT_TARGET_RANDOMIZE and NOT_MODEL_RANDOMIZE. The randomized consideration of atoms in both sets increases the stability of the algorithm. For reasons of reference I supply version where randomization is switched off:
haystack_nocona_norandom
haystack_pentiumpro_norandom
haystack_athlonxp_norandom
haystack_pentium4_norandom

NeedleHaystack uses by default true random numbers, generated by the number of milliseconds in the actual time. Another compiler option changes this into pseudo-random numbers: REALRANDOM. Switched off, the results of NeedleHaystack are very likely to be reproducible on the same machine. Executables can be found here:
haystack_nocona_norealrandom
haystack_pentiumpro_norealrandom
haystack_athlonxp_norealrandom
haystack_pentium4_norealrandom
Keep in mind, that the time-out control makes the results indeterministic. For complete reproducibility Timeout or -to must be switched off.

4. How can I check the compiler options of a given executable?

Enter:
haystack -v
which results in:
haystack version 3.3.21
  (unrestricted length/4 word floats/4 word precision floats)
  no checks done, no warnings done
  Compiler: Reading specs from /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.6/specs Configured with: /var/tmp/portage/gcc-3.3.6/work/gcc-3.3.6/configure --prefix=/usr --bindir=/usr/i686-pc-linux-gnu/gcc-bin/3.3.6 --includedir=/usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.6/include --datadir=/usr/share/gcc-data/i686-pc-linux-gnu/3.3.6 --mandir=/usr/share/gcc-data/i686-pc-linux-gnu/3.3.6/man --infodir=/usr/share/gcc-data/i686-pc-linux-gnu/3.3.6/info --with-gxx-include-dir=/usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.6/include/g++-v3 --host=i686-pc-linux-gnu --build=i686-pc-linux-gnu --disable-altivec --enable-nls --without-included-gettext --with-system-zlib --disable-checking --disable-werror --disable-libunwind-exceptions --disable-multilib --enable-java-awt=gtk --enable-languages=c,c++,java,objc,f77 --enable-shared --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu Thread model: posix gcc version 3.3.6 (Gentoo 3.3.6, ssp-3.3.6-1.0, pie-8.7.8)
  Compiler flags:   -DDOUBLEFLOAT -DLARGEINDECES -DNOMALLINFO   -DMAININFO -DLINUX -DREALRANDOM -DPSF_IMPORTANCE -DPSF_ORIENT -DPSF_ATOMMATCH -DPSF_MEANS -Wall -Wno-uninitialized -D_GNU_SOURCE -pipe  -mcpu=athlon-xp -march=athlon-xp    -static  -O3 -ffast-math 
  Inlining: yes
  Compiled: Thu Oct 26 17:06:41 CEST 2006

At "Compiler flags:" the compiler options are headed with "-D", e. g. "-DREALRANDOM" means the the compiler option "REALRANDOM" is switched on.

last update: 13.5.2008