README for prot4EST version 2.1.
File last updated 20/12/04

This file is meant as a very brief reference for the program.
For more detail please consult the user guide.

prot4EST version 2.0

author: James Wasmuth
address: Institute of Evolutionary Biology	
	 School of Biological Sciences,
	 University of Edinburgh,
	 UK.

james.wasmuth at ed.ac.uk

CONTENTS
1. Introduction
2. Getting Started
3. Output
4. References

For information regarding installation of prot4EST, its external dependencies or 
setting of environmental variables please consult the INSTALL file.


1. Introduction

prot4EST is a computer program that takes expressed sequence tags (ESTs) and translates them 
to produce putative peptides.  In essence a pipeline, scripted in Perl, prot4EST integrates a 
number of programs that are exploited in overcome problems inherrent with translating ESTs.  
All of these external components are freely available for accedemic researchers.


There are a number of publicly available software for translating EST sequences into putative peptides.  
These range from similarity-based methods such as FASTx [Pearson & Lipman 1998] and BLASTx [Altschul et al. 1997] 
to approaches based upon building models of prior sequence data.  This second set include ESTScan [Lottaz et 
al. 2003] uses a Hidden Markov model (HMM) and DIANA-EST [Hatzigeorgiou et al. 2001]  exploits an artifical neural
 networks (ANN).  Finally the only method that considers the quality file associated with a newly determined 
sequence is DECODER [Fukunishi & Hayashizaki 2001]; primarily designed to translate cDNA sequences, 
it can used for EST sequences.  

A review of these techniques will soon be available on my homepage [http://www.nematodes.org/~jamesw].  
It is accepted that the final three methods hold significant advantages over similarity-base approaches and show 
better performance in their detection and correction of frame shifts.  However all three have a dependence upon 
data submitted to the major sequence repositories, such as EMBL [http://www.ebi.ac.uk].  For model organisms 
there is more than enough, however the majority of the EST projects are from species where there is little or 
no submitted sequence.  Training the HMM or ANN with so little data produces a model who's predictions may not 
stand up to robust scrutiny.

The driving force behind the creation of prot4EST was to integrate some of these programs and error check their 
output to provide the user with robust putative peptides.  The program connects to all the online databases 
required while also allowing the user to exploit their own training sets.


2. Getting Started

i. Installation
Please refer to either the user guide or INSTALL file.  Please take some time to ensure that this is done 
correctly.

ii. Running prot4EST.
Once installed simply typing prot4EST should start the program.  You should see the welcome screen.  Any 
warning messages most likely point to missing external programs.

If you have not created a configuration file then you can do so from this point.
This is the primary point of contact between the user and the program.
For a detailed description please refer to the user guide.

iii. The Program in action
After the configuration file has been loaded, all the options are checked.  If this is satisfactory then the 
program runs.  The user is lead through any remaining choices by the program.


3. Output
The output directory (name is user specified), contains a number of files.  The four most important are:

translations_xtn.fsa
translations_noxtn.fsa
prot_main.psql
prot_HSP.psql

Details of each of these files can be found in the user guide.

4. References
Finally...
my apologies for the brevity of this file but everything is in the published material and soon to be updated User Guide, although the one included here is still relevant.

Please Reference prot4EST accordingly:
Wasmuth JD, Blaxter ML.  
prot4EST: Translating Expressed Sequence Tags from neglected genomes.
BMC Bioinformatics. 2004 Nov 30;5(1):187

The PartiGene paper is also available:
Parkinson J, Anthony A, Wasmuth J, Schmid R, Hedley A, Blaxter M.  
PartiGene--constructing partial genomes.
Bioinformatics. 2004 Jun 12;20(9):1398-404.

if you have any suggestions then please contact me at james.wasmuth at ed.ac.uk

you can subscribe to our low traffic mailing list at www.nematodes.org/~PartiGene
