The BIOJAVA interface in STRAP

BioJava is a set of modules and packages for biology, including sequence analysis, database access, and parsers for sequence files. Mark Schreiber maintains an excellent introduction, with many examples: BioJava Cookbook. An interface for BioJava is provided to allow authors of STRAP-plugins or -scripts to use the BioJava API. Vice-versa, BioJava projects can use the STRAP API.
Two classes are provided to convert objects between both tool-kits: One makes an GappedSequence object from a StrapProtein instance. The other class creates a StrapProtein object from a GappedSequence object. The sequence position specific features contained in the classes are also transformed.

Testing the STRAP-BioJava-interface

Plugins for STRAP can be created, started, and modified at runtime. A few demo-plugins are enclosed in STRAP to exemplify the usage of plugins. When STRAP is started there are several possibilities to get some protein files for testing into STRAP. In the menu Plugins of the toolbar is a menu item Start standard plugin ..... There you can select the BioJava example.

Comparing BioJava and the STRAP-API

Similarities:

Both provide comprehensive collections of methods for protein sequences.
Both are used by Java programmers for coding Bioinformatics algorithms.
Both separate implementations and definitions by using java interfaces.
Both are open source projects.
Both can read and write many sequence file formats.

Differences between BioJava and STRAP:

BioJava is applicable to nucleotide and peptide sequences and can be applied for entire genomes. STRAP cannot cope with single sequences as long as an entire chromosome. Instead STRAP manipulates peptide sequences and 3D- structures of the size of single proteins. Nevertheless, it can hold a high number of sequences and structures in memory. STRAP is designed for protein sequences but can read coding nucleotide files, which are then translated to peptide sequences.
STRAP is very fast since the graphical user interface must be highly responsive. BioJava is used where speed is less critical.
BioJava is well designed in terms of type safety, ontology and object design. BioJava uses objects for sequences, annotations and sequence positions. Even single amino acids or nucleotides are object references. To enhance speed, STRAP avoids frequent object instantiations and invocation of non-final object-methods to enhance speed.
- In BioJava peptide sequences and nucleotide sequences are lists of symbols. The symbols can be retrieved one after the other with an iterator or sub-sequences can be obtained. The advantages are that the entire sequence does not necessarily reside in memory and that programs are less susceptible to programming errors. Symbol objects are immutable elements of an alphabet. In STRAP however simple byte arrays are used for sequences and float arrays for coordinates. Besides speed the low memory consumption is an important advantage of basic data types. Classes in Strap expose internal data. Therefore programmers might commit programming errors like manipulating byte arrays directly instead of using the setter methods. Another disadvantage is that no checks are performed in STRAP whether the characters in sequences are valid with respect to an underlying alphabet.
- In BioJava sequence positions are realized by the class Location. Discontiguous Location objects are composed of several contiguous RangeLocation objects or PointLocation objects. For the class StrapProtein however, single residue positions are indicated by integer numbers between 0 and countResidues()-1. Multiple positions are given by boolean arrays. True at a given index means selected whereas false means not selected.
BioJava throws exceptions when methods are invoked with invalid parameters. STRAP avoids the time consuming creation of Throwable objects. Instead, errors in methods are indicated by the return values NaN, -1 or null. From the point of program design however Throwable objects are nicer.
In BioJava a Sequence object is either a peptide sequence or a nucleotide sequence. A StrapProtein can hold both at the same time if a coding nucleotide sequence was read and translated into protein. Both, the nucleotide sequence and the peptide sequence are contained in the same StrapProtein object. The coding or non-coding regions can be changed and the peptide sequence alters accordingly.