Fwd: [Biococoa-dev] singletons

Alexander Griekspoor mek at mekentosj.com
Sun Aug 22 04:39:40 EDT 2004

I got an error from the mailserver at biococoa so I'll try to send it 

Got it Koen.
If I look at it (again not understanding it completely as I haven't had 
the time to really dive in biojava yet), it seems much of that code is 
indeed in the (singleton object)  AlphabetManager 

Quoted from tutorial:

The set of Symbol objects which may be found in a particular type of 
sequence data are defined in an Alphabet. It it  always possible to 
define custom Symbols and Alphabets, but BioJava supplies a set of 
predefined alphabets for representing biological molecules. These are 
accessible through a central registry called the AlphabetManager, and 
through convenience methods.

FiniteAlphabet dna = DNATools.getDNA();
Iterator dnaSymbols = dna.iterator();
while (dnaSymbols.hasNext()) {
     Symbol s = (Symbol) dnaSymbols.next();

Quoted from source:

  * Utility methods for working with Alphabets.  Also acts as a registry 
  * well-known alphabets.
  * <p>
  * The alphabet interfaces themselves don't give you a lot of help in 
  * getting an alphabet instance. This is where the AlphabetManager 
comes in
  * handy. It helps out in serialization, generating derived alphabets 
  * building CrossProductAlphabet instances. It also contains limited 
support for
  * parsing complex alphabet names back into the alphabets.
  * </p>
  * @author Matthew Pocock
  * @author Thomas Down

It seems to get the details from the xml file AlphabetManager.xml
It also has the methods to create the symbols like this one:
    * <p>
    * Generate a new AtomicSymbol instance with a token, name and 
    * </p>
    * <p>
    * Use this method if you wish to create an AtomicSymbol instance. 
Initially it
    * will not be a member of any alphabet.
    * </p>
    * @param token  the Char token returned by getToken() (ignpred as of 
BioJava 1.2)
    * @param name  the String returned by getName()
    * @param annotatin the Annotation returned by getAnnotation()
    * @return a new AtomicSymbol instance
    * @deprecated Use the two-arg version of this method instead.
   static public AtomicSymbol createSymbol(
     char token, String name, Annotation annotation
   ) {
     AtomicSymbol as = new FundamentalAtomicSymbol(name, annotation);
     return as;

It also seems to contain the code to convert items in the xml file to 
symbols, though my java isn't that good here.
Anyway, I already mentioned before that I very much like the idea of an 
intermediate Alphabet layer also in BioCocoa.
In that, symbols make up alphabets, this way you for instance "solve" 
the problem John is having that he has to instantiate all nucleotides 
at once by creating the alphabets when needed (and thus automatically 
fill it with all singletons that belong in there). This might also 
answer the species specific protein problem, although I have to admit 
that I don;t know exactly yet. Alphabets can be used for both proteins 
and dna/rna as its simply a bag of symbols. Thus also solving the 
problem that we need acgtn for dna and acgun for RNA (uracil is still 
missing now), we could have a DNAAlphabet and RNAAlphabet. A lot of 
questions and answers can be found in the cookbook on the biojava 
website by the way.

Quote from cookbook:

In BioJava Alphabets are collections of Symbols. Common biological 
alphabets (DNA, RNA, protein etc) are registered with the BioJava 
AlphabetManager at startup and can be accessed by name. The DNA, RNA 
and protein alphabets can also be accessed using convenient static 
methods from DNATools, RNATools and ProteinTools respectively.

Both of these approaches are shown in the example below

import org.biojava.bio.symbol.*;
import java.util.*;
import org.biojava.bio.seq.*;
public class AlphabetExample {
   public static void main(String[] args) {
     Alphabet dna, rna, prot;
     //get the DNA alphabet by name
     dna = AlphabetManager.alphabetForName("DNA");
     //get the RNA alphabet by name
     rna = AlphabetManager.alphabetForName("RNA");
     //get the Protein alphabet by name
     prot = AlphabetManager.alphabetForName("PROTEIN");
     //get the protein alphabet that includes the * termination Symbol
     prot = AlphabetManager.alphabetForName("PROTEIN-TERM");
     //get those same Alphabets from the Tools classes
     dna = DNATools.getDNA();
     rna = RNATools.getRNA();
     prot = ProteinTools.getAlphabet();
     //or the one with the * symbol
     prot = ProteinTools.getTAlphabet();

Well, perhaps you get more ideas (and better) ideas when checking some 
of the BioJava code). Tell us if you think how they solved the problem. 
Again, also their tutorial and cookbook seems to give quite a bit of 
info (I love the way they handle things like crossAlphabets where you 
can for instance get all the symbols from two alphabets and also the 
way the create codons (which consist of three symbols, but itself is 
again one symbol. Thus you can create a sequence of codon symbols.)

In general I agree that it feels cluncky to instantiate so many things 
hard coded and manually. I would love to see one of us come up with a 
method to completely instantiate a singleton symbol (or alphabet of 
symbols) from a plist, much like you would instantiate a dictionary 
from a plist. The problem I see preventing this is that you have to 
declare your statics beforehand, but maybe this is completely false.
Anyone can come up with the -(id)initAlphabetFromFile: method we're 
looking for? ;-)


Op 21-aug-04 om 21:39 heeft Koen van der Drift het volgende geschreven:

> Hi,
> I looked around the BioJava code to see how they implement the use of 
> singletons for amino acids. Does anyone know where this is coded, I 
> couldn't find it.
> thanks,
> - Koen.
> _______________________________________________
> Biococoa-dev mailing list
> Biococoa-dev at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/biococoa-dev

                     ** Alexander Griekspoor **
               The Netherlands Cancer Institute
               Department of Tumorbiology (H4)
          Plesmanlaan 121, 1066 CX, Amsterdam
                   Tel:  + 31 20 - 512 2023
                   Fax:  + 31 20 - 512 2029
                   AIM: mekentosj at mac.com
                   E-mail: a.griekspoor at nki.nl
               Web: http://www.mekentosj.com

                             iRNAi, do you?


                       ** Alexander Griekspoor **
                 The Netherlands Cancer Institute
                 Department of Tumorbiology (H4)
           Plesmanlaan 121, 1066 CX, Amsterdam
                     Tel:  + 31 20 - 512 2023
                     Fax:  + 31 20 - 512 2029
                    AIM: mekentosj at mac.com
                     E-mail: a.griekspoor at nki.nl
                 Web: http://www.mekentosj.com

           LabAssistant - Get your life organized!


More information about the Biococoa-dev mailing list