BIRCH/Release To Do list

From Bioinformatics.Org Wiki

Jump to: navigation, search

Mystica Arrow set (with deep) 1.png [return to BIRCH Project]


Future Releases

Installation and Updating


These changes can be put into operation without waiting for a new release, since they don't affect the stable release files.
The question is, how?

Test whether jkd is headless

dpkg -l | grep openjdk
ii  openjdk-8-jre-headless:amd64          8u171-b11-0ubuntu0.18.04.1            amd64        OpenJDK Java runtime, using Hotspot JIT (headless)

Solution: Document on GetBirch site that JDK MUST be full JDK, and not headless. If getbirch.jar won't run, install full JDK


sudo apt-get remove openjdk-8-jre-headless
sudo apt-get install openjdk-8-jre


Desktop setup

local files

no longer supported, comparable to NewLocalFiles.list

Dependencies for 3rd party programs

First, we need to begin a list of programs that have special dependencies:

program dependency
Weblogo numpy
Weblogo ghostscript (for formats other than eps)

It's probably a good idea to put this into birchdb, once the necessary fields for the table become stable.

Next, we need a way to detect each dependency during an install or update, and to somehow inform the user of the dependency.

Ideally, there would be a way to install the dependency during install or update, if the user is the BIRCH administrator. However, that would be a lot of work, and may not be stable. There could also be version dependencies, such as needing Python 3, or Java 8 or greater.


Settings for each User

We need to have a directory within the $HOME directory where settings for each user can be stored.


Since we set these things anyway, maybe its best to implement this as a bash source file. This would at the same time be executable, as well as readable, since bash variables are in the form VAR=VALUE


Variables in $HOME/.config/BIRCH/BIRCHsettings.source
variable name description
BL_EMAIL email address for notifications
BIRCH_PROMPT Y or N - tells whether to use the BIRCH command line prompt. Y overrides prompt set in .bashrc file.
BL_TextEditor text editor for BioLegato
BL_PDFViewer PDF viewer for BioLegato
BL_PSViewer PostScript viewer for BioLegato
BL_ImageViewer bitmap image viewer for BioLegato
BL_Document Word Processor for BioLegato
BL_Spreadsheet Spreadsheet for BioLegato
BL_Browser Web browser for BioLegato
BL_Terminal Terminal program for BioLegato

Where do we put it?


OS/desktop directory
RHEL7/Xfce $HOME/.config
fedora31/GNOME $HOME/.config
Ubuntu 18/MATE $HOME/.config
Mac OSX $HOME/.config defines environment variables that specify where files for applications are stored. Config files are usually in a directory within $HOME/.config.

OSX - tells us that the place for preferences on the OSX desktop is ~/Library/Preferences. This directory contains a .plist file for each Mac application. The plist file is a binary. My feeling is that since BIRCH will be using a textfile for settings, we avoid this directory, especially because BIRCH is not an app, but in many respects more like an operating system.



1. Create a script that creates this directory if necessary, and populates it with settings
2. Get newuser to run it.
3. Get BioLegato to run it


There should be a common Python directory for installed Python packages. We already have $BIRCH/python and should add $BIRCH/local/python. Added $BIRCH/local-generic/python. Forget the idea of a common Python directory. Because many Python modules such as numpy require compilation of C code, libraries cannot be assumed to be portable across platforms. We will phase out use of $BIRCH/python and $BIRCH/local/python in favor of lib-linux-x86_64/python and lib-osx-x86_64/python.

Python libraries


Is there a way, on a script by script basis, to force use of Python3? That way, as we progress, we can focus on developing for Python3, and do 2to3 conversions that are not backward compatible with Python2.

By now, we can count on Python3 being available, but not necessarily being the system default. We need to explicitly call Python3 in those cases where Python3 is required.

Machine Python3 version
flamingo 3.6.9
brassica 3.5
triticum 3.6.9
peacock 3.7.2
CCL 3.6.8
maui 3.6.9
fedora31 3.7
wotan 3.6.8

Comptability issues

BIRCH Python compatibility




Python2&3 compliant:


Not yet compliant:

  • - needs urllib.request (try fixing with six)
  • - needs urllib.request (try fixing with six)
  • - imports urllib but doesn't directly call it. Do we need this declaration?


Python2&3 compliant:

3rd Party Python3 compatiblilty


The pip command installs Python packages from repositiores. By default, they are installed system-wide, but we want to install them in $BIRCH/python. For example, to install the package gffutils, we type

pip3 install --install-option="--prefix=$birch/lib-$BIRCH_PLATFORM/python" gffutils

All packages installed in this manner will be in $BIRCH/lib-$BIRCH_PLATFORM/python.

We would have to add the PYTHONPATH environment variable to profile.source, cshrc.source etc.


Platform-dependent Python
In some Python packages (eg. cutadapt), platform-specific libraries (eg. C, C++) are part of the package, usually as .o files. These can be buried several directories down in the package, but they are there.

For such cases, we install in platform-specific python directories:


pip3 install --install-option="--prefix=$birch/lib-linux-x86_64/python"


pip3 install --install-option="--prefix=$birch/lib-osx-x86_64/python"

Setting PYTHONPATH then becomes




Delete old libraries

especially those associated with GDE. The best way is to rename a library using the .old extension. The libraries to try are:

Testing: The main programs of concern are acedb and treetool.

OSX Dependencies


Need mechanism for BioLegato to run commands in the background

At present, there is no way for PCD shell commands to run jobs in the background. That is, the Java Virtual Machine cannot terminate until every shell command has terminated. Even if the command ends with an ampersand, it must terminate before the JVM will terminate. That is an annoyance when we want displayed output to persist even after a BioLegato job has terminated, and a potentially major problem if we want to launch long-running or resource-intensive jobs from BioLegato.

It's probably best to write a short demo program to experiment with different approaches.



Solution: In BioLegato 1.0.3, has been modified so that if a command line ends in '&', it will be run in the background.

Remaining issues:


GetInfo - Colourmask: new colours don't display

Bugzilla #1201


The Update action is contained in the SequenceWindow. My guess is that we need to pass the SequenceTextArea to the SequenceWindow so that it can call the repaint function for SequenceTextArea. It is worthy of note that there are numerous calls to repaint in SequenceTextArea that specify the area to repaint. This may be for efficiency during actions like select and scroll, and may not be necessary here.

system command appears to have no effect

Bugzilla #1204


get rid of wrappers for text editors

The BioLegato scripts call, which in turn calls either for nedit or for gedit.

Output to console

We need to decide on a standard way to run programs so that we see the progress as the program runs. Currently this is done using the command stored in $GDE_TERM, but that is not necessarily platform independent. Some possibilities include:

Table Canvas


birchadmin is a birch system administration tool.


The problem is that failure of birchdb to launch Xace or tace has been inconsistent. It works on some days, and not on others. It is as if something keeps getting set or unset.

Although error messages aren't consistent, here's one (on jupiter):

Gtk-WARNING **: Failed to load module "": cannot open shared object file: No such file or directory
Gtk-WARNING **: Failed to load module "": cannot open shared object file: No such file or directory

Other times, this script gives a Segmentation Fault error. Once again, the only place I've had this trouble is on CCL.

As well, there is a GUI front end called [ RazorSQL] which may be all that we need to manage birchdb.

Quick and dirty patch/addon mechanism

We need a way to apply patches to an existing BIRCH install. This should be a very simple mechanism to start with, which will also teach us some things about exactly what it is that we want it to do. Initially, it should probably be nothing more than running a script that downloads a file and untars it, so that the files just go where they are supposed to go with permissions already set.

We need a mechanism to record in $BIRCH/local which addons are installed. This way, when a BIRCH update is installed, we can make sure to re-install any addons.

Definition of an add-on

An add-on includes:


An add-on can either be something new that is installed, or a patch that overwrites existing files, or even a script that runs and changes something. For example, a patch might be as simple as a script that changes important permissions, or changes the name of a file, or does a string substitution to correct an error.


get list of available addons/patches
user selects one or more
foreach addon selected
    cd $BIRCH
    download addon
    gunzip addon.tar.gz
    tar xvfp addon.tar
    cd xxxx.addon.d
    mv payload.tar $BIRCH
    cd $BIRCH
    tar xvfp payload.tar
    cd xxxxx.addon.d
    cat addon_spec.csv >> $BIRCH/local/admin/addons.csv


Convert FSAP and XYLEM to GNU Pascal?

GNU Pascal has a great deal features aside from the Jensen & Wirth standard, including support for most Borland features, and even abstract object types and methods. The main improvement would be that we could leave behind p2c. This should be done with great care and a lot of testing, because there could be surprises hiding in the implementation. See


Fasta3.6 has extensive improvements from 3.5, and we need to reflect that in our online documentation and in biolegato menus.

Command line options

There are many new output options, such as output to key/value pairs, that we should be able to take advantage of.


Each program can generate a man page eg. fasta36 -help. These should be saved as files and made available through the bioLegato Help button.

pairwise alignment programs - output to biolegato

Replace fastaout.csh with a script that will open the output of a pairwise alignment program in blnalign or blpalign.


In Python:

import multiprocessing
if BLASTDB not set
   prompt for directory (default $BIRCH/GenBank)
read list of database divisions currently installed
read list of database divisions to be installed
uninstall those not in the list from previous step
install all divisions in the install list

Could do this as:

    • shell script with BioLegato front end
    • Python script with BioLegato front end. This could be implemented by adapting The menu layout would look something like:
Nucleotide (nt) Installed Install O</d> Delete O</d>
Protein (nr) Installed Install O</d> Delete O</d>
RefSeq RNA (refseq_rna) Not installed Install O</d> Delete O</d>
    • Java application

Blast output viewers




How about a BioLegato that displays a GenBank features table. It would be an output option from the Features program. blfeatures would use the table canvas to display feature information:

Accession    FeatureKey    Location    Qualifiers....

You could do the usual scan/sort/extract operations to get a narrowed-down list of features. Then retrieve the features you want from the GenBank files.
This might be far more useful than one might originally think.






The Pandas API seems ideally suited for a BioLegato front end. The data paradigm seems to be the data frame (df). Pandas does an operation on data in a data frame, and the output is another data frame. Sound familiar? See

Here's how to do this:

  1. Break out BioLegato as a standalone project, perhaps in a Git repo.
  2. Create a demo blpandas
  3. Advertise blpandas on the Pandas Stack Overflow forum. Solicit collaborators from the Pandas community.

Multiple Alignment


Grishin Lab Software

The Grishin Lab at HHMI has a lot of publications and tools related to protein evolution, structure and multiple alignment. The Grishin scoring matrix is one of the ones used in NCBI BLAST. See


Calculates statistics for multiple sequence alignments. Output includes various scores for multiple alignment. This should be a good way for comparing the quality of alignments based on different methods or parameters.


Replace TCOFFEE!!!

On MacOSX, t_coffee is v8.14. It has not been possible so far to get later versions to run on albacore. It was possible to compile the generic version but that also generates errors. It is not certain whether this is a problem with albacore specificially, or MacOSX in general. I ONCE installed TCOFFEE in an account on OSX, and the binary didn't work. Nonetheless, I was unable to run any previous version of TCOFFEE. Even after removing all of the TCOFFEE environment variables from all .rc files, and from the .MacOSX directory, every time I tried to run a 8.14, it would create a new ~/tcoffee directory with the new version in it! This thing is like a virus. You just can't get rid of it. Somewhere in this account, there is a tcoffee script or settings lurking in a file.

Fortunately, the problem is limited to a single account.

There is now a Clustal Omega, which the authors claim is "The last alignment program you'll ever need". Maybe.

blnalign, blpalign

Multiple Alignment Tutorial

blnfetch, blpfetch




BioLegato for continuous data. This would be an implementation of bltable, targeted at data expressed in real numbers, such as phenotypic data. We would start out with the appropriate programs from Phylip:



It may be time to replace mrtrans with something better

Possible replacements:

Looks like the best option is to use tranalign from EMBOSS. On the downside, that will still require writing a wrapper to check sequence names and reorder the DNA into the same order as the protein sequences. As well, it will require setting up a skeleton install of EMBOSS. But that puts us in a position to add other EMBOSS programs as we see fit.

On that subject, it probably would not be worth the effort and space to do a complete install of EMBOSS. In most cases, there are better programs to do each job.

Basic Genomics Tools

It should be possible to identify a set of basic genomics tools that are used by common 3rd party packages.


BIRCHv3.90 (Future Development Version)

BIRCHv3.80 (Current Development Version, UNSTABLE)

BIRCHv3.71 (Current Production Version, STABLE)












Personal tools
wiki navigation