From Bioinformatics.Org Wiki
customdoc.py - Should always ignore changes in lines beginning with http, https, ftp etc. Otherwise, it breaks links to things like ftp:///ftp.cc.umanitoba.ca/psgendb..... Does htmldoc.py also need to be fixed? No.
- FAQ - Some of the questions and answers in the main BIRCH FAQ are a bit dated. This document needs to be reviewed and updated.
- better documentation on how to get email notification working
- add mechanism for desktop notification eg. GNOME, MAC?
- more email notifications for long running jobs (eg. phylogeny, multiple aligmnent)
- Using Unix pages
- link to Linux man pages - definitive site for Linux command manual pages
- For Unix command summary, make links to specific commands. The downside of the man7.org page is that is has ALL commands, so it's hard to find man pages for the common commands.
Upgrade to Python 2.7
- Upgrade needed in anticipation of BioPython 1.69, which no longer supports Python 2.6
Upgrade to Java 1.8
BioLegato 1.0.4 - recompiled using Java 1.8
- Can flamingo and albacore be upgraded to Java 1.7?
other Java programs
- needs to be done in NetBeans
- need to re-compile all .jar files
- test all non-BIRCH Java programs
- URGENT! - Need to fix the birchdb and tbirchdb scripts to work properly on CCL, and still work elsewhere.
- Could it be that it works from a csh script, and not a bash script?
The problem is that failure of birchdb to launch Xace or tace has been inconsistent. It works on some days, and not on others. It is as if something keeps getting set or unset.
Although error messages aren't consistent, here's one (on jupiter):
Gtk-WARNING **: Failed to load module "libgail.so": libgail.so: cannot open shared object file: No such file or directory Gtk-WARNING **: Failed to load module "libatk-bridge.so": libatk-bridge.so: cannot open shared object file: No such file or directory
Other times, this script gives a Segmentation Fault error. Once again, the only place I've had this trouble is on CCL.
Need mechanism for BioLegato to run commands in the background
At present, there is no way for PCD shell commands to run jobs in the background. That is, the Java Virtual Machine cannot terminate until every shell command has terminated. Even if the command ends with an ampersand, it must terminate before the JVM will terminate. That is an annoyance when we want displayed output to persist even after a BioLegato job has terminated, and a potentially major problem if we want to launch long-running or resource-intensive jobs from BioLegato.
It's probably best to write a short demo program to experiment with different approaches.
- src/BioPCD/parser/src/org/biopcd/parser/CommandThread.java is the object that calls Runtime.getRuntime to execute commands as new threads. see shellCommand in this file.
- As a temporary workaround, we can call scripts though a bash wrapper that uses nohup and & to run the script in the background.
- How do I launch a completely independent process from a Java program?
- Runtime.getRunTime().exec not behaving like C language “system()” command
- Is there a Null OutputStream in Java?
- Create threads to run in background
Solution: In BioLegato 1.0.3, CommandThread.java has been modified so that if a command line ends in '&', it will be run in the background.
- On Linux, we can run anything we want in the background using &. This doesn't always work in MacOSX, for reasons that are not clear. If you launch output in the background, the text editor doesn't pop up until the parent BioLegato process is terminated. It looks like what we have to do is to only use & when output is being sent to files. The rest of the time, we don't use &. This is probably okay, because if you logout, you certainly don't expect or want viewers to persist. If you don't logout, there's no need to kill BioLegato. One important thing is to first remove & from the birch launcher and birchadmn. We need to do more thorough testing on the remaining BioLegato programs. In some cases, there may be a need to run programs in the background through wrappers.
- Most shell commands run the command in the background. However, if temp files aren't explicitly saved, they get deleted before the command has a chance to process them. Therefore, we need to go through the .blmenu files and make sure that "save true" is included for all input file declarations eg in1.
birch birchadmin bldna
- sort dates. We already know how to do dates from blastdbkit.py, so we borrow that code here.
- re-annotate parameters using pydoc @param operator
- revisit how BinaryExists looks for apps. The command string could be any of
- open -aWXYZ appname (where WXYZ might be any number of options)
- There should be a refresh button to update birchadmin to show any newly-installed applications
- The command for all Save buttons should include a notify popup to tell the user to restart birchadmin for changes to be seen.
- CRITICAL! Check for Python2 during install.
- CRITICAL! Check for Java (non-headless) during install.
- CRITICAL! On flamingo, updating BIRCH causes local/admin/platform.profile.source permissions change to not world-readable. What causes this?
These changes can be put into operation without waiting for a new release, since they don't affect the stable release files.
The question is, how?
- Wrap getbirch in a script, and check for what we need there?
- Just make it very prominent on the download page that you need to have these installed? For example, clicking on GETBIRCH could pop up a new web page that tells the user what must be installed, and how to test for it?
- In any case, we need to revise the Install with GetBirch page to show the current look of the GetBirch wizard.
- record of downloads - add failover to a second host, if sending data to flamingo fails
- option to get onto mailing list
- Java 1.8 on Ubuntu - Default doesn't work with getbirch.jar. Need to comment out a line in /etc/java-8-openjdk/accessibility.properties as described in . This really sucks, because there doesn't seem to be a non-root way to fix this. The -Djavax switch recommended by another responder doesn't appear to work.
Test whether jkd is headless
dpkg -l | grep openjdk ii openjdk-8-jre-headless:amd64 8u171-b11-0ubuntu0.18.04.1 amd64 OpenJDK Java runtime, using Hotspot JIT (headless)
Solution: Document on GetBirch site that JDK MUST be full JDK, and not headless. If getbirch.jar won't run, install full JDK
sudo apt-get remove openjdk-8-jre-headless sudo apt-get install openjdk-8-jre
Mechanism for setting max. CPU General table functions
- File/Directory functions
Delete Unzip/bzunzip cd - change directory
- Sequence trimming and quality
- trim_galore - rewrite scripts and PCD so that paired read files can be used as input.
- fastq_pair - add binary and PCD menu
- For big files, seqkit stats can take a long time.
Add email notification to this function.Also, add to the hints a note that as the # of CPUs increase, the load on RAM also increases, because SeqKit uses pigz to do decompression through an I/O stream for each file. It could be that things will go faster if we use a smaller number of CPUs. Some experimentation is in order.
- For big files, seqkit stats can take a long time.
- Genome assembly and evaluation
- Pollux - automatically running FastQC on corrected reads doesn't work
- Transcriptome assembly and evaluation
Hints - short help items in menus that tell you what to select, what is required Can we get BioLegato overwrite function to work?
- update MAFFT. A lot of bug fixes since the current version in BIRCH.
- blprotein - PXHOM can't work with lowercase amino acids. Where do we do the fix? PXHOM, hom.py, or BioLegato?
Short term Fix: set readseq to convert aa sequences to uppercase before running PXHOM.
- Better long term fix is to modify P1HOM and P2HOM to accept lowercase. Actually, this will require modifying sunmods.p, p1hom.rp, p2hom.rp and prostat.rp. While we're at it, we should add support for the aminoacids pyrrolysine (Pyl,O) and Leu/Ile (Xle,J).
- we should add a -uniq option to blsort.py
- All BioLegato interfaces using the Table canvas should have a function to export to a spreadsheet, as determined by the $BL_Spreadsheet variable. ie. blmarker, blnfetch, blpfetch, blncbi, bltable
- Nucleotide database query - In the Output tab, the Output Format combo box includes an obsolete choice for GI number. Can we replace this with an option to retrieve Accession numbers?
birchdb - some of the command line arguments for blastdbkit.py listed in the database are wrong eg. --addfiles should be --add. Fix these.
- Special cases of est and pdbnt databases
- Where to document, aside from Adding BLAST Databases page
- How should blastdbkit.py deal with these?
- If est or pdbnt are installed, generate message to also install dependencies?
- message in blastdbkit.log?
- don't include in BLAST/FASTA menus unless dependencies are installed
- In updates, automatically update dependencies if est or pdbnt are installed?
- upgrade to FASTA 36.3.8 (April 2016)
- have a look at FASTA scripts for annotation.
- work out in depth the ways to get the most performance out of local BLAST and FASTA.
- add option to set number of CPUs
- This paper demonstrates the counterintuitive finding that multiple CPUs actually degrade performance when memory is smaller than the size of the database being searched. http://www.nersc.gov/users/computational-systems/genepool/performance-and-optimization. It may be good to add environment variables that are set at install time and default to reasonable values for one's system.
- BLAST display programs and/or HTML output
- BLASTGrabber - BMC Bioinformatics 2014 15:128.
- output should always show real, user and sys times for search. Maybe these could be pseudocomments in table-report?
- blastdbcmd integration into BioLegato (akin to blncbi)
- launch BLAST/FASTA from Artemis
- better support for the more specialized blasts such as rpsblast and psiblast
- In BioLegato (and in reports?) should display both the descriptions of databases, and the codes for the databases.
- FASTA output in multiple formats. With fasta, run fasta using multiple output formats
eg.-m "F8 $NAME.fasta.tsv" -m "F2 $NAME.fasta2"