[Bio-linux-devel] Best method to build a regularly updated bioinformatics platform

Squires, Richard (NIH/NIAID) [C] richard.squires at nih.gov
Mon Oct 31 10:59:59 EDT 2016


Hi Tony, all,

Thank you for the reply Tony. That is very helpful. I agree that a collaboration would make the most sense as I see it right now. I work at NIAID (as a contractor) and have mentioned the idea to my boss and there is interest but I would not say I have full buy in just yet. SO, this would be a great time to talk about a collaboration and figure out how to create a win-win scenario for all involved. 

As I think about updating or re-envisioning how best to create an up-to-date biolinux that can best meet our needs and the needs of the community (for analysis and training), while taking best advantage of recent technological development, I have a few thoughts:

1. I appreciate the use of Ubuntu but am not a fan of the Unity desktop. I find the Elementary OS (https://elementary.io, based on Ubuntu from what I can read) to be more intuitive for new users and users in general. Perhaps we can have a discussion around the possibility of setting Elementary OS as the default OS. 
a. Alternatively, we could pick a different flavor of Ubuntu such as kubuntu or lubuntu as the reference
b. Yet another alternative is to enable a user to pick a flavor of Ubuntu and build biolinux on top of it.
c. (As I do not know the history of biolinux I hope I am not treading on difficult ground here.)
2. The idea of creating USB flash drives for people to use their own laptops is an attractive option as well as an option when using another groups computers’ for training.
3. I would be glad to draft some use cases that we need to solve and it sounds like many other may need to solve as well. I think that if we have use cases well defined it would be very helpful in determine our path forward.
4. My initial (hopefully not too naïve) thoughts are to replace the single biolinux distribution with a pipeline that can build custom packages of biolinux for people’s needs (see next point) that can ultimately result in a regularly updated new biolinux distribution that users can download just as they have been but that it would be no more than 6 months out of date when downloaded. Other endpoints could include containers or just scripts to install the packages below on a Mac.
5. As I think about the areas that we train in I see particular groupings of analysis tools that could be offered as a single full biolinux install or as separate packages for those offering a specific training. As I see the packages, they could include:
a. Data Science, Biostatistics, Scientific Development
b. NGS, Microbiome, Sequence Analysis
c. Clinical Genomics
d. Phylogenetics
e. Structural, 3D
f. Biological Networks, Systems Biology, Downstream Analysis
g. Workflow (Galaxy, Jupyter Notebook, etc)
6. All of this will only be possible (I think) if we automate individual steps as much as possible BUT I think the technology exists to enable us to do this.
7. Finally, I see the following platforms as possible endpoints of this effort:
a. Laptops or desktops (Intel, Not-Apple)
b. Laptops or desktops (Apple)
c. Containers
d. Virtual Machines
e. Cloud (Amazon, Open cloud, other)

I apologize if any of the is no totally coherent, if so it merely reflects the current state of it in my head! ☺

I would be glad to discuss more here or answer any questions people might have as well as learn any history that would prevent anything above from being a possibility.

Cheers,

Burke Squires


--
R. Burke Squires
Computational Genomics Specialist
Contractor: Medical Sciences & Computing, Inc.
Bioinformatics and Computational Biosciences Branch (BCBB)
National Institutes of Allergy and Infectious Diseases (NIAID)
OCICB / OSMO / OD / NIAID / NIH
 
31 Center Drive, Room 3B62E.2
Bethesda, MD 20892
Office: 301-402-9408
Mobile: 240-454-4515
http://bioinformatics.niaid.nih.gov <http://bioinformatics.niaid.nih.gov/> (Within NIH)
http://exon.niaid.nih.gov <http://exon.niaid.nih.gov/> (Public)
https://twitter.com/niaidbioit (Twitter)
 
Disclaimer: The information in this e-mail and any of its attachments is confidential and may contain sensitive information. It should not be used by anyone who is not the original intended recipient. If you have received this e-mail in error please inform the sender and delete it from your mailbox or any other storage devices. National Institute of Allergy and Infectious Diseases shall not accept liability for any statements made that are sender's own and not expressly made on behalf of the NIAID by one of its representatives.

On 10/30/16, 2:03 PM, "Tony Travis" <tony.travis at minke-informatics.co.uk> wrote:

    On 30/10/16 00:59, Squires, Richard (NIH/NIAID) [C] wrote:
    > Hello all,
    > 
    >  
    > 
    > I am interested in hearing peoples’ thoughts about the best way to build
    > a regularly updated platform for bioinformatics analysis and training.
    > We are currently teaching classes on the NIH campus using Macs as well
    > as at international locations, such as Mali and India, using Wintel
    > laptops. We regularly have students who bring their own laptop and who
    > would like to follow along with the lessons.
    
    Hi, Richard.
    
    I ran a Bio-Linux 8 training course at Assam Agricultural University in
    Jorhat, India using the USB version of Bio-Linux 8. Everyone brought
    their own laptop and although this was great for teaching Bio-Linux 8,
    we ran more computationally demanding GWAS execises on a Bio-Linux 8
    terminal server using the "x2go" client on the Bio-Linux 8 USB sticks as
    well as showing people how to use the Windows or Mac "x2go" clients.
    
    > To date, we have used Salt Scripts to update biolinux 8.0. with the
    > software that we need. This has been used on the wintel laptops.
    > Ideally, I would like to have a single workflow that would create a
    > virtual machines or containers that can be used in the cloud or on a
    > laptop, as well as scripts that can be used to install software on a mac
    > or windows.
    
    I'm not sure what you mean by "wintel" laptops: Do you mean dual-booted
    with Windows or Bio-Linux or using a VM etc. I've just been trying out
    the Ubuntu-in-Windows under the 'Anniversary' edition of Windows 10 :-)
    
    My colleague Luca Beltrame uses "Salt" to manage our Bio-Linux 8 configs
    for a project we are collaborating on in the Life Science Informatics
    department at the Mario Negri Institure in Milan, Italy.
    
    > I ask here because I think that this potential workflow could also be
    > used to create a new version of biolinux that can be easily updated.
    
    Bio-Linux can already be updated automatically from the Debian APT
    repositories that it comes pre-configured with:
    
      apt update
      apt full-upgrade
    
    This works well, but when Ubuntu 16.04 LTS was released quite a lot of
    problems occurred for people trying to upgrade Bio-Linux 8 from Ubuntu
    14.04 to 16.04 because some of the NERC packages (e.g. QIMME) have
    dependencies for packages in the older release of Ubuntu.
    
    I don't see how you get around problems like this unless you repackage
    everything that breaks for the new Ubuntu release, which is what I've
    started doing. I've now got Bio-Linux 8 running under Ubuntu 16.04 LTS,
    but I've started work on a Bio-Linux 9 release, rather than explaining
    the complicated work-arounds that are necessary to do the upgrade.
    
    Bio-Linux 9 will be based on Ununtu 16.04 LTS
    
    > Please let me know if you have any suggestions or questions.
    
    I'm using Tim Booth's upgrade scripts and his instructions on GitHub
    about how to create an iso of the new release:
    
      https://github.com/environmentalomics
    
    I met Harry Mangalam at Basel Life Science Week recently and we
    discussed at length how we could achieve what you are also trying to do,
    which is create a supportable version of Bio-Linux. Harry teaches a
    "Bio-Linux" course at UCI, but does not currently use NERC/EOS
    Bio-Linux. Brad Chapman also used NERC/EOS Bio-Linux as the basis of
    Cloud-Bio-Linux, and he is also a 'Docker' enthusiast.
    
    I think it would be good if we start a collaboration to create a new
    version of Bio-Linux and this list is a good place to discuss ideas.
    Please forward this email to anyone else who might be interested and
    let's try to put together a plan of action. I've CC'ed this off-list to
    the people I've mentioned and also to Tracey Timms-Wilson, who is the
    NERC/EOS project manager for Bio-Linux.
    
    Thanks for raising this important topic,
    
      Tony.
    
    -- 
    Minke Informatics Limited, Registered in Scotland - Company No. SC419028
    Registered Office: 3 Donview, Bridge of Alford, AB33 8QJ, Scotland (UK)
    tel. +44(0)19755 63548                    http://minke-informatics.co.uk
    mob. +44(0)7985 078324        mailto:tony.travis at minke-informatics.co.uk
    _______________________________________________
    Bio-Linux-devel mailing list
    Bio-Linux-devel at bioinformatics.org
    http://www.bioinformatics.org/mm/listinfo/bio-linux-devel
    



More information about the Bio-Linux-devel mailing list