[BiO BB] Bioperl FASTA to NEXUS with concatenation of genes

basm101 basm101 at york.ac.uk
Tue May 20 05:24:55 EDT 2003

Hi there,

I'm new to this board so not sure whether people use it to ask for help,
if not I apologise for posting in
the wrong place.

I am wanting to concatenate gene datasets and wondered if anyone knew
how to use bioperl to do this.
My input files are aligned FASTA format and I would like the output to
be NEXUS in non-interleaved format.


begin taxa;
dimensions ntax=number of taxa;
labels here

begin characters;
dimensions nchar=number of chars;
format symbols = "" missing=?;

species 1 AAA
species 2 BBB
species 3 AAA

species1 CCC
species2 DDD
species3 CCC

paup block here.

I have a bioperl script that prints out in interleaved format, but I
don't want this as I want to clearly see where one gene ends and the
next begins.

#!/usr/bin/perl -w

#Bioperl for format conversions
print "Which input file ?\n";

print "output filename:\n";

open (MYFILE, "$infile" ) || die;
open (DATA, ">$output" )  ||die;

use Bio::AlignIO;

    $in  = Bio::AlignIO->new(-file => "$infile" , '-format' => 'fasta');

    $out = Bio::AlignIO->new(-file => ">$output" , '-format' =>
    # note: we quote -format to keep older perls from complaining.

    while ( my $aln = $in->next_aln() ) {

  @sequences = <$infile>;

Also I need to make the AlignIO object take multiple input files.

Any ideas ?

University of York

More information about the BBB mailing list