[BiO BB] gff to sequence

J Greenbaum jgbaum at gmail.com
Sat Oct 3 13:02:48 EDT 2009


I would suggest using the bioperl modules for parsing GFF and FASTA files:
Bio::Tools::GFF

and

Bio::SeqIO

This should save you a lot of pain.

-Jason

On Sat, Oct 3, 2009 at 5:59 AM, Mike Marchywka <marchywka at hotmail.com>wrote:

>
>  <2c8757af0910030454t454facf1r1d083120aec1e41 at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
> Content-Transfer-Encoding: quoted-printable
> MIME-Version: 1.0
>
>
>
>
>
>
>
>
>
>
> ----------------------------------------
> > Date: Sat=2C 3 Oct 2009 12:54:51 +0100
> > From: dan.bolser at gmail.com
> > To: bbb at bioinformatics.org
> > Subject: Re: [BiO BB] gff to sequence
> >
> > You can do this easily in Perl... Here is some 'pseudo code' to
> > (roughly) do it...
> >
> >
> > ## Get a hash of sequences=2C keys =3D IDs=2C values =3D sequence
> strings=
> =3B
> > my %sequences=3B
> > ...
> >
> > # open the GFF file ...
> >
> > while(my $gff =3D ){
> > my @gffcols =3D split(/\t/=2C $gff)=3B
> >
> > print substr($sequence{$gffcols[0]}=2C $gffcols[3]=2C $gffcols[4] -
> > $gffcols[3])=2C "\n"=3B
> > ...
> > }
> >
> >
> > Or something roughly similar to the above =3B-)
> >
> > Dan.
> >
> >
> > 2009/10/3 Kie Kyon Huang :
> >> Hi=2C
> >>
> >> Is there a way to quickly extract out the coordinates from a gff file
> >> and the corresponding sequence from a fasta file?
> >>
>
> I guess it depends what you mean by quick- quick to write you could use awk
> but then it depends what additional things you want to do with results.=20
> I ended up writing a C++ fasta utility program since PERL can slow down
> som=
> etimes but I ended up grabbing a couple of regex libraries to let me=20
> grep names etc.=20
>
>
>
>
>                                          =0A=
> _________________________________________________________________=0A=
> Hotmail: Free=2C trusted and rich email service.=0A=
> http://clk.atdmt.com/GBL/go/171222984/direct/01/=
>
> _______________________________________________
> BBB mailing list
> BBB at bioinformatics.org
> http://www.bioinformatics.org/mailman/listinfo/bbb
>



More information about the BBB mailing list