[Biococoa-dev] first non-whitespace character
Koen van der Drift
kvddrift at earthlink.net
Sun Nov 28 19:04:08 EST 2004
Hi,
Anyone knows how to get the location in a string where the first
non-whitespace character is at? I am trying to parse a clustal file
(turn on monospaced font):
ACT1_FUGRU
-----------------------MEDEIAALVVDNGSGMCKAGFAGDDAPRAVFPSIVGR
ACT2_FUGRU
-----------------------MDDEIAALVVDNGSGMCKAGFAGDDAPRAVFPSIVGR
ACT3_FUGRU
-----------------------MEDEVASLVVDNGSGMCKAGFAGDDAPRAVFPSIVGR
5H1A_FUGRU
MDLRATSSNDSNATSGYSDTAAVDWDEGENATGSGSLPDPELSYQIITSLFLGALILCSI
5H1B_FUGRU
-------MEGTNNTTGWT-----HFDSTSNRTSKSFDEEVKLSYQVVTSFLLGALILCSI
5H1D_FUGRU
-------MELDNNSLDYFSSN--FTDIPSNTTVAHWTEATLLGLQISVSVVLAIVTLATM
* . . :
:
The first 6 lines are easy to parse. However, the last line which
contains the alignment, starts at the same location as the other lines,
not at the asterisk. So, I need to figure out where the first character
after the name starts in the first lines. Once I have that number, all
subsequent lines start at the same number. Unfortunately that number
can vary for different clustal files. I already commited a readClustal
method earlier today, so you can see what I have sofar.
cheers,
- Koen.
More information about the Biococoa-dev
mailing list