[Biococoa-dev] first non-whitespace character

Koen van der Drift kvddrift at earthlink.net
Sun Nov 28 19:04:08 EST 2004


Hi,

Anyone knows how to get the location in a string where the first 
non-whitespace character is at? I am trying to parse a clustal file 
(turn on monospaced font):


ACT1_FUGRU      
-----------------------MEDEIAALVVDNGSGMCKAGFAGDDAPRAVFPSIVGR
ACT2_FUGRU      
-----------------------MDDEIAALVVDNGSGMCKAGFAGDDAPRAVFPSIVGR
ACT3_FUGRU      
-----------------------MEDEVASLVVDNGSGMCKAGFAGDDAPRAVFPSIVGR
5H1A_FUGRU      
MDLRATSSNDSNATSGYSDTAAVDWDEGENATGSGSLPDPELSYQIITSLFLGALILCSI
5H1B_FUGRU      
-------MEGTNNTTGWT-----HFDSTSNRTSKSFDEEVKLSYQVVTSFLLGALILCSI
5H1D_FUGRU      
-------MELDNNSLDYFSSN--FTDIPSNTTVAHWTEATLLGLQISVSVVLAIVTLATM
                                          *     .          .     :       
:

The first 6 lines are easy to parse. However, the last line which 
contains the alignment, starts at the same location as the other lines, 
not at the asterisk. So, I need to figure out where the first character 
after the name starts in the first lines. Once I have that number, all 
subsequent lines start at the same number. Unfortunately that number 
can vary for different clustal files. I already commited a readClustal 
method earlier today, so you can see what I have sofar.


cheers,

- Koen.




More information about the Biococoa-dev mailing list