Andrew wrote:
> Hi Amir,
Hi. I'm sorry I didn't get back to you sooner. I could have sworn I'm "monitoring" this forum, but I didn't get any notification that this posting was made. Hm.
> [snip...] a one-liner to count A, T, G, C from a fasta sequence.
> perl -ne "BEGIN {%cnt}" -e "@nt= /^>/ || split //, $_; foreach (@nt) { $cnt{a}++ if /[Aa]/; $cnt{t}++ if /[Tt]/; $cnt{g}++ if /[Gg]/; $cnt{c}++ if /[Cc]/; } END { print "a$cnt{a}\tt$cnt{t}\tg$cnt{g}\tc$cnt{c}"; }"
1. Thanks! I'll put it on the list of possible things to include.
2. I think when you type backslash-t, the forum doesn't print the backslash. What a bummer!
3. I think I would do it a bit differently. Something like (untested!):
perl -ne 'BEGIN {%cnt=()} if (! /^>/) {while (/([actg])/ig) {$cnt{lc $1}++}} END {print ...}
> Also, are you looking for a developer to help your project?
I'm definitely interested in getting help. (That goes for anyone else reading this, too!) Please email me (My email's at the bottom of http://cgr.harvard.edu/cbg/scriptome/UNIX/) so we can discuss this further.
-Amir Karger
|