[BiO BB] remove CTL-M and Buying a bioinformatics workstation

Iddo Friedberg idoerg at burnham.org
Wed Sep 3 17:00:18 EDT 2003


Tristan Fiedler wrote:
> Dear Bio Gurus!
> 
> Two quick questions :
> 
> 1.  could someone please assist me in writing a shell script (awk, sed,
> etc.) which would use a loop to run thru about 1000 files (filenames all
> end in '.seq') and remove all occurences of control-M, resulting in a file
> containing the sequence on a single line.
> 
> Currently each file looks similar to :
> 
> % cat -v seq_018_G05.seq
> AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA^M
> AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGGGGGGGG^M
> TTTTTTTTTTTTTTTTCCCAAAAAAAAAAAAA^M
> 

Sounds like you need the dos2unix utility. Comes bundled in with Linux, 
in case you are working on another OS, you can download it free.. use 
Google to find it.


> 
> 2.  We are planning to buy a workstation for our local (~3 labs producing
> sequences from an ABI sequencer) genomics needs (lots of blast runs,
> database management, standard bioinformatics software), and were planning
> on getting something like :
> 
> 4 GB RAM  (is this enough for doing local blast searches against genbank?)

Definitely, that's what I have, haven't had any issues. BLAST/PSI-BLAST 
is not that memory-intensive actually.

> 2 x 3 GHz Xeon processors (how about Mac OSX?)

The more processors, the merrier. BLAST parallelizes nicely. Regarding 
OS: I'm partial to Linux, but that's me.

> 400 GB storage
> 

You can always add more, and 400 is ample for starters.

> 
> Thank you - and feel free to reply directly to me (not waste bb resources).
> 
> Cheers!
> 
> 
> 

-- 
Iddo Friedberg, Ph.D.
The Burnham Institute
10901 N. Torrey Pines Rd.
La Jolla, CA 92037
USA
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 646 3171
http://ffas.ljcrf.edu/~iddo




More information about the BBB mailing list