From pankaj at nii.res.in Wed Feb 1 01:02:08 2006 From: pankaj at nii.res.in (Pankaj) Date: Wed, 1 Feb 2006 11:32:08 +0530 Subject: [BiO BB] how to find gene neighbours Message-ID: <20060201060208.M35291@nii.res.in> Hi Everybody, I have a few gi numbers. I want to find the their neighbouring genes. How can I do that? Thanking all in advance Pankaj Khurana -- Open WebMail Project (http://openwebmail.org) From adarshramakumar at yahoo.co.uk Wed Feb 1 03:20:31 2006 From: adarshramakumar at yahoo.co.uk (Adarsh Ramakumar) Date: Wed, 1 Feb 2006 08:20:31 +0000 (GMT) Subject: [BiO BB] how to find gene neighbours In-Reply-To: <20060201060208.M35291@nii.res.in> Message-ID: <20060201082031.15704.qmail@web25508.mail.ukl.yahoo.com> Try and explore with http://string.embl.de/ It does some amazing exploration --- Pankaj wrote: > > Hi Everybody, > I have a few gi numbers. I want to find the their > neighbouring genes. How can > I do that? > > Thanking all in advance > > Pankaj Khurana > > > -- > Open WebMail Project (http://openwebmail.org) > > _______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > Mr. Adarsh Ramakumar, BioInformatics, University of Bremen. http://www.biox.uni-bremen.de Mail: adarshbiox at uni-bremen.de Tel (W) +49-421-2182911. Mobile +49-1747273680. ___________________________________________________________ NEW Yahoo! Cars - sell your car and browse thousands of new and used cars online! http://uk.cars.yahoo.com/ From idoerg at gmail.com Wed Feb 1 18:40:58 2006 From: idoerg at gmail.com (Iddo Friedberg) Date: Wed, 01 Feb 2006 15:40:58 -0800 Subject: [BiO BB] how to find gene neighbours In-Reply-To: <20060201060208.M35291@nii.res.in> References: <20060201060208.M35291@nii.res.in> Message-ID: <43E1470A.2000203@burnham.org> Generally, you can do that by going to the genomic database of choice for your particular organism. As you have not mentioned which organism those genes are form, it is rather hard to make a recommendation. Ensembl, www.fruitfly.org, yeastgenome.org are examples of such databases for Eukaryotic models. A list for bacterial projects is available from : http://www.pasteur.fr/recherche/unites/tcruzi/minoprio/genomics/bacteria.htm HTH, Iddo Pankaj wrote: >Hi Everybody, >I have a few gi numbers. I want to find the their neighbouring genes. How can >I do that? > >Thanking all in advance > >Pankaj Khurana > > >-- >Open WebMail Project (http://openwebmail.org) > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > > -- Iddo Friedberg, Ph.D. Burnham Institute for Medical Reseach 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 713 9949 http://iddo-friedberg.org http://BioFunctionPrediction.org From mkgovindis at yahoo.com Wed Feb 1 21:46:31 2006 From: mkgovindis at yahoo.com (govind mk) Date: Wed, 1 Feb 2006 18:46:31 -0800 (PST) Subject: [BiO BB] how to find gene neighbours In-Reply-To: <43E1470A.2000203@burnham.org> Message-ID: <20060202024631.50131.qmail@web34701.mail.mud.yahoo.com> I think NCBI's map viewer can elp you find neighbouring genes ..... Are u looking for an algorithm or just want to map you genes on to genomes and identify their neighbours -Govind Iddo Friedberg wrote: Generally, you can do that by going to the genomic database of choice for your particular organism. As you have not mentioned which organism those genes are form, it is rather hard to make a recommendation. Ensembl, www.fruitfly.org, yeastgenome.org are examples of such databases for Eukaryotic models. A list for bacterial projects is available from : http://www.pasteur.fr/recherche/unites/tcruzi/minoprio/genomics/bacteria.htm HTH, Iddo Pankaj wrote: >Hi Everybody, >I have a few gi numbers. I want to find the their neighbouring genes. How can >I do that? > >Thanking all in advance > >Pankaj Khurana > > >-- >Open WebMail Project (http://openwebmail.org) > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > > -- Iddo Friedberg, Ph.D. Burnham Institute for Medical Reseach 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 713 9949 http://iddo-friedberg.org http://BioFunctionPrediction.org _______________________________________________ Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From pankaj at nii.res.in Thu Feb 2 01:17:44 2006 From: pankaj at nii.res.in (Pankaj) Date: Thu, 2 Feb 2006 11:47:44 +0530 Subject: [BiO BB] how to find gene neighbours In-Reply-To: <20060202024631.50131.qmail@web34701.mail.mud.yahoo.com> References: <43E1470A.2000203@burnham.org> <20060202024631.50131.qmail@web34701.mail.mud.yahoo.com> Message-ID: <20060202061744.M53505@nii.res.in> Thankx Govind! I have gi numbers of homologous proteins from different organisms (some of which may be sequenced and some may not be....i dont know!). I want to know how to find gene neighbours and then automate the script to find gene neighbours for all the gi numbers that I have. Pankaj Khurana -- Open WebMail Project (http://openwebmail.org) ---------- Original Message ----------- From: govind mk To: idoerg at burnham.org, "The general forum at Bioinformatics.Org" Sent: Wed, 1 Feb 2006 18:46:31 -0800 (PST) Subject: Re: [BiO BB] how to find gene neighbours > I think NCBI's map viewer can elp you find neighbouring genes > ..... Are u looking for an algorithm or just want to map you genes > on to genomes and identify their neighbours > > -Govind > > > Iddo Friedberg wrote: > Generally, you can do that by going to the genomic database of > choice for your particular organism. As you have not mentioned which > organism those genes are form, it is rather hard to make a > recommendation. Ensembl, www.fruitfly.org, yeastgenome.org are > examples of such databases for Eukaryotic models. > > A list for bacterial projects is available from : > > http://www.pasteur.fr/recherche/unites/tcruzi/minoprio/genomics/bacteria.htm > > HTH, > > Iddo > > Pankaj wrote: > > >Hi Everybody, > >I have a few gi numbers. I want to find the their neighbouring genes. How can > >I do that? > > > >Thanking all in advance > > > >Pankaj Khurana > > > > > >-- > >Open WebMail Project (http://openwebmail.org) > > > >_______________________________________________ > >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > > > > > > > > -- > Iddo Friedberg, Ph.D. > Burnham Institute for Medical Reseach > 10901 N. Torrey Pines Rd. > La Jolla, CA 92037 USA > Tel: +1 (858) 646 3100 x3516 > Fax: +1 (858) 713 9949 > http://iddo-friedberg.org > http://BioFunctionPrediction.org > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com ------- End of Original Message ------- From msr at asu.edu Thu Feb 2 17:12:55 2006 From: msr at asu.edu (Michael S. Rosenberg) Date: Thu, 02 Feb 2006 15:12:55 -0700 Subject: [BiO BB] Genomes, Evolution & Bioinformatics Conference Message-ID: <43E283E7.2000208@asu.edu> SMBE 2006 Conference (May 24 - May 28, 2006) Genomes, Evolution & Bioinformatics Arizona State University, Tempe, Arizona, USA The conference will open on the evening of Wednesday, May 24 with a Welcome Social and Registration from 7:00 p.m. - 11:00 p.m. The opening symposia and contributed sessions will begin at 8:00 a.m. on May 25. The closing symposia and contributed sessions will take place from 8:00 a.m.- 12:00 noon on Sunday, May 28. A schedule of events at http://www.smbe.org/geb/events.htm To register visit http://www.smbe.org/geb/registration.php (Early registration from Feb 1 - April 1, 2006). Abstracts submission: http://www.smbe.org/geb/abstracts.php For poster presentations and invited talks, we only require you to provide a title and authors. For contributed talks and for consideration for travel awards, you need to submit a short abstract as well. (Submissions accepted from Febuary 1 - March 15, 2006). Over 50 leading experts in Genomics, Evolutionary Biology, and Bioinformatics have been invited and confirmed (see a list at http://www.smbe.org/geb/speakers). Highlights of the scientific program at GEB2006 include: (1) A Keynote Address every morning and over 20 invited symposia (2) "Fitch Legacy" and "Nei Legacy" symposia celebrating the achievements of world-renowned students and academic associates of Walter M. Fitch and Masatoshi Nei, co-founders of the journal MBE. (3) SMBE Awards Banquet for Council Awards for "Life time achievement" (Dr. Tomoko Ohta) and "Service to Evolutionary Biology Community" (Dr. Brian Golding) (4) A NASA Astrobiology Institute symposium on "Discovering the Timetree of Life" (5) Many Graduate and Undergraduate Student Travel awards sponsored by SMBE (http://www.smbe.org/geb/awards.htm). ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Michael S. Rosenberg, Ph.D. Assistant Professor School of Life Sciences / Arizona State University msr at asu.edu http://www.public.asu.edu/~mrosenb/Lab From wrs06 at redstar.cs.pdx.edu Wed Feb 1 14:06:36 2006 From: wrs06 at redstar.cs.pdx.edu (wrs06 at redstar.cs.pdx.edu) Date: Wed, 1 Feb 2006 11:06:36 -0800 Subject: [BiO BB] WRS06 1st call for paper Message-ID: <200602011906.k11J6aKC010743@redstar.cs.pdx.edu> [Our apologies if you receive multiple copies] ----------------------------------------------------------------------- WRS06 The Sixth International Workshop on Reduction Strategies in Rewriting and Programming http://www.cs.pdx.edu/~antoy/wrs06/ The Seattle Sheraton Hotel and Towers, Seattle, Washington, August 11, 2006 Scope The workshop intends to promote and stimulate international research and collaboration in the area of evaluation strategies. It encourages the presentation of new directions,developments and results as well as surveys and tutorials on existing knowledge in this area. Reduction strategies study which subexpression(s) of an expression should be selected for evaluation and which rule(s) should be applied. These choices affect fundamental properties of a computation such as laziness, strictness, completeness and need to name a few. For this reason some programming languages, e.g., Elan, Maude, *OBJ* and Stratego, allow the explicit definition of the evaluation strategy, whereas other languages,e.g., Clean, Curry, and Haskell, allow its modification. Strategies pose challenging theoretical problems and play an important role in practical tools such as theorem provers, model checkers and programming languages. In implementations of languages, strategies bridge the gap between operational principles, e.g., graph and term rewriting,narrowing and lambda-calculus, and semantics, e.g., normalization, computation of values and head-normalization. The previous editions of the workshop were: WRS 2001 (Utrecht, The Netherlands),WRS 2002 (Copenhagen, Denmark), WRS 2003 (Valencia, Spain), WRS 2004 (Aachen, Germany), and WRS 2005 (Nara, Japan). See also the WRS permanent page at http://www.dsic.upv.es/~wrs/ Important Dates Abstract Submission: May 8, 2006 Paper Submission: May 15, 2006 Author Notification: June 12, 2006 Camera-Ready: July 10, 2006 Conference: Aug 11, 2006 Program Committee Sergio Antoy, (chair) Portland State University Santiago Escobar, Universidad Politecnica de Valencia Juergen Giesl, RWTH Aachen Bernhard Gramlich, Technische Universitat Wien Ralf Laemmel, Microsoft Corp. Salvador Lucas, Universidad Politecnica de Valencia Narciso Marti-Oliet, Universidad Complutense de Madrid Mizuhito Ogawa, Japan Advanced Institute of Science and Technology Jaco van de Pol, Centrum voor Wiskunde en Informatica Manfred Schmidt-Schauss, Johann Wolfgang Goethe-Universitat Topics Topics of interest include, but are not restricted to: o theoretical foundations for the definition and semantic description of reduction strategies o strategies in different frameworks such as term rewriting, graph rewriting, infinitary rewriting, lambda calculi, higher order rewriting, conditional rewriting, rewriting with built-ins, narrowing, constraint solving, etc. o application of strategies to equational, functional, functional-logic programming languages o properties of reduction strategies and corresponding computations, e.g., completeness, computability, decidability, complexity, optimality, normalization, cofinality, fairness, perpetuality, context-freedom, need, laziness, eagerness, strictness o interrelations, combinations and applications of reduction under different strategies, e.g., evaluation mechanisms in programming languages, equivalence conditions for fundamental properties like termination and confluence, applications in modularity analysis, connections between strategies of different frameworks,etc. o program analysis and other semantics-based optimization techniques dealing with reduction strategies o rewrite systems, tools, implementations with flexible or programmable strategies as an essential concept or ingredient o specification of reduction strategies in real languages strategies suitable to software engineering problems and applications tutorials and systems related to evaluation strategies Submissions Submissions must be original and not submitted for publication elsewhere. The page limit for regular papers is 13 pages in Springer Verlag LNCS style. Surveys and tutorials maybe longer. Use the WRS06 submission page, handled by the EasyChair conference system, to submit abstracts, papers and to update a previous submission. Publication Informal proceedings of accepted contributions will be available on-line. A hard copy will be distributed at the workshop to registered participants. Authors of selected contributions will be invited to submit a revised version, after the workshop, for inclusion in a collection. We anticipate the publication of formal proceedings in the Elsevier ENTCS series. Contact Sergio Antoy, antoy at cs.pdx.edu. From fcjooty at yahoo.com Thu Feb 2 18:31:16 2006 From: fcjooty at yahoo.com (Franklin Jose) Date: Thu, 2 Feb 2006 15:31:16 -0800 (PST) Subject: [BiO BB] (no subject) Message-ID: <20060202233116.44335.qmail@web54213.mail.yahoo.com> Hai everybody Is there any difference between tools, methods and algorithms in sequence alignment? Franklin Lecturer Govt. Arts College Udhagai-643002 --------------------------------- Relax. Yahoo! Mail virus scanning helps detect nasty viruses! -------------- next part -------------- An HTML attachment was scrubbed... URL: From journalshoyaib at gmail.com Fri Feb 3 06:26:03 2006 From: journalshoyaib at gmail.com (shohag md) Date: Fri, 3 Feb 2006 17:26:03 +0600 Subject: [BiO BB] How to calculate the value of K and Lambda for two sequence alignment Message-ID: Hi Everybody Using Smith Waterman algorithm I want to align two sequences. Aftet that I want to calculate the expectation value. For calculating the expectation value we know that E = Kmn e -lx But how can I calculate the value of K and l . Is there any formula that can help me to calculate the value of K and l , and then the expectation value. Thanking all in advance Shoyaib -------------- next part -------------- An HTML attachment was scrubbed... URL: From yqzhou at buffalo.edu Fri Feb 3 08:32:30 2006 From: yqzhou at buffalo.edu (Yaoqi Zhou) Date: Fri, 3 Feb 2006 08:32:30 -0500 Subject: [BiO BB] (no subject) In-Reply-To: <20060202233116.44335.qmail@web54213.mail.yahoo.com> References: <20060202233116.44335.qmail@web54213.mail.yahoo.com> Message-ID: The most recent comparison can be found in comparing SPEM with other sequence alignment methods. H. Zhou and Y. Zhou, ``SPEM: Improving multiple-sequence alignment with sequence profiles and predicted secondary structures'', Bioinformatics 21, 3615--3621 (2005). Download and/or run SPEM server at http://theory.med.buffalo.edu Yaoqi Zhou, Associate Professor **CHECK out Newly UPDATED webpage** on http://theory.med.buffalo.edu Department of Physiology and Biophysics University at Buffalo, State University of New York 124 Sherman Hall, Buffalo, NY 14214 (716) 829-2985 Fax (716) 829-2344 Email: yqzhou at buffalo.edu NEW Office/Lab Address: 306/308 Cary Hall On Feb 2, 2006, at 6:31 PM, Franklin Jose wrote: > Hai everybody > Is there any difference between tools, methods and algorithms in > sequence alignment? > ? > Franklin > Lecturer > Govt. Arts College > Udhagai-643002 > > Relax. Yahoo! Mail virus scanning helps detect nasty > viruses!_______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From maximilianh at gmail.com Fri Feb 3 09:51:19 2006 From: maximilianh at gmail.com (Maximilian Haeussler) Date: Fri, 3 Feb 2006 15:51:19 +0100 Subject: [BiO BB] Sequence Editor for Unix? Message-ID: <76f031ae0602030651n7522df0dq@mail.gmail.com> Hi, in 2000 and 2002 someone asked on this list if there was a good replacement for the Bioedit sequence editor for Unix. Some time has passed from this. So I'm asking the question again: What are you using as a sequence editor on Linux if you have a Biologist standing next to who who'd like to see a graphical view of the alignment of a couple of sequences in Blast, Clustal, etc? Bioedit could do all of this easily and quickly without any hassles. I've tried a couple of programs. Personally, I don't like java swing...interfaces feel rather sluggish. Apollo/Artemis take ages to load if you just want to have a look on an alignment or run a Blast. Strap has a strange interface. Seqpup hasn't been updated for years. Clustalx is OK for alignment viewing can only run ClustalW. Maybe just a small executable, C-style, that can plot blast/gff-files and zoom, might be sufficient... What are you using? Thanks a lot, Max From pandeswati at gmail.com Fri Feb 3 16:23:18 2006 From: pandeswati at gmail.com (Swati Pande) Date: Fri, 3 Feb 2006 13:23:18 -0800 Subject: [BiO BB] naccess Message-ID: <3ec517850602031323y40f213e4v569d2832c93b1e19@mail.gmail.com> hi I was wondering if anyone has naccess installed on linux? the problem is that there is no way to de crypt the file on linux I tried installing mcrypt which didnt work out and I was wondering if anyone has had any luck with this? I really need to get naccess going. thanks Swati From rwang at bccrc.ca Fri Feb 3 17:51:31 2006 From: rwang at bccrc.ca (Renxue Wang) Date: Fri, 3 Feb 2006 14:51:31 -0800 Subject: [BiO BB] aligment of large sequences Message-ID: <0BE438149FF2254DB4199E2682C8DFEB702971@crcmail1.BCCRC.CA> Hi, there, Anybody knows any good program for alignment/comparison of large sequences, e.g., two genomes of closely related species, for identifying the inversion deletion and translocation. Thanks, Renxue Wang -----Original Message----- From: bio_bulletin_board-bounces+rwang=bccrc.ca at bioinformatics.org [mailto:bio_bulletin_board-bounces+rwang=bccrc.ca at bioinformatics.org] On Behalf Of Yaoqi Zhou Sent: Friday, February 03, 2006 5:33 AM To: The general forum at Bioinformatics.Org Subject: Re: [BiO BB] (no subject) The most recent comparison can be found in comparing SPEM with other sequence alignment methods. H. Zhou and Y. Zhou, ``SPEM: Improving multiple-sequence alignment with sequence profiles and predicted secondary structures'', Bioinformatics 21, 3615--3621 (2005). Download and/or run SPEM server at http://theory.med.buffalo.edu Yaoqi Zhou, Associate Professor **CHECK out Newly UPDATED webpage** on http://theory.med.buffalo.edu Department of Physiology and Biophysics University at Buffalo, State University of New York 124 Sherman Hall, Buffalo, NY 14214 (716) 829-2985 Fax (716) 829-2344 Email: yqzhou at buffalo.edu NEW Office/Lab Address: 306/308 Cary Hall On Feb 2, 2006, at 6:31 PM, Franklin Jose wrote: > Hai everybody > Is there any difference between tools, methods and algorithms in > sequence alignment? > ? > Franklin > Lecturer > Govt. Arts College > Udhagai-643002 > > Relax. Yahoo! Mail virus scanning helps detect nasty > viruses!_______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board _______________________________________________ Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From oepresearch at bellsouth.net Fri Feb 3 19:35:05 2006 From: oepresearch at bellsouth.net (Octavio E. Pajaro) Date: Fri, 3 Feb 2006 19:35:05 -0500 Subject: [BiO BB] naccess In-Reply-To: <3ec517850602031323y40f213e4v569d2832c93b1e19@mail.gmail.com> Message-ID: <20060204002638.NBHH2815.ibm62aec.bellsouth.net@masf> I have had the same problem. I could never decrypt the program. Octavio -----Original Message----- From: bio_bulletin_board-bounces+oepresearch=bellsouth.net at bioinformatics.org [mailto:bio_bulletin_board-bounces+oepresearch=bellsouth.net at bioinformatics. org] On Behalf Of Swati Pande Sent: Friday, February 03, 2006 4:23 PM To: bio_bulletin_board at bioinformatics.org Subject: [BiO BB] naccess hi I was wondering if anyone has naccess installed on linux? the problem is that there is no way to de crypt the file on linux I tried installing mcrypt which didnt work out and I was wondering if anyone has had any luck with this? I really need to get naccess going. thanks Swati _______________________________________________ Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From hamid at ibb.ut.ac.ir Sat Feb 4 02:15:07 2006 From: hamid at ibb.ut.ac.ir (hamid) Date: Sat, 04 Feb 2006 10:45:07 +0330 Subject: [BiO BB] aligment of large sequences In-Reply-To: <0BE438149FF2254DB4199E2682C8DFEB702971@crcmail1.BCCRC.CA> References: <0BE438149FF2254DB4199E2682C8DFEB702971@crcmail1.BCCRC.CA> Message-ID: Dear Wang, Mummer is a suitable program for this reason. Note that this program only runs on Unix base OSs. Yours, Hamid /* Hamid Nikbakht, M.Sc of Cell and Molecular Biology, Laboratory of Biophysics and Molecular Biology, Bioinformatics Department, Institute of Biochemistry and Biophysics(IBB), University of Tehran, Tehran,Iran. Tel: +98-21-6111-3322 Fax: +98-21-640-4680 E-Mail: hamid at ibb.ut.ac.ir Alt. E-mail: nikbakht at ibb.ut.ac.ir */ From william.hsiao at gmail.com Sat Feb 4 03:15:51 2006 From: william.hsiao at gmail.com (William Hsiao) Date: Sat, 4 Feb 2006 00:15:51 -0800 Subject: [BiO BB] aligment of large sequences In-Reply-To: <0BE438149FF2254DB4199E2682C8DFEB702971@crcmail1.BCCRC.CA> References: <0BE438149FF2254DB4199E2682C8DFEB702971@crcmail1.BCCRC.CA> Message-ID: <679a35b20602040015y8595febif95a80e740474b88@mail.gmail.com> Hi Renxue, You might want to try http://gel.ahabs.wisc.edu/mauve/ Cheers, Will On 2/3/06, Renxue Wang wrote: > Hi, there, > > Anybody knows any good program for alignment/comparison of large sequences, e.g., two genomes of closely related species, for identifying the inversion deletion and translocation. > > Thanks, > > Renxue Wang > > -----Original Message----- > From: bio_bulletin_board-bounces+rwang=bccrc.ca at bioinformatics.org [mailto:bio_bulletin_board-bounces+rwang=bccrc.ca at bioinformatics.org] On Behalf Of Yaoqi Zhou > Sent: Friday, February 03, 2006 5:33 AM > To: The general forum at Bioinformatics.Org > Subject: Re: [BiO BB] (no subject) > > The most recent comparison can be found in comparing SPEM with other > sequence alignment methods. > > H. Zhou and Y. Zhou, ``SPEM: Improving multiple-sequence alignment with > sequence profiles and predicted secondary structures'', Bioinformatics > 21, 3615--3621 (2005). > > Download and/or run SPEM server at http://theory.med.buffalo.edu > > Yaoqi Zhou, Associate Professor > **CHECK out Newly UPDATED webpage** on > http://theory.med.buffalo.edu > Department of Physiology and Biophysics > University at Buffalo, State University of New York > 124 Sherman Hall, Buffalo, NY 14214 > (716) 829-2985 Fax (716) 829-2344 > Email: yqzhou at buffalo.edu > NEW Office/Lab Address: 306/308 Cary Hall > > On Feb 2, 2006, at 6:31 PM, Franklin Jose wrote: > > > Hai everybody > > Is there any difference between tools, methods and algorithms in > > sequence alignment? > > > > Franklin > > Lecturer > > Govt. Arts College > > Udhagai-643002 > > > > Relax. Yahoo! Mail virus scanning helps detect nasty > > viruses!_______________________________________________ > > Bioinformatics.Org general forum - > > BiO_Bulletin_Board at bioinformatics.org > > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > -- William Hsiao PhD Student, Brinkman Laboratory Department of Molecular Biology and Biochemistry Simon Fraser University, 8888 University Dr. Burnaby, BC, Canada V5A 1S6 Phone: 604-291-4206 Fax: 604-291-5583 From boris.steipe at utoronto.ca Sat Feb 4 09:04:11 2006 From: boris.steipe at utoronto.ca (Boris Steipe) Date: Sat, 4 Feb 2006 09:04:11 -0500 Subject: [BiO BB] aligment of large sequences In-Reply-To: <679a35b20602040015y8595febif95a80e740474b88@mail.gmail.com> References: <0BE438149FF2254DB4199E2682C8DFEB702971@crcmail1.BCCRC.CA> <679a35b20602040015y8595febif95a80e740474b88@mail.gmail.com> Message-ID: <90E9EEE7-5A13-43CB-9F8E-3FAD12EEBD20@utoronto.ca> You could also try LAGAN http://lagan.stanford.edu/lagan_web/index.shtml in particular Shuffle_Lagan handles inversions, transpositions(!) and even some degree of duplications. HTH, B. On 4 Feb 2006, at 03:15, William Hsiao wrote: > Hi Renxue, > You might want to try http://gel.ahabs.wisc.edu/mauve/ > > Cheers, > > Will > > On 2/3/06, Renxue Wang wrote: >> Hi, there, >> >> Anybody knows any good program for alignment/comparison of large >> sequences, e.g., two genomes of closely related species, for >> identifying the inversion deletion and translocation. >> >> Thanks, >> >> Renxue Wang >> >> -----Original Message----- >> From: bio_bulletin_board-bounces+rwang=bccrc.ca at bioinformatics.org >> [mailto:bio_bulletin_board-bounces >> +rwang=bccrc.ca at bioinformatics.org] On Behalf Of Yaoqi Zhou >> Sent: Friday, February 03, 2006 5:33 AM >> To: The general forum at Bioinformatics.Org >> Subject: Re: [BiO BB] (no subject) >> >> The most recent comparison can be found in comparing SPEM with other >> sequence alignment methods. >> >> H. Zhou and Y. Zhou, ``SPEM: Improving multiple-sequence alignment >> with >> sequence profiles and predicted secondary structures'', >> Bioinformatics >> 21, 3615--3621 (2005). >> >> Download and/or run SPEM server at http://theory.med.buffalo.edu >> >> Yaoqi Zhou, Associate Professor >> **CHECK out Newly UPDATED webpage** on >> http://theory.med.buffalo.edu >> Department of Physiology and Biophysics >> University at Buffalo, State University of New York >> 124 Sherman Hall, Buffalo, NY 14214 >> (716) 829-2985 Fax (716) 829-2344 >> Email: yqzhou at buffalo.edu >> NEW Office/Lab Address: 306/308 Cary Hall >> >> On Feb 2, 2006, at 6:31 PM, Franklin Jose wrote: >> >>> Hai everybody >>> Is there any difference between tools, methods and algorithms in >>> sequence alignment? >>> >>> Franklin >>> Lecturer >>> Govt. Arts College >>> Udhagai-643002 >>> >>> Relax. Yahoo! Mail virus scanning helps detect nasty >>> viruses!_______________________________________________ >>> Bioinformatics.Org general forum - >>> BiO_Bulletin_Board at bioinformatics.org >>> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >> >> _______________________________________________ >> Bioinformatics.Org general forum - >> BiO_Bulletin_Board at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >> _______________________________________________ >> Bioinformatics.Org general forum - >> BiO_Bulletin_Board at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >> > > > -- > William Hsiao > PhD Student, Brinkman Laboratory > Department of Molecular Biology and Biochemistry > Simon Fraser University, 8888 University Dr. Burnaby, BC, Canada > V5A 1S6 > Phone: 604-291-4206 Fax: 604-291-5583 > _______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From sravan_111 at rediffmail.com Mon Feb 6 05:46:51 2006 From: sravan_111 at rediffmail.com (sravan sravan) Date: 6 Feb 2006 10:46:51 -0000 Subject: [BiO BB] (no subject) Message-ID: <20060206104651.28130.qmail@webmail10.rediffmail.com> Dear friends, I am working to study microsatellites. In this regard, I have to generate the equivalent classes of repeat motifs. Hence, I request for your assistance by providing me the source code, methodology or algorithm in detail / pseudo code. Ex: AA,AC,AG,AT,CA,CC,CG,CT,GA,GC,GG,GT,TA,TC,TG,TT : AC : AC,CA,GT,TG AG : AG,GA,CT,TG ------------------------------------------------------------------ In addition: For Tri,Tetra,Penta,Hexa Motifs: Ex: ACG,CGA,GAC AGC,GCA,CAG should be in two seperate classes. But I am getting in single class due to the equal nucleotide composition. Please suggest me some solution for this and any additional issues explored by you. I appreciate your early response, as it is an urgent requirement. Thank you very much. Regads. Sravan. ? Dear Bio_Bulletin_Board, This is an urgent requirement. Hence, I request for your early response. Thanks. Sravan. -------------- next part -------------- An HTML attachment was scrubbed... URL: From narcis at fiserlab.org Mon Feb 6 11:29:41 2006 From: narcis at fiserlab.org (Narcis Fernandez-Fuentes) Date: Mon, 06 Feb 2006 11:29:41 -0500 Subject: [BiO BB] naccess In-Reply-To: <20060204002638.NBHH2815.ibm62aec.bellsouth.net@masf> References: <20060204002638.NBHH2815.ibm62aec.bellsouth.net@masf> Message-ID: <43E77975.9050606@fiserlab.org> I think you should contact naccess authors' Octavio E. Pajaro wrote: > I have had the same problem. I could never decrypt the program. > Octavio > > -----Original Message----- > From: > bio_bulletin_board-bounces+oepresearch=bellsouth.net at bioinformatics.org > [mailto:bio_bulletin_board-bounces+oepresearch=bellsouth.net at bioinformatics. > org] On Behalf Of Swati Pande > Sent: Friday, February 03, 2006 4:23 PM > To: bio_bulletin_board at bioinformatics.org > Subject: [BiO BB] naccess > > hi > I was wondering if anyone has naccess installed on linux? the problem > is that there is no way to de crypt the file on linux I tried > installing mcrypt which didnt work out and I was wondering if anyone > has had any luck with this? I really need to get naccess going. > thanks > Swati > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > -- From lato4864 at uidaho.edu Mon Feb 6 12:27:03 2006 From: lato4864 at uidaho.edu (Andrew Latos) Date: Mon, 06 Feb 2006 09:27:03 -0800 Subject: [BiO BB] naccess In-Reply-To: <43E77975.9050606@fiserlab.org> References: <20060204002638.NBHH2815.ibm62aec.bellsouth.net@masf> <43E77975.9050606@fiserlab.org> Message-ID: Hi there, you get the decryption key from your registration. Then you use your decryption software in linux to read it. It took me a couple of tries to find the right combination but eventually it will work. cheers, Andrew L :-) ----- Original Message ----- From: Narcis Fernandez-Fuentes Date: Monday, February 6, 2006 8:30 am Subject: Re: [BiO BB] naccess To: "The general forum at Bioinformatics.Org" > > I think you should contact naccess authors' > > > Octavio E. Pajaro wrote: > > I have had the same problem. I could never decrypt the program. > > Octavio > > > > -----Original Message----- > > From: > > bio_bulletin_board- > bounces+oepresearch=bellsouth.net at bioinformatics.org> > [mailto:bio_bulletin_board-bounces+oepresearch=bellsouth.net at bioinformatics. > > org] On Behalf Of Swati Pande > > Sent: Friday, February 03, 2006 4:23 PM > > To: bio_bulletin_board at bioinformatics.org > > Subject: [BiO BB] naccess > > > > hi > > I was wondering if anyone has naccess installed on linux? the > problem> is that there is no way to de crypt the file on linux I tried > > installing mcrypt which didnt work out and I was wondering if anyone > > has had any luck with this? I really need to get naccess going. > > thanks > > Swati > > _______________________________________________ > > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org> > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board> > > > > _______________________________________________ > > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org> > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board> > > -- > _______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.orghttps://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From rwang at bccrc.ca Mon Feb 6 13:24:25 2006 From: rwang at bccrc.ca (Renxue Wang) Date: Mon, 6 Feb 2006 10:24:25 -0800 Subject: [BiO BB] aligment of large sequences Message-ID: <0BE438149FF2254DB4199E2682C8DFEB702975@crcmail1.BCCRC.CA> Thanks so much. :) Renxue -----Original Message----- From: bio_bulletin_board-bounces+rwang=bccrc.ca at bioinformatics.org [mailto:bio_bulletin_board-bounces+rwang=bccrc.ca at bioinformatics.org] On Behalf Of Boris Steipe Sent: Saturday, February 04, 2006 6:04 AM To: The general forum at Bioinformatics.Org Subject: Re: [BiO BB] aligment of large sequences You could also try LAGAN http://lagan.stanford.edu/lagan_web/index.shtml in particular Shuffle_Lagan handles inversions, transpositions(!) and even some degree of duplications. HTH, B. On 4 Feb 2006, at 03:15, William Hsiao wrote: > Hi Renxue, > You might want to try http://gel.ahabs.wisc.edu/mauve/ > > Cheers, > > Will > > On 2/3/06, Renxue Wang wrote: >> Hi, there, >> >> Anybody knows any good program for alignment/comparison of large >> sequences, e.g., two genomes of closely related species, for >> identifying the inversion deletion and translocation. >> >> Thanks, >> >> Renxue Wang >> >> -----Original Message----- >> From: bio_bulletin_board-bounces+rwang=bccrc.ca at bioinformatics.org >> [mailto:bio_bulletin_board-bounces >> +rwang=bccrc.ca at bioinformatics.org] On Behalf Of Yaoqi Zhou >> Sent: Friday, February 03, 2006 5:33 AM >> To: The general forum at Bioinformatics.Org >> Subject: Re: [BiO BB] (no subject) >> >> The most recent comparison can be found in comparing SPEM with other >> sequence alignment methods. >> >> H. Zhou and Y. Zhou, ``SPEM: Improving multiple-sequence alignment >> with >> sequence profiles and predicted secondary structures'', >> Bioinformatics >> 21, 3615--3621 (2005). >> >> Download and/or run SPEM server at http://theory.med.buffalo.edu >> >> Yaoqi Zhou, Associate Professor >> **CHECK out Newly UPDATED webpage** on >> http://theory.med.buffalo.edu >> Department of Physiology and Biophysics >> University at Buffalo, State University of New York >> 124 Sherman Hall, Buffalo, NY 14214 >> (716) 829-2985 Fax (716) 829-2344 >> Email: yqzhou at buffalo.edu >> NEW Office/Lab Address: 306/308 Cary Hall >> >> On Feb 2, 2006, at 6:31 PM, Franklin Jose wrote: >> >>> Hai everybody >>> Is there any difference between tools, methods and algorithms in >>> sequence alignment? >>> >>> Franklin >>> Lecturer >>> Govt. Arts College >>> Udhagai-643002 >>> >>> Relax. Yahoo! Mail virus scanning helps detect nasty >>> viruses!_______________________________________________ >>> Bioinformatics.Org general forum - >>> BiO_Bulletin_Board at bioinformatics.org >>> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >> >> _______________________________________________ >> Bioinformatics.Org general forum - >> BiO_Bulletin_Board at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >> _______________________________________________ >> Bioinformatics.Org general forum - >> BiO_Bulletin_Board at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >> > > > -- > William Hsiao > PhD Student, Brinkman Laboratory > Department of Molecular Biology and Biochemistry > Simon Fraser University, 8888 University Dr. Burnaby, BC, Canada > V5A 1S6 > Phone: 604-291-4206 Fax: 604-291-5583 > _______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board _______________________________________________ Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From pandeswati at gmail.com Mon Feb 6 13:54:27 2006 From: pandeswati at gmail.com (Swati Pande) Date: Mon, 6 Feb 2006 10:54:27 -0800 Subject: [BiO BB] naccess:problem solved Message-ID: <3ec517850602061054g2ec17f20kc0e7cbf64ebe125e@mail.gmail.com> Hi I managed to install it. You need mcrypt for this and I just had a friend who had mcrypt installed on Ubuntu to de-crypt it for me. thanks so much Swati From maximilianh at gmail.com Tue Feb 7 06:50:31 2006 From: maximilianh at gmail.com (Maximilian Haeussler) Date: Tue, 7 Feb 2006 12:50:31 +0100 Subject: [BiO BB] aligment of large sequences In-Reply-To: <0BE438149FF2254DB4199E2682C8DFEB702971@crcmail1.BCCRC.CA> References: <0BE438149FF2254DB4199E2682C8DFEB702971@crcmail1.BCCRC.CA> Message-ID: <76f031ae0602070350y2766ef3as@mail.gmail.com> UCSC is using Blastz for the actual alignment and TBA for the postprocessing. http://www.bx.psu.edu/miller_lab/ Max On 03/02/06, Renxue Wang wrote: > Hi, there, > > Anybody knows any good program for alignment/comparison of large sequences, e.g., two genomes of closely related species, for identifying the inversion deletion and translocation. > > Thanks, > > Renxue Wang > > -----Original Message----- > From: bio_bulletin_board-bounces+rwang=bccrc.ca at bioinformatics.org [mailto:bio_bulletin_board-bounces+rwang=bccrc.ca at bioinformatics.org] On Behalf Of Yaoqi Zhou > Sent: Friday, February 03, 2006 5:33 AM > To: The general forum at Bioinformatics.Org > Subject: Re: [BiO BB] (no subject) > > The most recent comparison can be found in comparing SPEM with other > sequence alignment methods. > > H. Zhou and Y. Zhou, ``SPEM: Improving multiple-sequence alignment with > sequence profiles and predicted secondary structures'', Bioinformatics > 21, 3615--3621 (2005). > > Download and/or run SPEM server at http://theory.med.buffalo.edu > > Yaoqi Zhou, Associate Professor > **CHECK out Newly UPDATED webpage** on > http://theory.med.buffalo.edu > Department of Physiology and Biophysics > University at Buffalo, State University of New York > 124 Sherman Hall, Buffalo, NY 14214 > (716) 829-2985 Fax (716) 829-2344 > Email: yqzhou at buffalo.edu > NEW Office/Lab Address: 306/308 Cary Hall > > On Feb 2, 2006, at 6:31 PM, Franklin Jose wrote: > > > Hai everybody > > Is there any difference between tools, methods and algorithms in > > sequence alignment? > > > > Franklin > > Lecturer > > Govt. Arts College > > Udhagai-643002 > > > > Relax. Yahoo! Mail virus scanning helps detect nasty > > viruses!_______________________________________________ > > Bioinformatics.Org general forum - > > BiO_Bulletin_Board at bioinformatics.org > > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > -- Maximilian Haeussler, CNRS Gif-sur-Yvette, Paris tel: +33 6 12 82 76 16 icq: 3825815 -- msn: maximilian.haeussler at hpi.uni-potsdam.de skype: maximilianhaeussler From bioinfosm at gmail.com Tue Feb 7 12:16:10 2006 From: bioinfosm at gmail.com (Samantha Fox) Date: Tue, 7 Feb 2006 12:16:10 -0500 Subject: [BiO BB] Updating NCBI databases Message-ID: <726450810602070916g2f525b20rae9b09f7183e4209@mail.gmail.com> Hi, I had a question regarding regular update of BLAST databases. Is there a standard way to move the updated databases to the user section, making sure that the current copy is not already in use ? Suppose I update the database monthly, but a user might have a big BLAST job running when my cron script starts the update. This can lead to errors. How can I prevent that ? Thanks, ~S -------------- next part -------------- An HTML attachment was scrubbed... URL: From areed at imdc.org Tue Feb 7 12:45:16 2006 From: areed at imdc.org (Ann Reed) Date: Tue, 7 Feb 2006 11:45:16 -0600 Subject: [BiO BB] Education: Tuition-free on-line bioinformatics training program seeks students Message-ID: <7BDF6464945AB045B845C2AB588B9B401B88AB@204.21.125.209.transedge.com> BiTmaP: Midwest Bioinformatics Training Program, is seeking students for the semester that begins in May 2006. All tuition and course materials are paid for by a generous grant from the U.S. Dept. of Labor. BiTmaP courses are accredited and provided by the University of Illinois at Chicago (UIC). Our program consists of on-line lectures, industry seminars and work-site and internship experiences that focus on standard bioinformatics principles and protocols used widely throughout industry and academia. Students complete 3 of the following 4 courses and earn 12-graduate level credits and a certificate in bioinformatics from UIC. BiTmaP courses are: Introduction to Bioinformatics, Biostatistics, Functional Computational Genomics and Microarray, Molecular Modeling in Bioinformatics. To pre-qualify for training, send a resume or CV to apply at bitmapchicago.com For more information, please visit http://www.bitmapchicago.com or email Program Director, Ann Reed at areed at bitmapchicago.com From b.d.vanschaik at amc.uva.nl Wed Feb 8 06:40:16 2006 From: b.d.vanschaik at amc.uva.nl (Barbera van Schaik) Date: Wed, 08 Feb 2006 12:40:16 +0100 Subject: [BiO BB] 3rd International Symposium on Networks in Bioinformatics Message-ID: <43E9D8A0.1080002@amc.uva.nl> */3rd International Symposium on Networks in Bioinformatics/* 29, 30 and 31 may 2006 Amsterdam, the Netherlands http://isnb.amc.uva.nl/ Deadline for abstract submission: 1 april 2006 Early-bird registration: before 1 april 2006 * * */Symposium focus: Bioinformatics of networks/* Bioinformatics of biological networks involves a range of interconnected multidisciplinary research topics. Research areas include the quantitative understanding of the dynamics of regulatory and metabolic networks by using modeling and simulation techniques, the reconstruction of biological pathways from experimental data, identification of pathway modules, the analysis and interpretation of experimental data in the context of biological networks, the construction and use of (public) pathway databases, network visualization and the development and use of pathway markup languages such as SBML and BioPax. Biological questions and new experimental techniques as well as ongoing (bio)informatics and statistics efforts will guide the development of the next generation of bioinformatics software packages. The combination of computational and genomics research will accelerate the detailed understanding of biological networks, which will find many applications in all application domains of life sciences. */ /* */Bridging the gap between disciplines/* ISNB is specifically aimed at researchers working in life sciences, which includes disciplines like molecular biology, genomics, bioinformatics, biostatistics, informatics, computational life sciences and mathematical biology. Since more and more researchers are focusing on biological networks from different perspectives there is large interest in a dedicated symposium like the ISNB. ISNB provides a platform to bring together these different researchers in this field to exchange ideas and facilitate new national and international collaborations. Experience from ISNB2004 and ISNB2005 learns that there is a significant interest in this symposium. This symposium will also contribute to merge ideas and research from scientific programs in the field of bioinformatics, computational life sciences (e.g. simulation and modeling) and the experimental genomics work. */Scientific program and tutorial lectures/* The Third International Symposium on Networks in Bioinformatics (ISNB 2006) is likely to continue its success from previous years by bringing together different disciplines to discuss ongoing research in this exciting field of biological networks and bioinformatics. In continuation of ISNB 2005 we aim to organize a three day meeting during which we will again schedule a mix of tutorial lectures and scientific presentations. The scientific program includes research and poster presentations from Dutch and international acknowledged researchers but also from young researchers starting in the field. The tutorial lectures provide an excellent opportunity to have well known researchers in the field give an introduction to junior and senior researchers and students who recently started working in related projects. The setup of the symposium facilitates sufficient possibilities to meet and talk to colleague researchers, which will facilitate many new and exciting collaborations and research projects. -------------- next part -------------- An HTML attachment was scrubbed... URL: From joel.dudley at asu.edu Wed Feb 8 15:24:27 2006 From: joel.dudley at asu.edu (Joel Dudley) Date: Wed, 8 Feb 2006 13:24:27 -0700 Subject: [BiO BB] MacResearch.org announces iPod giveaway contest Message-ID: <23FF1B53-F91F-434B-8DD8-05D936A1E55E@asu.edu> Help MacResearch.org expand its Script Repository and you could win a black 2GB iPod Nano. Eligible contestants must submit a research- oriented script that can run natively (no emulators) on Mac OS X 10.3 or higher without modification before the contest end date. Scripts for all scientific domains are welcome including scripts written for High Performance Computing (grid, cluster, etc) setup and management. If your script does not meet the aforementioned criteria then you will not be eligible to win the iPod Nano. Winners will be chosen by random drawing. The contest begins 2/8/2006 and ends 2/28/2006. The ultimate goal of this contest, and the script repository in general, is to create a valuable community resource that can be used to benefit endeavors in research and education. Please don't be shy about your coding style or lack of documentation. Your script will make someone's life easier. To learn more about MacResearch.org and the MacResearch.org Script Repository visit http:// www.macresearch.org and http://www.macresearch.org/script_repository. About MacResearch.org: MacResearch.org is the premier community for scientists using Mac OS X and related hardware in their research. It is the mission MacResearch.org to cultivate a knowledgeable and vibrant community of researchers to exchange ideas and information, build a community knowledge-base, and collectively escalate the prominence of Apple technologies in the scientific research community. Official Rules: Eligible entrants must submit a script to the MacResearch.org Script Repository using the script submission form available through MacResearch.org (see http://www.macresearch.org/script_repository). The submitter becomes eligible for the drawing when their script is approved by the MacResearch.org executive committee and published in the public Script Repository. This sweepstakes is open to persons over 18 years of age. Limit one entry per person. No purchase necessary. All entries must be received before 5:00 pm PST on February 28th, 2006. The prize is one (1) Apple iPod Nano Black 2GB One winner will be selected within forty-eight (48) hours of Contest End Date in a random drawing. Drawing will be conducted by the MacResearch.org executive committee, whose decision is final on all matters relating to this sweepstakes. The winner need not be present at the drawing to win. Odds of winning are dependent upon the total number of entries received. Limit one prize per person. Winner will be notified by e-mail within seventy-two (72) hours of the drawing date. Prizes must be claimed within two weeks of the drawing date. Winners are responsible for all applicable taxes. If the Sponsor is unable to locate a given winner, an alternate winner will be selected by a random drawing. All prizes will be awarded and are non-transferable. No cash or other substitutions are allowed except by sponsors sole election due to prize unavailability. By submitting an entry for this Sweepstakes, participants agree to abide by these official rules and any decision Sponsor makes regarding this promotion. Sponsor reserves the right to disqualify from the Sweepstakes, and to prosecute to the fullest extent permitted by law, any participant or winner who, in Sponsors reasonable suspicion, tampers with the MacResearch.org website, the entry process, intentionally submits more than a single entry or mechanical entries, violates these official rules, or acts in an unsportsmanlike or disruptive manner. By entering the sweepstakes, the entrant (a) agrees to the Official Rules and the decisions of the Sponsor shall be final in all respects; (b) consents to the use of winners names and likenesses and any statements, quotes or testimonials provided by the winners for advertising and publicity purposes without further compensation, except where prohibited by law; (c) and releases Sponsor, its subsidiaries, and affiliates, and their directors, officers, employees and agents from any and all liability for any injuries, losses or damages of any kind caused by any prize or resulting from acceptance, possession or use of any prize. The promotion and the rights and obligations of Sponsor and participants will be governed and controlled by the laws of the State of Arizona, applicable to contracts made and preformed therein without reference to the applicable choice of law provisions. All actions, proceedings, and litigation relating hereto will be instituted and prosecuted solely within the State of Arizona, Maricopa County. The parties consent to the jurisdiction of the state courts of Arizona and federal court located within the state and county with respect to any action, dispute, or other matter pertaining to or arising out of this promotion. This promotion is not affiliated in any way with Apple Computer, Inc. Apple, the Apple logo, and iPod are trademarks of Apple Computer, Inc. registered in the U.S. and other countries. From jeff at bioinformatics.org Wed Feb 8 18:46:40 2006 From: jeff at bioinformatics.org (J.W. Bizzaro) Date: Wed, 08 Feb 2006 18:46:40 -0500 Subject: [BiO BB] Fwd: Call for Papers2006 Summer Computer Simulation Conference (SCSC'06) Message-ID: <43EA82E0.2070209@bioinformatics.org> Call for Papers2006 Summer Computer Simulation Conference (SCSC'06) July 30- August 3, 2006Coast Plaza HotelCalgary, Alberta, Canada http://www.scs.org/confernc/summersim/summersim06/cfp/scsc06.htm Track: Bioinformatics Track chair: Isaac Barjis Affiliation: New York City College of TechnologyCity University of New York Contac: Email: ibarjis at cityech.cuny.edu; Tel.: (718) 260-5285; Fax: (718) 254-8680 Introduction The Bioinformatics track is organized within the SCSC and aims at understanding living systems through studying and retrieving biological information using computational methods, approaches, computer tools and graphics. Although Bioinformatics is an established subject, its study and investigation are exposing more and more challenges, perspectives and opportunities. Results of the studies in this field have profound impacts on all fields of biology as a subject, development of science in general, and wealth of human society and health. This fascinating new field of research requires valuable research efforts and therefore you are invited to consider this track for discussion and publication of your outstanding research works. Although the track welcomes all related works, authors are especially encouraged to submit their original research findings on the topics suggested below. Suggested Topics: 1) Systems Biology Modeling & Simulation? 2) Protein Structure Prediction and Modeling? 3) Medical Informatics? 4) Chemical/Molecular Bioinformatics ? 5) Metabolic Pathways? 6) Gene Identification, Annotation and Expression 7) Biological Data Mining, Modeling & Integration? 8) Pharmacogenetics and Pharmacogenomics? 9) Analysis of Evolution and Phylogeny? 10) Biological Data Visualization? Important Dates: Full Draft Paper/Extended Abstract: March 15, 2006 Preliminary Notification of Acceptance: May 2, 2006 Final Camera Ready Submission Due: May 23, 2006 Please submit the extended abstract (4-6 pages) as a PDF file via http://mc.manuscriptcentral.com/scsc (New users, please create an account first: to create an account, on the mentioned site, click on "Create Account" in the left hand upper corner) Dr. I. Barjis Assistant Professor Department of Physical and Biological Sciences Room P313 300 Jay Street Brooklyn, NY 11201 Phone: (718)2605285 Fax: (718)2548680 Fax: (718) 254-8595 Department Office http://websupport1.citytech.cuny.edu/Faculty/ibarjis From golharam at umdnj.edu Wed Feb 8 23:41:05 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Wed, 08 Feb 2006 23:41:05 -0500 Subject: [BiO BB] Updating NCBI databases In-Reply-To: <726450810602070916g2f525b20rae9b09f7183e4209@mail.gmail.com> Message-ID: <003101c62d33$0ac480b0$e6028a0a@GOLHARMOBILE1> I think the easiest way is to do a 'ps -ef | grep blast'. If it returns a process (other than itself) in the list, then you know someone is running blast. The other (simpler) option is to maintain a seperate BLAST database that you can copy in place under a standard maintainence window. We normally reboot our server once a month when system patches are applied. During this outage period, the new database could be copied in place. Let me know what you end up doing. I'm curious to the solution you decide on. I might want to use that here. Ryan -----Original Message----- From: bio_bulletin_board-bounces+golharam=umdnj.edu at bioinformatics.org [mailto:bio_bulletin_board-bounces+golharam=umdnj.edu at bioinformatics.org ] On Behalf Of Samantha Fox Sent: Tuesday, February 07, 2006 12:16 PM To: The general forum at Bioinformatics.Org Subject: [BiO BB] Updating NCBI databases Hi, I had a question regarding regular update of BLAST databases. Is there a standard way to move the updated databases to the user section, making sure that the current copy is not already in use ? Suppose I update the database monthly, but a user might have a big BLAST job running when my cron script starts the update. This can lead to errors. How can I prevent that ? Thanks, ~S -------------- next part -------------- An HTML attachment was scrubbed... URL: From golharam at umdnj.edu Wed Feb 8 23:46:43 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Wed, 08 Feb 2006 23:46:43 -0500 Subject: [BiO BB] Tool to mutate DNA sequence Message-ID: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Does anyone know of tool to mutate a DNA sequence by a specified amount? For instance, say I have a DNA sequence 1000 bases long, and I want to simulate mutations to make it 75% (or 80%, etc) similar to the original. Ryan From pmr at ebi.ac.uk Thu Feb 9 03:25:24 2006 From: pmr at ebi.ac.uk (pmr at ebi.ac.uk) Date: Thu, 9 Feb 2006 08:25:24 -0000 (GMT) Subject: [BiO BB] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: <2714.86.132.216.50.1139473524.squirrel@webmail.ebi.ac.uk> Ryan Golhar writes: > Does anyone know of tool to mutate a DNA sequence by a specified amount? > For instance, say I have a DNA sequence 1000 bases long, and I want to > simulate mutations to make it 75% (or 80%, etc) similar to the original. EMBOSS has the msbar program ("mutate sequence beyond all recognition") which allows you to select the number and type of changes. With some tuning of options to match the sequence length you should be able to get results that match whatever your definition of 75% similar might be (amazing how much more similarity you can get by adding gaps in an alignment :-) If you can specify a clear and generally useful way to define what you need we could of course add a "percent change" option to the msbar program for a future release. Hope that helps, Peter From torsten.seemann at infotech.monash.edu.au Thu Feb 9 06:15:28 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Thu, 09 Feb 2006 22:15:28 +1100 Subject: [BiO BB] Re: [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: <43EB2450.6000606@infotech.monash.edu.au> Ryan, > Does anyone know of tool to mutate a DNA sequence by a specified amount? > For instance, say I have a DNA sequence 1000 bases long, and I want to > simulate mutations to make it 75% (or 80%, etc) similar to the original. The EMBOSS suite comes with a tool called "msbar" which can controllably mutate sequences: http://emboss.sourceforge.net/apps/msbar.html -- Torsten Seemann Victorian Bioinformatics Consortium, Monash University, Australia http://www.vicbioinformatics.com/ From heikki at sanbi.ac.za Thu Feb 9 06:31:20 2006 From: heikki at sanbi.ac.za (Heikki Lehvaslaiho) Date: Thu, 9 Feb 2006 13:31:20 +0200 Subject: [BiO BB] Re: [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: <200602091331.21690.heikki@sanbi.ac.za> Ryan, Instructions in pseudo code: take the sequence string out of the object use a hash to store changed locations repeat pick a location in the string randomly if the location is not in a hash , i.e. changed already, change it into something else add the changed location into the hash if enough locations have been changed (scalar keys hash), exit loop put the sequence string back into the seq object -Heikki On Thursday 09 February 2006 06:46, Ryan Golhar wrote: > Does anyone know of tool to mutate a DNA sequence by a specified amount? > For instance, say I have a DNA sequence 1000 bases long, and I want to > simulate mutations to make it 75% (or 80%, etc) similar to the original. > > > Ryan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za _/_/_/_/_/ Associate Professor skype: heikki_lehvaslaiho _/ _/ _/ SANBI, South African National Bioinformatics Institute _/ _/ _/ University of Western Cape, South Africa _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 ___ _/_/_/_/_/________________________________________________________ From osborne1 at optonline.net Thu Feb 9 08:55:57 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Thu, 09 Feb 2006 08:55:57 -0500 Subject: [BiO BB] Re: [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: Ryan, The script scripts/utilities/mutate.PLS does this. Brian O. On 2/8/06 11:46 PM, "Ryan Golhar" wrote: > Does anyone know of tool to mutate a DNA sequence by a specified amount? > For instance, say I have a DNA sequence 1000 bases long, and I want to > simulate mutations to make it 75% (or 80%, etc) similar to the original. > > > Ryan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From heikki at sanbi.ac.za Thu Feb 9 09:54:30 2006 From: heikki at sanbi.ac.za (Heikki Lehvaslaiho) Date: Thu, 9 Feb 2006 16:54:30 +0200 Subject: [BiO BB] Re: [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: <200602091654.30890.heikki@sanbi.ac.za> Ryan, I should have made this very clear in my first reply: You have to plan very carefully what rules you use when you mutate your sequence because it will affect directly the resulting sequences. Of course, all that depends on what you will be using the sequences for. If you are going to draw evolutionary conclusions from those sequences, you must mutate them in a way that simulates evolutionary principles. My earlier pseudocode example, for example, should allow mutations in every location. Mutations do occur multiple times in same places as sequences get saturated by mutations. Also, you should decide the relative occurrence of transversions versus transitions. Then there are indels; do you want to take those into account? Also, check the EMBOSS program 'msbar'. You did not ask this, but... I remember that during the early days of Celera, one of the tools that enabled them to estimate the feasibility of the whole genome shotgun sequence assembly, was a very complete program to 'synthesize' in-silico the whole complexity of the human genome. I have no idea of that program is generally available now. Yours, -Heikki On Thursday 09 February 2006 06:46, Ryan Golhar wrote: > Does anyone know of tool to mutate a DNA sequence by a specified amount? > For instance, say I have a DNA sequence 1000 bases long, and I want to > simulate mutations to make it 75% (or 80%, etc) similar to the original. > > > Ryan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za _/_/_/_/_/ Associate Professor skype: heikki_lehvaslaiho _/ _/ _/ SANBI, South African National Bioinformatics Institute _/ _/ _/ University of Western Cape, South Africa _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 ___ _/_/_/_/_/________________________________________________________ From jeff at bioinformatics.org Thu Feb 9 12:17:44 2006 From: jeff at bioinformatics.org (J.W. Bizzaro) Date: Thu, 09 Feb 2006 12:17:44 -0500 Subject: [BiO BB] New features for hosted projects Message-ID: <43EB7938.2010801@bioinformatics.org> Greetings, Bioinformatics.Org is pleased to announce a couple recent additions to the services offered to projects hosted at Bioinformatics.Org. 1. Subversion (SVN) version control system Software developers may now use the Subversion version control system on our servers. Subversion was developed "to take over the CVS user base," according to the Subversion website. "Specifically, we're writing a new version control system that is very similar to CVS, but fixes many things that are broken." Here are a few of the advantages of using Subversion over CVS (from the website): * Directories, renames, and file meta-data are versioned. * Commits are truly atomic. * Versioning of symbolic links * Efficient handling of binary files Subversion at Bioinformatics.Org makes use of svnserve over SSH (svn+ssh) for developer access (anonymous access uses ordinary svnserve). Instructions on using Subversion at BiO can be found here: http://bioinformatics.org/docs/svn/ We also have the WebSVN interface set up: http://bioinformatics.org/websvn/ 2. Wiki system for project websites Bioinformatics.Org will now install a wiki system in project web directories by default. If you're not familiar with wikis, you may want to look through this introduction by Wikipedia: http://en.wikipedia.org/wiki/Wiki Instructions for using wikis and the wiki system are also included with (are part of) the website. Some projects are just getting started using their new wikis, so there's not much content to show at this time, but an example of the system can be seen here: http://bioinformatics.org/dnalinux/ More information on Bioinformatics.Org's free group-hosting service can be found here: http://bioinformatics.org/about/hosting.php If you have any questions or comments, please don't hesitate to ask: sysadmins at bioinformatics.org Cheers, Jeff -- J.W. Bizzaro Bioinformatics Organization, Inc. (Bioinformatics.Org) E-mail: jeff at bioinformatics.org Phone: +1 508 890 8600 -- From jason.stajich at duke.edu Thu Feb 9 14:10:54 2006 From: jason.stajich at duke.edu (Jason Stajich) Date: Thu, 9 Feb 2006 14:10:54 -0500 Subject: [BiO BB] Re: [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: <0B84EE38-0BA5-4E56-B35F-C8CBAA342AC4@duke.edu> Depending on whether or not you want to use evolutionary realistic models... * evolver which comes with PAML lets you evolve sequences on a tree * SeqGen from Andrew Rambaut http://evolve.zoo.ox.ac.uk/software.html? id=seqgen also lets you do this I believe there are PISE interfaces to both of these at the pasteur bioweb site - http://bioweb.pasteur.fr/ -jason On Feb 8, 2006, at 11:46 PM, Ryan Golhar wrote: > Does anyone know of tool to mutate a DNA sequence by a specified > amount? > For instance, say I have a DNA sequence 1000 bases long, and I want to > simulate mutations to make it 75% (or 80%, etc) similar to the > original. > > > Ryan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich Duke University http://www.duke.edu/~jes12 From marksty at MIT.EDU Thu Feb 9 15:00:24 2006 From: marksty at MIT.EDU (Mark Styczynski) Date: Thu, 09 Feb 2006 15:00:24 -0500 Subject: [BiO BB] Old versions of Prosite? Message-ID: <1139515224.9315.21.camel@localhost> Hi, I'm looking to replicate the results from an earlier work that used PROSITE version 8.0. I wrote to the email contact on ExPASy, and I was told that though things like Swiss-Prot have old versions archived, older versions of PROSITE are considered as obsolete and no longer available to users. I've tried scraping Google but have yet to get anything as old as what I'm looking for. Does anyone have any old versions of PROSITE, or have any idea how I could get access to them? Thanks for any help you can give, Mark Styczynski From golharam at umdnj.edu Thu Feb 9 16:19:46 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Thu, 09 Feb 2006 16:19:46 -0500 Subject: [BiO BB] RE: [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <200602091654.30890.heikki@sanbi.ac.za> Message-ID: <002801c62dbe$8d4d7e20$e6028a0a@GOLHARMOBILE1> Thanks all. The responses I got were definitely more than helpful. FYI - I did initially look at msbar. I glanced over the "Number of times to perform mutation operations", which is what I was looking for. I'm looking to statistically test some simply scoring matrices. I think msbar will do. Ryan -----Original Message----- From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Heikki Lehvaslaiho Sent: Thursday, February 09, 2006 9:55 AM To: bioperl-l at lists.open-bio.org; golharam at umdnj.edu Cc: 'The general forum at Bioinformatics.Org'; 'bioperl-l'; emboss at emboss.open-bio.org Subject: Re: [Bioperl-l] Tool to mutate DNA sequence Ryan, I should have made this very clear in my first reply: You have to plan very carefully what rules you use when you mutate your sequence because it will affect directly the resulting sequences. Of course, all that depends on what you will be using the sequences for. If you are going to draw evolutionary conclusions from those sequences, you must mutate them in a way that simulates evolutionary principles. My earlier pseudocode example, for example, should allow mutations in every location. Mutations do occur multiple times in same places as sequences get saturated by mutations. Also, you should decide the relative occurrence of transversions versus transitions. Then there are indels; do you want to take those into account? Also, check the EMBOSS program 'msbar'. You did not ask this, but... I remember that during the early days of Celera, one of the tools that enabled them to estimate the feasibility of the whole genome shotgun sequence assembly, was a very complete program to 'synthesize' in-silico the whole complexity of the human genome. I have no idea of that program is generally available now. Yours, -Heikki On Thursday 09 February 2006 06:46, Ryan Golhar wrote: > Does anyone know of tool to mutate a DNA sequence by a specified > amount? For instance, say I have a DNA sequence 1000 bases long, and I > want to simulate mutations to make it 75% (or 80%, etc) similar to the > original. > > > Ryan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- ______ _/ _/_____________________________________________________ _/ _/ _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za _/_/_/_/_/ Associate Professor skype: heikki_lehvaslaiho _/ _/ _/ SANBI, South African National Bioinformatics Institute _/ _/ _/ University of Western Cape, South Africa _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512 ___ _/_/_/_/_/________________________________________________________ _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From pankaj at nii.res.in Fri Feb 10 01:43:13 2006 From: pankaj at nii.res.in (Pankaj) Date: Fri, 10 Feb 2006 12:13:13 +0530 Subject: [BiO BB] how to find gene neighbours In-Reply-To: <1139334677.43e8de159cb95@webservices.di.fc.ul.pt> References: <43E1470A.2000203@burnham.org> <20060202024631.50131.qmail@web34701.mail.mud.yahoo.com> <20060202061744.M53505@nii.res.in> <1139334677.43e8de159cb95@webservices.di.fc.ul.pt> Message-ID: <20060210064313.M41802@nii.res.in> Hi Pooja, Nops! By gene neighbours I mean genes which are positionally close together on the genome, ie adjoining genes. Although I found the NCBI's map viewer friendly, but I couldnt figure out how to automate a script which can find gene neighbours for all the GI's that I have. Presently, having a GI number I can find gene neighbours manually by looking at the chromosome map, but I dont know how data is organised in the database, so that I can automate the process. Any help on this will be highly appreciated! Cheers, Pankaj Khurana -- Open WebMail Project (http://openwebmail.org) ---------- Original Message ----------- From: Pooja Jain To: pankaj at nii.res.in Sent: Tue, 7 Feb 2006 17:51:17 +0000 Subject: Re: [BiO BB] how to find gene neighbours > Hi, > Probably u will be interested in idetifying the set of GI numbers > from the GI number you have which share evolutionaly relationship. > In other words you are trying to find genes from different organisms > which are neighbours during their parallel evolution. Am I > understood you correctly ? > > cheers! > > -Pooja > Citando Pankaj : > > > > > Thankx Govind! > > > > I have gi numbers of homologous proteins from different organisms (some of > > which may be sequenced and some may not be....i dont know!). I want to know > > how to find gene neighbours and then automate the script to find gene > > neighbours for all the gi numbers that I have. > > > > Pankaj Khurana > > > > -- > > Open WebMail Project (http://openwebmail.org) > > > > > > ---------- Original Message ----------- > > From: govind mk > > To: idoerg at burnham.org, "The general forum at Bioinformatics.Org" > > > > Sent: Wed, 1 Feb 2006 18:46:31 -0800 (PST) > > Subject: Re: [BiO BB] how to find gene neighbours > > > > > I think NCBI's map viewer can elp you find neighbouring genes > > > ..... Are u looking for an algorithm or just want to map you genes > > > on to genomes and identify their neighbours > > > > > > -Govind > > > > > > > > > Iddo Friedberg wrote: > > > Generally, you can do that by going to the genomic database of > > > choice for your particular organism. As you have not mentioned which > > > organism those genes are form, it is rather hard to make a > > > recommendation. Ensembl, www.fruitfly.org, yeastgenome.org are > > > examples of such databases for Eukaryotic models. > > > > > > A list for bacterial projects is available from : > > > > > > > > http://www.pasteur.fr/recherche/unites/tcruzi/minoprio/genomics/bacteria.htm > > > > > > HTH, > > > > > > Iddo > > > > > > Pankaj wrote: > > > > > > >Hi Everybody, > > > >I have a few gi numbers. I want to find the their neighbouring genes. How > > can > > > >I do that? > > > > > > > >Thanking all in advance > > > > > > > >Pankaj Khurana > > > > > > > > > > > >-- > > > >Open WebMail Project (http://openwebmail.org) > > > > > > > >_______________________________________________ > > > >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > > > >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > > > > > > > > > > > > > > > > > > > > -- > > > Iddo Friedberg, Ph.D. > > > Burnham Institute for Medical Reseach > > > 10901 N. Torrey Pines Rd. > > > La Jolla, CA 92037 USA > > > Tel: +1 (858) 646 3100 x3516 > > > Fax: +1 (858) 713 9949 > > > http://iddo-friedberg.org > > > http://BioFunctionPrediction.org > > > > > > _______________________________________________ > > > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > > > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > > > > > > > __________________________________________________ > > > Do You Yahoo!? > > > Tired of spam? Yahoo! Mail has the best spam protection around > > > http://mail.yahoo.com > > ------- End of Original Message ------- > > > > _______________________________________________ > > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > ------- End of Original Message ------- From kiekyon.huang at gmail.com Fri Feb 10 05:31:37 2006 From: kiekyon.huang at gmail.com (Kie Kyon Huang) Date: Fri, 10 Feb 2006 18:31:37 +0800 Subject: [BiO BB] (no subject) Message-ID: -------------- next part -------------- An HTML attachment was scrubbed... URL: From delete at elfdata.com Fri Feb 10 09:13:29 2006 From: delete at elfdata.com (Theodore H. Smith) Date: Fri, 10 Feb 2006 14:13:29 +0000 Subject: [BiO BB] Understanding Smith-Waterman scoring Message-ID: Hi people, I'm trying to learn about Smith-Waterman. There is one thing I haven't seen answered in explanations of the Smith-Waterman algorithm. How does it score alignments that come in sections? Does it give a penalty if a sequence must be split up? For example, let's say I had the protein AAAABBBB, and I wanted to see how this scored against the protein BBBBAAAA. Let's ignore the fact that it can be reversed, for the moment, just so I can understand how should Smith-Waterman work. Now, what would the match score be? Let's assume that A to A has a score of 1 and B to B also has a score of 1. Its a really simple example. So matching AAAABBBB to itself, would give a SW score of 8. What would matching BBBBAAAA to AAAABBBB give? I'd expect it to generate two "sections", like this: AAAA :::: AAAA BBBB :::: BBBB But what should the overall score be? Is it still 8? Or should we give a penalty because we've had to split this up? Is it normal for alignment tools to give penalties to segmented sequences. Also is there some kind of "minimum length" that a Smith-Waterman based aligner would allow? Would it say that you can't have sections below a certain length? Are there any tools which let you specify such a minimum section length? If you don't like that example above of AAAABBBB (as it can be reversed), then try this example. Assume all the proteins get a score of 1 against themselves. The protein: ABCDEFGH, if I did a Smith- Waterman score comparison against DCHABGEF, would the score still be 8. After all, all the proteins are there, just in a different order. I would expect this to get a score of zero or below. It's a really basic question, sorry about that! From pmr at ebi.ac.uk Fri Feb 10 09:22:39 2006 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 10 Feb 2006 14:22:39 +0000 Subject: [BiO BB] Understanding Smith-Waterman scoring In-Reply-To: References: Message-ID: <43ECA1AF.2020206@ebi.ac.uk> Theodore H. Smith wrote: > How does it score alignments that come in sections? Does it give a > penalty if a sequence must be split up? You get one alignment. If more than one "section" aligns ... with the parts in the same order in both proteins ... you can have a misaligned region and/or gaps in the sequences. There are penalty scores for the misalignments and the gaps. There is also a Smith-Waterman-Eggert variation of the algorithm that finds a scond, third, fourth ... alignment that excludes all those already reported. Smith-Waterman is a local alignment method, so any unaligned parts of either sequence do not count in the score. > What would matching BBBBAAAA to AAAABBBB give? AAAA matching AAAA or BBBB matching BBBB (unless A has a positive score to match B, then other results are possible) > I'd expect it to generate two "sections", like this: No, but you will get the second section from the Smith-Waterman-Eggert algorithm. Each will have its own local alignment score. > But what should the overall score be? Is it still 8? Or should we give > a penalty because we've had to split this up? Is it normal for > alignment tools to give penalties to segmented sequences. Also is there > some kind of "minimum length" that a Smith-Waterman based aligner would > allow? Would it say that you can't have sections below a certain > length? Are there any tools which let you specify such a minimum > section length? > If you don't like that example above of AAAABBBB (as it can be > reversed), then try this example. Assume all the proteins get a score > of 1 against themselves. The protein: ABCDEFGH, if I did a Smith- > Waterman score comparison against DCHABGEF, would the score still be 8. > After all, all the proteins are there, just in a different order. > > I would expect this to get a score of zero or below. Be careful not to confuse protein (the whole sequence) with amino acid or residue (one character). You will get at least 1 residue matching. Maybe more as some of the mismatches will have a positive score. Hope that helps. It is cmoplicated :-) Peter From delete at elfdata.com Fri Feb 10 09:53:18 2006 From: delete at elfdata.com (Theodore H. Smith) Date: Fri, 10 Feb 2006 14:53:18 +0000 Subject: [BiO BB] Understanding Smith-Waterman scoring In-Reply-To: <43ECA1AF.2020206@ebi.ac.uk> References: <43ECA1AF.2020206@ebi.ac.uk> Message-ID: On 10 Feb 2006, at 14:22, Peter Rice wrote: > Theodore H. Smith wrote: > >> How does it score alignments that come in sections? Does it give >> a penalty if a sequence must be split up? > > You get one alignment. > > If more than one "section" aligns ... with the parts in the same > order in both proteins ... you can have a misaligned region and/or > gaps in the sequences. There are penalty scores for the > misalignments and the gaps. OK. I understand. The most popular tools in use today, only find the best (or at least one) locally aligned section, but not all of them. Is this a problem in general? Or is it that multiple sections to be aligned, are quite rare in the kind of queries that biologists do today? > There is also a Smith-Waterman-Eggert variation of the algorithm > that finds a scond, third, fourth ... alignment that excludes all > those already reported. Am I right in seeing that this isn't talked about as much as Smith- Waterman though? It sounds promising for the line of work I am doing however, thanks very much for telling me of Smith-Waterman-Eggert, it looks like a good lead. >> What would matching BBBBAAAA to AAAABBBB give? > > AAAA matching AAAA or BBBB matching BBBB (unless A has a positive > score to match B, then other results are possible) Which would I get? Does it depend on the tool? Do I get the first alignment, the last, or the best? >> I'd expect it to generate two "sections", like this: > > No, but you will get the second section from the Smith-Waterman- > Eggert algorithm. Each will have its own local alignment score. Thanks. Sounds very interesting. >> But what should the overall score be? Is it still 8? Or should we >> give a penalty because we've had to split this up? Is it normal >> for alignment tools to give penalties to segmented sequences. >> Also is there some kind of "minimum length" that a Smith-Waterman >> based aligner would allow? Would it say that you can't have >> sections below a certain length? Are there any tools which let >> you specify such a minimum section length? > >> If you don't like that example above of AAAABBBB (as it can be >> reversed), then try this example. Assume all the proteins get a >> score of 1 against themselves. The protein: ABCDEFGH, if I did a >> Smith- Waterman score comparison against DCHABGEF, would the score >> still be 8. After all, all the proteins are there, just in a >> different order. >> I would expect this to get a score of zero or below. > > Be careful not to confuse protein (the whole sequence) with amino > acid or residue (one character). You might not be surprised to find out that I come from a software developer background. I won't make that mistake again. > You will get at least 1 residue matching. Maybe more as some of the > mismatches will have a positive score. > > Hope that helps. It is cmoplicated :-) Yes it's been of great help. And yes it is complicated :) From pmr at ebi.ac.uk Fri Feb 10 12:32:32 2006 From: pmr at ebi.ac.uk (pmr at ebi.ac.uk) Date: Fri, 10 Feb 2006 17:32:32 -0000 (GMT) Subject: [BiO BB] Understanding Smith-Waterman scoring In-Reply-To: References: <43ECA1AF.2020206@ebi.ac.uk> Message-ID: <3475.86.137.129.90.1139592752.squirrel@webmail.ebi.ac.uk> >> Theodore H. Smith wrote: > OK. I understand. The most popular tools in use today, only find the > best (or at least one) locally aligned section, but not all of them. > > Is this a problem in general? Or is it that multiple sections to be > aligned, are quite rare in the kind of queries that biologists do today? The algorithm guarantees one alignment, and it is always the "best" (highest scoring) ... although in your AAAABBBB case there arer two possible answers with the same score. Changing the comparison matrix (scores for A:A, B:B and A:B) and changing the penalties for adding gaps will of course change the scoring and may give another "best" alignment. There is also the closely related Needleman-Wunsch global alignment algorithm. This guarantees one best alignment over the whole of both sequences. In global alignment there are usually options to penalise gaps at the end of the sequence (usually not penalised as both sequences arer assumed to be incomplete). In local alignments (Smith-Waterman) the alignment is what you get ... there are no penalties for anything outside the aligned regions (except that edxtending the alignment will always give a worse score). >> There is also a Smith-Waterman-Eggert variation of the algorithm >> that finds a scond, third, fourth ... alignment that excludes all >> those already reported. > > Am I right in seeing that this isn't talked about as much as Smith- > Waterman though? It sounds promising for the line of work I am doing > however, thanks very much for telling me of Smith-Waterman-Eggert, it > looks like a good lead. SMith-Waterman is standard. There are utilities that give the alternative but often users fail to spot the possibility. The first alignment is always the same for both. > You might not be surprised to find out that I come from a software > developer background. I won't make that mistake again. Ah, in that case be careful what you describe as a "sequence" ... mathematicians can have different ideas of what the word means :-) >> You will get at least 1 residue matching. Maybe more as some of the >> mismatches will have a positive score. regards, Peter Rice From golharam at umdnj.edu Sat Feb 11 01:25:50 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Sat, 11 Feb 2006 01:25:50 -0500 Subject: [BiO BB] Understanding Smith-Waterman scoring In-Reply-To: Message-ID: <001b01c62ed4$01ca08c0$2f01a8c0@GOLHARMOBILE1> Theodore, Smith-Waterman will find all the alignments. Remember, a mismatch must have a negative score. Once the aligned region drops to 0, the end of the alignment is reached. A second area alignment is found by looking at the matrix of scores it generated and locating the next highest score. You can then trace back along the diagonal until you get zero at which point you reached the end of the next alignment. Ryan -----Original Message----- From: bio_bulletin_board-bounces+golharam=umdnj.edu at bioinformatics.org [mailto:bio_bulletin_board-bounces+golharam=umdnj.edu at bioinformatics.org ] On Behalf Of Theodore H. Smith Sent: Friday, February 10, 2006 9:53 AM To: The general forum at Bioinformatics.Org Subject: Re: [BiO BB] Understanding Smith-Waterman scoring On 10 Feb 2006, at 14:22, Peter Rice wrote: > Theodore H. Smith wrote: > >> How does it score alignments that come in sections? Does it give >> a penalty if a sequence must be split up? > > You get one alignment. > > If more than one "section" aligns ... with the parts in the same > order in both proteins ... you can have a misaligned region and/or > gaps in the sequences. There are penalty scores for the > misalignments and the gaps. OK. I understand. The most popular tools in use today, only find the best (or at least one) locally aligned section, but not all of them. Is this a problem in general? Or is it that multiple sections to be aligned, are quite rare in the kind of queries that biologists do today? > There is also a Smith-Waterman-Eggert variation of the algorithm > that finds a scond, third, fourth ... alignment that excludes all > those already reported. Am I right in seeing that this isn't talked about as much as Smith- Waterman though? It sounds promising for the line of work I am doing however, thanks very much for telling me of Smith-Waterman-Eggert, it looks like a good lead. >> What would matching BBBBAAAA to AAAABBBB give? > > AAAA matching AAAA or BBBB matching BBBB (unless A has a positive > score to match B, then other results are possible) Which would I get? Does it depend on the tool? Do I get the first alignment, the last, or the best? >> I'd expect it to generate two "sections", like this: > > No, but you will get the second section from the Smith-Waterman- > Eggert algorithm. Each will have its own local alignment score. Thanks. Sounds very interesting. >> But what should the overall score be? Is it still 8? Or should we >> give a penalty because we've had to split this up? Is it normal >> for alignment tools to give penalties to segmented sequences. >> Also is there some kind of "minimum length" that a Smith-Waterman >> based aligner would allow? Would it say that you can't have >> sections below a certain length? Are there any tools which let >> you specify such a minimum section length? > >> If you don't like that example above of AAAABBBB (as it can be >> reversed), then try this example. Assume all the proteins get a >> score of 1 against themselves. The protein: ABCDEFGH, if I did a >> Smith- Waterman score comparison against DCHABGEF, would the score >> still be 8. After all, all the proteins are there, just in a >> different order. >> I would expect this to get a score of zero or below. > > Be careful not to confuse protein (the whole sequence) with amino > acid or residue (one character). You might not be surprised to find out that I come from a software developer background. I won't make that mistake again. > You will get at least 1 residue matching. Maybe more as some of the > mismatches will have a positive score. > > Hope that helps. It is cmoplicated :-) Yes it's been of great help. And yes it is complicated :) _______________________________________________ Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From golharam at umdnj.edu Sat Feb 11 01:21:26 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Sat, 11 Feb 2006 01:21:26 -0500 Subject: [BiO BB] How to calculate the value of K and Lambda for two sequence alignment In-Reply-To: Message-ID: <001601c62ed3$637d6fe0$2f01a8c0@GOLHARMOBILE1> Did anyone ever respond to you on this? K and lambda. I forget where K comes from. Lambda is dependent on the scoring matrix you are using. I believe it is given with the matrix. BLOSUM uses 0.347. Ryan -----Original Message----- From: bio_bulletin_board-bounces+golharam=umdnj.edu at bioinformatics.org [mailto:bio_bulletin_board-bounces+golharam=umdnj.edu at bioinformatics.org ] On Behalf Of shohag md Sent: Friday, February 03, 2006 6:26 AM To: bio_bulletin_board at bioinformatics.org Subject: [BiO BB] How to calculate the value of K and Lambda for two sequence alignment Hi Everybody Using Smith Waterman algorithm I want to align two sequences. Aftet that I want to calculate the expectation value. For calculating the expectation value we know that E = Kmn e - lx But how can I calculate the value of K and l . Is there any formula that can help me to calculate the value of K and l , and then the expectation value. Thanking all in advance Shoyaib -------------- next part -------------- An HTML attachment was scrubbed... URL: From journalshoyaib at gmail.com Sun Feb 12 06:36:15 2006 From: journalshoyaib at gmail.com (shohag md) Date: Sun, 12 Feb 2006 17:36:15 +0600 Subject: [BiO BB] How to calculate the value of k and lambda for calculating the expectation value? Message-ID: Hi everybody After aligning two/more sequences I want to find the expectation value. But for calculating the expectation value we know that there are two constanats, k and lambda.(E= kmn e-lambda *s, where m and n are the lengths of sequences) Now I want to know how to calculate the value of k and lambda. I want to know the mathematics , not any tool . Are there any one who can help me? Thanks in advance shoyaib -------------- next part -------------- An HTML attachment was scrubbed... URL: From maximilianh at gmail.com Sun Feb 12 07:50:53 2006 From: maximilianh at gmail.com (Maximilian Haeussler) Date: Sun, 12 Feb 2006 13:50:53 +0100 Subject: [BiO BB] How to calculate the value of K and Lambda for two sequence alignment In-Reply-To: <001601c62ed3$637d6fe0$2f01a8c0@GOLHARMOBILE1> References: <001601c62ed3$637d6fe0$2f01a8c0@GOLHARMOBILE1> Message-ID: <76f031ae0602120450n6206f0d6t@mail.gmail.com> I have no clue :-) but I typed "smith waterman k lamda" into google and clicked on "I'm feeling lucky": got the NCBI page where blast is explained http://www.people.virginia.edu/~wrp/cshl02/Altschul/Altschul-3.html: --------------- K and lambda are statistical parameters dependent upon the scoring system and the background amino acid frequences of the sequences being compared. While FASTA estimates these parameters from the scores generated by actual database searches, BLAST estimates them beforehand for specific scoring schemes by comparing many random sequences generated using a standard protein amino acid composition [12]. For example, using BLOSUM-62 amino acid substitution scores [13], and affine gap costs [14-16] in which a gap of length k is assigned a score of -(10 + k), we generated 10,000 pairs of length-1000 random protein sequences, and used the Smith-Waterman algorithm to calculate 10,000 optimal local alignment scores. From these scores, lambda was estimated at 0.252 and K at 0.035 by the method of maximum-likelihood [17]. In general, given M samples from an extreme value distribution, the ratio of the maximum-likelihood estimate of lambda to its actual value is approximately normally distributed, with mean 1.0 and standard deviation 0.78/sqrt(M) [17]. Thus the standard error for our estimate of lambda is about 0.002, or less than 1%. --------------- >From what I understood, you generate many alignments, plot the generated scores for the current matrix, assume that they follow your function E and then approximate lambda. The addison wesley BLAST book goes into details and gives an example PERL program to calculate lambda and says that the value of k doesn't really matter: (this sample chapter is free) http://www.oreilly.com/catalog/blast/chapter/ch04.pdf Don't know if it helped or if it is completelywrong, it took 10 minutes and I found it interesting... :-) Max On 11/02/06, Ryan Golhar wrote: > Did anyone ever respond to you on this? K and lambda. I forget where K > comes from. Lambda is dependent on the scoring matrix you are using. I > believe it is given with the matrix. BLOSUM uses 0.347. > > Ryan > > > -----Original Message----- > From: > bio_bulletin_board-bounces+golharam=umdnj.edu at bioinformatics.org > [mailto:bio_bulletin_board-bounces+golharam=umdnj.edu at bioinformatics.org] > On Behalf Of shohag md > Sent: Friday, February 03, 2006 6:26 AM > To: bio_bulletin_board at bioinformatics.org > Subject: [BiO BB] How to calculate the value of K and Lambda for two > sequence alignment > > > > Hi Everybody > > > > Using Smith Waterman algorithm I want to align two sequences. Aftet that I > want to calculate the expectation value. For calculating the expectation > value we know that > > > > E = Kmn e - lx > > > > But how can I calculate the value of K and l . > > > Is there any formula that can help me to calculate the value of K and l , > and then the expectation value. > > Thanking all in advance > > Shoyaib > > _______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > -- Maximilian Haeussler, CNRS Gif-sur-Yvette, Paris tel: +33 6 12 82 76 16 icq: 3825815 -- msn: maximilian.haeussler at hpi.uni-potsdam.de skype: maximilianhaeussler From delete at elfdata.com Mon Feb 13 17:45:10 2006 From: delete at elfdata.com (Theodore H. Smith) Date: Mon, 13 Feb 2006 22:45:10 +0000 Subject: [BiO BB] Constructing a multiple aligner, similar to Smith-Waterman Message-ID: <70024952-5394-48F6-8402-902186876C6B@elfdata.com> Hi people, I am trying to figure out how to construct my own multiple aligner. Not because I expect it to be better than other aligners, but because I need to learn about alignment for a larger project. I need to get this all figured out. I know that when aligning a Smith-Waterman based aligner, we start from the highest cell in the matrix, and trace backwards. But what if the trace actually has multiple highest cells? Or what if it goes like this: 13, 11, 13, 10, 9, 6, 3, 1, 0 That is, the trace has multiple highest cells within it, two 13's. Do we use the shorter back trace? OK, so let's assume we got a single alignment done. How do we then find the other sections to align? Would a process of elimination do it? That is, search the remaining matrix (Excluding the portion already aligned) for the highest cell? All this Smith-Waterman understanding I'm gaining is going into go into an experimental project I am working on, that I have no idea if it will be practical or not. I know it will work, I just don't know that it will give anyone any benefits over what they already got :) Needless to say, it's exciting stuff for people like me who like giving something new a go, I suppose the danger of it not being useful is half the excitement. From pmr at ebi.ac.uk Tue Feb 14 03:20:25 2006 From: pmr at ebi.ac.uk (pmr at ebi.ac.uk) Date: Tue, 14 Feb 2006 08:20:25 -0000 (GMT) Subject: [BiO BB] Constructing a multiple aligner, similar to Smith-Waterman In-Reply-To: <70024952-5394-48F6-8402-902186876C6B@elfdata.com> References: <70024952-5394-48F6-8402-902186876C6B@elfdata.com> Message-ID: <4425.86.137.129.90.1139905225.squirrel@webmail.ebi.ac.uk> > That is, the trace has multiple highest cells within it, two 13's. Do > we use the shorter back trace? There is nothing special about either score. You should choose the first or last one you see (whichever is easiest). Usually there is very little difference between them, although I have seen perfect repeats in proteins which would each align with a single repeat in another sequence :-) > OK, so let's assume we got a single alignment done. How do we then > find the other sections to align? Would a process of elimination do > it? That is, search the remaining matrix (Excluding the portion > already aligned) for the highest cell? Yes, that's how Smith-Waterman-Eggert works. Check the original paper for the rules (if memory serves, zero all cells that contributed to the previous alignment and then look for the highest remainning score) One thing nconcerns me a little ... you mentioned "multiple alignment". Local (Smith-Waterman) alignments will do the best matches, but you need a strategy for the remainder of each sequence. Depending on your project this could be anything from a global alignment to throwing the rest away (alignments looking for protein domains do this, for example). Hope this helps, Peter From maximilianh at gmail.com Tue Feb 14 05:11:42 2006 From: maximilianh at gmail.com (Maximilian Haeussler) Date: Tue, 14 Feb 2006 11:11:42 +0100 Subject: [BiO BB] Re: [Bioperl-l] Tool to mutate DNA sequence In-Reply-To: <0B84EE38-0BA5-4E56-B35F-C8CBAA342AC4@duke.edu> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> <0B84EE38-0BA5-4E56-B35F-C8CBAA342AC4@duke.edu> Message-ID: <76f031ae0602140211n2a0bbf4fl@mail.gmail.com> The tool ROSE also evolves sequences on a tree. There is a web interface and downloadable source at http://bibiserv.techfak.uni-bielefeld.de/rose/ Max On 09/02/06, Jason Stajich wrote: > Depending on whether or not you want to use evolutionary realistic > models... > * evolver which comes with PAML lets you evolve sequences on a tree > * SeqGen from Andrew Rambaut http://evolve.zoo.ox.ac.uk/software.html? > id=seqgen > also lets you do this > I believe there are PISE interfaces to both of these at the pasteur > bioweb site - http://bioweb.pasteur.fr/ > > -jason > On Feb 8, 2006, at 11:46 PM, Ryan Golhar wrote: > > > Does anyone know of tool to mutate a DNA sequence by a specified > > amount? > > For instance, say I have a DNA sequence 1000 bases long, and I want to > > simulate mutations to make it 75% (or 80%, etc) similar to the > > original. > > > > > > Ryan > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > Duke University > http://www.duke.edu/~jes12 > > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > -- Maximilian Haeussler, CNRS Gif-sur-Yvette, Paris tel: +33 6 12 82 76 16 icq: 3825815 -- msn: maximilian.haeussler at hpi.uni-potsdam.de skype: maximilianhaeussler From jeff at bioinformatics.org Wed Feb 15 18:49:59 2006 From: jeff at bioinformatics.org (J.W. Bizzaro) Date: Wed, 15 Feb 2006 18:49:59 -0500 Subject: [BiO BB] Bioinformatics.Org announces the laureate of the 2006 Benjamin Franklin Award Message-ID: <43F3BE27.9040001@bioinformatics.org> Bioinformatics.Org is proud to present the 2006 Benjamin Franklin Award in the Life Sciences to Michael Ashburner of Cambridge University. As expressed by his nominators, Prof. Ashburner has made fundamental contributions to many open access bioinformatics projects including FlyBase [1], the GASP project [2], the Gene Ontology project [3], and the Open Biological Ontologies project [4], and he was instrumental in the establishment of the European Bioinformatics Institute [5]. He is also known for advocating open access to biological information [6]. The Benjamin Franklin Award in the Life Sciences is a humanitarian award presented annually by Bioinformatics.Org to an individual who has, in his or her practice, promoted free and open access to the materials and methods used in the life sciences. The Award is named for Benjamin Franklin (1706-1790), one of the most remarkable men of his time. Scientist, inventor, statesman, Franklin freely and openly shared his ideas and refused to patent his inventions, and it is the opinion of the founders of Bioinformatics.Org that he embodied the best traits of a scientist. At the end of 2005, requests for nominations for the 2006 Award were sent out to more than 17,000 members of Bioinformatics.Org. Any individual who received more than one nomination was considered a nominee and had their name placed on the ballot for final selection by the membership. The ceremony for the presentation of the Award will be held at the 2006 Bioinformatics.Org Annual Meeting (BiOAM), held in conjunction with the Life Sciences Conference and Expo, Boston, Massachusetts, April 3 to 5, 2006. The presentation will be made April 5 at 10:00 AM, and it is open to all attendees. It involves a short introduction, the presentation of the certificate, and the laureate seminar. Please see http://bio-itworldexpo.com/ for more information on the event. Past laureates of the Benjamin Franklin Award in the Life Sciences include Ewan Birney (2005), Lincoln Stein (2004), James Kent (2003) and Michael Eisen (2002). More information on the Award can be found at http://bioinformatics.org/franklin/. References: 1. http://www.flybase.org/ 2. http://www.fruitfly.org/GASP1/ 3. http://www.geneontology.org/ 4. http://obo.sourceforge.net/ 5. http://www.ebi.ac.uk/ 6. http://www.newscientist.com/article.ns?id=dn2061 From maithreyi_roses at yahoo.co.in Wed Feb 15 11:02:16 2006 From: maithreyi_roses at yahoo.co.in (maithreyi thaticherla) Date: Wed, 15 Feb 2006 16:02:16 +0000 (GMT) Subject: [BiO BB] What exactly the clustering of DNA means? Message-ID: <20060215160216.6863.qmail@web8414.mail.in.yahoo.com> Hi This is Maithreyi , studying final B.Tech . Iam doing a project on Clustering a DNA sequence. Iam a computer student so iam not having knowledge on this bioinformatics.But iam interested to know how is this DNA present in our body. My doubt is how can we cluster a DNA sequence.Iam using k-means algorithm and iam unable to link this algorithm with my topic. I want to know the exact steps of k-means algorithm with Clustering of DNA sequences. Can anyone help me immediately plz. k bye --------------------------------- Jiyo cricket on Yahoo! India cricket Yahoo! Messenger Mobile Stay in touch with your buddies all the time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ykalidas at gmail.com Thu Feb 16 11:07:15 2006 From: ykalidas at gmail.com (Kalidas Yeturu) Date: Thu, 16 Feb 2006 21:37:15 +0530 Subject: [BiO BB] What exactly the clustering of DNA means? In-Reply-To: <20060215160216.6863.qmail@web8414.mail.in.yahoo.com> References: <20060215160216.6863.qmail@web8414.mail.in.yahoo.com> Message-ID: <5632703b0602160807h7b781074ha17f808bcc2bf69f@mail.gmail.com> you may want to check out BLAST program suite to know how DNA sequences will be approximately compared. It gives a score (called Z score) which you can use it as clustering threshold. and mean of a cluster you can define as that dna sequence which is similar to most of the sequences. If you do not want approximate approach for clustering, you can use pairwise sequence alignment algorithms (smith waterman/Hidden markov model-approach) when you have not many sequences to cluster. On 2/15/06, maithreyi thaticherla wrote: > > Hi > This is Maithreyi , studying final B.Tech . Iam doing a project on > Clustering a DNA sequence. > Iam a computer student so iam not having knowledge on this > bioinformatics.But iam interested to know how is this DNA present in our > body. > My doubt is how can we cluster a DNA sequence.Iam using k-means algorithm > and iam unable to link this algorithm with my topic. > I want to know the exact steps of k-means algorithm with Clustering of DNA > sequences. > Can anyone help me immediately plz. > k bye > > ------------------------------ > Jiyo cricket on Yahoo! India cricket > Yahoo! Messenger MobileStay in touch with your buddies all the time. > > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > -- Kalidas Y http://ssl.serc.iisc.ernet.in/~kalidas -------------- next part -------------- An HTML attachment was scrubbed... URL: From delete at elfdata.com Thu Feb 16 16:43:10 2006 From: delete at elfdata.com (Theodore H. Smith) Date: Thu, 16 Feb 2006 21:43:10 +0000 Subject: [BiO BB] Testing blast, getting "Unable to open BLOSUM62" error? Message-ID: <1F3D43D3-4C57-4145-B255-DF12A579A4A7@elfdata.com> Hi people, I've installed blast on my computer. I'm getting some errors using a custom DB. ShatterProof% cd /Users/theodore/Desktop/customgene.faa/ ShatterProof% blastall -p blastp -d customgene.txt -i query.txt -o result.out [NULL_Caption] WARNING: [000.000] query: Unable to open BLOSUM62 [NULL_Caption] WARNING: [000.000] query: BlastScoreBlkMatFill returned non-zero status [NULL_Caption] WARNING: [000.000] query: SetUpBlastSearch failed. however, this example copied from ncbi's website, works just fine: ShatterProof% cd /usr/blast/dbs/ ShatterProof% blastall -p blastn -d ecoli.nt -i query.txt -o test.out I get a text file containing the result. Is that because I'm using blastp in the first attempt, and blastn in the second? Where does BLAST expect the BLOSUM62 file to be? -- http://elfdata.com/plugin/ From pculpep at hotmail.com Thu Feb 16 16:53:36 2006 From: pculpep at hotmail.com (Pamela Culpepper) Date: Thu, 16 Feb 2006 21:53:36 +0000 Subject: [BiO BB] Testing blast, getting "Unable to open BLOSUM62" error? In-Reply-To: <1F3D43D3-4C57-4145-B255-DF12A579A4A7@elfdata.com> Message-ID: The is an environment variable -- BLASTMAT -- that you can set to the location of your matrix file. Otherwise, the data/ directory should contain the filters. Pam >From: "Theodore H. Smith" >Reply-To: "The general forum at Bioinformatics.Org" > >To: "The general forum at Bioinformatics.Org" > >Subject: [BiO BB] Testing blast, getting "Unable to open BLOSUM62" error? >Date: Thu, 16 Feb 2006 21:43:10 +0000 > >Hi people, > >I've installed blast on my computer. I'm getting some errors using a >custom DB. > >ShatterProof% cd /Users/theodore/Desktop/customgene.faa/ >ShatterProof% blastall -p blastp -d customgene.txt -i query.txt -o >result.out > >[NULL_Caption] WARNING: [000.000] query: Unable to open BLOSUM62 >[NULL_Caption] WARNING: [000.000] query: BlastScoreBlkMatFill returned >non-zero status >[NULL_Caption] WARNING: [000.000] query: SetUpBlastSearch failed. > > >however, this example copied from ncbi's website, works just fine: > >ShatterProof% cd /usr/blast/dbs/ >ShatterProof% blastall -p blastn -d ecoli.nt -i query.txt -o test.out > >I get a text file containing the result. > >Is that because I'm using blastp in the first attempt, and blastn in the >second? > >Where does BLAST expect the BLOSUM62 file to be? > >-- >http://elfdata.com/plugin/ > > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From christoph.gille at charite.de Thu Feb 16 17:03:16 2006 From: christoph.gille at charite.de (Dr. Christoph Gille) Date: Thu, 16 Feb 2006 23:03:16 +0100 (CET) Subject: [BiO BB] Testing blast, getting 'Unable to open BLOSUM62' error? In-Reply-To: <1F3D43D3-4C57-4145-B255-DF12A579A4A7@elfdata.com> References: <1F3D43D3-4C57-4145-B255-DF12A579A4A7@elfdata.com> Message-ID: <63839.84.190.18.69.1140127396.squirrel@webmail.charite.de> I had similiar problems some time ago and I could only blast when I had been in the directory where blast was installed. Even though I set all shell variables correctly. If you do not solve this problem think about installing WU-blast instead of NCBI. It might also be faster. From bioinfosm at gmail.com Thu Feb 16 17:17:19 2006 From: bioinfosm at gmail.com (Samantha Fox) Date: Thu, 16 Feb 2006 17:17:19 -0500 Subject: [BiO BB] BLAST problem - .index files Message-ID: <726450810602161417l4bc339eblbd62d8e4902b61c5@mail.gmail.com> Hi ! I got this wierd observation.. not really a problem or error, as BLAST runs fine and gives me the output .. but something to know about. So I format the nucleotide database as normal 'formatdb -i data -p F -o T' .. and this makes all the relevant files. However, I noticed that after some BLAST runs, a .index file and .index.___ also appear !! and the next BLAST runs give this sort of exception: ------------- EXCEPTION ------------- MSG: Can't open cache file data.index: Invalid argument STACK Bio::DB::Fasta::_open_index /usr/local/lib/perl5/site_perl/5.8.5/Bio/DB/Fasta.pm:502 STACK Bio::DB::Fasta::index_file /usr/local/lib/perl5/site_perl/5.8.5/Bio/DB/Fasta.pm:607 STACK Bio::DB::Fasta::new /usr/local/lib/perl5/site_perl/5.8.5/Bio/DB/Fasta.pm:467 STACK toplevel get_ortho.pl.all:43 Any BLAST experts here, who can help ! thanks .. ~S -------------- next part -------------- An HTML attachment was scrubbed... URL: From pculpep at hotmail.com Thu Feb 16 17:30:19 2006 From: pculpep at hotmail.com (Pamela Culpepper) Date: Thu, 16 Feb 2006 22:30:19 +0000 Subject: [BiO BB] Testing blast, getting 'Unable to open BLOSUM62' error? In-Reply-To: <63839.84.190.18.69.1140127396.squirrel@webmail.charite.de> Message-ID: Create a file -- .ncbirc in your home directory and add the following lines -- [NCBI] data=/home/user/blast/data [BLAST] BLASTDB=/home/user/blast/db BLASTMAT=/home/user/blast/data Pam >From: "Dr. Christoph Gille" >Reply-To: "The general forum at Bioinformatics.Org" > >To: "The general forum at Bioinformatics.Org" > >Subject: Re: [BiO BB] Testing blast, getting 'Unable to open BLOSUM62' >error? >Date: Thu, 16 Feb 2006 23:03:16 +0100 (CET) > >I had similiar problems some time ago and I could only blast >when I had been in the directory where blast was installed. >Even though I set all shell variables correctly. > >If you do not solve this problem think about installing WU-blast >instead of NCBI. It might also be faster. > > > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From delete at elfdata.com Thu Feb 16 20:34:51 2006 From: delete at elfdata.com (Theodore H. Smith) Date: Fri, 17 Feb 2006 01:34:51 +0000 Subject: [BiO BB] Testing blast, getting "Unable to open BLOSUM62" error? In-Reply-To: References: Message-ID: Hi Pamela, copying my files in the data/ directory to the directory that my custom database is in, did work. However, your suggestion to use the .ncbirc file made no changes, I was unable to get that working. This is what I put in my file: [NCBI] Data="/usr/blast/data" [BLAST] BLASTDB="/usr/blast/dbs" BLASTMAT="/usr/blast/data" Removing the quotes made no change. My blast executables and folders, exist in /usr/blast/ so I have /usr/ blast/blastall as an executable, and /usr/blast/data as a folder containing BLOSUM files and the rest of the usual blast data. What now? I'd rather be able to blast into a DB in any directory, if possible. Especially with custom DB's, because I'm likely to delete them as they are computer generated experimental DB's. Or should I always put my custom DBs in my "/usr/blast/dbs" folder and be done with it? On 16 Feb 2006, at 21:53, Pamela Culpepper wrote: > The is an environment variable -- BLASTMAT -- that you can set to > the location of your matrix file. > Otherwise, the data/ directory should contain the filters. > > Pam > >> From: "Theodore H. Smith" >> Reply-To: "The general forum at Bioinformatics.Org" >> >> To: "The general forum at Bioinformatics.Org" >> >> Subject: [BiO BB] Testing blast, getting "Unable to open BLOSUM62" >> error? >> Date: Thu, 16 Feb 2006 21:43:10 +0000 >> >> Hi people, >> >> I've installed blast on my computer. I'm getting some errors using >> a custom DB. >> >> ShatterProof% cd /Users/theodore/Desktop/customgene.faa/ >> ShatterProof% blastall -p blastp -d customgene.txt -i query.txt - >> o result.out >> >> [NULL_Caption] WARNING: [000.000] query: Unable to open BLOSUM62 >> [NULL_Caption] WARNING: [000.000] query: BlastScoreBlkMatFill >> returned non-zero status >> [NULL_Caption] WARNING: [000.000] query: SetUpBlastSearch failed. >> >> >> however, this example copied from ncbi's website, works just fine: >> >> ShatterProof% cd /usr/blast/dbs/ >> ShatterProof% blastall -p blastn -d ecoli.nt -i query.txt -o test.out >> >> I get a text file containing the result. >> >> Is that because I'm using blastp in the first attempt, and blastn >> in the second? >> >> Where does BLAST expect the BLOSUM62 file to be? From delete at elfdata.com Thu Feb 16 20:41:12 2006 From: delete at elfdata.com (Theodore H. Smith) Date: Fri, 17 Feb 2006 01:41:12 +0000 Subject: [BiO BB] Testing blast, getting "Unable to open BLOSUM62" error? In-Reply-To: References: Message-ID: <08688232-3FB2-4842-AB0D-573E5315BF24@elfdata.com> Oh wait, BLASTMAT is a proper environment variable, right? So I just need to set those lines into my unix thingy that contains the environment variables? I'll do that. I can figure it out myself :) I thought blast itself actually read .ncbirc. On 16 Feb 2006, at 21:53, Pamela Culpepper wrote: > The is an environment variable -- BLASTMAT -- that you can set to > the location of your matrix file. > Otherwise, the data/ directory should contain the filters. > > Pam > >> From: "Theodore H. Smith" >> Reply-To: "The general forum at Bioinformatics.Org" >> >> To: "The general forum at Bioinformatics.Org" >> >> Subject: [BiO BB] Testing blast, getting "Unable to open BLOSUM62" >> error? >> Date: Thu, 16 Feb 2006 21:43:10 +0000 >> >> Hi people, >> >> I've installed blast on my computer. I'm getting some errors using >> a custom DB. >> >> ShatterProof% cd /Users/theodore/Desktop/customgene.faa/ >> ShatterProof% blastall -p blastp -d customgene.txt -i query.txt - >> o result.out >> >> [NULL_Caption] WARNING: [000.000] query: Unable to open BLOSUM62 >> [NULL_Caption] WARNING: [000.000] query: BlastScoreBlkMatFill >> returned non-zero status >> [NULL_Caption] WARNING: [000.000] query: SetUpBlastSearch failed. >> >> >> however, this example copied from ncbi's website, works just fine: >> >> ShatterProof% cd /usr/blast/dbs/ >> ShatterProof% blastall -p blastn -d ecoli.nt -i query.txt -o test.out >> >> I get a text file containing the result. >> >> Is that because I'm using blastp in the first attempt, and blastn >> in the second? >> >> Where does BLAST expect the BLOSUM62 file to be? >> >> -- >> http://elfdata.com/plugin/ >> >> >> _______________________________________________ >> Bioinformatics.Org general forum - >> BiO_Bulletin_Board at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > _______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > -- http://elfdata.com/plugin/ From cmsb06 at msr-unitn.unitn.it Fri Feb 17 04:22:14 2006 From: cmsb06 at msr-unitn.unitn.it (COMPUTATIONAL METHODS IN SYSTEMS BIOLOGY) Date: Fri, 17 Feb 2006 10:22:14 +0100 Subject: [BiO BB] Call for Papers CMSB06 Message-ID: <005201c633a3$a4836c40$66e8a8c0@cosbi.local> ***** Apologize for multiple copies ****** INTERNATIONAL CONFERENCE ON COMPUTATIONAL METHODS IN SYSTEMS BIOLOGY 18 and 19 October 2006 The Microsoft Research - University of Trento Centre for Computational and Systems Biology TRENTO - ITALY http://www.msr-unitn.unitn.it/events/cmsb06.php The CMSB (Computational Methods in Systems Biology) conference series was established in 2003 to help catalyze the convergence of modellers, physicists, mathematicians, and theoretical computer scientists from fields such as language design, concurrency theory, program verification, and molecular biologists, physicians, neuroscientists interested in a systems-level understanding of cellular physiology and pathology. CMSB'06 solicits original research articles (including significant works-in-progress), surveys of current research and posters. These may cover theoretical or applied contributions that are motivated by a biological question and can demonstrate either actual or potential usefulness towards answering that question. They may also cover models of computation inspired by biological processes; the motivation may be as much computational as biological. Particularly relevant case studies and open issues from the biological side that demands modeling of systems are of interest as well. The introduction of formal models should be supported by theoretical arguments about the model and/or on the analyses that they enable, by comparisons with other network models, and/or by examples of representation and analysis of a biological system. Topics of interest include: 1. Biological systems and networks: inference, properties, modeling, dynamics, simulation and reverse engineering 2. Formal methods for drug discovery and design 3. Methods to predict biological network behavior from incomplete information 4. Models including symbolic evolution and learning 5. Models of Self-assembly 6. Detailed case-studies on how a biological question was successfully addressed using formal models 7. Emergence of properties in complex biological systems 8. Theoretical comparisons between different formal models of cellular processes 9. Differential, discrete and/or stochastic modeling-language frameworks 10. Quantitative formal languages 11. Biologically-inspired extensions to concurrency theory, constraint programming, logical methods or language equivalences 12. Computer models in nano-sciences applied to biological domains 13. Definition and study of theoretical properties of biologically-inspired formal languages 14. Biological data bases and exchange formats for biological data and standards History 2003 held in Trento, chaired by Corrado Priami 2004 held in Paris, co-chaired by Vincent Danos and Vincent Schachter 2005 held in Edinburgh, chaired by Gordon Plotkin. Paper and poster submission guidelines Authors are invited to submit original research papers or survey papers of no more than 15 pages in .pdf format using the LNCS templates, available at the url below http://www.springer.com/sgw/cda/frontpage/0,11855,5-164-2-72376-0,00.html We also accept poster proposals in the form of a text-only abstract describing the poster contents. Papers and posters descriptions should be sent by e-mail to cmsb06 at msr-unitn.unitn.it. The subject line should be CMSB Paper: (Title of Paper). The body of the e-mail should contain the title, authors and affiliations, an abstract, and the themes to which the paper/poster refers according to the topics of interest list. If no theme is listed, please insert some keywords. All submissions will be reviewed by the program committee. Accepted papers will be included in the proceedings available at the conference. Publication as an LNBI volume by Springer is under negotiation. Important Dates (deadlines are strict): Submission of papers: May, 10 Notification of paper acceptance: June, 10 Revised version of papers due: June, 30 Submission of posters: July, 10 Notification of poster acceptance: July, 30 Venue The conference will be held in Trento (Italy) at the premises of the newly established Microsoft Research - University of Trento Centre for Computational and Systems Biology. The dates are 18 - 19 October 2006. Steering Committee Finn Drablos, Norwegian University of Science and Technology, Trondheim (NO) Monika Heiner, TU Cottbus (D) Patrick Lincoln, Stanford Research International (US) Satoru Miyano, University of Tokyo (JP) Corrado Priami, University of Trento (IT) Magali Roux-Rouqui?, CNRS-UPMC (FR) Vincent Schachter, Genoscope, Evry (FR) Adelinde Uhrmacher, University of Rostock (D) Program Committee Program Committee Chair Corrado Priami - The Microsoft Research - University of Trento Centre for Computational and Systems Biology - (I) Charles Auffray, CNRS (F) Muffy Calder, University of Glasgow (UK) Luca Cardelli, Microsoft Research Cambridge (UK) Diego Di Bernardo, Telethon Institute of Genetics and Medicine (IT) David Harel, Weizmann Institute (Israel) Monika Heiner, University of Cottbus (D) Ela Hunt, University of Z?rich (CH) Fran?ois Kepes, CNRS / Epigenomics Program, Evry (F) Marta Kwiatkowska, University of Birmingham (UK) Cosimo Laneve, University of Bologna (IT) Eduardo Mendoza, LMU (D) and University of the Philippines-Diliman (PH) Bud Mishra, New York University (US) Satoru Miyano, University of Tokyo (JP) Christos Ouzounis, European Bioinformatics Institute (UK) Gordon Plotkin, University of Edinburgh (UK) Alessandro Quattrone, University of Florence (IT) Magali Roux-Rouqui?, CNRS-UPMC (F) David Searls, Senior Vice-President, Worldwide Bioinformatics - GlaxoSmithKline (US) Adelinde Uhrmacher, University of Rostock (D) Alfonso Valencia, Centro Nacional de Biotecnologia-CSIC (ES) Organizing Committee Matteo Cavaliere and Elisabetta Nones - The Microsoft Research - University of Trento Centre for Computational and Systems Biology (IT) Events and Meetings Office of the University of Trento (IT) http://www.unitn.it/ln/umc The Microsoft Research - University of Trento Centre for Computational and Systems Biology P.zza Manci, 17 - 38050 Povo, Italy tel: +39 0461/882811 fax: +39 0461/882814 email: cmsb06 at msr-unitn.unitn.it web: http://www.msr-unitn.unitn.it/ As for Decreto Legislativo n. 196 of June 30th 2003, according to art. 13 of the code on personal data processing, we inform you that all the data we possess are used for testing the satisfaction level of the services offered, for dealing with curricula, for invitations to events, conferences, workshops, for sending data by e-mail to companies belonging to, linked with or partners of The Microsoft Research - University of Trento Centre for Computational and Systems Biology Scarl. According to art. 7 of the above mentioned DL, being the legal owner of your personal data, it is your right to be informed on which of your data are used and how; you may ask for their correction, cancellation or you may oppose to their use by written request sent by recorded delivery to The Microsoft Research - University of Trento Centre for Computational and Systems Biology Scarl - P.zza Manci, 17, 38050 Povo (TN), Italy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From delete at elfdata.com Fri Feb 17 09:45:11 2006 From: delete at elfdata.com (Theodore H. Smith) Date: Fri, 17 Feb 2006 14:45:11 +0000 Subject: [BiO BB] Want to test a custom scoring matrix Message-ID: Hi people, I'm just playing around with BLAST to see what it outputs. I figured I'd better understand BLAST now as I'm going to have to understand it by the end of this project anyhow. (I'm at the beginning of this project). I'm trying to test a made up scoring matrix. It's a very simple one. # Here is a matrix I made up. A R N D ... A 1 -1 -1 -1 R -1 1 -1 -1 N -1 -1 1 -1 D -1 -1 -1 1 ... This is just to get a better idea of how does BLAST work. However, I can't figure out how to tell blast to use this matrix file! I don't see it in the documentation, although I might have missed something. Any ideas anyone? What this experiment is trying to figure out, is does BLAST return multiple sections, when doing local alignments. Basically, the same quesiton I have before about Smith-Waterman, I was trying to figure out how does BLAST work in the same respect. In other words, does BLAST match BBBBAAAA to AAAABBBB, to give the alignments: AAAA :::: AAAA and BBBB :::: BBBB I thought I'd figure out the answer myself instead of asking, but I couldn't figure out how to input custom scoring matrixes :) -- http://elfdata.com/plugin/ From pculpep at hotmail.com Fri Feb 17 12:29:56 2006 From: pculpep at hotmail.com (Pamela Culpepper) Date: Fri, 17 Feb 2006 17:29:56 +0000 Subject: [BiO BB] Testing blast, getting "Unable to open BLOSUM62" error? In-Reply-To: Message-ID: If you are using pre-compiled BLAST (downloaded from NCBI), if may be stopping you from reading the .ncbirc file. They have a define -- NCBI_DONT_USE_LOCAL_CONFIG that could be set to TRUE and that's causing the problem. My .ncbirc file is -- [pac at pam-nb ~]$ more .ncbirc [NCBI] data=/home/pac/blast/data [BLAST] BLASTDB=/home/pac/blast/db BLASTMAT=/home/pac/blast/data and it works fine. Go to -- http://www.lifeformulae.com/local_blast_graphics/index.html to see how to compile BLAST for a local machine running Linux. Pam However, I have compiled blastall (the blast programs wrapper) from scratch. >From: "Theodore H. Smith" >Reply-To: "The general forum at Bioinformatics.Org" > >To: "The general forum at Bioinformatics.Org" > >Subject: Re: [BiO BB] Testing blast, getting "Unable to open BLOSUM62" >error? >Date: Fri, 17 Feb 2006 01:34:51 +0000 > >Hi Pamela, > >copying my files in the data/ directory to the directory that my custom >database is in, did work. > >However, your suggestion to use the .ncbirc file made no changes, I was >unable to get that working. This is what I put in my file: > >[NCBI] >Data="/usr/blast/data" > >[BLAST] >BLASTDB="/usr/blast/dbs" >BLASTMAT="/usr/blast/data" > >Removing the quotes made no change. > >My blast executables and folders, exist in /usr/blast/ so I have /usr/ >blast/blastall as an executable, and /usr/blast/data as a folder >containing BLOSUM files and the rest of the usual blast data. > >What now? I'd rather be able to blast into a DB in any directory, if >possible. Especially with custom DB's, because I'm likely to delete them >as they are computer generated experimental DB's. > >Or should I always put my custom DBs in my "/usr/blast/dbs" folder and be >done with it? > > > >On 16 Feb 2006, at 21:53, Pamela Culpepper wrote: > >>The is an environment variable -- BLASTMAT -- that you can set to the >>location of your matrix file. >>Otherwise, the data/ directory should contain the filters. >> >>Pam >> >>>From: "Theodore H. Smith" >>>Reply-To: "The general forum at Bioinformatics.Org" >>> >>>To: "The general forum at Bioinformatics.Org" >>> >>>Subject: [BiO BB] Testing blast, getting "Unable to open BLOSUM62" >>>error? >>>Date: Thu, 16 Feb 2006 21:43:10 +0000 >>> >>>Hi people, >>> >>>I've installed blast on my computer. I'm getting some errors using a >>>custom DB. >>> >>>ShatterProof% cd /Users/theodore/Desktop/customgene.faa/ >>>ShatterProof% blastall -p blastp -d customgene.txt -i query.txt - o >>>result.out >>> >>>[NULL_Caption] WARNING: [000.000] query: Unable to open BLOSUM62 >>>[NULL_Caption] WARNING: [000.000] query: BlastScoreBlkMatFill >>>returned non-zero status >>>[NULL_Caption] WARNING: [000.000] query: SetUpBlastSearch failed. >>> >>> >>>however, this example copied from ncbi's website, works just fine: >>> >>>ShatterProof% cd /usr/blast/dbs/ >>>ShatterProof% blastall -p blastn -d ecoli.nt -i query.txt -o test.out >>> >>>I get a text file containing the result. >>> >>>Is that because I'm using blastp in the first attempt, and blastn in >>>the second? >>> >>>Where does BLAST expect the BLOSUM62 file to be? > > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From marty.gollery at gmail.com Fri Feb 17 13:18:27 2006 From: marty.gollery at gmail.com (Martin Gollery) Date: Fri, 17 Feb 2006 10:18:27 -0800 Subject: [BiO BB] Want to test a custom scoring matrix In-Reply-To: References: Message-ID: Hi Theodore, For serious work, I don't recommend this, because you would really have to recalculate lamba and the gap penalties. For playing around, you can specify the matrix with -M. The easiest thing to do is to backup BLOSUM62, then save your test matrix with that name. Sneaky, but I think it will work. Marty On 2/17/06, Theodore H. Smith wrote: > > Hi people, > > I'm just playing around with BLAST to see what it outputs. I figured > I'd better understand BLAST now as I'm going to have to understand it > by the end of this project anyhow. (I'm at the beginning of this > project). > > I'm trying to test a made up scoring matrix. It's a very simple one. > > # Here is a matrix I made up. > A R N D ... > A 1 -1 -1 -1 > R -1 1 -1 -1 > N -1 -1 1 -1 > D -1 -1 -1 1 > ... > > This is just to get a better idea of how does BLAST work. However, I > can't figure out how to tell blast to use this matrix file! I don't > see it in the documentation, although I might have missed something. > > Any ideas anyone? > > What this experiment is trying to figure out, is does BLAST return > multiple sections, when doing local alignments. Basically, the same > quesiton I have before about Smith-Waterman, I was trying to figure > out how does BLAST work in the same respect. In other words, does > BLAST match BBBBAAAA to AAAABBBB, to give the alignments: > > AAAA > :::: > AAAA > > and > > BBBB > :::: > BBBB > > I thought I'd figure out the answer myself instead of asking, but I > couldn't figure out how to input custom scoring matrixes :) > > -- > http://elfdata.com/plugin/ > > > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > -- -- Martin Gollery Associate Director Center For Bioinformatics University of Nevada at Reno Dept. of Biochemistry / MS330 775-784-7042 ----------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From pculpep at hotmail.com Sat Feb 18 12:26:46 2006 From: pculpep at hotmail.com (Pamela Culpepper) Date: Sat, 18 Feb 2006 17:26:46 +0000 Subject: [BiO BB] Testing blast, getting "Unable to open BLOSUM62" error? In-Reply-To: <08688232-3FB2-4842-AB0D-573E5315BF24@elfdata.com> Message-ID: right. >From: "Theodore H. Smith" >Reply-To: "The general forum at Bioinformatics.Org" > >To: "The general forum at Bioinformatics.Org" > >Subject: Re: [BiO BB] Testing blast, getting "Unable to open BLOSUM62" >error? >Date: Fri, 17 Feb 2006 01:41:12 +0000 > >Oh wait, BLASTMAT is a proper environment variable, right? > >So I just need to set those lines into my unix thingy that contains the >environment variables? I'll do that. I can figure it out myself :) > >I thought blast itself actually read .ncbirc. > >On 16 Feb 2006, at 21:53, Pamela Culpepper wrote: > >>The is an environment variable -- BLASTMAT -- that you can set to the >>location of your matrix file. >>Otherwise, the data/ directory should contain the filters. >> >>Pam >> >>>From: "Theodore H. Smith" >>>Reply-To: "The general forum at Bioinformatics.Org" >>> >>>To: "The general forum at Bioinformatics.Org" >>> >>>Subject: [BiO BB] Testing blast, getting "Unable to open BLOSUM62" >>>error? >>>Date: Thu, 16 Feb 2006 21:43:10 +0000 >>> >>>Hi people, >>> >>>I've installed blast on my computer. I'm getting some errors using a >>>custom DB. >>> >>>ShatterProof% cd /Users/theodore/Desktop/customgene.faa/ >>>ShatterProof% blastall -p blastp -d customgene.txt -i query.txt - o >>>result.out >>> >>>[NULL_Caption] WARNING: [000.000] query: Unable to open BLOSUM62 >>>[NULL_Caption] WARNING: [000.000] query: BlastScoreBlkMatFill >>>returned non-zero status >>>[NULL_Caption] WARNING: [000.000] query: SetUpBlastSearch failed. >>> >>> >>>however, this example copied from ncbi's website, works just fine: >>> >>>ShatterProof% cd /usr/blast/dbs/ >>>ShatterProof% blastall -p blastn -d ecoli.nt -i query.txt -o test.out >>> >>>I get a text file containing the result. >>> >>>Is that because I'm using blastp in the first attempt, and blastn in >>>the second? >>> >>>Where does BLAST expect the BLOSUM62 file to be? >>> >>>-- >>>http://elfdata.com/plugin/ >>> >>> >>>_______________________________________________ >>>Bioinformatics.Org general forum - >>>BiO_Bulletin_Board at bioinformatics.org >>>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >> >> >>_______________________________________________ >>Bioinformatics.Org general forum - >>BiO_Bulletin_Board at bioinformatics.org >>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >> >> > >-- >http://elfdata.com/plugin/ > > > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From ahmed at pobox.com Fri Feb 17 19:59:52 2006 From: ahmed at pobox.com (Ahmed Moustafa) Date: Fri, 17 Feb 2006 18:59:52 -0600 Subject: [BiO BB] phylogeny search engine? Message-ID: <43F67188.6010700@pobox.com> Hi All! We are working on a project where we need to search a large number of phylogeny trees for an approximate (or exact) containment of a query tree, what tool/algorithm would help achieving that? Your help will be appreciated so much! Ahmed From dmb at mrc-dunn.cam.ac.uk Sun Feb 19 08:36:42 2006 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Sun, 19 Feb 2006 13:36:42 +0000 Subject: [BiO BB] phylogeny search engine? In-Reply-To: <43F67188.6010700@pobox.com> References: <43F67188.6010700@pobox.com> Message-ID: <43F8746A.5080203@mrc-dunn.cam.ac.uk> Ahmed Moustafa wrote: > Hi All! > > > We are working on a project where we need to search a large number of > phylogeny trees for an approximate (or exact) containment of a query > tree, what tool/algorithm would help achieving that? > > Your help will be appreciated so much! > You should be able to do 'sub tree matching' quite easily, but 'ontology alignment' is a big research field! So far as I know there are no 'exact algorithms' for 'approximate sub tree matching'. > Ahmed > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From maximilianh at gmail.com Sun Feb 19 08:52:37 2006 From: maximilianh at gmail.com (Maximilian Haeussler) Date: Sun, 19 Feb 2006 14:52:37 +0100 Subject: [BiO BB] Tool to mutate DNA sequence In-Reply-To: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> Message-ID: <76f031ae0602190552v5f2542dbv@mail.gmail.com> Hi bio-mailinglists, does anyone here know of a tool or a library to display two (or more) sequences at the same time with coloured features? Possibly with lines, connecting some features from one sequence to the other (synteny-plot) ? Or to display two multiple alignments, one on top of each other, with colored features added? It's not that it would be difficult to write, but programming visualisation usually takes a lot of time. Bio::Graphics seems mainly concerned with one main sequence and features on it. Well, I could copy together two of these gif-images, but then there would be no connecting lines. Same applies for the graphics in Biojava or the gff2ps tool or all the multiple alignment viewers that I know (Bioedit, ClustalX). There is something called Toucan in Java, which displays at least several lines of gff-style-features, but no visible sequences and more importantly, no connecting lines. A recent software, Djinn lite, is using a similar kind of visualization to compare different spliced genes from various species, but it's mainly aimed at splicing and written in Visual Basic. I guess a good compromise might be the 3D viewer Sockeye, but I haven't seen any synteny-lines in sockeye yet. I guess I must have missed something here. I cannot be the first one that would like to compare, say, two gff files, or two multiple alignments? Thanks a lot for any idea, Max -------------- next part -------------- An HTML attachment was scrubbed... URL: From bikram80 at gmail.com Sun Feb 19 09:06:02 2006 From: bikram80 at gmail.com (Bikram Nayak) Date: Sun, 19 Feb 2006 19:36:02 +0530 Subject: [BiO BB] Tool to mutate DNA sequence In-Reply-To: <76f031ae0602190552v5f2542dbv@mail.gmail.com> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> <76f031ae0602190552v5f2542dbv@mail.gmail.com> Message-ID: <16f934830602190606x1e8c5b42k42c1b74da903ce8d@mail.gmail.com> yes i know but why i would tell u ? import tcl tk package from CPAN ,naughty -------------- next part -------------- An HTML attachment was scrubbed... URL: From moyc at mail.med.upenn.edu Sun Feb 19 13:51:15 2006 From: moyc at mail.med.upenn.edu (Christopher Moy) Date: Sun, 19 Feb 2006 13:51:15 -0500 Subject: [BiO BB] Gene Search in Worm Message-ID: <20fdce30fd904ecbe4dad8f3196c463c@mail.med.upenn.edu> Hello, I have been trying to find a gene in worm that has orthologs in higher order taxa (human, chimp, dog, etc) and in fungi (yeast, candid albicans, etc). Using the protein database from wormbase I have ran HMM, PSI-BLAST, FuzzPro, MEME and BLOCKS to find a similar domain structure shared among all homologs. Although I find some loose e-values (.001-.1) but the proteins that have this score fail to contain distinct domains that is shared among all putative orthologs. From what I can tell, it is possible that there are no proteins in the current wormpep database. I have also performed tblastn runs using various orthologs as queries to search for genome wide hits to the worm but any prospective ortholog fails to contain the domains in other orthologs. Does anyone have any other suggestions for further searching? Should I consider a more comprehensive sequence alignment strategy (Smith Waterman,etc). Note: There is no putative structure for this gene available right now. Chris From hararid at bgumail.bgu.ac.il Sun Feb 19 14:11:11 2006 From: hararid at bgumail.bgu.ac.il (hararid at bgumail.bgu.ac.il) Date: Sun, 19 Feb 2006 21:11:11 +0200 Subject: [BiO BB] Gene Search in Worm Message-ID: <20060219190053.27B1A33E78@smtp2.bgu.ac.il> Chris, It is possible that your worm gene does not have homologues in higher organisms, although it is likely that you will find homologs amongst different worm species. Many such examples exist. The same goes for flies, etc. The most sensitive search that you can perform in which to hunt for homologs is to perform the following: Extract the homologs from different worm species. Create a multiple sequence alignment (E.g. ClustalW). Then use the multiple sequence alignment to generate a PROFILE. You can then run this profile againt various databases (IE: profilesearches). This can be done using the HMMER algorithm or using a Smith-Waterman based profilesearch algorithm. For the latter I refer you to this site: http://eta.embl-heidelberg.de:8000/misc/ Daniel hararid at bgu.ac.il > > From: Christopher Moy > Date: 2006/02/19 ? PM 08:51:15 GMT+02:00 > To: bio_bulletin_board at bioinformatics.org > Subject: [BiO BB] Gene Search in Worm > > Hello, > > I have been trying to find a gene in worm that has orthologs in higher > order taxa (human, chimp, dog, etc) and in fungi (yeast, candid > albicans, etc). Using the protein database from wormbase I have ran > HMM, PSI-BLAST, FuzzPro, MEME and BLOCKS to find a similar domain > structure shared among all homologs. Although I find some loose > e-values (.001-.1) but the proteins that have this score fail to > contain distinct domains that is shared among all putative orthologs. > From what I can tell, it is possible that there are no proteins in the > current wormpep database. I have also performed tblastn runs using > various orthologs as queries to search for genome wide hits to the worm > but any prospective ortholog fails to contain the domains in other > orthologs. Does anyone have any other suggestions for further > searching? Should I consider a more comprehensive sequence alignment > strategy (Smith Waterman,etc). > > Note: There is no putative structure for this gene available right now. > > Chris > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From delete at elfdata.com Sun Feb 19 16:54:53 2006 From: delete at elfdata.com (Theodore H. Smith) Date: Sun, 19 Feb 2006 21:54:53 +0000 Subject: [BiO BB] More Smith-Waterman internals. Gaps extensions this time. Message-ID: <16071B28-7A8E-4E30-A3D9-1A4DC1A6B46D@elfdata.com> Hi people, Is there anyone out there with enough knowledge of the arcane field of Smith-Waterman matrixes to answer this? OK, so Smith-Waterman is best done using a gap penalty that can has a start cost, and an extension cost. I want to implement a Smith-Waterman algorithm. Must I maintain a "gap or not" matrix alongside the cell score matrix? I'd rather not if it were possible and not wasting CPU time. Because maintaining and processing two matrixes, effectively makes my algorithm run at 2x as slow as it would otherwise! If there is a nice simple trick to avoid needing two matrixes, that would be great. If not, well at least I know. The answer could really be something obvious like "well I can't see any way to avoid needing two matrixes, it looks pretty much like you'll have to use two" :) Maybe someone better than me with this stuff has a good answer. -- http://elfdata.com/plugin/ From idoerg at burnham.org Sun Feb 19 22:34:12 2006 From: idoerg at burnham.org (Iddo Friedberg) Date: Sun, 19 Feb 2006 19:34:12 -0800 Subject: [BiO BB] More Smith-Waterman internals. Gaps extensions this time. In-Reply-To: <16071B28-7A8E-4E30-A3D9-1A4DC1A6B46D@elfdata.com> Message-ID: On Sun, 19 Feb 2006, Theodore H. Smith wrote: > > Hi people, > > Is there anyone out there with enough knowledge of the arcane field > of Smith-Waterman matrixes to answer this? > > OK, so Smith-Waterman is best done using a gap penalty that can has a > start cost, and an extension cost. > > I want to implement a Smith-Waterman algorithm. > > Must I maintain a "gap or not" matrix alongside the cell score matrix? > If I understand your question correctly, it seems like you should pick up a good Bioinformatics textbook and / or read the original Smith Waterman paper. I'm not sure how exactly you want to implemement SW given your question. The short answer is "no". When building the distance matrix, you use the gap insertion and extension penalties as initially given to generate the values of that matrix. Here is a free resource explaining the basics of dynamic programming for sequence alignment, of which SW is a variant. http://www.techfak.uni-bielefeld.de/bcd/Curric/PrwAli/prwali.html HTH, Iddo -- Iddo Friedberg, Ph.D. Burnham Institute for Medical Research 10901 N. Torrey Pines Rd. La Jolla, CA 92037, USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 646 3171 http://iddo-friedberg.org http://BioFunctionPrediction.org From shameer at ncbs.res.in Mon Feb 20 01:21:01 2006 From: shameer at ncbs.res.in (Shameer Khadar) Date: Mon, 20 Feb 2006 11:51:01 +0530 (IST) Subject: [BiO BB] Matrix Average Code / Module ? In-Reply-To: <76f031ae0602190552v5f2542dbv@mail.gmail.com> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> <76f031ae0602190552v5f2542dbv@mail.gmail.com> Message-ID: <59825.192.168.1.176.1140416461.squirrel@192.168.1.176> Hi all, Is there any program/module to calculate the average of a blosum/pam any matrix ? I have a matrix and I need to see the average for example 11 22 43 54 50 27 87 74 32 10 66 58 98 78 20 22 23 44 16 34 I have gone through Bio::Matrix::MatrixI and Bio::Matrix::GenericMatrix and other perl modules like Math::Matrix http://search.cpan.org/~ulpfr/Math-Matrix-0.4/Matrix.pm and Math::Cephes::Matrix - but none of them have a provison to do matrix average calculation. Any help ??? thanks in advance, Happy biocomputing !!! -- Shameer Khadar National Centre for Biological Sciences (TIFR) UAS - GKVK Campus - Bellary Road Bangalore - 65 - Karnataka - India T - 91-080-23636420-32 EXT 4241 F - 91-080-23636662/23636675 W - http://www.ncbs.res.in -------------------------------------------------- "Refrain from illusions, insist on work and not words, patiently seek divine and scientific truth." MM From gbottu at ben.vub.ac.be Mon Feb 20 03:04:11 2006 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Mon, 20 Feb 2006 09:04:11 +0100 Subject: [EMBOSS] [BiO BB] Tool to mutate DNA sequence - Checked by AntiVir In-Reply-To: <76f031ae0602190552v5f2542dbv@mail.gmail.com> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> <76f031ae0602190552v5f2542dbv@mail.gmail.com> Message-ID: <20060220080411.GB26540@bigben.ulb.ac.be> On Sun, Feb 19, 2006 at 02:52:37PM +0100, Maximilian Haeussler wrote: > does anyone here know of a tool or a library to display two (or more) > sequences at the same time with coloured features? Possibly with lines, > connecting some features from one sequence to the other (synteny-plot) ? > Or to display two multiple alignments, one on top of each other, with > colored features added? Well, there is Alfresco http://www.sanger.ac.uk/Software/Alfresco/ Guy Bottu, Belgian EMBnet Node From khoueiry at ibdm.univ-mrs.fr Mon Feb 20 04:27:07 2006 From: khoueiry at ibdm.univ-mrs.fr (khoueiry) Date: Mon, 20 Feb 2006 10:27:07 +0100 Subject: [Bioperl-l] [BiO BB] Tool to mutate DNA sequence In-Reply-To: <76f031ae0602190552v5f2542dbv@mail.gmail.com> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> <76f031ae0602190552v5f2542dbv@mail.gmail.com> Message-ID: <1140427628.10569.10.camel@localhost> Hi Maximilian, I hope that I understand your question. An easy way to do so is to put your sequences in a separate arrays, loop overs these arrays while comparing positions, if the nuc/aa are equal, then push a pipe ( "|" ) in a third separate array, else push a space ( " " ). Once it's done, print your arrays by looping over them... array1 qw (A A G C T) array2 qw (C A C C G) array3 qw ( | | ) Then, print your arrays and that will give you A A G C T | | C A C C G Surely, you can color whatever you want in that case too (i.e : the aligned nuc). Hope this was clear Pierre On Sun, 2006-02-19 at 14:52 +0100, Maximilian Haeussler wrote: > Hi bio-mailinglists, > > does anyone here know of a tool or a library to display two (or more) > sequences at the same time with coloured features? Possibly with lines, > connecting some features from one sequence to the other (synteny-plot) ? > Or to display two multiple alignments, one on top of each other, with > colored features added? > > It's not that it would be difficult to write, but programming visualisation > usually takes a lot of time. > Bio::Graphics seems mainly concerned with one main sequence and features on > it. Well, I could copy together two of these gif-images, but then there > would be no connecting lines. Same applies for the graphics in Biojava or > the gff2ps tool or all the multiple alignment viewers that I know (Bioedit, > ClustalX). There is something called Toucan in Java, which displays at least > several lines of gff-style-features, but no visible sequences and more > importantly, no connecting lines. A recent software, Djinn lite, is using a > similar kind of visualization to compare different spliced genes from > various species, but it's mainly aimed at splicing and written in Visual > Basic. > I guess a good compromise might be the 3D viewer Sockeye, but I haven't seen > any synteny-lines in sockeye yet. > > I guess I must have missed something here. I cannot be the first one that > would like to compare, say, two gff files, or two multiple alignments? > > Thanks a lot for any idea, > Max > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -------------- next part -------------- An HTML attachment was scrubbed... URL: From ahmed at users.sourceforge.net Mon Feb 20 12:28:28 2006 From: ahmed at users.sourceforge.net (Ahmed Moustafa) Date: Mon, 20 Feb 2006 11:28:28 -0600 Subject: [BiO BB] phylogeny search engine? In-Reply-To: <43F8746A.5080203@mrc-dunn.cam.ac.uk> References: <43F67188.6010700@pobox.com> <43F8746A.5080203@mrc-dunn.cam.ac.uk> Message-ID: <43F9FC3C.10404@users.sourceforge.net> Hi Dan, Could you please give more details on how do "sub tree matching" (e.g. which algorithm)? Thanks, Ahmed On 2/19/2006 7:36 AM, Dan Bolser wrote: > Ahmed Moustafa wrote: > >> Hi All! >> >> >> We are working on a project where we need to search a large number of >> phylogeny trees for an approximate (or exact) containment of a query >> tree, what tool/algorithm would help achieving that? >> >> Your help will be appreciated so much! >> > > You should be able to do 'sub tree matching' quite easily, but > 'ontology alignment' is a big research field! So far as I know there > are no 'exact algorithms' for 'approximate sub tree matching'. From boris.steipe at utoronto.ca Mon Feb 20 13:40:19 2006 From: boris.steipe at utoronto.ca (Boris Steipe) Date: Mon, 20 Feb 2006 13:40:19 -0500 Subject: [BiO BB] Re: [Bioperl-l] Matrix Average Code / Module ? In-Reply-To: <59825.192.168.1.176.1140416461.squirrel@192.168.1.176> References: <003b01c62d33$d37d15d0$e6028a0a@GOLHARMOBILE1> <76f031ae0602190552v5f2542dbv@mail.gmail.com> <59825.192.168.1.176.1140416461.squirrel@192.168.1.176> Message-ID: <92CF0104-0524-4BA3-B039-3CEECF68E20B@utoronto.ca> Assuming you mean the arithmetic average of all elements in a matrix, you could do the following (using your numbers): #!/usr/bin/perl -w use strict; my @matrix; push(@matrix, [(11,22,43,54,50)]); # [(...)] :a list passed as an anonymous array push(@matrix, [(27,87,74,32,10)]); push(@matrix, [(66,58,98,78,20)]); push(@matrix, [(22,23,44,16,34)]); my $sum = 0; my $number = 0; foreach my $row (@matrix) { foreach my $element (@{$row}){ $sum += $element; $number++; } } print "Average of $number elements = ", $sum/$number,"\n"; exit; HTH, B. On 20 Feb 2006, at 01:21, Shameer Khadar wrote: > Hi all, > Is there any program/module to calculate the average of a blosum/ > pam any > matrix ? > > I have a matrix and I need to see the average > > for example > > 11 22 43 54 50 > 27 87 74 32 10 > 66 58 98 78 20 > 22 23 44 16 34 > > I have gone through Bio::Matrix::MatrixI and > Bio::Matrix::GenericMatrix > and other perl modules like Math::Matrix > http://search.cpan.org/~ulpfr/Math-Matrix-0.4/Matrix.pm > and Math::Cephes::Matrix - but none of them have a provison to do > matrix > average calculation. > > Any help ??? > thanks in advance, > Happy biocomputing !!! > > > -- > Shameer Khadar > National Centre for Biological Sciences (TIFR) > UAS - GKVK Campus - Bellary Road Bangalore - 65 - Karnataka - India > T - 91-080-23636420-32 EXT 4241 > F - 91-080-23636662/23636675 > W - http://www.ncbs.res.in > -------------------------------------------------- > "Refrain from illusions, insist on work and not words, > patiently seek divine and scientific truth." > MM > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From smagarwal at yahoo.com Tue Feb 21 00:03:24 2006 From: smagarwal at yahoo.com (Subhash Agarwal) Date: Tue, 21 Feb 2006 05:03:24 +0000 (GMT) Subject: [BiO BB] Mean and Standard deviation Message-ID: <20060221050325.25235.qmail@web31501.mail.mud.yahoo.com> Dear Forum Memebers I have a dataset with 394 data points. The frequecy distribution of which is as follows: 0-0.02 85 0.02-0.04 63 0.04-0.06 51 0.06-0.08 39 0.08-0.1 36 0.10-0.12 24 -0.12-0.14 21 0.14-0.16 19 0.16-0.18 13 0.18-0.2 10 >0.20 33 The mean and SD of the data is 0.0842 and 0.0853. But when the data points from 0-0.02 and > 0.2 are removed and the mean and SD is calculated it is found that mean doesnt change much (0.0819) but SD does (0.0472). My query is what can I interpret from this regarding the data. If any of the forum members feels that this question should be addressed to some other place do let me know. Thanks Subhash --------------------------------- Jiyo cricket on Yahoo! India cricket Yahoo! Messenger Mobile Stay in touch with your buddies all the time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From siowcheng82 at yahoo.com Tue Feb 21 04:34:42 2006 From: siowcheng82 at yahoo.com (chan cheng) Date: Tue, 21 Feb 2006 01:34:42 -0800 (PST) Subject: [BiO BB] how to get PSSM profile from PSI-BLAST Message-ID: <20060221093442.93682.qmail@web34311.mail.mud.yahoo.com> I m doing a research about prediction of beta turns in proteins from multiple alignment using neural network. I had paste the beta turn sequence in PSI-BLAST and summit the form. however i cant get the PSSM profile that I need it as my neural network input. how can i get the profile from PSI-BLAST? __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From karen.morris at bbsrc.ac.uk Tue Feb 21 06:45:41 2006 From: karen.morris at bbsrc.ac.uk (karen morris (RRes-Roth)) Date: Tue, 21 Feb 2006 11:45:41 -0000 Subject: [BiO BB] 3rd Integrative Bioinformatics Workshop Message-ID: *** CALL FOR PAPERS *** 3rd Integrative Bioinformatics Workshop September 4-6, 2006 Rothamsted Research, Harpenden, Hertfordshire, United Kingdom http://www.rothamsted.bbsrc.ac.uk/bab/conf/ibiof/ Accepted papers will also appear in the Journal of Integrative Bioinformatics http://journal.imbio.de/ DESCRIPTION Biological data are scattered across hundreds of biological databases and thousands of scientific journals. Current high throughput genomics technologies generate large quantities of high dimensional data. Microarray, NMR, mass spectrometry, protein chips, gel electrophoresis data, Yeast-Two-Hybrid, QTL mapping, gene silencing and knockout experiments are all examples of technologies that capture thousands of data points, often in single experiments. The challenge for Integrative Bioinformatics is to capture, model, integrate and analyse these data in a consistent way to provide new and deeper insights into complex biological systems. This, third workshop on Integrative Bioinformatics will be of interest to Bioinformaticians, Computer Scientists and others working in, or interested in finding out more about, the developing area of integrative bioinformatics. There will be opportunities to present and discuss methods, theoretical approaches or their practical applications. TOPICS Database Integration Combined dry and wetlab studies Molecular Databases / Data Warehouses Errors and inconsistencies in biological databases Prediction and Integration of Metabolic and Regulatory Networks Genotype - phenotype linkage Protein-Protein-Interactions Microarray Modeling and Analyses Integrative Approaches for Drug Design Computational Infrastructure for Biotechnology Virtual Cell Modeling Gene Identification, Regulation and Expression Identification of Gene Regulatory Networks Computational Systems Biology Computational Proteomics Optimization of Workflow Management in Bioinformatics Bio Ontologies Quality and consistency of ontologies Integrative modeling and simulation frameworks Integrative data and text mining approaches IMPORTANT DATES May 8: Paper submission deadline June 23: Notification of acceptance for papers July 17: Camera ready paper submission deadline August 1: Registration deadline August 15: Poster submission deadline Karen Morris Rothamsted Research Harpenden Herts AL5 2JQ Telephone Number: 01582 763133 Extension 2813 From CHRISTOPHER_FRENZ at NYMC.EDU Tue Feb 21 11:57:39 2006 From: CHRISTOPHER_FRENZ at NYMC.EDU (Frenz, Christopher) Date: Tue, 21 Feb 2006 11:57:39 -0500 Subject: [BiO BB] Mean and Standard deviation References: <20060221050325.25235.qmail@web31501.mail.mud.yahoo.com> Message-ID: <70C50B8807B54A429AC206E83A3BA6BCAD8C5C@mail.nymc.edu> SD calculations involve the sample size as one of the variables. This is likely what is causing the change in SD, since you are reducing the the sample size when you remove the two sets of endpoints. Chris -----Original Message----- From: bio_bulletin_board-bounces+christopher_frenz=nymc.edu at bioinformatics.org on behalf of Subhash Agarwal Sent: Tue 2/21/2006 12:03 AM To: bioinformatics.org Subject: [BiO BB] Mean and Standard deviation Dear Forum Memebers I have a dataset with 394 data points. The frequecy distribution of which is as follows: 0-0.02 85 0.02-0.04 63 0.04-0.06 51 0.06-0.08 39 0.08-0.1 36 0.10-0.12 24 -0.12-0.14 21 0.14-0.16 19 0.16-0.18 13 0.18-0.2 10 >0.20 33 The mean and SD of the data is 0.0842 and 0.0853. But when the data points from 0-0.02 and > 0.2 are removed and the mean and SD is calculated it is found that mean doesnt change much (0.0819) but SD does (0.0472). My query is what can I interpret from this regarding the data. If any of the forum members feels that this question should be addressed to some other place do let me know. Thanks Subhash --------------------------------- Jiyo cricket on Yahoo! India cricket Yahoo! Messenger Mobile Stay in touch with your buddies all the time. -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3509 bytes Desc: not available URL: From marty.gollery at gmail.com Tue Feb 21 14:12:10 2006 From: marty.gollery at gmail.com (Martin Gollery) Date: Tue, 21 Feb 2006 11:12:10 -0800 Subject: [BiO BB] how to get PSSM profile from PSI-BLAST In-Reply-To: <20060221093442.93682.qmail@web34311.mail.mud.yahoo.com> References: <20060221093442.93682.qmail@web34311.mail.mud.yahoo.com> Message-ID: Hi Chan, To output the profile, use the -C option, as in blastpgp -d nr -i myfile.faa -C myprofile.ckp -j 3 This will save the profile as myprofile.ckp. Marty On 2/21/06, chan cheng wrote: > > I m doing a research about prediction of beta turns in > proteins from multiple alignment using neural network. > I had paste the beta turn sequence in PSI-BLAST and > summit the form. however i cant get the PSSM profile > that I need it as my neural network input. > how can i get the profile from PSI-BLAST? > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > -- -- Martin Gollery Associate Director Center For Bioinformatics University of Nevada at Reno Dept. of Biochemistry / MS330 775-784-7042 ----------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From aloraine at gmail.com Tue Feb 21 15:22:27 2006 From: aloraine at gmail.com (Ann Loraine) Date: Tue, 21 Feb 2006 14:22:27 -0600 Subject: [BiO BB] collecting all genes with a given GO id or its child terms Message-ID: <83722dde0602211222l30f0c88cye4e5aeae29d81d45@mail.gmail.com> Hi, Maybe somebody on the list could point me in the right direction? I'm looking for something that takes a list of GO ids and then returns a list of all Arabidopsis 'AGI' codes (gene ids, e.g., AT4G30490 ) that have been annotated with a GO id on the list and/or the child term of any GO term on the list. All the best, Ann Loraine -- Ann Loraine Assistant Professor Section on Statistical Genetics University of Alabama at Birmingham http://www.ssg.uab.edu http://www.transvar.org From CHRISTOPHER_FRENZ at NYMC.EDU Tue Feb 21 12:40:37 2006 From: CHRISTOPHER_FRENZ at NYMC.EDU (Frenz, Christopher) Date: Tue, 21 Feb 2006 12:40:37 -0500 Subject: [BiO BB] Mean and Standard deviation References: <20060221050325.25235.qmail@web31501.mail.mud.yahoo.com> Message-ID: <70C50B8807B54A429AC206E83A3BA6BCAD8C5E@mail.nymc.edu> Oops! Missed the fact that he posted the SD values, so I didn't note the direction of the change. Let's give a full explanation this time. SD= sqrt(Summation((Xi-Mean)^2)/(n-1)) where Xi is used to iterate through each value in the data set. Basically you begin by summing the squared differences between the mean and each value in the data set. Getting rid of the endpoints will likely lessen the value of this number resulting in a smaller SD, since the remaining numbers will result in smaller differences with the mean. This number is then divided by (n-1), where n is the sample size. Thus a smaller n will result in a larger SD and a larger n will result in a smaller SD. The final value of SD is the number square root of the number that results from the division by (n-1). Chris -----Original Message----- From: bio_bulletin_board-bounces+christopher_frenz=nymc.edu at bioinformatics.org on behalf of Subhash Agarwal Sent: Tue 2/21/2006 12:03 AM To: bioinformatics.org Subject: [BiO BB] Mean and Standard deviation Dear Forum Memebers I have a dataset with 394 data points. The frequecy distribution of which is as follows: 0-0.02 85 0.02-0.04 63 0.04-0.06 51 0.06-0.08 39 0.08-0.1 36 0.10-0.12 24 -0.12-0.14 21 0.14-0.16 19 0.16-0.18 13 0.18-0.2 10 >0.20 33 The mean and SD of the data is 0.0842 and 0.0853. But when the data points from 0-0.02 and > 0.2 are removed and the mean and SD is calculated it is found that mean doesnt change much (0.0819) but SD does (0.0472). My query is what can I interpret from this regarding the data. If any of the forum members feels that this question should be addressed to some other place do let me know. Thanks Subhash --------------------------------- Jiyo cricket on Yahoo! India cricket Yahoo! Messenger Mobile Stay in touch with your buddies all the time. -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3833 bytes Desc: not available URL: From marty.gollery at gmail.com Tue Feb 21 12:17:56 2006 From: marty.gollery at gmail.com (Martin Gollery) Date: Tue, 21 Feb 2006 09:17:56 -0800 Subject: [BiO BB] Mean and Standard deviation In-Reply-To: <70C50B8807B54A429AC206E83A3BA6BCAD8C5C@mail.nymc.edu> References: <20060221050325.25235.qmail@web31501.mail.mud.yahoo.com> <70C50B8807B54A429AC206E83A3BA6BCAD8C5C@mail.nymc.edu> Message-ID: Chris is correct. Another way to look at it is to consider that you are removing the values that have the greatest deviation from the mean, and so it makes sense that you then have a lower standard deviation. Marty On 2/21/06, Frenz, Christopher wrote: > > SD calculations involve the sample size as one of the variables. This is > likely what is causing the change in SD, since you are reducing the the > sample size when you remove the two sets of endpoints. > > Chris > > > -----Original Message----- > From: bio_bulletin_board-bounces+christopher_frenz= > nymc.edu at bioinformatics.org on behalf of Subhash Agarwal > Sent: Tue 2/21/2006 12:03 AM > To: bioinformatics.org > Subject: [BiO BB] Mean and Standard deviation > > Dear Forum Memebers > > I have a dataset with 394 data points. The frequecy distribution of > which is as follows: > > 0-0.02 85 0.02-0.04 63 0.04-0.06 51 0.06-0.08 > 39 0.08-0.1 36 0.10-0.12 24 -0.12-0.14 21 0.14-0.16 > 19 0.16-0.18 13 0.18-0.2 10 >0.20 33 > > The mean and SD of the data is 0.0842 and 0.0853. But when the data > points from 0-0.02 and > 0.2 are removed and the mean and SD is calculated > it is found that mean doesnt change much (0.0819) but SD does (0.0472). My > query is what can I interpret from this regarding the data. > > If any of the forum members feels that this question should be addressed > to some other place do let me know. > > Thanks > Subhash > > > --------------------------------- > Jiyo cricket on Yahoo! India cricket > Yahoo! Messenger Mobile Stay in touch with your buddies all the time. > > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > -- -- Martin Gollery Associate Director Center For Bioinformatics University of Nevada at Reno Dept. of Biochemistry / MS330 775-784-7042 ----------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From ryan.raaum at gmail.com Tue Feb 21 12:41:22 2006 From: ryan.raaum at gmail.com (Ryan Raaum) Date: Tue, 21 Feb 2006 12:41:22 -0500 Subject: [BiO BB] Mean and Standard deviation In-Reply-To: <20060221050325.25235.qmail@web31501.mail.mud.yahoo.com> References: <20060221050325.25235.qmail@web31501.mail.mud.yahoo.com> Message-ID: Hi, One thing you might consider is that your data are nowhere near normally distributed, so it's unclear how mean and standard deviation are useful summary statistics for these data. Best, Ryan On 2/21/06, Subhash Agarwal wrote: > > Dear Forum Memebers > > I have a dataset with 394 data points. The frequecy distribution of which > is as follows: > > 0-0.02 85 0.02-0.04 63 0.04-0.06 51 0.06-0.08 39 0.08-0.1 36 0.10-0.12 > 24 -0.12-0.14 21 0.14-0.16 19 0.16-0.18 13 0.18-0.2 10 >0.20 > 33 > > The mean and SD of the data is 0.0842 and 0.0853. But when the data points > from 0-0.02 and > 0.2 are removed and the mean and SD is calculated it is > found that mean doesnt change much (0.0819) but SD does (0.0472). My query > is what can I interpret from this regarding the data. > > If any of the forum members feels that this question should be addressed > to some other place do let me know. > > Thanks > Subhash > > ------------------------------ > Jiyo cricket on Yahoo! India cricket > Yahoo! Messenger MobileStay in touch with your buddies all the time. > > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > -- Ryan Raaum http://www.rockefeller.edu -- Bacterial Pathogenesis and Immunology http://www.worldmartial.com -- Black Belt Instructor http://locomotive.raaum.org -- Self contained one-click Rails for Mac OS X -------------- next part -------------- An HTML attachment was scrubbed... URL: From chea at mail.nih.gov Tue Feb 21 15:34:25 2006 From: chea at mail.nih.gov (Che, Anney (NIHNIAID)) Date: Tue, 21 Feb 2006 15:34:25 -0500 Subject: [BiO BB] Tiling microarray Message-ID: Hi, Does anyone know how to normalize Tiling microarray data? And any good software that analysis tiling microarray data. Thanks, Anney -- Anney Che, M.S. Biocomputing Specialist Laboratory of Molecular Microbiology (LMM) National Institute of Allergy and Infectious Diseases (NIAID) 9000 Rockville Pike, Bldg 4, Room 301 Bethesda, MD 20892 Phone: 301-451-2851 Fax: 301-280-2716 From aloraine at gmail.com Tue Feb 21 16:00:05 2006 From: aloraine at gmail.com (Ann Loraine) Date: Tue, 21 Feb 2006 15:00:05 -0600 Subject: [BiO BB] Tiling microarray In-Reply-To: References: Message-ID: <83722dde0602211300yf7c8d9bi3d7e917d641e01ed@mail.gmail.com> This is a just a guess, I'm afraid, but quantile-quantile normalization is probably a good way to go. For visualizing tiling array data, I recommend using the Integrated Genome Browser. It lets you do things like load tiling array data as simple text files with position/value pairs and then perform simple manipulations, such as subtracting one graph from another. There are some links from my Web site http://www.transvar.org including a now slightly out-of-date tutorial (http://www.transvar.org/at_annots) that might be useful. Also, you should check out http://genoviz.sourceforge.net. That will take you to a Java Web Start page at Affymetrix where you can download and launch the program. All the best, Ann Loraine On 2/21/06, Che, Anney (NIHNIAID) wrote: > Hi, > > Does anyone know how to normalize Tiling microarray data? > And any good software that analysis tiling microarray data. > > Thanks, > > Anney > -- > Anney Che, M.S. > Biocomputing Specialist > Laboratory of Molecular Microbiology (LMM) > National Institute of Allergy and Infectious Diseases (NIAID) > 9000 Rockville Pike, Bldg 4, Room 301 > Bethesda, MD 20892 > Phone: 301-451-2851 > Fax: 301-280-2716 > > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > -- Ann Loraine Assistant Professor Section on Statistical Genetics University of Alabama at Birmingham http://www.ssg.uab.edu http://www.transvar.org From maximilianh at gmail.com Tue Feb 21 17:18:41 2006 From: maximilianh at gmail.com (Maximilian Haeussler) Date: Tue, 21 Feb 2006 23:18:41 +0100 Subject: [BiO BB] collecting all genes with a given GO id or its child terms In-Reply-To: <83722dde0602211222l30f0c88cye4e5aeae29d81d45@mail.gmail.com> References: <83722dde0602211222l30f0c88cye4e5aeae29d81d45@mail.gmail.com> Message-ID: <76f031ae0602211418o388b84abq@mail.gmail.com> I wonder if you couldn't simply download the arabidopsis GO from ftp://ftp.arabidopsis.org/home/tair/Ontologies/Gene_Ontology/ATH_GO_GOSLIM.20060218.txt, import it into (wouuw...) Excel and filter by the 4th field (=GO id)...? The 1st fields would then contain the corresponding gene id. Oh. You're not a biologist. So replace Excel by BioPerl or Python or AWK. :-) Take care, Max On 21/02/06, Ann Loraine wrote: > > Hi, > > Maybe somebody on the list could point me in the right direction? > > I'm looking for something that takes a list of GO ids and then returns > a list of all Arabidopsis 'AGI' codes (gene ids, e.g., AT4G30490 ) > that have been annotated with a GO id on the list and/or the child > term of any GO term on the list. > > All the best, > > Ann Loraine > > -- > Ann Loraine > Assistant Professor > Section on Statistical Genetics > University of Alabama at Birmingham > http://www.ssg.uab.edu > http://www.transvar.org > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > -- Maximilian Haeussler, CNRS Gif-sur-Yvette, Paris tel: +33 6 12 82 76 16 icq: 3825815 -- msn: maximilian.haeussler at hpi.uni-potsdam.de skype: maximilianhaeussler -------------- next part -------------- An HTML attachment was scrubbed... URL: From wusu at ele.uri.edu Tue Feb 21 18:49:17 2006 From: wusu at ele.uri.edu (wusu at ele.uri.edu) Date: Tue, 21 Feb 2006 18:49:17 -0500 Subject: [BiO BB] Mean and Standard deviation In-Reply-To: References: <20060221050325.25235.qmail@web31501.mail.mud.yahoo.com> <70C50B8807B54A429AC206E83A3BA6BCAD8C5C@mail.nymc.edu> Message-ID: <1140565757.43fba6fda4f5c@webmail.ele.uri.edu> The mathematical interpretations for the reduction of SD due to a trimmed sample size at both ends are all very good. However, I don't know why you would get rid of the end points of your data set, because the frequencies for both end classes or points are rather high (that is 85 + 33 = 118) compared with the other classes. You are ignoring 118 / 394 = 29.95% of your data set. If it is the boundary effect that you are concerned with, you may want to redesign the experiment or simulation in my opinion. Regards, Su Quoting Martin Gollery : > Chris is correct. Another way to look at it is to consider that you are > removing the values that have the greatest deviation from the mean, and so > it makes sense that you then have a lower standard deviation. > > Marty > > On 2/21/06, Frenz, Christopher wrote: > > > > SD calculations involve the sample size as one of the variables. This is > > likely what is causing the change in SD, since you are reducing the the > > sample size when you remove the two sets of endpoints. > > > > Chris > > > > > > -----Original Message----- > > From: bio_bulletin_board-bounces+christopher_frenz= > > nymc.edu at bioinformatics.org on behalf of Subhash Agarwal > > Sent: Tue 2/21/2006 12:03 AM > > To: bioinformatics.org > > Subject: [BiO BB] Mean and Standard deviation > > > > Dear Forum Memebers > > > > I have a dataset with 394 data points. The frequecy distribution of > > which is as follows: > > > > 0-0.02 85 0.02-0.04 63 0.04-0.06 51 0.06-0.08 > > 39 0.08-0.1 36 0.10-0.12 24 -0.12-0.14 21 0.14-0.16 > > 19 0.16-0.18 13 0.18-0.2 10 >0.20 33 > > > > The mean and SD of the data is 0.0842 and 0.0853. But when the data > > points from 0-0.02 and > 0.2 are removed and the mean and SD is calculated > > it is found that mean doesnt change much (0.0819) but SD does (0.0472). My > > query is what can I interpret from this regarding the data. > > > > If any of the forum members feels that this question should be addressed > > to some other place do let me know. > > > > Thanks > > Subhash > > > > > > --------------------------------- > > Jiyo cricket on Yahoo! India cricket > > Yahoo! Messenger Mobile Stay in touch with your buddies all the time. > > > > > > _______________________________________________ > > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > > > > > > > -- > -- > Martin Gollery > Associate Director > Center For Bioinformatics > University of Nevada at Reno > Dept. of Biochemistry / MS330 > 775-784-7042 > ----------- > From nagesh.chakka at anu.edu.au Tue Feb 21 21:06:04 2006 From: nagesh.chakka at anu.edu.au (Nagesh Chakka) Date: Wed, 22 Feb 2006 13:06:04 +1100 Subject: [BiO BB] Designing degenerate primers Message-ID: <43FBC70C.7000602@anu.edu.au> Hi all, I need to design primers to "fish out" the region of interest from reptilian genomic sequence. I donot have the reptilian gDNA sequence information to design the primer with exact sequence. Hence I thought of designing degenerate primers based on the available sequence information (eutherian mammal, marsupial, amphibian). Are there any available programs which does this kind of primer design or that I need to perform a multiple sequence alignment and look for the most conserved region and design primer with that. Any advice is greatly appreciated. Thanks Nagesh From smagarwal at yahoo.com Tue Feb 21 23:34:44 2006 From: smagarwal at yahoo.com (Subhash Agarwal) Date: Wed, 22 Feb 2006 04:34:44 +0000 (GMT) Subject: [BiO BB] Mean and SD Message-ID: <20060222043444.83585.qmail@web31503.mail.mud.yahoo.com> Dear Forum Members I agree with the points suggested by all you. From Ryan I would like to ask that keeping in mind that the data doesnt follow normal distribution what other statistical parameters can be calculated to understand whether the data is meaningful or not. Secondly can Su explain a bit more regarding his suggestion on "redesign the experiment or simulation in my opinion". thanks Subhash --------------------------------- Jiyo cricket on Yahoo! India cricket Yahoo! Messenger Mobile Stay in touch with your buddies all the time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ryan.raaum at gmail.com Wed Feb 22 05:25:01 2006 From: ryan.raaum at gmail.com (Ryan Raaum) Date: Wed, 22 Feb 2006 05:25:01 -0500 Subject: [BiO BB] Mean and SD In-Reply-To: <20060222043444.83585.qmail@web31503.mail.mud.yahoo.com> References: <20060222043444.83585.qmail@web31503.mail.mud.yahoo.com> Message-ID: Hi Subhash > > I agree with the points suggested by all you. From Ryan I would like to > ask that keeping in mind that the data doesnt follow normal distribution > what other statistical parameters can be calculated to understand whether > the data is meaningful or not. > I think you'll have to provide some more information on what the data represent and what in particular you are trying to understand before I could offer any other suggestions. (And even then, we may have already reached the limits of my statistical knowledge, but someone else may be able to provide some helpful advice). Best, -Ryan -------------- next part -------------- An HTML attachment was scrubbed... URL: From hershel.safer at weizmann.ac.il Wed Feb 22 10:22:20 2006 From: hershel.safer at weizmann.ac.il (Hershel Safer) Date: Wed, 22 Feb 2006 17:22:20 +0200 Subject: [BiO BB] March 15th deadline for European Conf on Computational Biology Message-ID: <7.0.1.0.2.20060222172203.02aafa80@alum.mit.edu> An HTML attachment was scrubbed... URL: From maximilianh at gmail.com Wed Feb 22 20:56:28 2006 From: maximilianh at gmail.com (Maximilian Haeussler) Date: Thu, 23 Feb 2006 02:56:28 +0100 Subject: [BiO BB] Designing degenerate primers In-Reply-To: <43FBC70C.7000602@anu.edu.au> References: <43FBC70C.7000602@anu.edu.au> Message-ID: <76f031ae0602221756q37789c80y@mail.gmail.com> forgot to forward to the list: you don't have to do the sequences alignment yourself. on genome-test.soe.ucsc.edu you'll find multiple alignments of 12 mammalian species (switch on track "MultiZ 10 way" or "MultiZ 12 way"). You can extract alignments with "Table Browser" - "MultiZ 10 Way" - Download as maf (or fasta?). Take care, Max On 22/02/06, Nagesh Chakka wrote: > > Hi all, > I need to design primers to "fish out" the region of interest from > reptilian genomic sequence. I donot have the reptilian gDNA sequence > information to design the primer with exact sequence. Hence I thought of > designing degenerate primers based on the available sequence information > (eutherian mammal, marsupial, amphibian). Are there any available > programs which does this kind of primer design or that I need to perform > a multiple sequence alignment and look for the most conserved region and > design primer with that. > Any advice is greatly appreciated. > Thanks > Nagesh > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > -- Maximilian Haeussler, CNRS Gif-sur-Yvette, Paris tel: +33 6 12 82 76 16 icq: 3825815 -- msn: maximilian.haeussler at hpi.uni-potsdam.de skype: maximilianhaeussler -------------- next part -------------- An HTML attachment was scrubbed... URL: From logan at cacs.louisiana.edu Fri Feb 24 12:56:35 2006 From: logan at cacs.louisiana.edu (Raja Loganantharaj) Date: Fri, 24 Feb 2006 11:56:35 -0600 Subject: [BiO BB] Need help on batch query on GO ontology Message-ID: <43FF48D3.60308@cacs.louisiana.edu> I have a set of GO IDs for a set of genes and I would like to obtain their taxonomical GO hierarchy. The GO portal allow single ID at a time. Is there any API or software from which I enter or read in the set of GO ID or genes so that I get the clustered output based on their function(say). I have tried DAVID. Thanks and I appreciate you sharing your experience. Raja Loganantharaj From idh at poulet.org Sat Feb 25 11:39:37 2006 From: idh at poulet.org (idh at poulet.org) Date: Sat, 25 Feb 2006 17:39:37 +0100 Subject: [BiO BB] Need help on batch query on GO ontology In-Reply-To: <43FF48D3.60308@cacs.louisiana.edu> References: <43FF48D3.60308@cacs.louisiana.edu> Message-ID: <1140885577.44008849af80e@webmail.poulet.org> Hi Raja, I do not know how much detail you need, but you might want to have a look at a tool I used recently: wego http://wego.genomics.org.cn/ Upload your gene ontology annotation file, and it will tell you how many genes are in each go category. You can also fold the hierarchy down to a given level (say level 2 if you want the tool to cumulatively count how many genes are in each basal category). Cheers, Yannick --- http://www.unil.ch/dee/page28685_en.html Quoting Raja Loganantharaj : > I have a set of GO IDs for a set of genes and I would like to obtain > their taxonomical GO hierarchy. The GO portal allow single ID at a time. > Is there any API or software from which I enter or read in the set of GO > ID or genes so that I get the clustered output based on their > function(say). I have tried DAVID. > > Thanks and I appreciate you sharing your experience. > > Raja Loganantharaj > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From delete at elfdata.com Sat Feb 25 12:25:27 2006 From: delete at elfdata.com (Theodore H. Smith) Date: Sat, 25 Feb 2006 17:25:27 +0000 Subject: [BiO BB] BLAST and multiple section alignment Message-ID: I'm not sure if I got this correct, so let me know if this isn't true. I'm just asking for confirmation here. My question: It seems that BLAST does not generate multiple section alignments? Is this true or not? By multiple section alignments, I mean for example when the head of a gene is moved to the middle of a gene. A very sophisticated gene searcher could generate a stronger hit for such a case, than simply returning the best alignment, because there will now be multiple alignments within the two genes that we are comparing. -- http://elfdata.com/plugin/ From bioinfosm at gmail.com Sun Feb 26 12:57:45 2006 From: bioinfosm at gmail.com (Samantha Fox) Date: Sun, 26 Feb 2006 12:57:45 -0500 Subject: [BiO BB] mapping genbank accession to Organism name Message-ID: <726450810602260957h12229647q88a33dab6d5a07ed@mail.gmail.com> Hi all, This should be really easy, but somehow I am cannot figure it out. I BLASTed my sequences with the non-redundant nt sequence from NCBI. The hits are something like gb|AC167666.4, emb|BX640434.1, emb|BX640418.1 ... Can someone suggest me a way to get the organism name from these genbank accessions ? hope someone has done this already :) .. ~S -------------- next part -------------- An HTML attachment was scrubbed... URL: From sourangshu at csa.iisc.ernet.in Mon Feb 27 06:54:37 2006 From: sourangshu at csa.iisc.ernet.in (Sourangshu Bhattacharya) Date: Mon, 27 Feb 2006 17:24:37 +0530 (IST) Subject: [BiO BB] Extra entries in blast blosum62 matrix In-Reply-To: <7.0.1.0.2.20060222172203.02aafa80@alum.mit.edu> References: <7.0.1.0.2.20060222172203.02aafa80@alum.mit.edu> Message-ID: Hi, Just wondering what the B, Z, and X entries in blast blosum 62 matrix mean.. Any idea ?? Thanks, Sourangshu Sourangshu Bhattacharya PhD Student, Dept. of Computer Science & Automation, IISc, Bangalore. http://people.csa.iisc.ernet.in/sourangshu From golharam at umdnj.edu Sun Feb 26 13:20:22 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Sun, 26 Feb 2006 13:20:22 -0500 Subject: [BiO BB] mapping genbank accession to Organism name In-Reply-To: <726450810602260957h12229647q88a33dab6d5a07ed@mail.gmail.com> Message-ID: <012301c63b01$4efc9520$e6028a0a@GOLHARMOBILE1> Use the NCBI eutils. Submit the GenBank accession # to get the GenBank record in XML. The XML record will contain an entry for the organism. Here's a script I have to do this. First input is the database, nucleotide in your case, search term is the accession #. No options causes the results to be sent back in ASN.1 format. I forget the option to make it xml. I think its rettype=XML or something like that. You can even put your input in a text file and redirect the file as input to the script... ---BEGIN: efetch.pl--- #!/usr/bin/perl -w use LWP::Simple; my $utils = " http://eutils.ncbi.nlm.nih.gov/entrez/eutils"; print "Database: "; $db = ; chomp $db; print "Search Term: "; $term = ; chomp $term; print "Options: "; $options = ; chomp $options; my $esearch = "$utils/esearch.fcgi?db=$db&term=$term&usehistory=y&tool=efetch"; print "$esearch\n"; my $esearch_result = get($esearch); if ($esearch_result =~ m/(\d+)<\/Id>/) { $id = $1; } if ($esearch_result =~ m/(\d+)<\/QueryKey>/) { $key = $1; } if ($esearch_result =~ m/(.*)<\/WebEnv>/) { $webenv = $1; } if (defined($id)) { print "ID: $id\nKey: $key\nWebEnv: $webenv\n\n"; $esearch = "$utils/efetch.fcgi?db=$db&id=$id&tool=efetch"; print "$esearch\n"; my $esummary_result = get($esearch); print "$esummary_result\n"; } else { print "$esearch_result\n"; } ---END: efetch.ph--- -----Original Message----- From: bio_bulletin_board-bounces+golharam=umdnj.edu at bioinformatics.org [mailto:bio_bulletin_board-bounces+golharam=umdnj.edu at bioinformatics.org ] On Behalf Of Samantha Fox Sent: Sunday, February 26, 2006 12:58 PM To: The general forum at Bioinformatics.Org Subject: [BiO BB] mapping genbank accession to Organism name Hi all, This should be really easy, but somehow I am cannot figure it out. I BLASTed my sequences with the non-redundant nt sequence from NCBI. The hits are something like gb|AC167666.4, emb|BX640434.1, emb|BX640418.1 ... Can someone suggest me a way to get the organism name from these genbank accessions ? hope someone has done this already :) .. ~S -------------- next part -------------- An HTML attachment was scrubbed... URL: From MEC at Stowers-Institute.org Thu Feb 23 11:41:39 2006 From: MEC at Stowers-Institute.org (Cook, Malcolm) Date: Thu, 23 Feb 2006 10:41:39 -0600 Subject: [BiO BB] Designing degenerate primers Message-ID: Nagesh, Vector NTI's (windows only) AlignX module supports your needs quite nicely (though not in a high throughput fashion). Version 10 is now free to academics (!), though getting registered is a bit of a bear. Some researchers here at the Stowers Institute used it for amping up a lizard gene based on chicken aligned with (was it?) mouse. For further specificity, they designed nested degenerate primer pairs (if I remember correctly). AlignX can import precomputed alignments too (but only in MAF format). Let us know what you do. The following is from their Help file, and might help you decide if the weighty install is worth your effort: The Alignment PCR feature in Vector NTI allows you to design PCR primers for amplifying a region of aligned DNA/RNA molecules. Using this feature, you can design primer pair sets that will amplify any of the DNA/RNA molecules in a specified alignment. In this type of PCR analysis, the molecules are first aligned in AlignX and the alignment is subsequently analyzed using the Alignment PCR feature in Vector NTI. The basic procedure for performing an Alignment PCR analysis consists of the following steps: 1. Define the set of DNA/RNA molecules you want to use for Alignment PCR analysis. 2. Align the set of molecules in AlignX. 3. Select a region of the AlignX alignment for which Alignment PCR will be run. 4. Define the Alignment PCR parameters. 5. Perform the Alignment PCR analysis. >From another page in the help file (Alignment PCR parameters), I quote additionally: Primer/Every Molecule Set the minimum overall similarity and 3' end similarity for the primer with every molecule in the alignment Cheers, Malcolm Cook - mec at stowers-institute.org - 816-926-4449 Database Applications Manager - Bioinformatics Stowers Institute for Medical Research - Kansas City, MO USA ________________________________ From: bio_bulletin_board-bounces+mec=stowers-institute.org at bioinformatics.org [mailto:bio_bulletin_board-bounces+mec=stowers-institute.org at bioinformat ics.org] On Behalf Of Maximilian Haeussler Sent: Wednesday, February 22, 2006 7:56 PM To: The general forum at Bioinformatics.Org Subject: Re: [BiO BB] Designing degenerate primers forgot to forward to the list: you don't have to do the sequences alignment yourself. on genome-test.soe.ucsc.edu you'll find multiple alignments of 12 mammalian species (switch on track "MultiZ 10 way" or "MultiZ 12 way"). You can extract alignments with "Table Browser" - "MultiZ 10 Way" - Download as maf (or fasta?). Take care, Max On 22/02/06, Nagesh Chakka < nagesh.chakka at anu.edu.au > wrote: Hi all, I need to design primers to "fish out" the region of interest from reptilian genomic sequence. I donot have the reptilian gDNA sequence information to design the primer with exact sequence. Hence I thought of designing degenerate primers based on the available sequence information (eutherian mammal, marsupial, amphibian). Are there any available programs which does this kind of primer design or that I need to perform a multiple sequence alignment and look for the most conserved region and design primer with that. Any advice is greatly appreciated. Thanks Nagesh _______________________________________________ Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board -- Maximilian Haeussler, CNRS Gif-sur-Yvette, Paris tel: +33 6 12 82 76 16 icq: 3825815 -- msn: maximilian.haeussler at hpi.uni-potsdam.de skype: maximilianhaeussler -------------- next part -------------- An HTML attachment was scrubbed... URL: From maximilianh at gmail.com Mon Feb 27 08:47:51 2006 From: maximilianh at gmail.com (Maximilian Haeussler) Date: Mon, 27 Feb 2006 14:47:51 +0100 Subject: [BiO BB] Designing degenerate primers In-Reply-To: <76f031ae0602221756q37789c80y@mail.gmail.com> References: <43FBC70C.7000602@anu.edu.au> <76f031ae0602221756q37789c80y@mail.gmail.com> Message-ID: <76f031ae0602270547x4c0a1744v@mail.gmail.com> To import sequences into vector NTI: Yes, you can download the alignments from ucsc directly, but only in maf-format. If you need them converted to fasta, you can do it manually if it's only a few sequences (search-replace) or use a script maf2fasta which does to conversion for you. (if you cannot find it, I also have written python script that does it) I quote from the genome-mailinglist at ucsc: Hi Robert, You may try the download the multiz program from: http://www.bx.psu.edu/miller_lab/dist/multiz-tba-2005-04-28.tar.gz. In there you can find a program called maf2fasta, which can convert MAF to FASTA format. regards, Hong ----- Original Message ----- From: "Donna Karolchik" > To: "Querfurth" >; > Sent: Friday, July 22, 2005 2:59 PM Subject: Re: [Genome] MAF format On 23/02/06, Maximilian Haeussler wrote: > > forgot to forward to the list: > you don't have to do the sequences alignment yourself. on genome-test.soe.ucsc.edu > you'll find multiple alignments of 12 mammalian species (switch on track > "MultiZ 10 way" or "MultiZ 12 way"). You can extract alignments with "Table > Browser" - "MultiZ 10 Way" - Download as maf (or fasta?). > > Take care, > Max > > On 22/02/06, Nagesh Chakka < nagesh.chakka at anu.edu.au> wrote: > > > > Hi all, > > I need to design primers to "fish out" the region of interest from > > reptilian genomic sequence. I donot have the reptilian gDNA sequence > > information to design the primer with exact sequence. Hence I thought of > > designing degenerate primers based on the available sequence information > > > > (eutherian mammal, marsupial, amphibian). Are there any available > > programs which does this kind of primer design or that I need to perform > > a multiple sequence alignment and look for the most conserved region and > > > > design primer with that. > > Any advice is greatly appreciated. > > Thanks > > Nagesh > > _______________________________________________ > > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > > > > -- > Maximilian Haeussler, > CNRS Gif-sur-Yvette, Paris > tel: +33 6 12 82 76 16 > icq: 3825815 -- msn: maximilian.haeussler at hpi.uni-potsdam.de > skype: maximilianhaeussler -- Maximilian Haeussler, CNRS Gif-sur-Yvette, Paris tel: +33 6 12 82 76 16 icq: 3825815 -- msn: maximilian.haeussler at hpi.uni-potsdam.de skype: maximilianhaeussler -------------- next part -------------- An HTML attachment was scrubbed... URL: From wusu at ele.uri.edu Mon Feb 27 12:55:51 2006 From: wusu at ele.uri.edu (wusu at ele.uri.edu) Date: Mon, 27 Feb 2006 12:55:51 -0500 Subject: [BiO BB] Mean and SD In-Reply-To: <20060222043444.83585.qmail@web31503.mail.mud.yahoo.com> References: <20060222043444.83585.qmail@web31503.mail.mud.yahoo.com> Message-ID: <1141062951.44033d27d2882@webmail.ele.uri.edu> Dear Subhash and other interested members: I am sorry for a late reply. I did type up my reply last Thursday, but the network went dead as I sent it. I just came back from a conference so let me try to answer the question again. Thank you for your patience! I am a biomedical engineer that also teaches undergraduate statistics, so I get to know some statistics. In order to trim down the discarded portion of the data set, you may decrease the class width from .02 to .01 and then abandon the end classes. Since we don't have your original data set, it's hard to tell from a frequency histogram. Thanks to the beautiful Central Limit Theorem so we know that the sampling (or probability) distribution of the sample means becomes normal if the sample size grows large. Your sample size was 394 so it's plenty large. Your 394 data values produces one sample mean, which is only one value of many 'possible' sample means if you repeat the measurements 394 times over, over and over again. For example, see http://www.statsoft.com/textbook/stnonpar.html#when That is to say if you repeat the same experiment many times, the probability distribution of all the sample means will look normal. Since we don't do such stupid things, we can really appreciate the word, 'possible' from random sampling. Redesign experiments can mean to treat physical boundary problems carefully so you won't have to throw so many values away. It also applies to simulations, in which boundary conditions have to be met. I had to do something similar for a bioelectromagnetics project. Since I don't know what this data set is about, I can't say much about it. In statistics, experimental design or DOE can mean different things. You can take a look at a statictics book for reference or see http://www.itl.nist.gov/div898/handbook/pri/pri.htm Perhaps some statistician may respond to us free of charge. Best regards, Su Wu - her 2 cents. :) Quoting Subhash Agarwal : > Dear Forum Members > > I agree with the points suggested by all you. From Ryan I would like to ask > that keeping in mind that the data doesnt follow normal distribution what > other statistical parameters can be calculated to understand whether the data > is meaningful or not. > > Secondly can Su explain a bit more regarding his suggestion on "redesign > the experiment or simulation in my opinion". > > thanks > Subhash > > > --------------------------------- > Jiyo cricket on Yahoo! India cricket > Yahoo! Messenger Mobile Stay in touch with your buddies all the time. From zhong.huang at jefferson.edu Tue Feb 28 00:41:40 2006 From: zhong.huang at jefferson.edu (Zhong Huang) Date: Tue, 28 Feb 2006 00:41:40 -0500 Subject: [BiO BB] fasta to harsh table bioperl Message-ID: <20060228054128.1CC2C368D42@primary.bioinformatics.org> hi, Can anyone suggest me a simple way to convert multiple sequences fasta (in Bio::SeqIO object) into harsh table (sequence annotation as key, sequence as value)? The fasta file looks like this: >gi|9049352|dbj|BAA99407.1| 3-methylcrotonyl-CoA carboxylase biotin-containing subunit [Homo sapiens] MAAASAVSVLLVAAERNRWHRLPSLLLPPRTWVWRQRTMKYTTATGRNITKVLIANRGEIACRVMRTAKKLGVQT- VAVYSEADRNSMHVDMADEAYSIGPAPSQQSYLSMEKIIQVAKTSAAQAIHPGCGFLSENMEFAE >gi|4504067|ref|NP_002070.1| aspartate aminotransferase 1 [Homo sapiens] MAPPSVFAEVPQAQPVLVFKLTADFREDPDPRKVNLGVGAYRTDDCHPWVLPVVKKVEQKIANDNSLNHEYLPIL- GLAEFRSCASRLALGD I want to have the harsh table %seqharsh to hold sequences like this: # my %seqharsh = ('seq1', MAAASAVSVL......', # 'seq2', MAPPSVFAEVPQ......,); My code is like this: my $seqio = new Bio::SeqIO(-format => $format, -file => $file); while ( my $seq = $seqio->next_seq ) { if( $seq->alphabet ne 'protein' ) { confess("Skipping non protein sequence..."); next; } #write code here to assign each entry into harsh %seqharsh my $seqharsh{$seq->primary_id} = $seq->seq(); bla bla bla Thank you very much for your help! zhong From biopctgi at yahoo.es Tue Feb 28 05:52:02 2006 From: biopctgi at yahoo.es (Jose Maria Gonzalez Izarzugaza) Date: Tue, 28 Feb 2006 11:52:02 +0100 Subject: [BiO BB] MrBayes - Memory overflow Message-ID: <44042B52.1090608@yahoo.es> Hello everyone. Does anybody know how i could limit the memory usage while creating a consensus tree with MrBayes? (even though i am using a 2GB RAM computer) The problem (segmentation fault) only appears when i'm trying to build a tree with more than 40 taxa, so the problem is not in the compilation, nor in the commands given. my mrbayes command is: sumt filename=filewithALLtrees contype=allcompat burnin=1500 Many thanks in advance. Regards, Txema From omoya at uib.es Tue Feb 28 06:40:41 2006 From: omoya at uib.es (Oscar Moya) Date: Tue, 28 Feb 2006 12:40:41 +0100 Subject: [BiO BB] MrBayes - Memory overflow In-Reply-To: <44042B52.1090608@yahoo.es> Message-ID: <01LZI35E3X6C8ZET1I@uib.es> Hi Txema, I am not sure about if your problem is due to memory needs. I am running analyses of 70 sequences 1200 bases long in 512MB RAM computers. Have you tried to run it on another computer? It sounds evident, but sometimes work. If problems keep, you could send the command block for checking (mcmc settings, Lset, prset, etc...). Good luck! ?scar -----Mensaje original----- De: bio_bulletin_board-bounces+omoya=uib.es at bioinformatics.org [mailto:bio_bulletin_board-bounces+omoya=uib.es at bioinformatics.org] En nombre de Jose Maria Gonzalez Izarzugaza Enviado el: martes, 28 de febrero de 2006 11:52 Para: bio_bulletin_board at bioinformatics.org Asunto: [BiO BB] MrBayes - Memory overflow Hello everyone. Does anybody know how i could limit the memory usage while creating a consensus tree with MrBayes? (even though i am using a 2GB RAM computer) The problem (segmentation fault) only appears when i'm trying to build a tree with more than 40 taxa, so the problem is not in the compilation, nor in the commands given. my mrbayes command is: sumt filename=filewithALLtrees contype=allcompat burnin=1500 Many thanks in advance. Regards, Txema _______________________________________________ Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From biopctgi at yahoo.es Tue Feb 28 06:53:05 2006 From: biopctgi at yahoo.es (Jose Maria Gonzalez Izarzugaza) Date: Tue, 28 Feb 2006 12:53:05 +0100 Subject: [BiO BB] MrBayes - Memory overflow In-Reply-To: <01LZI35E3X6C8ZET1I@uib.es> References: <01LZI35E3X6C8ZET1I@uib.es> Message-ID: <440439A1.8050006@yahoo.es> Thanks a lot, Oscar. I've tried to run the consensus tree on different computers, with different architechtures, and the results are the same in all of them: a "nice" segmentation fault after reading the 2 tree files. The problem is just with one specific dataset, the one with the most taxa. That's why i say it migth be due to memory limitations. As i use a previously generated set of trees, the command block is as simple as sumt filename=filewithALLtrees contype=allcompat burnin=1500 (where contype can also be skipped to the default half compat) ----------------------------------------------------------------------------------------------------------------------------------------- Oscar Moya wrote: >Hi Txema, > >I am not sure about if your problem is due to memory needs. I am running >analyses of 70 sequences 1200 bases long in 512MB RAM computers. > >Have you tried to run it on another computer? It sounds evident, but >sometimes work. > >If problems keep, you could send the command block for checking (mcmc >settings, Lset, prset, etc...). > >Good luck! > >?scar > >-----Mensaje original----- >De: bio_bulletin_board-bounces+omoya=uib.es at bioinformatics.org >[mailto:bio_bulletin_board-bounces+omoya=uib.es at bioinformatics.org] En >nombre de Jose Maria Gonzalez Izarzugaza >Enviado el: martes, 28 de febrero de 2006 11:52 >Para: bio_bulletin_board at bioinformatics.org >Asunto: [BiO BB] MrBayes - Memory overflow > >Hello everyone. > >Does anybody know how i could limit the memory usage while creating a >consensus tree with MrBayes? (even though i am using a 2GB RAM computer) > >The problem (segmentation fault) only appears when i'm trying to build a >tree with more than 40 taxa, so the problem is not in the compilation, >nor in the commands given. > >my mrbayes command is: >sumt filename=filewithALLtrees contype=allcompat burnin=1500 > >Many thanks in advance. > >Regards, >Txema > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > From omoya at uib.es Tue Feb 28 09:29:38 2006 From: omoya at uib.es (Oscar Moya) Date: Tue, 28 Feb 2006 15:29:38 +0100 Subject: [BiO BB] MrBayes - Memory overflow In-Reply-To: <440439A1.8050006@yahoo.es> Message-ID: <01LZI91W1IO28ZET1I@uib.es> Ok, commands and computer discarded I have just seen that MrBayes has problems with 64 bit architechtures, don?t know if you have tried in 32 bits one. The following links speak about similar cases: http://sourceforge.net/mailarchive/forum.php?thread_id=9725904&forum_id=4511 7 http://www.rannala.org/phpBB2/search.php?search_author=GertW& A solution for some of them was to download the sourcecode from CVS at Sourceforge and compile it by themselves. Hope that helps ?scar -----Mensaje original----- De: bio_bulletin_board-bounces+omoya=uib.es at bioinformatics.org [mailto:bio_bulletin_board-bounces+omoya=uib.es at bioinformatics.org] En nombre de Jose Maria Gonzalez Izarzugaza Enviado el: martes, 28 de febrero de 2006 12:53 Para: The general forum at Bioinformatics.Org Asunto: Re: [BiO BB] MrBayes - Memory overflow Thanks a lot, Oscar. I've tried to run the consensus tree on different computers, with different architechtures, and the results are the same in all of them: a "nice" segmentation fault after reading the 2 tree files. The problem is just with one specific dataset, the one with the most taxa. That's why i say it migth be due to memory limitations. As i use a previously generated set of trees, the command block is as simple as sumt filename=filewithALLtrees contype=allcompat burnin=1500 (where contype can also be skipped to the default half compat) ---------------------------------------------------------------------------- ------------------------------------------------------------- Oscar Moya wrote: >Hi Txema, > >I am not sure about if your problem is due to memory needs. I am running >analyses of 70 sequences 1200 bases long in 512MB RAM computers. > >Have you tried to run it on another computer? It sounds evident, but >sometimes work. > >If problems keep, you could send the command block for checking (mcmc >settings, Lset, prset, etc...). > >Good luck! > >?scar > >-----Mensaje original----- >De: bio_bulletin_board-bounces+omoya=uib.es at bioinformatics.org >[mailto:bio_bulletin_board-bounces+omoya=uib.es at bioinformatics.org] En >nombre de Jose Maria Gonzalez Izarzugaza >Enviado el: martes, 28 de febrero de 2006 11:52 >Para: bio_bulletin_board at bioinformatics.org >Asunto: [BiO BB] MrBayes - Memory overflow > >Hello everyone. > >Does anybody know how i could limit the memory usage while creating a >consensus tree with MrBayes? (even though i am using a 2GB RAM computer) > >The problem (segmentation fault) only appears when i'm trying to build a >tree with more than 40 taxa, so the problem is not in the compilation, >nor in the commands given. > >my mrbayes command is: >sumt filename=filewithALLtrees contype=allcompat burnin=1500 > >Many thanks in advance. > >Regards, >Txema > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > _______________________________________________ Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From biopctgi at yahoo.es Tue Feb 28 09:52:45 2006 From: biopctgi at yahoo.es (Jose Maria Gonzalez Izarzugaza) Date: Tue, 28 Feb 2006 15:52:45 +0100 Subject: [BiO BB] MrBayes - Memory overflow In-Reply-To: <01LZI91W1IO28ZET1I@uib.es> References: <01LZI91W1IO28ZET1I@uib.es> Message-ID: <440463BD.9010504@yahoo.es> Many thanks. Txema Oscar Moya wrote: >Ok, commands and computer discarded > >I have just seen that MrBayes has problems with 64 bit architechtures, don?t >know if you have tried in 32 bits one. The following links speak about >similar cases: > >http://sourceforge.net/mailarchive/forum.php?thread_id=9725904&forum_id=4511 >7 > >http://www.rannala.org/phpBB2/search.php?search_author=GertW& > > >A solution for some of them was to download the sourcecode from CVS at >Sourceforge and compile it by themselves. > >Hope that helps > >?scar > > >-----Mensaje original----- >De: bio_bulletin_board-bounces+omoya=uib.es at bioinformatics.org >[mailto:bio_bulletin_board-bounces+omoya=uib.es at bioinformatics.org] En >nombre de Jose Maria Gonzalez Izarzugaza >Enviado el: martes, 28 de febrero de 2006 12:53 >Para: The general forum at Bioinformatics.Org >Asunto: Re: [BiO BB] MrBayes - Memory overflow > >Thanks a lot, Oscar. > >I've tried to run the consensus tree on different computers, with >different architechtures, and the results are the same in all of them: a >"nice" segmentation fault after reading the 2 tree files. The problem is >just with one specific dataset, the one with the most taxa. That's why i >say it migth be due to memory limitations. > >As i use a previously generated set of trees, the command block is as >simple as > >sumt filename=filewithALLtrees contype=allcompat burnin=1500 (where contype >can also be skipped to the default half compat) > >---------------------------------------------------------------------------- >------------------------------------------------------------- > > >Oscar Moya wrote: > > > >>Hi Txema, >> >>I am not sure about if your problem is due to memory needs. I am running >>analyses of 70 sequences 1200 bases long in 512MB RAM computers. >> >>Have you tried to run it on another computer? It sounds evident, but >>sometimes work. >> >>If problems keep, you could send the command block for checking (mcmc >>settings, Lset, prset, etc...). >> >>Good luck! >> >>?scar >> >>-----Mensaje original----- >>De: bio_bulletin_board-bounces+omoya=uib.es at bioinformatics.org >>[mailto:bio_bulletin_board-bounces+omoya=uib.es at bioinformatics.org] En >>nombre de Jose Maria Gonzalez Izarzugaza >>Enviado el: martes, 28 de febrero de 2006 11:52 >>Para: bio_bulletin_board at bioinformatics.org >>Asunto: [BiO BB] MrBayes - Memory overflow >> >>Hello everyone. >> >>Does anybody know how i could limit the memory usage while creating a >>consensus tree with MrBayes? (even though i am using a 2GB RAM computer) >> >>The problem (segmentation fault) only appears when i'm trying to build a >>tree with more than 40 taxa, so the problem is not in the compilation, >>nor in the commands given. >> >>my mrbayes command is: >>sumt filename=filewithALLtrees contype=allcompat burnin=1500 >> >>Many thanks in advance. >> >>Regards, >>Txema >> >>_______________________________________________ >>Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >> >> >>_______________________________________________ >>Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >> >> >> >> >> > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > From MAG at Stowers-Institute.org Tue Feb 28 15:51:36 2006 From: MAG at Stowers-Institute.org (Goel, Manisha) Date: Tue, 28 Feb 2006 14:51:36 -0600 Subject: [BiO BB] Energy minimazation program Message-ID: Hi All, I have a simple problem, I hope someone can advice. I needed to make a covalent bond between a sugar molecule and an aminoacid >(aa-COOH + sugar-OH --> H2O + aa-sugar). I had them in a single pdb file and managed to generate a pdb file with a covalent bond, but the geometry of this resulting molecule is very distorted. I now need to optimize this geometry and I think that energy minimization shouwl be able to take care of that. I have previously used insightII for this energy minimization but do not access to it anymore. Is there any software avaibale (free for academic use) that should be able to do this. I did a google search and find quite a few but cannot decide which one to start with, because each would have a learning curve. Please suggest something that you think is more popular/optimal. Thanks for any help. -Manisha Goel Post-doc associate Stowers Institute for Medical Research 1000 E 50th St. Kansas city, 64110 -------------- next part -------------- An HTML attachment was scrubbed... URL: