From hershel.safer at weizmann.ac.il Mon May 1 08:08:14 2006 From: hershel.safer at weizmann.ac.il (Hershel Safer) Date: Mon, 01 May 2006 15:08:14 +0300 Subject: [BiO BB] CFP: Distributed, High-Performance and Grid Computing in Computational Biology Message-ID: <7.0.1.0.2.20060501150132.02b80d68@weizmann.ac.il> An HTML attachment was scrubbed... URL: From leser at informatik.hu-berlin.de Mon May 1 15:14:17 2006 From: leser at informatik.hu-berlin.de (Ulf Leser) Date: Mon, 1 May 2006 21:14:17 +0200 (MEST) Subject: [BiO BB] DILS06 - List of accepted papers and Call for posters Message-ID: DILS 2006 3rd International Workshop on Data Integration for the Life Sciences European Bioinformatics Institute, UK 20. - 22.7.2006 Call for Participation ====================== Registration is now open for the DILS06. Please go to: http://www.informatik.hu-berlin.de/dils2006/index.html Note: Early registration deadline is June, 7th, 2006 Call for Posters ================ DILS06 will feature a poster session. Poster abstracts will be presented on the DILS website, and selected posters will be given the oppurtunity for an oral flash presentation. Please send poster abstracts directly to: leser at informatik.hu-berlin.de Deadline for poster abstract submission is May, 10th, 2006. Keynotes ======== Victor Markowitz "An Application Driven Perspective on Biological Data Integration" James H. Kaufman "Towards a National Healthcare Information Infrastructure" Accepted Papers =============== Towards an Automated Analysis of Biomedical Abstracts Barbara Gawronska, Erlendsson Bj?rn, Bj?rn Olsson A method for similarity-based grouping of biological data Vaida Jakoniene, David Rundqvist, Patrick Lambrix Improving Text Mining with Controlled Natural Language: A Case Study for Protein Interactions Tobias Kuhn, Loic Royer, Norbert E. Fuchs, Michael Schroeder An Information Management System for Collaboration within Distributed Working Environment Maria Samsonova, Andrei Pisarev, Konstantin Kozlov, Ekaterina Poustelnikova, Arthur Tkachenko SIBIOS Ontology System: A Robust Package for Biological Data Integration malika mahoui, Zina Ben Miled, Mindi Dippold, Bing Yang, Srinivasan Sriram On Querying OBO Ontologies using a DAG Pattern Query Language Amarnath Gupta, Simone Santini Using Term Lists and Inverted Files to Improve Search Speed for Metabolic Pathway Databases Greeshma Neglur, Robert Grossman, Natalia Maltsev, Clement Yu Data Object Models for Genome Annotation, Validation and Alternative Splicing Sven Mielordt, J?rgen Kleffe Link Discovery in Graphs Derived from Biological Databases Petteri Sevon, Petteri Hintsanen, Lauri Eronen, Kimmo Kulovesi, Hannu Toivonen BioFuice: Mapping-based data integration in bioinformatics Toralf Kirsten, Erhard Rahm SNP-Converter: an ontology-based mechanism to reconcile heterogeneous SNP descriptions for pharmacogenomic studies Adrien Coulet, Marie-Dominique Devignes, Malika Sma?l-Tabbone, Pascale Benlian, Amedeo Napoli Data Access and Integration in the ISPIDER Proteomics Grid Lucas Zamboulis, Hao Fan, Khalid Belhajjame, Jennifer Siepen, Andrew Jones, Nigel Martin, Alexandra Poulovassilis, Simon Hubbard, Suzanne Embury, Norman Paton A cell-cycle knowledge integration framework Erick Antezana, Elena Tsiporkova, Vladimir Mironov, Martin Kuiper An extensible light-weight XML-based monitoring system for sequence databases Dieter Van de Craen, Frank Neven, Kerstin Koch Arevir: a secure platform for designing personalized antiretroviral therapies against HIV Kirsten Roomp, Niko Beerenwinkel, Tobias Sing, Eugen Sch?lter, Joachim B?ch, Saleta Sierra-Aragon, Martin D?umer, Daniel Hoffmann, Rolf Kaiser, Thomas Lengauer, Joachim Selbig Collection-Oriented Scientific Workflows for Integrating and Analyzing Biological Data Shawn Bowers, Timothy McPhillips, Bertram Lud?scher Towards a Model of Provenance in Scientific Workflows Shirley Cohen, Sarah Cohen Boulakia, Susan Davidson The Distributed Annotation System for Integration of Biological Data Andreas Prlic, Ewan Birney, Tony Cox, Thomas Down, Rob Finn, Stefan Graef, David Jackson, Andreas kahari, Eugene Kulesha, Roger Pettett, James Smith, Jim Stalker, Tim Hubbard SABIO-RK: Integration and Curation of Reaction Kinetics Data Ulrike Wittig Accepted Short Papers ===================== Ontology Analysis on Complexity and Evolution Based on Conceptual Model Zhe Yang, Dalu Zhang, Chuan Ye Distributed Execution of Workflows in the INB Ismael Navas-Delgado, Antonio J. Perez, Jose F. Aldana-Montes, Oswaldo Trelles Knowledge networks of biological and medical data; an exhaustive and flexible solution to model life sciences domains Sascha Losko, Karsten Wenger, Wenzel Kalus, Andrea Ramge, Jens Wiehler, Klaus Heumann On characterizing and addressing mismatches in scientific workflows Khalid Belhajjame, Suzanne M. Embury, Norman W. Paton Sponsors ======== We are grateful to our sponsors - Microsoft - IBM - EBI Industry Program - Metanomics in Plant Biotech - Metanomics Health in Pharma & Nutrition - Schering AG From mahef111 at link.net Mon May 1 17:38:01 2006 From: mahef111 at link.net (Mhmoud Elhefnawi) Date: Mon, 1 May 2006 23:38:01 +0200 Subject: [BiO BB] A tool for gene identification.. Message-ID: <000d01c66d6e$03ad5880$2f98c952@pc> Dear colleagues, Our lab isolated a 196 bp fragment from humans from more than one cell type ( part of a gene that is an oncogene for cancer). Conducting BLAST searches of all types resulted in no homology to any published human gene. Is there some tool that can identify the human genome to: 1) locate on which chromosome is this sequence? 2) enable us to get the rest of the sequence of that gene? 3) predict this gene based on the sequence submitted? Your help is very highhly appreciated. Thanks, Mahmoud -------------- next part -------------- An HTML attachment was scrubbed... URL: From idoerg at burnham.org Mon May 1 18:39:47 2006 From: idoerg at burnham.org (Iddo Friedberg) Date: Mon, 01 May 2006 15:39:47 -0700 Subject: [BiO BB] A tool for gene identification.. In-Reply-To: <000d01c66d6e$03ad5880$2f98c952@pc> References: <000d01c66d6e$03ad5880$2f98c952@pc> Message-ID: <44568E33.1090802@burnham.org> Which databases have you tried blasting against? Have you tried a full genomic database like Ensembl or the human genome at Santa Cruz? ./I Mhmoud Elhefnawi wrote: > Dear colleagues, > Our lab isolated a 196 bp fragment from humans from more than one > cell type ( part of a gene that is an oncogene for cancer). Conducting > BLAST searches of all types resulted in no homology to any published > human gene. Is there some tool that can identify the human genome to: > 1) locate on which chromosome is this sequence? > 2) enable us to get the rest of the sequence of that gene? > 3) predict this gene based on the sequence submitted? > > Your help is very highhly appreciated. > Thanks, > Mahmoud > >------------------------------------------------------------------------ > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > -- Iddo Friedberg, Ph.D. Burnham Institute for Medical Reseach 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 713 9949 http://iddo-friedberg.org http://BioFunctionPrediction.org From golharam at umdnj.edu Mon May 1 18:43:08 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Mon, 01 May 2006 18:43:08 -0400 Subject: [BiO BB] A tool for gene identification.. In-Reply-To: <000d01c66d6e$03ad5880$2f98c952@pc> Message-ID: <00da01c66d70$9fc60f80$2f01a8c0@GOLHARMOBILE1> Did you BLAST your sequence against the human genome, or just (human) genes? BLAST should do this for you if you are using the correct database. The output (and MapViewer) will then tell you what chomosome its on and everything in the surrounding area. -----Original Message----- From: bio_bulletin_board-bounces+golharam=umdnj.edu at bioinformatics.org [mailto:bio_bulletin_board-bounces+golharam=umdnj.edu at bioinformatics.org ] On Behalf Of Mhmoud Elhefnawi Sent: Monday, May 01, 2006 5:38 PM To: bio_bulletin_board at bioinformatics.org Subject: [BiO BB] A tool for gene identification.. Dear colleagues, Our lab isolated a 196 bp fragment from humans from more than one cell type ( part of a gene that is an oncogene for cancer). Conducting BLAST searches of all types resulted in no homology to any published human gene. Is there some tool that can identify the human genome to: 1) locate on which chromosome is this sequence? 2) enable us to get the rest of the sequence of that gene? 3) predict this gene based on the sequence submitted? Your help is very highhly appreciated. Thanks, Mahmoud -------------- next part -------------- An HTML attachment was scrubbed... URL: From maximilianh at gmail.com Tue May 2 09:57:55 2006 From: maximilianh at gmail.com (Maximilian Haeussler) Date: Tue, 2 May 2006 15:57:55 +0200 Subject: [BiO BB] A tool for gene identification.. In-Reply-To: <000d01c66d6e$03ad5880$2f98c952@pc> References: <000d01c66d6e$03ad5880$2f98c952@pc> Message-ID: <76f031ae0605020657q74e4c5b2va30b7cad2af787ea@mail.gmail.com> You could also try to SSH (fast blast) against all of genbank, everything in any genome or other sequences. -max On 01/05/06, Mhmoud Elhefnawi wrote: > > Dear colleagues, > Our lab isolated a 196 bp fragment from humans from more than one cell type > ( part of a gene that is an oncogene for cancer). Conducting BLAST searches > of all types resulted in no homology to any published human gene. Is there > some tool that can identify the human genome to: > 1) locate on which chromosome is this sequence? > 2) enable us to get the rest of the sequence of that gene? > 3) predict this gene based on the sequence submitted? > > Your help is very highhly appreciated. > Thanks, > > Mahmoud > _______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > From maximilianh at gmail.com Tue May 2 10:03:21 2006 From: maximilianh at gmail.com (Maximilian Haeussler) Date: Tue, 2 May 2006 16:03:21 +0200 Subject: [BiO BB] A tool for gene identification.. In-Reply-To: <76f031ae0605020657q74e4c5b2va30b7cad2af787ea@mail.gmail.com> References: <000d01c66d6e$03ad5880$2f98c952@pc> <76f031ae0605020657q74e4c5b2va30b7cad2af787ea@mail.gmail.com> Message-ID: <76f031ae0605020703v344b839el3d810406c81aa52b@mail.gmail.com> sorry should get my acronyms right... :-) Of course, I meant ssaha instead of ssh: http://trace.ensembl.org/cgi-bin/tracesearch http://www.ncbi.nlm.nih.gov/BLAST/ On 02/05/06, Maximilian Haeussler wrote: > You could also try to SSH (fast blast) against all _traces_, > everything in any genome or any other sequences (nr). > > -max > > On 01/05/06, Mhmoud Elhefnawi wrote: > > > > Dear colleagues, > > Our lab isolated a 196 bp fragment from humans from more than one cell type > > ( part of a gene that is an oncogene for cancer). Conducting BLAST searches > > of all types resulted in no homology to any published human gene. Is there > > some tool that can identify the human genome to: > > 1) locate on which chromosome is this sequence? > > 2) enable us to get the rest of the sequence of that gene? > > 3) predict this gene based on the sequence submitted? > > > > Your help is very highhly appreciated. > > Thanks, > > > > Mahmoud > > _______________________________________________ > > Bioinformatics.Org general forum - > > BiO_Bulletin_Board at bioinformatics.org > > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > > > > > -- Maximilian Haeussler, CNRS Gif-sur-Yvette, Paris tel: +33 6 12 82 76 16 icq: 3825815 -- msn: maximilian.haeussler at hpi.uni-potsdam.de skype: maximilianhaeussler From natarajanganesan at gmail.com Mon May 1 18:30:40 2006 From: natarajanganesan at gmail.com (Natarajan Ganesan) Date: Mon, 1 May 2006 18:30:40 -0400 Subject: [BiO BB] A tool for gene identification.. In-Reply-To: <000d01c66d6e$03ad5880$2f98c952@pc> Message-ID: <000c01c66d6e$e0b6e2f0$bff0a18d@georgetown.mei.georgetown.edu> Did u try "InstaSeq" - http://bioinformatics.georgetown.edu/InstaSeq.htm ? Natarajan Ganesan, PhD Email: ng6 at georgetown.edu _____ From: bio_bulletin_board-bounces+natarajanganesan=gmail.com at bioinformatics.org [mailto:bio_bulletin_board-bounces+natarajanganesan=gmail.com at bioinformatics .org] On Behalf Of Mhmoud Elhefnawi Sent: Monday, May 01, 2006 5:38 PM To: bio_bulletin_board at bioinformatics.org Subject: [BiO BB] A tool for gene identification.. Dear colleagues, Our lab isolated a 196 bp fragment from humans from more than one cell type ( part of a gene that is an oncogene for cancer). Conducting BLAST searches of all types resulted in no homology to any published human gene. Is there some tool that can identify the human genome to: 1) locate on which chromosome is this sequence? 2) enable us to get the rest of the sequence of that gene? 3) predict this gene based on the sequence submitted? Your help is very highhly appreciated. Thanks, Mahmoud -------------- next part -------------- An HTML attachment was scrubbed... URL: From MEC at Stowers-Institute.org Tue May 2 10:55:36 2006 From: MEC at Stowers-Institute.org (Cook, Malcolm) Date: Tue, 2 May 2006 09:55:36 -0500 Subject: [BiO BB] software for sequence mutational analysis Message-ID: There are many. Three recommendations: ABI has a very nice commercial offering for windows called SeqScape (https://products.appliedbiosystems.com/ab/en/US/adirect/ab?cmd=catNavig ate2&catID=600582). Free/open-source is Staden (http://staden.sourceforge.net/) which has a variety of approaches to gene resequencing for mutation detection with fairly recent development/improvements. Though it is cross-platform / opensource, it optionally depends on components which are/were not available windows - phred for base calling, phrap for assembly, so I suggest linux for this. Also often used is a combo of Phred/Phrap/consed. see PolyPhred: http://droog.mbt.washington.edu/PolyPhred.html Good luck, Malcolm Cook Stowers Institute for Medical Research ________________________________ From: bio_bulletin_board-bounces+mec=stowers-institute.org at bioinformatics.org [mailto:bio_bulletin_board-bounces+mec=stowers-institute.org at bioinformat ics.org] On Behalf Of Mhmoud Elhefnawi Sent: Wednesday, April 26, 2006 8:40 AM To: bio_bulletin_board at bioinformatics.org Subject: [BiO BB] software for sequence mutational analysis Dear colleagues, I have some sequences closely related that I want to perform mutationl analysis in an efficient way, locating mutations locations, frequency of mutations, ...etc.. Is there some package that can help me with that? Thanks in advance for helping, Mahmoud -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzheng_hotml at hotmail.com Tue May 2 11:41:01 2006 From: hzheng_hotml at hotmail.com (=?gb2312?B?y64gtb4=?=) Date: Tue, 02 May 2006 23:41:01 +0800 Subject: [BiO BB] How can I retrieve the biggest transposable element from NCBI? Message-ID: I want to retrieve the sequence of the biggest transposable element from NCBI. Can anybody tell me what's the proper steps? I've try to do it by searched nucleotide databases in NCBI with keyword "transposable element", but most records of what I retrieved was irrespective. Another problem is how can I distinguish the biggest transposon from others before I download all the sequences. After all, nearly 8000 records is rather large. I'll be so glad if anyone tell me better ways to achieve the aim. Thanks. _________________________________________________________________ ?????????????? MSN Messenger: http://messenger.msn.com/cn From mmarchywka at eyewonder.com Tue May 2 12:54:24 2006 From: mmarchywka at eyewonder.com (Mike Marchywka) Date: Tue, 2 May 2006 12:54:24 -0400 Subject: [BiO BB] How can I retrieve the biggest transposable element fromNCBI? Message-ID: <73CA026E5E77C74398C69F3338C5967C0750E176@atlexc01.atlanta.eyewonder.com> I wouldn't dismiss that approach more generally. Their support for eutils is amazaing- for abstracts and blast results, downloading coarsely defined searches and then doing ad hoc sorting/browsing can be pretty powerful. 8000 is no big deal and, again, their support for automated interaction is wonderful. This is the first link that came up on google and I think I've posted others in the past: http://www.ncbi.nlm.nih.gov/Class/PowerTools/QuickScripts/course.html For abstracts, I can download 1000's and data mine with little effort. Blast for peptides has been similarly useful. If you anticipate implementing a lot of ad hoc search/sort criteria you should probably get some kind of scripting capability. ************************************************************************* Mike Marchywka EyeWonder Instant Streaming, Infinite Results 1447 Peachtree Street 9th Floor Atlanta, GA 30309 w.678-891-2033 c. h.770-565-8101 mmarchywka at eyewonder.com alt: marchywka at hotmail.com Instant Streaming, Intelligent results. ************************************************************************* -----Original Message----- From: bio_bulletin_board-bounces+mmarchywka=eyewonder.com at bioinformatics.org [mailto:bio_bulletin_board-bounces+mmarchywka=eyewonder.com at bioinformatics.org]On Behalf Of hzheng_hotml at hotmail.com Sent: TuesdayMay-02-2006 11:41 AM To: bio_bulletin_board at bioinformatics.org Subject: [BiO BB] How can I retrieve the biggest transposable element fromNCBI? I want to retrieve the sequence of the biggest transposable element from NCBI. Can anybody tell me what's the proper steps? I've try to do it by searched nucleotide databases in NCBI with keyword "transposable element", but most records of what I retrieved was irrespective. Another problem is how can I distinguish the biggest transposon from others before I download all the sequences. After all, nearly 8000 records is rather large. I'll be so glad if anyone tell me better ways to achieve the aim. Thanks. _________________________________________________________________ ?????????????? MSN Messenger: http://messenger.msn.com/cn _______________________________________________ Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From mheusel at gmail.com Tue May 2 14:11:53 2006 From: mheusel at gmail.com (Martin Heusel) Date: Tue, 2 May 2006 20:11:53 +0200 Subject: [BiO BB] Parameter -e blastpgp/PSI-BLAST Message-ID: <6127fc200605021111s311c858dt1eceaceffd086012@mail.gmail.com> Hi everyone, i didnt find an explanation for what the blastpgp parameter -e is for. -h is the threshold for multipass searching but what can one do with -e? Can anyone tell me? Thanks a lot bye Martin From marty.gollery at gmail.com Tue May 2 14:21:58 2006 From: marty.gollery at gmail.com (Martin Gollery) Date: Tue, 2 May 2006 11:21:58 -0700 Subject: [BiO BB] Parameter -e blastpgp/PSI-BLAST In-Reply-To: <6127fc200605021111s311c858dt1eceaceffd086012@mail.gmail.com> References: <6127fc200605021111s311c858dt1eceaceffd086012@mail.gmail.com> Message-ID: -e sets the threshold expectation value for alignments in the final output. -h is the threshold for inclusion into the PSSM. I believe the default setting for -e is 10, which is extremely lenient. Marty On 5/2/06, Martin Heusel wrote: > > Hi everyone, > > i didnt find an explanation for what the blastpgp parameter -e is for. > -h is the threshold for multipass searching but what can one do with > -e? Can anyone tell me? > > Thanks a lot > > bye > > Martin > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > -- -- Martin Gollery Associate Director Center For Bioinformatics University of Nevada at Reno Dept. of Biochemistry / MS330 775-784-7042 ----------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From mheusel at gmail.com Tue May 2 14:44:00 2006 From: mheusel at gmail.com (Martin Heusel) Date: Tue, 2 May 2006 20:44:00 +0200 Subject: [BiO BB] Parameter -e blastpgp/PSI-BLAST In-Reply-To: References: <6127fc200605021111s311c858dt1eceaceffd086012@mail.gmail.com> Message-ID: <6127fc200605021144v6e9d4229x251cbe6c4163ad1e@mail.gmail.com> On 5/2/06, Martin Gollery wrote: > -e sets the threshold expectation value for alignments in the final output. > -h is the threshold for inclusion into the PSSM. I believe the default > setting for -e is 10, which is extremely lenient. Hi Marty, is it correct then that -e doesn't affect a PSSM at the end of some iterations? regards Martin From marty.gollery at gmail.com Tue May 2 15:12:08 2006 From: marty.gollery at gmail.com (Martin Gollery) Date: Tue, 2 May 2006 12:12:08 -0700 Subject: [BiO BB] Parameter -e blastpgp/PSI-BLAST In-Reply-To: <6127fc200605021144v6e9d4229x251cbe6c4163ad1e@mail.gmail.com> References: <6127fc200605021111s311c858dt1eceaceffd086012@mail.gmail.com> <6127fc200605021144v6e9d4229x251cbe6c4163ad1e@mail.gmail.com> Message-ID: Correct. The -h option affects what sequences are allowed in building the PSSM, and -e affects the final alignment threshold. This is a useful thing, because you might typically want a more lenient threshold with -h to maximize the entropy of the PSSM, while keeping -e more stringent so that you don't get a million crummy hits at the end. Marty On 5/2/06, Martin Heusel wrote: > > On 5/2/06, Martin Gollery wrote: > > -e sets the threshold expectation value for alignments in the final > output. > > -h is the threshold for inclusion into the PSSM. I believe the default > > setting for -e is 10, which is extremely lenient. > > Hi Marty, > > is it correct then that -e doesn't affect a PSSM at the end of some > iterations? > > regards > > Martin > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > -- -- Martin Gollery Associate Director Center For Bioinformatics University of Nevada at Reno Dept. of Biochemistry / MS330 775-784-7042 ----------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From dankoc at gmail.com Tue May 2 17:15:08 2006 From: dankoc at gmail.com (Charles Danko) Date: Tue, 2 May 2006 17:15:08 -0400 Subject: [BiO BB] Trouble installing R in perl Message-ID: <8adccabf0605021415y311ccfcx22dc54b8b6d9b3ff@mail.gmail.com> Hi, I am trying to install R in perl for SUSE10. I have downloaded the latest R distribution (2.3.0). I ran >./configure --enable-R-shlib It completes without problems. I ran >make It completes without problems, but bin/libR is not present (should it be present at this step?) After running make install, bin/libR is not present in the main R directory. R works fine after this step. Installing the R from perl package using: >R CMD INSTALL -c --configure-args='--with-in-perl' RSPerl It installs the package without warning or error messages, but the directory "blib" is not present anywhere in the main R directory (/usr/local/lib/R). running >perl test.pl asks for the directory of R.pm and R.so. providing these using the -I switch does not return an error, but it does not return any output either ... the output is blank. Any ideas on getting RSPerl working within perl? Thanks very much! -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmarchywka at eyewonder.com Tue May 2 19:44:17 2006 From: mmarchywka at eyewonder.com (Mike Marchywka) Date: Tue, 2 May 2006 19:44:17 -0400 Subject: [BiO BB] How can I retrieve the biggest transposable element fromNCBI? Message-ID: <73CA026E5E77C74398C69F3338C5967C07553E10@atlexc01.atlanta.eyewonder.com> I got some time so I modified my script to do this. Please check since I have never bothered to search of genes, just proteins and papers. $ eutilsnew -out transpx -v -nuc "transposable element" Count is 7254 --18:33:51-- http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?tool=biot echmarchywka&email=marchywka at hotmail.com&rettype=gb&retmode=text&retstart=0&retm ax=7254&db=nucleotide&query_key=1&WebEnv=0cY3im7_1EoYdn6YdoWhXDgFBoXmBY08HIOiFw7 caoA5sCVabQVX5c at w92iPIIOFuEAAAsMPkIAAAAB => `transpx' Resolving eutils.ncbi.nlm.nih.gov... 130.14.29.110 Connecting to eutils.ncbi.nlm.nih.gov[130.14.29.110]:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [pubmed/text] [ <=> ] 664,162,678 315.71K/s 19:15:57 (256.82 KB/s) - `transpx' saved [664162678] ............... This is pretty sloppy but it more or less worked. You can check this list manually and clean it up but this illustrates how you can do arbitrary or one-off stuff. Obviously, you need to be careful too( you can run various checksums with "wc" for example or, better, write scripts that check syntax more accurately. You could parse each entry and then pick a name and length not just look for things that look about right ) : $ more transpx | grep "ACCESSION\|source" > names_and_length $ more names_and_length | sed -e 's/\.\./ /'|grep "^ACCESSION\|^ source" |a wk '{ if ($1=="source") print acc" "$3-$2; else acc=$2}' >tentative_list $ cat tentative_list | sort -g -r -k 2 | more NT_107239 28196692 NC_003076 26992727 NC_003074 23470804 NC_003071 19705358 NC_003075 18585041 NT_079899 17249720 NT_079927 14818988 AE005173 14668882 AE005172 14221814 NT_079879 11570171 NT_036312 10050052 NT_107181 9248308 NC_003888 8667506 NT_079926 8469245 NT_107178 7645237 NC_004578 6397125 AE016853 6397125 NC_002947 6181862 AE015451 6181862 NT_107180 6019142 NT_107179 4877844 NC_007355 4837407 CP000099 4837407 NT_080067 4809258 NC_003198 4809036 NC_003143 4653727 AP009048 4646331 AC_000091 4646331 NT_080068 4609299 NC_000962 4411531 NC_002945 4345491 NT_080060 4008076 NT_079961 3480659 NT_079923 2970703 NT_080061 2697501 NT_079854 2592258 NC_002935 2488634 NC_002950 2343475 AE015924 2343475 NT_080065 2206061 NC_004116 2160266 NT_079947 1767735 NT_107183 1680143 NT_080064 1671486 NT_107176 1593846 NT_080066 1531813 NT_107077 1516643 NT_107224 975436 NC_002771 963878 NT_080062 569062 BA000027 425934 NC_003903 356022 BX248360 349658 BX842574 349563 BX248354 348516 AF172282 339484 AL445563 327649 AL445564 321249 BX248336 320049 AL445565 315078 AL939126 295149 AL939116 293049 AF427791 261264 AL627283 249049 AL645702 205102 AJ414160 203727 AL161495 199614 AL161493 198219 AL161505 198175 AL161494 194891 AP005160 194110 AC092748 192266 AC092172 191396 AL161533 190025 AP005298 189892 AC068654 189348 AC153856 188026 AC079852 185903 --More-- -----Original Message----- From: bio_bulletin_board-bounces+mmarchywka=eyewonder.com at bioinformatics.org [mailto:bio_bulletin_board-bounces+mmarchywka=eyewonder.com at bioinformatics.org]On Behalf Of hzheng_hotml at hotmail.com Sent: TuesdayMay-02-2006 11:41 AM To: bio_bulletin_board at bioinformatics.org Subject: [BiO BB] How can I retrieve the biggest transposable element fromNCBI? I want to retrieve the sequence of the biggest transposable element from NCBI. Can anybody tell me what's the proper steps? I've try to do it by searched nucleotide databases in NCBI with keyword "transposable element", but most records of what I retrieved was irrespective. Another problem is how can I distinguish the biggest transposon from others before I download all the sequences. After all, nearly 8000 records is rather large. I'll be so glad if anyone tell me better ways to achieve the aim. Thanks. _________________________________________________________________ ?????????????? MSN Messenger: http://messenger.msn.com/cn _______________________________________________ Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From hzheng_hotml at hotmail.com Wed May 3 11:07:47 2006 From: hzheng_hotml at hotmail.com (zheng hui) Date: Wed, 03 May 2006 23:07:47 +0800 Subject: [BiO BB] How can I retrieve the biggest transposable elementfromNCBI? In-Reply-To: <73CA026E5E77C74398C69F3338C5967C07553E10@atlexc01.atlanta.eyewonder.com> Message-ID: Thank you for your enthusiastic help. I have some exprience in perl programing but don't know eUtils before. This tool is a real powerful utility and I feel so glad that you introduce it to me. >From: "Mike Marchywka" >Reply-To: "The general forum at Bioinformatics.Org" >To: "The general forum at Bioinformatics.Org" >Subject: RE: [BiO BB] How can I retrieve the biggest transposable elementfromNCBI? >Date: Tue, 2 May 2006 19:44:17 -0400 > >I got some time so I modified my script to do this. Please check since >I have never bothered to search of genes, just proteins and papers. > >$ eutilsnew -out transpx -v -nuc "transposable element" >Count is 7254 >--18:33:51-- http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?tool=biot >echmarchywka&email=marchywka at hotmail.com&rettype=gb&retmode=text&retstart=0&retm >ax=7254&db=nucleotide&query_key=1&WebEnv=0cY3im7_1EoYdn6YdoWhXDgFBoXmBY08HIOiFw7 >caoA5sCVabQVX5c at w92iPIIOFuEAAAsMPkIAAAAB > => `transpx' >Resolving eutils.ncbi.nlm.nih.gov... 130.14.29.110 >Connecting to eutils.ncbi.nlm.nih.gov[130.14.29.110]:80... connected. >HTTP request sent, awaiting response... 200 OK >Length: unspecified [pubmed/text] > > [ <=> ] 664,162,678 315.71K/s > >19:15:57 (256.82 KB/s) - `transpx' saved [664162678] >............... >This is pretty sloppy but it more or less worked. You can check this list >manually and clean it up but this illustrates how you can do arbitrary >or one-off stuff. Obviously, you need to be careful too( you can run >various checksums with "wc" for example or, better, write scripts that check >syntax more accurately. You could parse each entry and then pick a name and length not >just look for things that look about right ) : > >$ more transpx | grep "ACCESSION\|source" > names_and_length > >$ more names_and_length | sed -e 's/\.\./ /'|grep "^ACCESSION\|^ source" |a >wk '{ if ($1=="source") print acc" "$3-$2; else acc=$2}' >tentative_list > > >$ cat tentative_list | sort -g -r -k 2 | more > >NT_107239 28196692 >NC_003076 26992727 >NC_003074 23470804 >NC_003071 19705358 >NC_003075 18585041 >NT_079899 17249720 >NT_079927 14818988 >AE005173 14668882 >AE005172 14221814 >NT_079879 11570171 >NT_036312 10050052 >NT_107181 9248308 >NC_003888 8667506 >NT_079926 8469245 >NT_107178 7645237 >NC_004578 6397125 >AE016853 6397125 >NC_002947 6181862 >AE015451 6181862 >NT_107180 6019142 >NT_107179 4877844 >NC_007355 4837407 >CP000099 4837407 >NT_080067 4809258 >NC_003198 4809036 >NC_003143 4653727 >AP009048 4646331 >AC_000091 4646331 >NT_080068 4609299 >NC_000962 4411531 >NC_002945 4345491 >NT_080060 4008076 >NT_079961 3480659 >NT_079923 2970703 >NT_080061 2697501 >NT_079854 2592258 >NC_002935 2488634 >NC_002950 2343475 >AE015924 2343475 >NT_080065 2206061 >NC_004116 2160266 >NT_079947 1767735 >NT_107183 1680143 >NT_080064 1671486 >NT_107176 1593846 >NT_080066 1531813 >NT_107077 1516643 >NT_107224 975436 >NC_002771 963878 >NT_080062 569062 >BA000027 425934 >NC_003903 356022 >BX248360 349658 >BX842574 349563 >BX248354 348516 >AF172282 339484 >AL445563 327649 >AL445564 321249 >BX248336 320049 >AL445565 315078 >AL939126 295149 >AL939116 293049 >AF427791 261264 >AL627283 249049 >AL645702 205102 >AJ414160 203727 >AL161495 199614 >AL161493 198219 >AL161505 198175 >AL161494 194891 >AP005160 194110 >AC092748 192266 >AC092172 191396 >AL161533 190025 >AP005298 189892 >AC068654 189348 >AC153856 188026 >AC079852 185903 >--More-- _________________________________________________________________ ???? MSN Explorer: http://explorer.msn.com/lccn/ From mheusel at gmail.com Wed May 3 11:17:43 2006 From: mheusel at gmail.com (Martin Heusel) Date: Wed, 3 May 2006 17:17:43 +0200 Subject: [BiO BB] Parameter -e blastpgp/PSI-BLAST In-Reply-To: References: <6127fc200605021111s311c858dt1eceaceffd086012@mail.gmail.com> <6127fc200605021144v6e9d4229x251cbe6c4163ad1e@mail.gmail.com> Message-ID: <6127fc200605030817x231676fel6ad8c4e2d0ce3c12@mail.gmail.com> On 5/2/06, Martin Gollery wrote: > Correct. The -h option affects what sequences are allowed in building the > PSSM, and -e affects the final alignment threshold. This is a useful thing, > because you might typically want a more lenient threshold with -h to > maximize the entropy of the PSSM, while keeping -e more stringent so that > you don't get a million crummy hits at the end. that makes it clear, thank you! martin From areed at imdc.org Wed May 3 13:48:02 2006 From: areed at imdc.org (Ann Reed) Date: Wed, 3 May 2006 12:48:02 -0500 Subject: [BiO BB] Please take the Bioinformatics Skills Requirements Survey Message-ID: <7BDF6464945AB045B845C2AB588B9B40249B79@tachyon.imdc.org> Dear Colleagues, The following survey has been created to assess the skills requirements necessary to find employment in the field of bioinformatics. Please assist us and complete the survey at the following link: http://bioinformatics.bioengr.uic.edu/survey/ The survey takes only a few minutes to complete. Your responses to this survey are critical in assessing the near-term skills requirements of this growing industry. We plan to use the findings from this survey to guide new curriculum development for the BiTmaP program. All respondents can elect to receive an emailed summary of the report findings. We also plan to post the summary report on the bioinformatics.org site. Best regards, Ann Reed Director, BiTmaP: Bioinformatics Training Program www.BiTmaPchicago.com BiTmaP is tuition-free online bioinformatics certificate training program sponsored by the U.S. Dept. of Labor, the Chicago Technology Park and the University of Illinois at Chicago. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmarchywka at eyewonder.com Wed May 3 14:07:44 2006 From: mmarchywka at eyewonder.com (Mike Marchywka) Date: Wed, 3 May 2006 14:07:44 -0400 Subject: [BiO BB] How can I retrieve the biggest transposableelementfromNCBI? Message-ID: <73CA026E5E77C74398C69F3338C5967C07553E12@atlexc01.atlanta.eyewonder.com> Did I find the right answer? What are you trying to determine? If the "largest" is some wierd thing, then you may have to probe a little with even more ad-hoc criteria. The validation phase of an exploratory investigation should be pretty obvious. If you can parse or just sed/grep the candidate answers, you can make a few observations pretty quickly. 664Megs may be a bit much but I've found quite often is is faster to download the coarse results, browse a little, and then devise an ad-hoc selection criterion that could be very much a quirk of the data set than to play with a web interface. Once you have a few results, it is easy to reformat them and pipe them elsewhere. IF you are going to do any analysis, you really need to get around the web stuff anyway. ************************************************************************* Mike Marchywka EyeWonder Instant Streaming, Infinite Results 1447 Peachtree Street 9th Floor Atlanta, GA 30309 w.678-891-2033 c. h.770-565-8101 mmarchywka at eyewonder.com alt: marchywka at hotmail.com Instant Streaming, Intelligent results. ************************************************************************* -----Original Message----- From: bio_bulletin_board-bounces+mmarchywka=eyewonder.com at bioinformatics.org [mailto:bio_bulletin_board-bounces+mmarchywka=eyewonder.com at bioinformatics.org]On Behalf Of zheng hui Sent: WednesdayMay-03-2006 11:08 AM To: bio_bulletin_board at bioinformatics.org Subject: RE: [BiO BB] How can I retrieve the biggest transposableelementfromNCBI? Thank you for your enthusiastic help. I have some exprience in perl programing but don't know eUtils before. This tool is a real powerful utility and I feel so glad that you introduce it to me. From mahef111 at link.net Wed May 3 17:09:20 2006 From: mahef111 at link.net (Mhmoud Elhefnawi) Date: Wed, 3 May 2006 23:09:20 +0200 Subject: [BiO BB] Re: A tool for gene identification Message-ID: <004c01c66ef5$d97b7ef0$b298c952@pc> dear all, I thank so much all those who sent me their feedback and suggestions. I actually tried the Trace search, and BLAST on the human genome. Both yielded no significant similarity for humans. We previously submitted this sequence to Genbank accession number AY083907.. It is 91% similar to an gene in Eschorchia Coli...It is a /note="delta 2-isopentenyl adenosine tRNA-like protein; possible member of the mevalonate pathway; down-regulated in several human tumors including liver, urinary tract, bladder, and submaxillary carcinoma; similar to Escherichia coli and Shigella flexneri delta-2 isopentenyl pyrophosphate transferases". We were able to isolate it from different human cell lines.. and also from Differential display analysis on different cancers like urethral cancer... what is the probability that it is a novel gene not yet discovered at all, ? and what percent of the human genome is still not sequenced in order to be able to search against? Thanks alot again for all your kind assistance, Mahmoud -------------- next part -------------- An HTML attachment was scrubbed... URL: From hararid at bgu.ac.il Thu May 4 06:29:02 2006 From: hararid at bgu.ac.il (Daniel Harari) Date: Thu, 04 May 2006 10:29:02 GMT Subject: [BiO BB] Re: A tool for gene identification In-Reply-To: <004c01c66ef5$d97b7ef0$b298c952@pc> References: <004c01c66ef5$d97b7ef0$b298c952@pc> Message-ID: Dear Mahmoud, A BlastN search of your sequence against the NCBI NR database provides rather compelling evidence that the DNA that you have cloned is bacterial in origin and unlikely to be human.? Is it possible that your PCR reaction pulled out bacterial DNA accidentally?? Perhaps you should repeat your PCR experiments using blank controls or using non human tissues / cell lines (e.g. mouse) and using the same solutions as you have to perform your earlier experiments.? If you get the same sequence using your negative controls as in your human tissue samples, then this will verify that you have an artifact here.? (I hope for you that I am wrong). Regards, Daniel ----- Original Message ----- From: Mhmoud Elhefnawi Date: Wednesday, May 3, 2006 23:11 Subject: [BiO BB] Re: A tool for gene identification To: bio_bulletin_board at bioinformatics.org > dear all, > I thank so much all those who sent me their feedback and suggestions. > I actually tried the Trace search, and BLAST on the? human > genome.Both yielded no significant similarity for humans. > We previously submitted this sequence to Genbank?? > accession number AY083907..? It is 91% similar to an gene > in Eschorchia Coli...It is a /note="delta 2-isopentenyl > adenosine tRNA-like protein; > possible member of the mevalonate pathway; down-regulated > in several human tumors including liver, urinary tract, > bladder, and submaxillary carcinoma; similar to > Escherichia coli and Shigella flexneri delta-2 isopentenyl > pyrophosphate transferases". > We were able to isolate it from different human cell lines.. and > also from Differential display analysis on different cancers > like urethral cancer... what is the probability that it is a > novel gene not yet discovered at all, ?? and what percent > of the human genome is still not sequenced in order to be able > to search against? > Thanks alot again for all your kind assistance, > Mahmoud ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From rabidphage at gmail.com Thu May 4 13:45:47 2006 From: rabidphage at gmail.com (darx) Date: Thu, 04 May 2006 18:45:47 +0100 Subject: [BiO BB] sequence extraction Message-ID: <445A3DCB.9000402@gmail.com> Greetings I've got a gene and i'd like to find the promoters of it. I don't know how to access the upstream and downstream sequences of that gene from ncbi databases. Please help... I facing a deadline.. :'( Thanks in advance.... From dankoc at gmail.com Thu May 4 13:59:14 2006 From: dankoc at gmail.com (Charles Danko) Date: Thu, 4 May 2006 13:59:14 -0400 Subject: [BiO BB] sequence extraction In-Reply-To: <445A3DCB.9000402@gmail.com> References: <445A3DCB.9000402@gmail.com> Message-ID: <8adccabf0605041059v3198eaddp487bccf778ceaa7@mail.gmail.com> This process is trivial using ENSEMBL. Just search for the gene of interest, select it, and the option is on the main page for the gene to view upstream & downstream regions. Good luck! Charles On 5/4/06, darx wrote: > > Greetings > I've got a gene and i'd like to find the promoters of it. I don't know > how to access the upstream and downstream sequences of that gene from > ncbi databases. Please help... > I facing a deadline.. :'( > Thanks in advance.... > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rabidphage at gmail.com Thu May 4 14:31:43 2006 From: rabidphage at gmail.com (darx) Date: Thu, 04 May 2006 19:31:43 +0100 Subject: [BiO BB] sequence extraction In-Reply-To: <8adccabf0605041059v3198eaddp487bccf778ceaa7@mail.gmail.com> References: <445A3DCB.9000402@gmail.com> <8adccabf0605041059v3198eaddp487bccf778ceaa7@mail.gmail.com> Message-ID: <445A488F.90105@gmail.com> Thanks a lot mate.. :) u rock.. Charles Danko wrote: > This process is trivial using ENSEMBL. Just search for the gene of > interest, select it, and the option is on the main page for the gene > to view upstream & downstream regions. > > Good luck! > Charles > > On 5/4/06, *darx* > > wrote: > > Greetings > I've got a gene and i'd like to find the promoters of it. I don't know > how to access the upstream and downstream sequences of that gene from > ncbi databases. Please help... > I facing a deadline.. :'( > Thanks in advance.... > _______________________________________________ > Bioinformatics.Org general > forum - BiO_Bulletin_Board at bioinformatics.org > > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > ------------------------------------------------------------------------ > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From namal at bigred.unl.edu Fri May 5 10:30:15 2006 From: namal at bigred.unl.edu (namal at bigred.unl.edu) Date: Fri, 05 May 2006 09:30:15 -0500 (CDT) Subject: [BiO BB] Glycosylation prediction- bacteria In-Reply-To: <8adccabf0605041059v3198eaddp487bccf778ceaa7@mail.gmail.com> References: <445A3DCB.9000402@gmail.com> <8adccabf0605041059v3198eaddp487bccf778ceaa7@mail.gmail.com> Message-ID: <1146839415.445b61771a365@webmail02.unl.edu> Hi I am looking for a program (web site) which can predict glycosylation sites of bacterial protein, Can anyone help me? Thanks namal From rabidphage at gmail.com Fri May 5 13:53:23 2006 From: rabidphage at gmail.com (darx) Date: Fri, 5 May 2006 18:53:23 +0100 Subject: [BiO BB] nucleotide oreintation Message-ID: <87c8630a0605051053s7b6168e6la1cb131730b70fda@mail.gmail.com> how can i find the orientation of a nucleotide form nucleotide databases? thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From rabidphage at gmail.com Fri May 5 13:54:55 2006 From: rabidphage at gmail.com (darx) Date: Fri, 5 May 2006 18:54:55 +0100 Subject: [BiO BB] nucleotide oreintation Message-ID: <87c8630a0605051054i4250eca2q110f4160a7d294c5@mail.gmail.com> how to figure out the orientation(5' or 3') from databases? -------------- next part -------------- An HTML attachment was scrubbed... URL: From hamid at ibb.ut.ac.ir Fri May 5 13:31:19 2006 From: hamid at ibb.ut.ac.ir (hamid) Date: Fri, 05 May 2006 22:01:19 +0430 Subject: [BiO BB] nucleotide oreintation In-Reply-To: <87c8630a0605051054i4250eca2q110f4160a7d294c5@mail.gmail.com> References: <87c8630a0605051054i4250eca2q110f4160a7d294c5@mail.gmail.com> Message-ID: Note that all the sequences are oriented in 5' to 3' direction. /* Hamid Nikbakht, M.Sc of Cell and Molecular Biology, Laboratory of Biophysics and Molecular Biology, Bioinformatics Center, Institute of Biochemistry and Biophysics(IBB), University of Tehran, Tehran,Iran. Tel: +98-21-6111-3322 Fax: +98-21-6640-4680 Alt. E-mail: nikbakht at ibb.ut.ac.ir */ From rabidphage at gmail.com Fri May 5 17:57:19 2006 From: rabidphage at gmail.com (darx) Date: Fri, 05 May 2006 22:57:19 +0100 Subject: [BiO BB] nucleotide oreintation In-Reply-To: References: <87c8630a0605051054i4250eca2q110f4160a7d294c5@mail.gmail.com> Message-ID: <445BCA3F.1010901@gmail.com> Thanks buddy.. hope everything is fine in Iran hamid wrote: > Note that all the sequences are oriented in 5' to 3' direction. > > /* > Hamid Nikbakht, > M.Sc of Cell and Molecular Biology, > Laboratory of Biophysics and Molecular Biology, > Bioinformatics Center, > Institute of Biochemistry and Biophysics(IBB), > University of Tehran, > Tehran,Iran. > Tel: +98-21-6111-3322 > Fax: +98-21-6640-4680 > Alt. E-mail: nikbakht at ibb.ut.ac.ir > */ > > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > From rabidphage at gmail.com Fri May 5 18:06:59 2006 From: rabidphage at gmail.com (darx) Date: Fri, 05 May 2006 23:06:59 +0100 Subject: [BiO BB] region identification Message-ID: <445BCC83.40900@gmail.com> hi, I'n looking for a tool to identify the basic region of a sequence. Google didn't help much. help please thanks in advance From idoerg at burnham.org Fri May 5 18:20:12 2006 From: idoerg at burnham.org (Iddo Friedberg) Date: Fri, 05 May 2006 15:20:12 -0700 Subject: [BiO BB] region identification In-Reply-To: <445BCC83.40900@gmail.com> References: <445BCC83.40900@gmail.com> Message-ID: <445BCF9C.4050406@burnham.org> darx wrote: > hi, > I'n looking for a tool to identify the basic region of a sequence. > Google didn't help much. > help please > thanks in advance > _______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > What is "the basic region of a sequence"? Unless you mean "A basic region", as in containng a cluster of basic amino acids. -- Iddo Friedberg, Ph.D. Burnham Institute for Medical Reseach 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 713 9949 http://iddo-friedberg.org http://BioFunctionPrediction.org From rabidphage at gmail.com Fri May 5 18:24:21 2006 From: rabidphage at gmail.com (darx) Date: Fri, 05 May 2006 23:24:21 +0100 Subject: [BiO BB] region identification In-Reply-To: <445BCF9C.4050406@burnham.org> References: <445BCC83.40900@gmail.com> <445BCF9C.4050406@burnham.org> Message-ID: <445BD095.1080509@gmail.com> Iddo Friedberg wrote: > darx wrote: > >> hi, >> I'n looking for a tool to identify the basic region of a sequence. >> Google didn't help much. >> help please >> thanks in advance >> _______________________________________________ >> Bioinformatics.Org general forum - >> BiO_Bulletin_Board at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >> > > What is "the basic region of a sequence"? > > Unless you mean "A basic region", as in containng a cluster of basic > amino acids. > Forgive my english. Yes thats what I meant. A cluster of basic amino acids. I'm studiying bHLH proteins for an assignment. And i'm pretty much a noob. :) From rabidphage at gmail.com Fri May 5 19:32:13 2006 From: rabidphage at gmail.com (darx) Date: Sat, 6 May 2006 00:32:13 +0100 Subject: [BiO BB] region identification In-Reply-To: <445BD095.1080509@gmail.com> References: <445BCC83.40900@gmail.com> <445BCF9C.4050406@burnham.org> <445BD095.1080509@gmail.com> Message-ID: <87c8630a0605051632mab36b94t226c8a039730df15@mail.gmail.com> yes the basic amino acid cluster :) On 05/05/06, darx wrote: > > Iddo Friedberg wrote: > > darx wrote: > > > >> hi, > >> I'n looking for a tool to identify the basic region of a sequence. > >> Google didn't help much. > >> help please > >> thanks in advance > >> _______________________________________________ > >> Bioinformatics.Org general forum - > >> BiO_Bulletin_Board at bioinformatics.org > >> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > >> > > > > What is "the basic region of a sequence"? > > > > Unless you mean "A basic region", as in containng a cluster of basic > > amino acids. > > > Forgive my english. Yes thats what I meant. A cluster of basic amino > acids. I'm studiying bHLH proteins for an assignment. And i'm pretty > much a noob. :) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rabidphage at gmail.com Sat May 6 07:44:44 2006 From: rabidphage at gmail.com (darx) Date: Sat, 6 May 2006 12:44:44 +0100 Subject: [BiO BB] basic amino cluster Message-ID: <87c8630a0605060444i540dde83k393caa426092e7ba@mail.gmail.com> is there a too available to check for a cluster amino acids based on their properties. eg cluster of basic amino acids?? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmb at mrc-dunn.cam.ac.uk Sat May 6 09:56:02 2006 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Sat, 06 May 2006 14:56:02 +0100 Subject: [BiO BB] basic amino cluster In-Reply-To: <87c8630a0605060444i540dde83k393caa426092e7ba@mail.gmail.com> References: <87c8630a0605060444i540dde83k393caa426092e7ba@mail.gmail.com> Message-ID: <445CAAF2.4060704@mrc-dunn.cam.ac.uk> darx wrote: > is there a too available to check for a cluster amino acids based on > their properties. > eg cluster of basic amino acids?? > You can try ProtScale... http://www.expasy.org/cgi-bin/protscale.pl?Q7U3V6 > > ------------------------------------------------------------------------ > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From rabidphage at gmail.com Sat May 6 10:46:05 2006 From: rabidphage at gmail.com (darx) Date: Sat, 6 May 2006 15:46:05 +0100 Subject: [BiO BB] basic amino cluster In-Reply-To: <445CAAF2.4060704@mrc-dunn.cam.ac.uk> References: <87c8630a0605060444i540dde83k393caa426092e7ba@mail.gmail.com> <445CAAF2.4060704@mrc-dunn.cam.ac.uk> Message-ID: <87c8630a0605060746o96cb39ewb472447d607a220f@mail.gmail.com> I couldn't figure out what option to chose. However, since I had a .pdb of the file, I worked around by coloring the basic residues in pymol. Thanks the link will be useful... On 06/05/06, Dan Bolser wrote: > > darx wrote: > > is there a too available to check for a cluster amino acids based on > > their properties. > > eg cluster of basic amino acids?? > > > > You can try ProtScale... > > http://www.expasy.org/cgi-bin/protscale.pl?Q7U3V6 > > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeff at bioinformatics.org Sun May 7 07:39:17 2006 From: jeff at bioinformatics.org (J.W. Bizzaro) Date: Sun, 07 May 2006 07:39:17 -0400 Subject: [BiO BB] Re: region identification In-Reply-To: <445BD095.1080509@gmail.com> References: <445BCC83.40900@gmail.com> <445BCF9C.4050406@burnham.org> <445BD095.1080509@gmail.com> Message-ID: <445DDC65.40709@bioinformatics.org> darx wrote: > I'm studiying bHLH proteins for an assignment. And i'm pretty > much a noob. :) Darx, I hope you're not expecting mailing list subscribers to answer every question on a homework assignment :-) That is unfair to yourself, your classmates, and the subscribers. Please reference your course materials and/or ask your instructor. Thanks, Jeff -- J.W. Bizzaro Bioinformatics Organization, Inc. (Bioinformatics.Org) E-mail: jeff at bioinformatics.org Phone: +1 508 890 8600 -- From dmb at mrc-dunn.cam.ac.uk Sun May 7 09:49:21 2006 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Sun, 07 May 2006 14:49:21 +0100 Subject: [BiO BB] Re: region identification In-Reply-To: <445DDC65.40709@bioinformatics.org> References: <445BCC83.40900@gmail.com> <445BCF9C.4050406@burnham.org> <445BD095.1080509@gmail.com> <445DDC65.40709@bioinformatics.org> Message-ID: <445DFAE1.80804@mrc-dunn.cam.ac.uk> J.W. Bizzaro wrote: > darx wrote: > >> I'm studiying bHLH proteins for an assignment. And i'm pretty >> much a noob. :) > > > Darx, I hope you're not expecting mailing list subscribers to answer > every question on a homework assignment :-) That is unfair to yourself, > your classmates, and the subscribers. Please reference your course > materials and/or ask your instructor. Its hard to know where to draw the line when it comes to 'research assignments' (kinda different from something like a maths assignment I guess), however, it is always good to acknowladge your sources :) By conincidence I found this (SAPS); http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=1549558&query_hl=28 We describe several protein sequence statistics designed to evaluate distinctive attributes of residue content and arrangement in primary structure. Considered are global compositional biases, local clustering of different residue types (e.g., charged residues, hydrophobic residues, Ser/Thr), long runs of charged or uncharged residues, periodic patterns, counts and distribution of homooligopeptides, and unusual spacings between particular residue types. The computer program SAPS (statistical analysis of protein sequences) calculates all the statistics for any individual protein sequence input and is available for the UNIX environment through electronic mail on request to V.B. (volker/genomic at stanford.edu). While reading an exelent summary email posted here; http://bioinformatics.org/pipermail/ssml-general/2005-July/000203.html Darx, if you have time, it is always good to send a final email back to the list summarizing what you have found. Cheers, Dan. > Thanks, > Jeff From dmb at mrc-dunn.cam.ac.uk Sun May 7 09:49:21 2006 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Sun, 07 May 2006 14:49:21 +0100 Subject: [BiO BB] Re: region identification In-Reply-To: <445DDC65.40709@bioinformatics.org> References: <445BCC83.40900@gmail.com> <445BCF9C.4050406@burnham.org> <445BD095.1080509@gmail.com> <445DDC65.40709@bioinformatics.org> Message-ID: <445DFAE1.80804@mrc-dunn.cam.ac.uk> J.W. Bizzaro wrote: > darx wrote: > >> I'm studiying bHLH proteins for an assignment. And i'm pretty >> much a noob. :) > > > Darx, I hope you're not expecting mailing list subscribers to answer > every question on a homework assignment :-) That is unfair to yourself, > your classmates, and the subscribers. Please reference your course > materials and/or ask your instructor. Its hard to know where to draw the line when it comes to 'research assignments' (kinda different from something like a maths assignment I guess), however, it is always good to acknowladge your sources :) By conincidence I found this (SAPS); http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=1549558&query_hl=28 We describe several protein sequence statistics designed to evaluate distinctive attributes of residue content and arrangement in primary structure. Considered are global compositional biases, local clustering of different residue types (e.g., charged residues, hydrophobic residues, Ser/Thr), long runs of charged or uncharged residues, periodic patterns, counts and distribution of homooligopeptides, and unusual spacings between particular residue types. The computer program SAPS (statistical analysis of protein sequences) calculates all the statistics for any individual protein sequence input and is available for the UNIX environment through electronic mail on request to V.B. (volker/genomic at stanford.edu). While reading an exelent summary email posted here; http://bioinformatics.org/pipermail/ssml-general/2005-July/000203.html Darx, if you have time, it is always good to send a final email back to the list summarizing what you have found. Cheers, Dan. > Thanks, > Jeff From rres.bab-announce at bbsrc.ac.uk Thu May 4 11:16:11 2006 From: rres.bab-announce at bbsrc.ac.uk (rres bab-announce (RRes-Roth)) Date: Thu, 4 May 2006 16:16:11 +0100 Subject: [BiO BB] 3rd Integrative Bioinformatics Workshop - NEW PAPER SUBMISSION DEADLINE JUNE 5 Message-ID: *** CALL FOR PAPERS - NEW PAPER SUBMISSION DEADLINE JUNE 5 *** we have extended the deadlines for the workshop. Several people have requested this as the current deadline clashes with other events. 3rd Integrative Bioinformatics Workshop September 4-6, 2006 Rothamsted Research, Harpenden, Hertfordshire, United Kingdom http://www.rothamsted.bbsrc.ac.uk/bab/conf/ibiof/ Accepted papers will also appear in the Journal of Integrative Bioinformatics http://journal.imbio.de/ DESCRIPTION Biological data are scattered across hundreds of biological databases and thousands of scientific journals. Current high throughput genomics technologies generate large quantities of high dimensional data. Microarray, NMR, mass spectrometry, protein chips, gel electrophoresis data, Yeast-Two-Hybrid, QTL mapping, gene silencing and knockout experiments are all examples of technologies that capture thousands of data points, often in single experiments. The challenge for Integrative Bioinformatics is to capture, model, integrate and analyse these data in a consistent way to provide new and deeper insights into complex biological systems. This, third workshop on Integrative Bioinformatics will be of interest to Bioinformaticians, Computer Scientists and others working in, or interested in finding out more about, the developing area of integrative bioinformatics. There will be opportunities to present and discuss methods, theoretical approaches or their practical applications. TOPICS Database Integration Combined dry and wetlab studies Molecular Databases / Data Warehouses Errors and inconsistencies in biological databases Prediction and Integration of Metabolic and Regulatory Networks Genotype - phenotype linkage Protein-Protein-Interactions Microarray Modeling and Analyses Integrative Approaches for Drug Design Computational Infrastructure for Biotechnology Virtual Cell Modeling Gene Identification, Regulation and Expression Identification of Gene Regulatory Networks Computational Systems Biology Computational Proteomics Optimization of Workflow Management in Bioinformatics Bio Ontologies Quality and consistency of ontologies Integrative modeling and simulation frameworks Integrative data and text mining approaches IMPORTANT DATES 5th June 2006 Paper submission deadline 10th July 2006 Notification of acceptance for papers 24th July 2006 Camera ready paper submission deadline 1st August 2006 Registration deadline 15th August 2006 Poster submission deadline ORGANISING COMMITTEE Julio Collado-Vides UNAM, Mexico Ralf Hofest?dt Bielefeld University, Germany Paul Kersey EBI, UK Jacob Koehler RRes, UK Chris Rawlings RRes, UK Uwe Scholz IPK Gatersleben, Germany Paul Verrier RRes, UK CONTACT Karen Morris, BAB Division, Rothamsted Research, West Common, Harpenden, Herts. AL5 2JQ, UK karen.morris at bbsrc.ac.uk tel: 01582 763133 ext 2813 fax: 01582 467907 From mmarchywka at eyewonder.com Sun May 7 10:59:52 2006 From: mmarchywka at eyewonder.com (Mike Marchywka) Date: Sun, 7 May 2006 10:59:52 -0400 Subject: [BiO BB] How can I retrieve the biggest transposableelementfromNCBI? Message-ID: <73CA026E5E77C74398C69F3338C5967C0750E190@atlexc01.atlanta.eyewonder.com> The reason I was so enthusiastic is I had been meaning to do this for a while as I had a backlog of proteins I wanted to compare and contrast. This, and the recent set of more basic links posted over the last few days got me to the point of making the needed additions to my scripts. In paricular, manual selection by size is important to avoid precursors, stupid fragments, variants, etc. The link from Dan Bolser pointed me to a bunch of uniformly coded ranking systems http://www.expasy.org/cgi-bin/protscale.pl?Q7U3V6 that I could parse with another script. Getting the text color modules from CPAN, I now have a pretty easy way to compare/color code just the proteins I want with clustalw. So, I get things like this in color: $ more cd28comp.aln |proteinparse -tr aa/dolittle Table loaded ok CLUSTAL W AAA51944 -MLRLLLALNLFPSIQVTGNKILVKQSPMLVAYDNAVNLSCKYSYNLFSREFRASLHKGL AAK37601 -MLRLLLALNLLPSIRVTGNKILVKQSPMLVAYDNAVNLSCKYSYNLFSREFRASLHKGL AAK37604 -MLRLLLVLNLFPSIQATGIKILVKQSPMLEAYDNTVNLTCKYSCNLFSRQFQASLHKGV CAA63707 -MLRLLLALNFFPSIQVAENKILVKQSPMLVVNDNEVNLSCKYTYNLFSKEFRASLYKGA BAA92349 MILRLLLALNFFPSIQVTENKILVKQLPRLVVYNNEVNLSCKYTHNLFSKEFRASLYKGV AAB53574 MILRLLLALNFFPSIQVTENKILVKQLPRLVVYNNEVNLSCKYTHNFFSKEFRASLYKGV AAF72533 MILRLLLALNFFPSIQVTENKILVKQLPRLVVYNNEVNLSCKYTYNLFSKEFRASLYKGV CAA39003 MTLRLLFLALSFFSVQVTENKILVKQSPLLVVDNNEVSLSCRYSYNLLAKEFRASLYKGV Anyone have comments on bioperl? This seems to keep coming up in google as I look for modules to install. I keep building my own stuff and have a pretty big collection of works in progress ( kluges). There are probably canned packages that do these things better. FWIW, I anticipate looking at things like the following: 1) Immune signalling- the immediate interest is tgn1412 ( comes up on google) and species translation issues when dealing with immune pathways ( cd28 and ctla-4 in the present case). 2) If I ever get back to it, an earlier interest in defective ribosomes/translation and immunotherapy. That is, Dendreon had one paper claiming that frame shifted proteins were unique to and may be general features of cancers ( animal data on an immunotherapy targeting a frame shifted protein looked pretty compelling). http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=15568617&query_hl=7&itool=pubmed_docsum You can appreciate the needs here- finding candidate alternative-reading frame peptides that can be observed by the immune system suggests that you may want to try speculative translation schemes applied to genome sequences and then feed to epitope prediction stuff etc. Or, look at published peptides and see what mis-translated genes they may derive from. 3) General issues in translating animal results into people as they may be described in small differences in key proteins- AA sequence and postranslational modifications. 4) Data mining from scientific abstracts- nothing particularly novel here but I have been amazed at how easy it is to overlook simple search strategies until a script output some phrases/ideas. -----Original Message----- From: bio_bulletin_board-bounces+mmarchywka=eyewonder.com at bioinformatics.org [mailto:bio_bulletin_board-bounces+mmarchywka=eyewonder.com at bioinformatics.org]On Behalf Of zheng hui Sent: WednesdayMay-03-2006 11:08 AM To: bio_bulletin_board at bioinformatics.org Subject: RE: [BiO BB] How can I retrieve the biggest transposableelementfromNCBI? Thank you for your enthusiastic help. I have some exprience in perl programing but don't know eUtils before. This tool is a real powerful utility and I feel so glad that you introduce it to me. From massimo.ubaldi at unicam.it Mon May 8 06:26:52 2006 From: massimo.ubaldi at unicam.it (Massimo Ubaldi) Date: Mon, 8 May 2006 12:26:52 +0200 Subject: [BiO BB] Summer School Microarray and Bioinformatics Message-ID: The University of Camerino (Italy) is pleased to announce the II edition of the Summer School "Microarray Technology and Bioinformatics" to be held in Camerino from August 28 to September 1, 2006. Topics of the course include: Microarray experiment design, SNP and MicroRNA microarrays, data preprocessing and normalization, Illumina BeadArray data analysis, Affymetrix quality control, preprocessing and normalisation, clustering and classification, Web tols for microarray data analysis, microarray data mining. The complete informations about topics, registration and accomodation can be found at the web site: http://web.unicam.it/microarray. Best Regards Massimo Ubaldi Massimo Ubaldi PhD Department of Experimental Medicine and Public Health University of Camerino (UNICAM) Via Scalzino 3 62032 Camerino (MC) Italy phone: +39 0737 403322 fax: +39 0737 630618 From paolo at nettab.org Mon May 8 08:09:01 2006 From: paolo at nettab.org (Paolo Romano) Date: Mon, 8 May 2006 14:09:01 +0200 (DFT) Subject: [BiO BB] CfP: NETTAB 2006 GRID Infrastructures for Bioinformatics Message-ID: <200605081209.k48C92O9005742@ibm43p.biotech.ist.unige.it> Workshop NETTAB 2006 on Distributed Applications, Web Services, Tools and GRID Infrastructures for Bioinformatics July 10-13, 2006 Santa Margherita di Pula, Sardinia, Italy http://www.nettab.org/2006/ CALL FOR PAPERS New deadline for oral communications: May 12, 2006 Due to a number of requests, we decided to postpone the deadline for the submission of oral presentations. The new deadline is now defined as May 12, 2006. This change should not influence the other deadlines. Oral presentations due: May 12, 2006 Notification to authors: May 22, 2006 Posters and position papers due: May 31, 2006 Early registration: May 31, 2006 We are now planning for a special issue of IEEE Transactions on Nanobiosciences for the best papers of our workshop. A special call will soon be delivered. The expected deadline is around next June 15, 2006. TOPICS A non exhaustive list of topics relevant to the workshop includes: Technologies and technological platforms: - Standards and protocols for Web Services - Choreography and Orchestration of Web Services - Comparison of available technologies, limitations, pros and cons - Knowledge representation and knowledge modeling tools for biological data - Ontologies, databases and applications of semantics in bioinformatics - Semantic Web technologies New tools for bioinformatics: - Web Services and related tools - Workflow management systems and enactment portals - Technologies for GRID infrastructures - Semantic Web tools Applications in bioinformatics - Case studies, scenarios and use cases - Remote applications for life sciences analysis - Workflows for actual data analysis in life sciences - Data intensive applications on GRID solutions - Remote biological data mining SCIENTIFIC PROGRAMME OPENING LECTURE Bioinformatics GRID based projects overview Luciano Milanesi, Biomedical Technologies Institute (ITB), National Research Council (CNR), Milan, Italy INVITED LECTURES ICT technologies involved with GRID infrastructure (title to be confirmed) Roberto Barbera, University of Catania and National Nuclear Physics Institute (INFN), Catania, Italy Setting up a Bioinformatics service Center in a distributed environment Patricia Rodriguez-Tome', Center for Advanced Studies, Research and Development in Sardinia (CRS4), Cagliari, Italy The EELA project: Biomedical Applications (title to be confirmed) Rafael Mayo Garcia Research Centre for Energy, Environment and Technology (CIEMAT) Madrid, Spain HealthGrid (title to be confirmed) Vincent Breton, CNRS, France TUTORIALS Agents in bioGRID computing Emanuela Merelli, University of Camerino, Camerino (MC), Italy Agent-based Infrastructures and Tools for Bioinformatics Giuliano Armano and Eloisa Vargiu, DIEE, University of Cagliari, Cagliari, Italy New bioinformatics applications based on Web Services technologies and GRID computing Tiziana Castrignano', CASPUR, Rome, Italy bioMOBY concept, architecture, central reporitory and related tools Martin Senger, IRRI, Manila, Philippines BioinfoGRID: theory and practice (title to be confirmed) Roberto Barbera, University of Catania and INFN, Catania, Italy and Giorgio Pietro Maggi, INFN, Bari , Italy Tutorials will be held at Polaris Science and Technology Park of Sardinia, very close to the location of the workshop. A shuttle bus will be available to reach Polaris from Is Morus Relais. A presentation of groups working in Polaris and of their scientific activity will also be given on Thursday 13 at the same location. Best regards. On behalf of the Organizing Committee Paolo Romano --- Paolo Romano (paolo.romano at istge.it) Bioinformatics and Structural Proteomics National Cancer Research Institute (IST) Largo Rosanna Benzi, 10, I-16132, Genova, Italy Tel: +39-010-5737-288 Fax: +39-010-5737-295 Skype: p.romano Web: http://www.nettab.org/promano/ From sswang at berkeley.edu Mon May 8 09:47:42 2006 From: sswang at berkeley.edu (Sarah Wang) Date: Mon, 08 May 2006 06:47:42 -0700 Subject: [BiO BB] E.coli K12 W3110 and MG1655 gene id conversion In-Reply-To: Message-ID: Hi, I need to convert gene ID at KEGG of E.coli K12 W3110 (like "JW0001") to the corresponding MG1655 geneID, ideally to Blattner Number (like "b0001"). Does anybody know where to download this conversion table or does anybody have one? Thanks Sarah From kannaiah at bsd.uchicago.edu Mon May 8 17:48:39 2006 From: kannaiah at bsd.uchicago.edu (kannaiah at bsd.uchicago.edu) Date: Mon, 8 May 2006 16:48:39 -0500 Subject: [BiO BB] Extracting upstream sequence of a gene Message-ID: <1147124919.445fbcb75d225@netmail.bsd.uchicago.edu> Hello, I have seen a few posts asking similar questions. I am looking to do something similar too. I want to extract the upstream sequence of genes (upto 3000bp upstream) in Human. But going thru the ensembl website is ok, if one has few genes. But i have a few hundred genes. I was wondering what would be the best way to automate this. Should i try blasting the gene sequences to the Human Chromosome files, and then parse the blast output to get the position of the genes, and go back and read the chromosome sequence where it was found and get the upstream sequence. That would be a long way, hopefully there is someother shorter way to do this, which i am not aware of. Any suggestions would be welcome:) Thank you -hak ------------------------------------------------- This email is intended only for the use of the individual or entity to which it is addressed and may contain information that is privileged and confidential. If the reader of this email message is not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is prohibited. If you have received this email in error, please notify the sender and destroy/delete all copies of the transmittal. Thank you. ------------------------------------------------- From pnuin at terra.com.br Mon May 8 17:56:44 2006 From: pnuin at terra.com.br (Paulo Nuin) Date: Mon, 08 May 2006 17:56:44 -0400 Subject: [BiO BB] Extracting upstream sequence of a gene In-Reply-To: <1147124919.445fbcb75d225@netmail.bsd.uchicago.edu> References: <1147124919.445fbcb75d225@netmail.bsd.uchicago.edu> Message-ID: <445FBE9C.3070407@terra.com.br> Hi If you have the IDs of these genes you can do that on the UCSC genome browser. You can set a region to download automatically from a multiple search. Regards Paulo kannaiah at bsd.uchicago.edu wrote: > Hello, > > I have seen a few posts asking similar questions. I am looking to do something > similar too. > > I want to extract the upstream sequence of genes (upto 3000bp upstream) in > Human. But going thru the ensembl website is ok, if one has few genes. > > But i have a few hundred genes. I was wondering what would be the best way to > automate this. > Should i try blasting the gene sequences to the Human Chromosome files, and then > parse the blast output to get the position of the genes, and go back and read > the chromosome sequence where it was found and get the upstream sequence. > > That would be a long way, hopefully there is someother shorter way to do this, > which i am not aware of. > Any suggestions would be welcome:) > > Thank you > > -hak > > > > > ------------------------------------------------- > This email is intended only for the use of the individual or entity to which > it is addressed and may contain information that is privileged and > confidential. If the reader of this email message is not the intended > recipient, you are hereby notified that any dissemination, distribution, or > copying of this communication is prohibited. If you have received this email > in error, please notify the sender and destroy/delete all copies of the > transmittal. Thank you. > ------------------------------------------------- > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > E-mail classificado pelo Identificador de Spam Inteligente Terra. > Para alterar a categoria classificada, visite > http://mail.terra.com.br/protected_email/imail/imail.cgi?+_u=pnuin&_l=1,1147124935.329176.19195.ambrose.hst.terra.com.br,5320,Des15,Des15 > > Esta mensagem foi verificada pelo E-mail Protegido Terra. > Scan engine: McAfee VirusScan / Atualizado em 08/05/2006 / Vers?o: 4.4.00/4757 > Proteja o seu e-mail Terra: http://mail.terra.com.br/ > > > From kannaiah at bsd.uchicago.edu Mon May 8 18:03:21 2006 From: kannaiah at bsd.uchicago.edu (kannaiah at bsd.uchicago.edu) Date: Mon, 8 May 2006 17:03:21 -0500 Subject: [BiO BB] Extracting upstream sequence of a gene In-Reply-To: <445FBE9C.3070407@terra.com.br> References: <1147124919.445fbcb75d225@netmail.bsd.uchicago.edu> <445FBE9C.3070407@terra.com.br> Message-ID: <1147125801.445fc0294a892@netmail.bsd.uchicago.edu> Hi Paulo, I didnt really see a way to do multiple search on the genome browser. To extract from a certain region, i guess then i will have to compile the start and end regions of the gene in the chromosomes before hand? -Kiran Quoting Paulo Nuin : > Hi > > If you have the IDs of these genes you can do that on the UCSC genome > browser. You can set a region to download automatically from a multiple > search. > > Regards > > Paulo > > > kannaiah at bsd.uchicago.edu wrote: > > Hello, > > > > I have seen a few posts asking similar questions. I am looking to do > something > > similar too. > > > > I want to extract the upstream sequence of genes (upto 3000bp upstream) > in > > Human. But going thru the ensembl website is ok, if one has few genes. > > > > But i have a few hundred genes. I was wondering what would be the best way > to > > automate this. > > Should i try blasting the gene sequences to the Human Chromosome files, and > then > > parse the blast output to get the position of the genes, and go back and > read > > the chromosome sequence where it was found and get the upstream sequence. > > > > That would be a long way, hopefully there is someother shorter way to do > this, > > which i am not aware of. > > Any suggestions would be welcome:) > > > > Thank you > > > > -hak > > > > > > > > > > ------------------------------------------------- > > This email is intended only for the use of the individual or entity to > which > > it is addressed and may contain information that is privileged and > > confidential. If the reader of this email message is not the intended > > recipient, you are hereby notified that any dissemination, distribution, > or > > copying of this communication is prohibited. If you have received this > email > > in error, please notify the sender and destroy/delete all copies of the > > transmittal. Thank you. > > ------------------------------------------------- > > _______________________________________________ > > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > > E-mail classificado pelo Identificador de Spam Inteligente Terra. > > Para alterar a categoria classificada, visite > > > http://mail.terra.com.br/protected_email/imail/imail.cgi?+_u=pnuin&_l=1,1147124935.329176.19195.ambrose.hst.terra.com.br,5320,Des15,Des15 > > > > Esta mensagem foi verificada pelo E-mail Protegido Terra. > > Scan engine: McAfee VirusScan / Atualizado em 08/05/2006 / Vers?o: > 4.4.00/4757 > > Proteja o seu e-mail Terra: http://mail.terra.com.br/ > > > > > > > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > ------------------------------------------------- This email is intended only for the use of the individual or entity to which it is addressed and may contain information that is privileged and confidential. If the reader of this email message is not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is prohibited. If you have received this email in error, please notify the sender and destroy/delete all copies of the transmittal. Thank you. ------------------------------------------------- From pnuin at terra.com.br Mon May 8 18:10:34 2006 From: pnuin at terra.com.br (Paulo Nuin) Date: Mon, 08 May 2006 18:10:34 -0400 Subject: [BiO BB] Extracting upstream sequence of a gene In-Reply-To: <1147125801.445fc0294a892@netmail.bsd.uchicago.edu> References: <1147124919.445fbcb75d225@netmail.bsd.uchicago.edu> <445FBE9C.3070407@terra.com.br> <1147125801.445fc0294a892@netmail.bsd.uchicago.edu> Message-ID: <445FC1DA.4060408@terra.com.br> Hi Kiran Try this http://genome.ucsc.edu/cgi-bin/hgTables IT is the tables browser. You can set several different parameters and do a search with hundreds of gene ids. Another option is to write a script and search for genes using the GeneSorter, but then you would need to parse the output. HTH Paulo kannaiah at bsd.uchicago.edu wrote: > Hi Paulo, > > I didnt really see a way to do multiple search on the genome browser. > To extract from a certain region, i guess then i will have to compile the start > and end regions of the gene in the chromosomes before hand? > > -Kiran > > Quoting Paulo Nuin : > > >> Hi >> >> If you have the IDs of these genes you can do that on the UCSC genome >> browser. You can set a region to download automatically from a multiple >> search. >> >> Regards >> >> Paulo >> >> >> kannaiah at bsd.uchicago.edu wrote: >> >>> Hello, >>> >>> I have seen a few posts asking similar questions. I am looking to do >>> >> something >> >>> similar too. >>> >>> I want to extract the upstream sequence of genes (upto 3000bp upstream) >>> >> in >> >>> Human. But going thru the ensembl website is ok, if one has few genes. >>> >>> But i have a few hundred genes. I was wondering what would be the best way >>> >> to >> >>> automate this. >>> Should i try blasting the gene sequences to the Human Chromosome files, and >>> >> then >> >>> parse the blast output to get the position of the genes, and go back and >>> >> read >> >>> the chromosome sequence where it was found and get the upstream sequence. >>> >>> That would be a long way, hopefully there is someother shorter way to do >>> >> this, >> >>> which i am not aware of. >>> Any suggestions would be welcome:) >>> >>> Thank you >>> >>> -hak >>> >>> >>> >>> >>> ------------------------------------------------- >>> This email is intended only for the use of the individual or entity to >>> >> which >> >>> it is addressed and may contain information that is privileged and >>> confidential. If the reader of this email message is not the intended >>> recipient, you are hereby notified that any dissemination, distribution, >>> >> or >> >>> copying of this communication is prohibited. If you have received this >>> >> email >> >>> in error, please notify the sender and destroy/delete all copies of the >>> transmittal. Thank you. >>> ------------------------------------------------- >>> _______________________________________________ >>> Bioinformatics.Org general forum - >>> >> BiO_Bulletin_Board at bioinformatics.org >> >>> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >>> >>> E-mail classificado pelo Identificador de Spam Inteligente Terra. >>> Para alterar a categoria classificada, visite >>> >>> > http://mail.terra.com.br/protected_email/imail/imail.cgi?+_u=pnuin&_l=1,1147124935.329176.19195.ambrose.hst.terra.com.br,5320,Des15,Des15 > >>> Esta mensagem foi verificada pelo E-mail Protegido Terra. >>> Scan engine: McAfee VirusScan / Atualizado em 08/05/2006 / Vers?o: >>> >> 4.4.00/4757 >> >>> Proteja o seu e-mail Terra: http://mail.terra.com.br/ >>> >>> >>> >>> >> _______________________________________________ >> Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >> >> >> >> > > > > > > ------------------------------------------------- > This email is intended only for the use of the individual or entity to which > it is addressed and may contain information that is privileged and > confidential. If the reader of this email message is not the intended > recipient, you are hereby notified that any dissemination, distribution, or > copying of this communication is prohibited. If you have received this email > in error, please notify the sender and destroy/delete all copies of the > transmittal. Thank you. > ------------------------------------------------- > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > E-mail classificado pelo Identificador de Spam Inteligente Terra. > Para alterar a categoria classificada, visite > http://mail.terra.com.br/protected_email/imail/imail.cgi?+_u=pnuin&_l=1,1147125825.794947.25535.baladonia.terra.com.br,7619,Des15,Des15 > > Esta mensagem foi verificada pelo E-mail Protegido Terra. > Scan engine: McAfee VirusScan / Atualizado em 08/05/2006 / Vers?o: 4.4.00/4757 > Proteja o seu e-mail Terra: http://mail.terra.com.br/ > > > From mmarchywka at eyewonder.com Mon May 8 18:38:09 2006 From: mmarchywka at eyewonder.com (Mike Marchywka) Date: Mon, 8 May 2006 18:38:09 -0400 Subject: [BiO BB] Extracting upstream sequence of a gene Message-ID: <73CA026E5E77C74398C69F3338C5967C0750E19B@atlexc01.atlanta.eyewonder.com> In my prior post, I discussed one reason that I may want to download the entire genome and apply various translation schemes ( model faulty ribosomes), pick out certain peptides, and then run them through blast to see if they have been observed anywhere. Someone posted the ensemble link the other day: ftp://ftp.ensembl.org/pub/current_homo_sapiens/data/fasta/dna/ and, sure enough, if I gunzip it I have what I need- one chromosome at a time may even fit in memory. Now, I just need some codon libraries. I can find simple things like this: http://www.cbs.dtu.dk/courses/27613/codon.html but are there species specific probabilities anywhere? I did find this: http://www.cbs.dtu.dk/dtucourse/27611spring2006/Lecture03/virtualribosome.pdf Thanks. ************************************************************************* Mike Marchywka EyeWonder Instant Streaming, Infinite Results 1447 Peachtree Street 9th Floor Atlanta, GA 30309 w.678-891-2033 c. h.770-565-8101 mmarchywka at eyewonder.com alt: marchywka at hotmail.com Instant Streaming, Intelligent results. ************************************************************************* -----Original Message----- From: bio_bulletin_board-bounces+mmarchywka=eyewonder.com at bioinformatics.org [mailto:bio_bulletin_board-bounces+mmarchywka=eyewonder.com at bioinformati cs.org]On Behalf Of kannaiah at bsd.uchicago.edu Sent: MondayMay-08-2006 06:03 PM To: The general forum at Bioinformatics.Org Subject: Re: [BiO BB] Extracting upstream sequence of a gene Hi Paulo, I didnt really see a way to do multiple search on the genome browser. To extract from a certain region, i guess then i will have to compile the start and end regions of the gene in the chromosomes before hand? -Kiran From nagesh.chakka at anu.edu.au Tue May 9 03:58:44 2006 From: nagesh.chakka at anu.edu.au (Nagesh Chakka) Date: Tue, 09 May 2006 17:58:44 +1000 Subject: [BiO BB] Fish genomics Message-ID: <44604BB4.7000606@anu.edu.au> Hi All, I just started work with fish genomes for my comparative study. I am a bit puzzled as to why three different fish species were selected for sequencing (Danio rerio , Fugu rubripes, and Tetraodon nigroviridis). Is there is any advantage in each of these species selected for sequencing which is not there in the other? I am addressing this question to this forum as I have a feeling that someone out there may be working with fish genome and may be having extensive information about what I was looking for. Please also note that I could not find any straight forward answer to my question searching the web. Thanks Nagesh From rb at hcl.in Tue May 9 04:09:40 2006 From: rb at hcl.in (Balamurugan.R) Date: Tue, 09 May 2006 13:39:40 +0530 Subject: [BiO BB] Glycosylation prediction- bacteria In-Reply-To: <1146839415.445b61771a365@webmail02.unl.edu> References: <445A3DCB.9000402@gmail.com><8adccabf0605041059v3198eaddp487bccf 778ceaa7@mail.gmail.com> <1146839415.445b61771a365@webmail02.unl.edu> Message-ID: <44604E44.2050704@hcl.in> hi, Try CBS Prediction Servers at http://www.cbs.dtu.dk/services/ Best Regards, Balamurugan.R namal at bigred.unl.edu wrote: >Hi >I am looking for a program (web site) which can predict glycosylation sites of >bacterial protein, Can anyone help me? >Thanks >namal >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > >DISCLAIMER: >----------------------------------------------------------------------------------------------------------------------- > >The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. >It shall not attach any liability on the originator or HCL or its affiliates. Any views or opinions presented in >this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates. >Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of >this message without the prior written consent of the author of this e-mail is strictly prohibited. If you have >received this email in error please delete it and notify the sender immediately. Before opening any mail and >attachments please check them for viruses and defect. > >----------------------------------------------------------------------------------------------------------------------- > > > > From roy at colibase.bham.ac.uk Tue May 9 07:56:11 2006 From: roy at colibase.bham.ac.uk (Roy Chaudhuri) Date: Tue, 09 May 2006 12:56:11 +0100 Subject: [BiO BB] E.coli K12 W3110 and MG1655 gene id conversion Message-ID: <4460835B.5010605@colibase.bham.ac.uk> Hi Sarah, > I need to convert gene ID at KEGG of E.coli K12 W3110 (like "JW0001") to the > corresponding MG1655 geneID, ideally to Blattner Number (like "b0001"). > Does anybody know where to download this conversion table or does anybody > have one? Our recent reannotation of the E.coli K-12 genome includes a table with this (and lots of other) information in the supplementary data. See: http://nar.oxfordjournals.org/cgi/content/full/34/1/1 http://nar.oxfordjournals.org/content/vol34/issue1/images/data/1/DC1/Supplementary_Table_1_Annotation_E._coli_Genes.txt Roy. -- Dr. Roy Chaudhuri Bioinformatics Research Fellow Division of Immunity and Infection University of Birmingham, U.K. http://xbase.bham.ac.uk From kannaiah at bsd.uchicago.edu Tue May 9 10:00:26 2006 From: kannaiah at bsd.uchicago.edu (kannaiah at bsd.uchicago.edu) Date: Tue, 9 May 2006 09:00:26 -0500 Subject: [BiO BB] Fish genomics In-Reply-To: <444DD1A5.20801@anu.edu.au> References: <444DD1A5.20801@anu.edu.au> Message-ID: <1147183226.4460a07a92fa1@netmail.bsd.uchicago.edu> Hi Nagesh, I know when i did some comparative work in eukaryotes, i used buth fugu and danio in my studies. Tetraodon and Fugu the 2 types of pufferfish, were sequenced due to their small genomes. And they are also one of the smallest known vertebrates. Follow these links: http://genome.jgi-psf.org/Takru4/Takru4.home.html http://www.cns.fr/externe/English/Projets/Projet_C/organisme_C.html#resume I know that zebra fish is used quite a lot in various biological studies and hence was a popular candidate for sequencing. Follow this link and you will get a better idea as to why... http://users.rcn.com/jkimball.ma.ultranet/BiologyPages/Z/Zebrafish.html hope that helps -Kiran Quoting Nagesh Chakka : > Hi All, > I just started work with fish genomes for my comparative study. I am a > bit puzzled as to why three different fish species were selected for > sequencing (Danio rerio , Fugu rubripes, > and Tetraodon nigroviridis). Is there is any advantage > in each of these species selected for sequencing which is not there in > the other? I am addressing this question to this forum as I have a > feeling that someone out there may be working with fish genome and may > be having extensive information about what I was looking for. Please > also note that I could not find any straight forward answer to my > question searching the web. > Thanks > Nagesh > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > ------------------------------------------------- This email is intended only for the use of the individual or entity to which it is addressed and may contain information that is privileged and confidential. If the reader of this email message is not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is prohibited. If you have received this email in error, please notify the sender and destroy/delete all copies of the transmittal. Thank you. ------------------------------------------------- From lray at albany.edu Tue May 9 10:22:07 2006 From: lray at albany.edu (Lipika Ray) Date: Tue, 9 May 2006 10:22:07 -0400 (EDT) Subject: [BiO BB] Percentage sequence identity calculation program Message-ID: <1369.169.226.137.206.1147184527.squirrel@webmail.albany.edu> Hello, I have a set of fragments of sequences which are of different length. I want to calculate percentage sequence identity of those sequences. Which program will be the fittest to do this type of calculation? I have seen that lot of databases are reporting about percentage sequence identity of the sequences used, but by which program they calculate it, I have no idea about that. Please help me if anyone have any clue. Thanks in advance, Lipika From christoph.gille at charite.de Tue May 9 10:55:49 2006 From: christoph.gille at charite.de (Dr. Christoph Gille) Date: Tue, 9 May 2006 16:55:49 +0200 (CEST) Subject: [BiO BB] Percentage sequence identity calculation program In-Reply-To: <1369.169.226.137.206.1147184527.squirrel@webmail.albany.edu> References: <1369.169.226.137.206.1147184527.squirrel@webmail.albany.edu> Message-ID: <52013.141.42.56.114.1147186549.squirrel@webmail.charite.de> If the number of sequences is less than 2000 the program STRAP can compute a matrix of pairwise seq identities. You can produce a large alignment and compute the matrix or you can handle each pair on its own. You can use an alignment score or the crude %age of identical residues. There is a choice of several standard alignment programs. This matrix can then be exported as ASCII text to be further processed in a spread sheet. From golharam at umdnj.edu Tue May 9 11:14:54 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Tue, 09 May 2006 11:14:54 -0400 Subject: [BiO BB] Percentage sequence identity calculation program In-Reply-To: <1369.169.226.137.206.1147184527.squirrel@webmail.albany.edu> Message-ID: <045601c6737b$57937110$2f01a8c0@GOLHARMOBILE1> If you are referring to comparing two sequences, you can use a global alignment tool (such as needle from EMBOSS). -----Original Message----- From: bio_bulletin_board-bounces+golharam=umdnj.edu at bioinformatics.org [mailto:bio_bulletin_board-bounces+golharam=umdnj.edu at bioinformatics.org ] On Behalf Of Lipika Ray Sent: Tuesday, May 09, 2006 10:22 AM To: bio_bulletin_board at bioinformatics.org Subject: [BiO BB] Percentage sequence identity calculation program Hello, I have a set of fragments of sequences which are of different length. I want to calculate percentage sequence identity of those sequences. Which program will be the fittest to do this type of calculation? I have seen that lot of databases are reporting about percentage sequence identity of the sequences used, but by which program they calculate it, I have no idea about that. Please help me if anyone have any clue. Thanks in advance, Lipika _______________________________________________ Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From golharam at umdnj.edu Tue May 9 11:14:54 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Tue, 09 May 2006 11:14:54 -0400 Subject: [BiO BB] Fish genomics In-Reply-To: <444DD1A5.20801@anu.edu.au> Message-ID: <045701c6737b$593931d0$2f01a8c0@GOLHARMOBILE1> You should read the papers that publish the genomes for these different species. They mention the reason behind sequencing their genome. I remember from the fugu paper - Fugu is a distant cousin of humans and has a very compact genome, about 1/8 in size. As such, it lacks a lot of "junk dna" aka introns and intergenic regions leaving it genome mostly functional. It should help in determining what is functional in human versus what is non-functional. Also, look at the website that make the genomes available - NCBI, UCSC, Ensemble, etc. If you read their "About" pages, you should get more information... Ryan -----Original Message----- From: bio_bulletin_board-bounces+golharam=umdnj.edu at bioinformatics.org [mailto:bio_bulletin_board-bounces+golharam=umdnj.edu at bioinformatics.org ] On Behalf Of Nagesh Chakka Sent: Tuesday, April 25, 2006 3:37 AM To: bio_bulletin_board at bioinformatics.org Subject: [BiO BB] Fish genomics Hi All, I just started work with fish genomes for my comparative study. I am a bit puzzled as to why three different fish species were selected for sequencing (Danio rerio , Fugu rubripes, and Tetraodon nigroviridis). Is there is any advantage in each of these species selected for sequencing which is not there in the other? I am addressing this question to this forum as I have a feeling that someone out there may be working with fish genome and may be having extensive information about what I was looking for. Please also note that I could not find any straight forward answer to my question searching the web. Thanks Nagesh _______________________________________________ Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From bukowski at tc.cornell.edu Tue May 9 11:58:03 2006 From: bukowski at tc.cornell.edu (Robert Bukowski) Date: Tue, 9 May 2006 11:58:03 -0400 Subject: [BiO BB] ParseBlastXmlReport in InterProScan 4.2 Message-ID: <395BBF33F0ED3F42876CB9C653729F4A8A60AA@mail.tc.cornell.edu> Hi, I was wondering if anybody knows where I could obtain the source of the program ParseBlastXmlReport. The executable is distributed with the latest version of InterProScan (v 4.2), but the source code is not provided. Thanks in advance, Robert -------------- next part -------------- An HTML attachment was scrubbed... URL: From kannaiah at bsd.uchicago.edu Tue May 9 13:30:11 2006 From: kannaiah at bsd.uchicago.edu (kannaiah at bsd.uchicago.edu) Date: Tue, 9 May 2006 12:30:11 -0500 Subject: [BiO BB] Extracting upstream sequence of a gene In-Reply-To: <445FBE9C.3070407@terra.com.br> References: <1147124919.445fbcb75d225@netmail.bsd.uchicago.edu> <445FBE9C.3070407@terra.com.br> Message-ID: <1147195811.4460d1a332659@netmail.bsd.uchicago.edu> Hi Guys, I was able to get the upstream sequences using Biomart. Thanks to Amir(Bauer Center at Harvard Univ) Here is the link to Biomart: http://www.ensembl.org/Multi/martview Steps: 1) Under Dataset: -Selected (ensembl 38, homo sapiens genes ) 2) Filters: -GENE - ID LIST LIMIT - "HGNC Symbols", Enter symbols or upload a list. 3) OUTPUT - ATTRIBUTE - (Select Sequences) - SEQUENCES - (Select Flank(Gene)) - Check box "Upstream Flank" Choose as many other attributes as you need in your output file. -Kiran Quoting Paulo Nuin : > Hi > > If you have the IDs of these genes you can do that on the UCSC genome > browser. You can set a region to download automatically from a multiple > search. > > Regards > > Paulo > > > kannaiah at bsd.uchicago.edu wrote: > > Hello, > > > > I have seen a few posts asking similar questions. I am looking to do > something > > similar too. > > > > I want to extract the upstream sequence of genes (upto 3000bp upstream) > in > > Human. But going thru the ensembl website is ok, if one has few genes. > > > > But i have a few hundred genes. I was wondering what would be the best way > to > > automate this. > > Should i try blasting the gene sequences to the Human Chromosome files, and > then > > parse the blast output to get the position of the genes, and go back and > read > > the chromosome sequence where it was found and get the upstream sequence. > > > > That would be a long way, hopefully there is someother shorter way to do > this, > > which i am not aware of. > > Any suggestions would be welcome:) > > > > Thank you > > > > -hak > > > > > > > > > > ------------------------------------------------- > > This email is intended only for the use of the individual or entity to > which > > it is addressed and may contain information that is privileged and > > confidential. If the reader of this email message is not the intended > > recipient, you are hereby notified that any dissemination, distribution, > or > > copying of this communication is prohibited. If you have received this > email > > in error, please notify the sender and destroy/delete all copies of the > > transmittal. Thank you. > > ------------------------------------------------- > > _______________________________________________ > > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > > E-mail classificado pelo Identificador de Spam Inteligente Terra. > > Para alterar a categoria classificada, visite > > > http://mail.terra.com.br/protected_email/imail/imail.cgi?+_u=pnuin&_l=1,1147124935.329176.19195.ambrose.hst.terra.com.br,5320,Des15,Des15 > > > > Esta mensagem foi verificada pelo E-mail Protegido Terra. > > Scan engine: McAfee VirusScan / Atualizado em 08/05/2006 / Vers?o: > 4.4.00/4757 > > Proteja o seu e-mail Terra: http://mail.terra.com.br/ > > > > > > > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > ------------------------------------------------- This email is intended only for the use of the individual or entity to which it is addressed and may contain information that is privileged and confidential. If the reader of this email message is not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is prohibited. If you have received this email in error, please notify the sender and destroy/delete all copies of the transmittal. Thank you. ------------------------------------------------- From martin_jambon at emailuser.net Tue May 9 14:00:33 2006 From: martin_jambon at emailuser.net (Martin Jambon) Date: Tue, 9 May 2006 11:00:33 -0700 (PDT) Subject: [BiO BB] Percentage sequence identity calculation program In-Reply-To: <1369.169.226.137.206.1147184527.squirrel@webmail.albany.edu> References: <1369.169.226.137.206.1147184527.squirrel@webmail.albany.edu> Message-ID: Hi, A summary of the issues concerning percentage identity can be read (and extended) at wikiomics.org: http://wikiomics.org/wiki/Percentage_identity Martin On Tue, 9 May 2006, Lipika Ray wrote: > Hello, > > I have a set of fragments of sequences which are of different length. I > want to calculate percentage sequence identity of those sequences. Which > program will be the fittest to do this type of calculation? I have seen > that lot of databases are reporting about percentage sequence identity of > the sequences used, but by which program they calculate it, I have no idea > about that. Please help me if anyone have any clue. > Thanks in advance, > > Lipika > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > -- Martin Jambon, PhD http://martin.jambon.free.fr Edit http://wikiomics.org, bioinformatics wiki From penghanchuan at yahoo.com Tue May 9 23:24:56 2006 From: penghanchuan at yahoo.com (Hanchuan Peng) Date: Tue, 9 May 2006 20:24:56 -0700 (PDT) Subject: [BiO BB] Call for Talk Abstracts/Papers: 2006 International Workshop on Multiscale Biological Imaging, Data Mining and Informatics, Santa Barbara, Sept 7-8, 2006 Message-ID: <20060510032456.84815.qmail@web34611.mail.mud.yahoo.com> ** We apologize if you receive multiple copies of this announcement ** Call for Talk Abstracts/Papers: 2006 International Workshop on Multiscale Biological Imaging, Data Mining and Informatics, Santa Barbara, Sept 7-8, 2006 Important dates: * June 30 - Talk abstract submission deadline * July 15 - Notification of acceptance/rejection * July 20 - Poster abstract submission deadline * July 20 - Demonstration abstract submission deadline * Aug 20 - Early registration deadline * Sept 7-8 - Workshop dates Conference Web site: http://www.bioimageinformatics.org/2006 Talk abstracts and papers are solicited for the 2006 Workshop on Multiscale Biological Imaging, Data Mining and Informatics, Santa Barbara, Sept 7-8, 2006. This workshop aims at bringing together interdisciplinary researchers to identify problems and present answers to multiscale bioimage data mining and informatics using cutting edge imaging technology (including fluorescence imaging, electron microscopy imaging, etc.) and quantitative analysis methods (including image data analysis, computer vision, data mining, machine learning, as well as other informatics methods). Abstracts for presentations, posters or software/hardware demonstrations, related to all aspects of bioimage data mining and informatics, are welcome. Appropriate topics include but are not limited to: * Acquisition of cellular, molecular and other bioimages; * Novel bioimaging techniques; novel bioimage data; * Bioimage feature measurement, description, extraction, and selection * Bioimage registration and comparison * Object segmentation and tracking in bioimages * Clustering/classification of bioimages or patterns derived from bioimages * Object/pattern recognition and understanding in bioimages * Bioimage ontology and related data mining * Bioimage data visualization * Other bioimaging related techniques, including transmission, compression, storage, database, etc. * Tools/software for bioimage data processing and data mining * Bioimage related biology, bioinformatics, and biomedicine applications, e.g. 3D protein structure reconstruction, gene regulatory network/pathway modeling, etc. * Joint analysis using both bioimages and other data (e.g. sequences, microarray, protein interaction, etc.) * Other bioinformatics problems where advanced imaging and image analysis methods can be applied. People who intends to give a talk should submit an abstract of 1 to 2 pages no later than June 30, 2006. The abstracts will be reviewed and the authors will be notified of the results. A selected set of submissions will be invited to extend as formal papers edited in a special issue in the Open-Access journal BMC Cell Biology. *** *** Program Committee Members * Manfred Auer ??? Lawrence Berkeley National Laboratory (co-chair) * Hanchuan Peng ??? Howard Hughes Medical Institute (co-chair) * Ambuj Singh ??? University of California, Santa Barbara (co-chair) * Xuewen Chen ??? University of Kansas * Wah Chiu ??? National Center for Macromolecular Imaging, Baylor College of Medicine * Gaudenz Danuser ??? Scripps Institute * Mary Dickinson ??? Baylor College of Medicine * Robert DuBose ??? Amgen, Inc. * Robert Dunkle ??? Scimagix * Christos Faloutsos ??? Carnegie Mellon University * Steven Fisher ??? University of California, Santa Barbara * Ilya Goldberg ??? National Institute on Aging, NIH * Amarnath Gupta ??? SDSC * Bernd Hamann ??? University of California, Davis * David Knowles ??? Lawrence Berkeley National Laboratory * Richard Levenson ??? Cambridge Research & Instrumentation, Inc. (CRI) * Chung-sheng Li ??? IBM T. J. Watson Research Center * Fuhui Long ??? Howard Hughes Medical Institute * B.S. Manjunath ??? University of California, Santa Barbara * May Wang ??? Georgia Institute of Technology * Stephen Wong ??? Harvard Medical School More information is available at http://www.bioimageinformatics.org/2006 From sankar.achuth at gmail.com Wed May 10 08:23:54 2006 From: sankar.achuth at gmail.com (Dr. Achuthsankar S. Nair) Date: Wed, 10 May 2006 17:53:54 +0530 Subject: [BiO BB] MPhil in Bioinformatics Message-ID: <2b168b460605100523v110027dcg64232d13a6c7e98c@mail.gmail.com> Hi The Kerala University in Thiruvananthapuram has now announced admission for MPhil in Bioinformatics, a one-year advanced programme. Full details are available in the site www.cbi.keralauniversity.edu. Please bring this to the notice of the needy. Regards Sincerely -- Dr Achuthsankar S Nair Hon. Director Centre for Bioinformatics University of Kerala, Trivandrum 695581, INDIA Tel (O) 471-2412759 (R) 471-2542220 www.achu.keralauniversity.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmarchywka at eyewonder.com Wed May 10 08:56:44 2006 From: mmarchywka at eyewonder.com (Mike Marchywka) Date: Wed, 10 May 2006 08:56:44 -0400 Subject: [BiO BB] RE: need simple verification for ribosome Message-ID: <73CA026E5E77C74398C69F3338C5967C0750E1A6@atlexc01.atlanta.eyewonder.com> I went ahead and wrote my own ribosome script in perl and wanted to know if there is some easy way to verify it- I can translate some test sequences properly for a standard translation table and find known start/stop codons? I have human chromosomes 13 and 19, any handy links or suggestions for comparisons to known accurate results? The virtual ribosome link I posted earlier seems to work and a quick look suggested similar outputs in a few test cases ( although I don't understand what it did with the first codon in one case where it did not match). Thanks. ************************************************************************* Mike Marchywka EyeWonder Instant Streaming, Infinite Results 1447 Peachtree Street 9th Floor Atlanta, GA 30309 w.678-891-2033 c. h.770-565-8101 mmarchywka at eyewonder.com alt: marchywka at hotmail.com Instant Streaming, Intelligent results. ************************************************************************* From lichunjiang at sibs.ac.cn Thu May 11 00:36:11 2006 From: lichunjiang at sibs.ac.cn (lichunjiang) Date: Thu, 11 May 2006 12:36:11 +0800 (CST) Subject: [BiO BB] tif file transfering Message-ID: <3627.10.10.224.88.1147322171.squirrel@webmail.sibs.ac.cn> Hi, I can't figure out how to transfer my blast2 result into a .tif file ? Any sugestion? Thanks for help! Lichun Jiang -- lichunjiang at sibs.ac.cn Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences -------------- next part -------------- An HTML attachment was scrubbed... URL: From aloraine at gmail.com Sat May 13 12:22:35 2006 From: aloraine at gmail.com (Ann Loraine) Date: Sat, 13 May 2006 11:22:35 -0500 Subject: [BiO BB] question regarding data management systems for microarray data Message-ID: <83722dde0605130922r53af379dgc5a2f0aa24347d0e@mail.gmail.com> Hi, Can anyone recommend a data management system for 'CEL' files from Affymetrix microarray scans? I'm looking for something like an inventory system that would allow us to track experimental groups, CEL File names, array type, and so on. I'd like to use the system to run commands like: "get me all the 'CEL' files for experiments with 5 biological replicates per group, u133A microarray, and stick them in directory /home/users/mary/data." I'm also looking for systems that are built with GEO in mind, e.g., knows about GEO's identifiers and codes and includes parsers for 'soft' files. Any tips or pointers or even just random rumors on this topic would be much appreciated! -Ann -- Ann Loraine Assistant Professor Section on Statistical Genetics University of Alabama at Birmingham http://www.ssg.uab.edu http://www.transvar.org From aloraine at gmail.com Sat May 13 12:40:32 2006 From: aloraine at gmail.com (Ann Loraine) Date: Sat, 13 May 2006 11:40:32 -0500 Subject: [BiO BB] ontology/controlled vocabulary for SNP "molecular phenotype", effects on coding Message-ID: <83722dde0605130940x62271e8fi90e76ea7e94626c2@mail.gmail.com> Hi, I have two questions for the list: First question: I'm looking for a classification system (ontology, controlled vocabulary) to describe the location and effects of various SNP alleles on protein and gene sequence. For example, this system would have controlled vocabulary terms to describe when a SNP is located in a translated region and changes (or doesn't change) the amino acid sequence of a protein. Ultimately, I'd like to be able to treat these controlled vocubulary terms as categorical variables in a QTL analysis, for example. Does such a thing exist? Second question: I'm looking for a system that can classify SNP alleles given their genomic location and their "base" relative to a reference genomic sequence. For example, I would tell the system that my SNP allele is base 'A' at posiition 100,345 on chromosome 1, and then it would tell me how the allele affects known gene(s) in that region, ideally using the controlled vocabulary from Question 1. Any tips, pointers, rumors, advice would be much appreciated! -Ann -- Ann Loraine Assistant Professor Section on Statistical Genetics University of Alabama at Birmingham http://www.ssg.uab.edu http://www.transvar.org From nikbakht at ibb.ut.ac.ir Sun May 14 10:54:34 2006 From: nikbakht at ibb.ut.ac.ir (Hamid Nikbakht) Date: Sun, 14 May 2006 19:24:34 +0430 Subject: [BiO BB] MARS Bioinformatics Institute Message-ID: MARS Bioinformatics Institute -------------- next part -------------- An HTML attachment was scrubbed... URL: From sankar.achuth at gmail.com Sun May 14 12:40:19 2006 From: sankar.achuth at gmail.com (Dr. Achuthsankar S. Nair) Date: Sun, 14 May 2006 22:10:19 +0530 Subject: [BiO BB] MPhil Bioinformatics in INDIA Message-ID: <2b168b460605140940u4e741d5dwd3b842b66f0642e4@mail.gmail.com> ONE YEAR MPHIL BIOINFORMATICS IN INDIA, 2007 ADMISSIONS OPEN Applications are invited for admission in Jan 2007, for the One-year M.Phil(Bioinformatics)programme of University of Kerala, India. For application form and brochure see www.cbi.keralauniversity.edu The fees for one year comes to US$ 1000 only -- Dr Achuthsankar S Nair Hon. Director Centre for Bioinformatics University of Kerala, Trivandrum 695581, INDIA Tel (O) 471-2412759 (R) 471-2542220 www.achu.keralauniversity.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From akunthavai at yahoo.co.in Mon May 15 04:12:34 2006 From: akunthavai at yahoo.co.in (A KUNTHAVAI) Date: Mon, 15 May 2006 09:12:34 +0100 (BST) Subject: [BiO BB] Doubt regarding paiwise alignment Message-ID: <20060515081234.34340.qmail@web8902.mail.in.yahoo.com> Hai, I am working in pairwise local alignment . While comparing two sequences first I have found all matches.To find the highest similarity I took the matching point as a starting point. I do not know where I should stop. that is starting from the first match whether I have to search the string until last letter in any one of the sequence or is there any other condition to stop. A.Kunthavai --------------------------------- Why was V. Sehwag warned by the BCCI? Share your knowledge on Yahoo! Answers India Send instant messages to your online friends - NOW -------------- next part -------------- An HTML attachment was scrubbed... URL: From hershel.safer at weizmann.ac.il Mon May 15 07:22:07 2006 From: hershel.safer at weizmann.ac.il (Hershel Safer) Date: Mon, 15 May 2006 14:22:07 +0300 Subject: [BiO BB] European Conf. Computational Biology: Fellowships & Posters Message-ID: <4468645F.90808@weizmann.ac.il> 5th European Conference on Computational Biology - ECCB ?06 Eilat, Israel September 10-13, 2006 www.eccb06.org The European Conference on Computational Biology (ECCB) is the leading European and a primary international conference in computational biology and bioinformatics. The fifth meeting in this series will be held on September 10-13, 2006, in the resort town of Eilat, Israel. ** Travel fellowships are available for students and postdocs. The deadline for applications is Monday, May 29, 2006. Please apply at: http://www.eccb06.org/new_pages/submission/sub_travel.html ** Posters are an important part of ECCB. The deadline for submitting posters is Thursday, June 8, 2006. Submit your poster at: http://www.eccb06.org/new_pages/submission/sub_posters.html ** Keynote speakers Prof. Naama Barkai, Weizmann Institute of Science Prof. Sir Tom Blundell, FRS, University of Cambridge Prof. Richard Karp, University of California, Berkeley Prof. Jeffrey Skolnick, Georgia Institute of Technology Marc Vidal, Ph.D., Harvard Medical School Prof. Martin Vingron, Max Planck Institute for Molecular Genetics ** Help us spread the word! Please download the conference poster and hang it on your door or a nearby bulletin board, and display it at the beginning of seminars: http://www.eccb06.org/other/eccbposter.pdf ** For further information: See the conference website, http://www.eccb06.org, or send e-mail to the conference secretariat at eccb06 at diesenhaus.com. From narcis at fiserlab.org Mon May 15 08:18:37 2006 From: narcis at fiserlab.org (Narcis Fernandez-Fuentes) Date: Mon, 15 May 2006 08:18:37 -0400 Subject: [BiO BB] European Conf. Computational Biology: Fellowships & Posters In-Reply-To: <4468645F.90808@weizmann.ac.il> References: <4468645F.90808@weizmann.ac.il> Message-ID: <4468719D.3000607@fiserlab.org> Hi Andras, I am thinking on asking a fellowship to attend this conference. We can present a poster about the work of multitemplate for protein modeling Narcis Hershel Safer wrote: > 5th European Conference on Computational Biology - ECCB ?06 > Eilat, Israel > September 10-13, 2006 > www.eccb06.org > > The European Conference on Computational Biology (ECCB) is the leading > European and a primary international conference in computational biology > and bioinformatics. The fifth meeting in this series will be held on > September 10-13, 2006, in the resort town of Eilat, Israel. > > ** Travel fellowships are available for students and postdocs. The > deadline for applications is Monday, May 29, 2006. Please apply at: > http://www.eccb06.org/new_pages/submission/sub_travel.html > > ** Posters are an important part of ECCB. The deadline for submitting > posters is Thursday, June 8, 2006. Submit your poster at: > http://www.eccb06.org/new_pages/submission/sub_posters.html > > ** Keynote speakers > Prof. Naama Barkai, Weizmann Institute of Science > Prof. Sir Tom Blundell, FRS, University of Cambridge > Prof. Richard Karp, University of California, Berkeley > Prof. Jeffrey Skolnick, Georgia Institute of Technology > Marc Vidal, Ph.D., Harvard Medical School > Prof. Martin Vingron, Max Planck Institute for Molecular Genetics > > ** Help us spread the word! Please download the conference poster and > hang it on your door or a nearby bulletin board, and display it at the > beginning of seminars: > http://www.eccb06.org/other/eccbposter.pdf > > ** For further information: See the conference website, > http://www.eccb06.org, or send e-mail to the conference secretariat at > eccb06 at diesenhaus.com. > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > -- Narcis Fernandez-Fuentes, phD Seaver Center for Bioinformatics Albert Einstein College of Medicine 1300 Morris Park Ave, Bronx, NY 10461, USA phone: (718)430-3233 fax: (718) 430-8565 mailto:narcis at fiserlab.org (http://www.fiserlab.org) From boris.steipe at utoronto.ca Mon May 15 08:18:00 2006 From: boris.steipe at utoronto.ca (Boris Steipe) Date: Mon, 15 May 2006 08:18:00 -0400 Subject: [BiO BB] Doubt regarding paiwise alignment In-Reply-To: <20060515081234.34340.qmail@web8902.mail.in.yahoo.com> References: <20060515081234.34340.qmail@web8902.mail.in.yahoo.com> Message-ID: No, you start from the highest score, the alignment is then constructed from all pairs of characters that contributed to it. HTH, Boris On 15 May 2006, at 04:12, A KUNTHAVAI wrote: > Hai, > I am working in pairwise local alignment . While comparing > two sequences > first I have found all matches.To find the highest similarity I > took the matching point as a starting point. I do not know where I > should stop. that is starting from the first match whether I have > to search the string until last letter in any one of the sequence > or is there any other condition to stop. > > A.Kunthavai > > Why was V. Sehwag warned by the BCCI? Share your knowledge on > Yahoo! Answers India > Send instant messages to your online friends - NOW > _______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From beenagpatel at hotmail.com Mon May 15 08:56:29 2006 From: beenagpatel at hotmail.com (beena patel) Date: Mon, 15 May 2006 12:56:29 +0000 Subject: [BiO BB] Screening Message-ID: Hi I am looking for a method for screening of Bifidobacterium longum from mixt culture of L. acidophilus anf S. thermophilus. I like a conventional method for quantification of B. longum from the product. I tried MRS, MRS+Cystein, MRS pH 5, MRS + Na thiosulphate,ST agar,MRS + salicin and Reinforced clostridum agar(RCA). I had some sucess with RCA agar but I still get interference of S. thermophilus in it. I appriciate your help on above question. Thank you Beena Patel _________________________________________________________________ Don?t just search. Find. Check out the new MSN Search! http://search.msn.click-url.com/go/onm00200636ave/direct/01/ From beenagpatel at hotmail.com Mon May 15 08:57:54 2006 From: beenagpatel at hotmail.com (beena patel) Date: Mon, 15 May 2006 12:57:54 +0000 Subject: FW: [BiO BB] Screening Message-ID: >From: "beena patel" >Reply-To: "The general forum at Bioinformatics.Org" > >To: BiO_Bulletin_Board at bioinformatics.org >Subject: [BiO BB] Screening >Date: Mon, 15 May 2006 12:56:29 +0000 > >Hi >I am looking for a method for screening of Bifidobacterium longum from mixt >culture of L. acidophilus anf S. thermophilus. I like a conventional >method for quantification of B. longum from the product. >I tried MRS, MRS+Cystein, MRS pH 5, MRS + Na thiosulphate,ST agar,MRS + >salicin and Reinforced clostridum agar(RCA). I had some sucess with RCA >agar but I still get interference of S. thermophilus in it. >I appriciate your help on above question. >Thank you >Beena Patel > >_________________________________________________________________ >Don?t just search. Find. Check out the new MSN Search! >http://search.msn.click-url.com/go/onm00200636ave/direct/01/ > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today - it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ From landman at scalableinformatics.com Mon May 15 14:47:41 2006 From: landman at scalableinformatics.com (Joe Landman) Date: Mon, 15 May 2006 14:47:41 -0400 Subject: [BiO BB] Scalable HMMer Message-ID: <4468CCCD.4070300@scalableinformatics.com> (sorry for spamming) Scalable Informatics has released Scalable HMMer, an optimized version of HMMer 2.3.2 that is 1.6-2.5x faster on benchmark tests run on Opteron systems. Please see http://www.scalableinformatics.com for details on where to download it, and how to use the yum repository. Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 cell : +1 734 612 4615 From niryo_f at yahoo.com Tue May 16 04:28:37 2006 From: niryo_f at yahoo.com (Nir Yosef) Date: Tue, 16 May 2006 01:28:37 -0700 (PDT) Subject: [BiO BB] Cross reference Message-ID: <20060516082837.62374.qmail@web38113.mail.mud.yahoo.com> Dear All, I would like to use combined PPI data from DIP and GenoBase data-bases. I would like to know where can I find an appropriate cross-reference (eg. between GenoBase and GI numbers or PIR). Thank you, Nir. __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From contact at hospital-os.com Wed May 17 05:53:42 2006 From: contact at hospital-os.com (contact) Date: Wed, 17 May 2006 16:53:42 +0700 Subject: [BiO BB] PR opportunity to promote the Free hospital software to share from Hospital OS, Thailand In-Reply-To: <20060517083031.M9480@hospital-os.com> References: <20060517065412.M70812@hospital-os.com> <20060517072058.M61696@hospital-os.com> <20060517083031.M9480@hospital-os.com> Message-ID: <20060517095145.M89409@hospital-os.com> Dear Bioinformatics.org staff, Being a non-profit organization and now seeking for international partners, I would like to ask for the PR opportunity to promote our open source Hospital Information System for small hospitals, Hospital OS. "Hospital OS" is the name of the software as well as of our project, which is originated in Thailand. The ideal partners we wish to reach out to are international hospitals, healthcare- focused IT groups, and humanitarian organizations. More details about our project can be found at our website : www.hospital-os.com/en/ How may I share some resources with Opensource.org? As a marketing & public relations lead of the project, I have some e-files regarding Hospital OS project i.e. brochure and newsletter available to send. Please let me know if you allow me to send you ones. If you could forward this information to those who might be interested in our project, I would greatly appreciated! Any advice also very welcome. Sincerely, Nalinee Chanyavanich Marketing & Public Relations Lead Hospital OS Internationalization Project Email: contact at hospital-os.com : nalinee at innovasystems.co.th Website: http://www.hospital-os.com/en/ From clement at cs.byu.edu Wed May 17 16:36:10 2006 From: clement at cs.byu.edu (Mark Clement) Date: Wed, 17 May 2006 14:36:10 -0600 Subject: [BiO BB] BIOT Symposium Message-ID: <25002DD5-D625-45B6-BB75-16CD9A34F42F@cs.byu.edu> ======================== Call for Papers Biotechnology and Bioinformatics Symposium (BIOT-2006) Provo, Utah October 20-21, 2006 http://www.biotconf.org/ ========================= Research and development in biotechnology requires the collaboration of scientists and engineers in fields such as biology, chemistry, computer science, chemical engineering, and electrical engineering. This symposium will bring together scientists, engineers and scholars from relevant fields with practitioners from industry in order to help each group to understand progress made in the area as a whole. The Biotechnology and Bioinformatics Symposium 2006 will be held in Provo , Utah on October 20-21, 2006. It will be hosted by Brigham Young University. Topics of interest include: Bio-molecular and Phylogenetic Databases Molecular Evolution and Phylogenetic analysis Drug Delivery Systems Bio-Ontology and Data Mining Sequence Search and Alignment Microarray Analysis System Biology Pathway analysis Identification and Classification of Genes Protein Structure Prediction and Molecular Simulation Functional Genomics Proteomics Tertiary structure prediction Drug Docking Gene Expression Analysis Biomedical Imaging Submissions An extended abstract or a paper must report significant research results, findings or advances within its own field. However, since the symposium is geared toward a diverse audience of biologists, computer scientists, chemists, engineers, technology transfer individuals, graduate students, professors, industry individuals, etc., the papers or extended abstracts must be presented in a lucid manner accessible to such individuals. A pdf version of your paper can be submitted at http://www.biotconf.org/papersubmission/openconf.php Extended Abstract Submission You should submit a two-page, single-spaced extended abstract by the submission date given below. Each extended abstract must be in 10 point type, in 2-column format. The extended abstract must show the names of the authors, their mailing and electronic addresses, and up to 3 keywords. An extended abstract must contain a paragraph summary of work followed by additional sections. Please note that sections in an extended abstract must contain enough information so that reviewers can judge the quality of work being reported. Thus, an extended abstract is like a mini paper. Each extended abstract will be reviewed. The accepted extended abstracts will be given several weeks after acceptance notification for revisions based on reviewers' comments. You must submit an updated camera-ready extended abstract following a given format by the date specified below. These extended abstracts will be printed in the Symposium proceedings. Full Paper Submission You submit a full paper of up to a maximum of 10 pages (following the tradition used in the field of Computer Science) by the submission date given below. Each paper must be in 10 point type, 2-column format. The paper must show the names of the authors, their mailing and electronic addresses, and up to 3 keywords on the top page. Each paper must contain a paragraph abstract or summary followed by other sections. Each paper will be reviewed. Accepted papers will be given several weeks after acceptance notification for revision based on reviewers' comments. You must submit an updated camera-ready paper by the date specified below. The papers will be printed in the Symposium proceedings. In keeping with the tradition of the bioinformatics and computational biology areas, authors for these sub-fields are strongly encouraged to submit full papers. Journal Publication The best papers from BIOT-2006 will be published in the International Journal of Bioinformatics Research and Applications (IJBRA). Important Dates Submission Deadline: June 2, 2006 (Two pages of Extended Abstracts or 6 pages of Full Papers). Acceptance Decision: July 17, 2006 Revised Camera Ready Extended Abstracts and Full papers due after revision: August 11, 2006 Symposium Date: October 20 and 21, 2006 ---------------- Dr. Mark Clement Department of Computer Science Brigham Young University 3370 TMCB Provo, Utah 84602 (801) 422-7608 clement at cs.byu.edu From Peter.Andrews at Dartmouth.EDU Wed May 17 17:35:02 2006 From: Peter.Andrews at Dartmouth.EDU (Peter Andrews) Date: Wed, 17 May 2006 17:35:02 -0400 Subject: [BiO BB] Know how to use sort in NCBI Entrez web services ESearch? Message-ID: I am using the Entrez web service from java. I am having difficulty finding documentation or examples on how to control the sort order of results for esearch queries. Any help will be appreciated. Thank you, Peter ------------------------------------------- Peter Andrews Software Engineer Dartmouth Medical School Computational Genetics Rubin 708 (603) 653-6017 Peter.Andrews at dartmouth.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmarchywka at eyewonder.com Wed May 17 18:09:40 2006 From: mmarchywka at eyewonder.com (Mike Marchywka) Date: Wed, 17 May 2006 18:09:40 -0400 Subject: [BiO BB] Know how to use sort in NCBI Entrez web services ESearch? Message-ID: <73CA026E5E77C74398C69F3338C5967C07553E2D@atlexc01.atlanta.eyewonder.com> I think I have accused them of not supporting just about everything but they keep answering my questions. The eutil help desk is very responsive- I would suggest trying them and get back to us with the answer or a link to whatever you are talking about. I used to do all this stuff with java until a linux guy got me hooked on scripts. So far, I haven't found any text processing that isn't easier using the simple search/fetch in a perl or bash script. Do you have a link describing what you are using and what it is supposed to do? One complaint I have with perl under cygwin is that I quickly end up with hash tables thrashing through VM. I could probably fix this with c++ or java but generally it is just an annoyance rather than a barrier. ************************************************************************* Mike Marchywka EyeWonder Instant Streaming, Infinite Results 1447 Peachtree Street 9th Floor Atlanta, GA 30309 w.678-891-2033 c. h.770-565-8101 mmarchywka at eyewonder.com alt: marchywka at hotmail.com Instant Streaming, Intelligent results. ************************************************************************* -----Original Message----- From: bio_bulletin_board-bounces+mmarchywka=eyewonder.com at bioinformatics.org [mailto:bio_bulletin_board-bounces+mmarchywka=eyewonder.com at bioinformatics.org]On Behalf Of Peter Andrews Sent: WednesdayMay-17-2006 05:35 PM To: bio_bulletin_board at bioinformatics.org Subject: [BiO BB] Know how to use sort in NCBI Entrez web services ESearch? I am using the Entrez web service from java. I am having difficulty finding documentation or examples on how to control the sort order of results for esearch queries. Any help will be appreciated. Thank you, Peter ------------------------------------------- Peter Andrews Software Engineer Dartmouth Medical School Computational Genetics Rubin 708 (603) 653-6017 Peter.Andrews at dartmouth.edu From ulimard at yahoo.com.br Wed May 17 23:38:55 2006 From: ulimard at yahoo.com.br (Ulisses Dias) Date: Thu, 18 May 2006 00:38:55 -0300 (ART) Subject: [BiO BB] Protein Features Extractors Message-ID: <20060518033855.59744.qmail@web50506.mail.yahoo.com> Hi all, I'd like to find papers or algorithms that extract protein features like fold, 3d-motifs, functional-linkages, domain, protein clefts or surface cavities and others relevant aspects from protein structures. I'd be glad if someone could help me. Atenciosamente Ulisses Dias --------------------------------- Novidade no Yahoo! Mail: receba alertas de novas mensagens no seu celular. Registre seu aparelho agora! -------------- next part -------------- An HTML attachment was scrubbed... URL: From jayananth at gmail.com Wed May 17 23:48:05 2006 From: jayananth at gmail.com (jay ananth) Date: Thu, 18 May 2006 09:18:05 +0530 Subject: [BiO BB] Re: Protein Features Extractors In-Reply-To: <20060518033855.59744.qmail@web50506.mail.yahoo.com> References: <20060518033855.59744.qmail@web50506.mail.yahoo.com> Message-ID: <9c9e3ba10605172048k856abbejaebc97160f7bd373@mail.gmail.com> hi, i have a doubt... how to find orthologous sequences and how to extract them... its important for my project .... On 5/18/06, Ulisses Dias wrote: > Hi all, > > I'd like to find papers or algorithms that extract protein features like > fold, 3d-motifs, functional-linkages, domain, protein clefts or surface > cavities and others relevant aspects from protein structures. > > I'd be glad if someone could help me. > > Atenciosamente > Ulisses Dias > > --------------------------------- > Novidade no Yahoo! Mail: receba alertas de novas mensagens no seu celular. > Registre seu aparelho agora! > From lichunjiang at sibs.ac.cn Thu May 18 01:15:41 2006 From: lichunjiang at sibs.ac.cn (lichunjiang) Date: Thu, 18 May 2006 13:15:41 +0800 (CST) Subject: [BiO BB] Re: Protein Features Extractors In-Reply-To: <9c9e3ba10605172048k856abbejaebc97160f7bd373@mail.gmail.com> References: <20060518033855.59744.qmail@web50506.mail.yahoo.com> <9c9e3ba10605172048k856abbejaebc97160f7bd373@mail.gmail.com> Message-ID: <1644.10.10.224.88.1147929341.squirrel@webmail.sibs.ac.cn> hi, try this websit: http://www.treefam.org/? which where TreeFam (Tree families database) is a database of phylogenetic trees of animal genes. HTH lichun Jiang jay ananth wrote: > hi, > i have a doubt... how to find orthologous sequences and how to > extract them... its important for my project .... > > On 5/18/06, Ulisses Dias wrote: >> Hi all, >> >> I'd like to find papers or algorithms that extract protein features >> like >> fold, 3d-motifs, functional-linkages, domain, protein clefts or surface >> cavities and others relevant aspects from protein structures. >> >> I'd be glad if someone could help me. >> >> Atenciosamente >> Ulisses Dias >> >> --------------------------------- >> Novidade no Yahoo! Mail: receba alertas de novas mensagens no seu >> celular. >> Registre seu aparelho agora! >> > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > -- lichunjiang at sibs.ac.cn Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmarchywka at eyewonder.com Thu May 18 08:08:01 2006 From: mmarchywka at eyewonder.com (Mike Marchywka) Date: Thu, 18 May 2006 08:08:01 -0400 Subject: [BiO BB] Re: Protein Features Extractors Message-ID: <73CA026E5E77C74398C69F3338C5967C07553E2F@atlexc01.atlanta.eyewonder.com> Have you looked at the ncbi tools? There is a lot buried in conserved domains. I just happen to be looking at Ig domains: http://www.ncbi.nlm.nih.gov/Structure/cdd/cddsrv.cgi?uid=smart00409 -----Original Message----- From: bio_bulletin_board-bounces+mmarchywka=eyewonder.com at bioinformatics.org [mailto:bio_bulletin_board-bounces+mmarchywka=eyewonder.com at bioinformati cs.org]On Behalf Of jay ananth Sent: WednesdayMay-17-2006 11:48 PM To: The general forum at Bioinformatics.Org Subject: [BiO BB] Re: Protein Features Extractors hi, i have a doubt... how to find orthologous sequences and how to extract them... its important for my project .... On 5/18/06, Ulisses Dias wrote: > Hi all, > > I'd like to find papers or algorithms that extract protein features like > fold, 3d-motifs, functional-linkages, domain, protein clefts or surface > cavities and others relevant aspects from protein structures. > > I'd be glad if someone could help me. > > Atenciosamente > Ulisses Dias > > --------------------------------- > Novidade no Yahoo! Mail: receba alertas de novas mensagens no seu celular. > Registre seu aparelho agora! > _______________________________________________ Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From mmarchywka at eyewonder.com Thu May 18 09:04:51 2006 From: mmarchywka at eyewonder.com (Mike Marchywka) Date: Thu, 18 May 2006 09:04:51 -0400 Subject: [BiO BB] Re: Protein Features Extractors Message-ID: <73CA026E5E77C74398C69F3338C5967C0750E1C9@atlexc01.atlanta.eyewonder.com> FYI, I ran into some related things on google: http://www.ncbi.nih.gov/Structure/lexington/lexington.cgi?cmd=rps linked from http://vivo.cornell.edu/entity?home=1&id=839 and it links to a paper: "The Conserved Domain Architecture Retrieval Tool (CDART) performs similarity searches of the NCBI Entrez Protein Database based on domain architecture, defined as the sequential order of conserved domains in proteins. The algorithm finds protein similarities across significant evolutionary distances using sensitive protein domain profiles rather than by direct sequence similarity. Proteins similar to a query protein are grouped and scored by architecture. Relying on domain profiles allows CDART to be fast, and, because it relies on annotated functional domains, informative. Domain profiles are derived from several collections of domain definitions that include functional annotation. Searches can be further refined by taxonomy and by selecting domains of interest. CDART is available at http://www.ncbi.nlm.nih.gov/Structure/lexington/lexington.cgi. " http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=12368255&dopt=Abstract ************************************************************************* Mike Marchywka EyeWonder Instant Streaming, Infinite Results 1447 Peachtree Street 9th Floor Atlanta, GA 30309 w.678-891-2033 c. h.770-565-8101 mmarchywka at eyewonder.com alt: marchywka at hotmail.com Instant Streaming, Intelligent results. ************************************************************************* From jaudall at iastate.edu Thu May 18 09:32:29 2006 From: jaudall at iastate.edu (Joshua A Udall) Date: Thu, 18 May 2006 08:32:29 -0500 Subject: [BiO BB] compile tgicl on OSX Message-ID: <6.2.3.4.2.20060518082259.02c809a0@jaudall.mail.iastate.edu> I'd like to use the tigr cluster tools (tgicl) on OSX. The available binaries didn't work and I've been trying to compile and install the individual packages separately. I having problems getting zmsort and mgblast to compile. If others have found a better way to have it run on OSX, I'd like to hear it. Anyway, I've included my compile errors below and could use some help or tips about what is not configure correctly. Thanks, Josh Specifically for mgblast: makefile:81: warning: overriding commands for target `clean' makefile:70: warning: ignoring old commands for target `clean' rm -f mgblast megablast *.[ao] gcc -o megablast -O2 -fast -mdynamic-no-pic -I/Users/jaudall/bin/tgicl/ncbi/include -lncbitool -lncbiobj -lncbi -lm -O -I/Users/jaudall/bin/tgicl/ncbi/include -L/Users/jaudall/bin/tgicl/ncbi/lib -L/Users/jaudall/bin/tgicl/ncbi/build -L/Users/jaudall/bin/tgicl/ncbi/corelib /usr/bin/ld: Undefined symbols: _Nlm_Main _CFBundleCopyBundleURL _CFBundleCopyExecutableURL _CFBundleGetMainBundle _CFURLGetFileSystemRepresentation _FSPathMakeRef _DisposeHandle _HLock _HUnlock _MemError _NewHandle _SetHandleSize collect2: ld returned 1 exit status make: *** [megablast] Error 1 Specifically, for zmsort: g++ -O2 -Wall -iquote . -iquote ../tgi_cl -fno-exceptions -fno-rtti -D_REENTRANT -fast -mdynamic-no-pic -c zmsort.cpp -o zmsort.o zmsort.cpp:32: error: missing terminating " character zmsort.cpp:136: error: missing terminating " character zmsort.cpp:137: error: missing terminating " character zmsort.cpp:152: error: missing terminating " character zmsort.cpp:156: error: missing terminating " character zmsort.cpp: In function `int main(int, char* const*)': zmsort.cpp:136: error: expected primary-expression before ',' token zmsort.cpp:137: error: expected primary-expression before ')' token zmsort.cpp:152: error: expected primary-expression before ')' token zmsort.cpp:156: error: expected primary-expression before ',' token make: *** [zmsort.o] Error 1 From Daniele.Santoni at caspur.it Thu May 18 07:25:35 2006 From: Daniele.Santoni at caspur.it (Daniele Santoni) Date: Thu, 18 May 2006 13:25:35 +0200 Subject: [BiO BB] collaboration Message-ID: <20060518112535.B64AE53E88@smtp.caspur.it> Hallo My name is Daniele Santoni. I?m a mathematician, specialized in theoretical informatics, with a master in bioinformatics. Currently I am a research fellow at IUSM in Rome, Italy, working on a grant in bioinformatics. I have interests in many areas of bioinformatics, my recent research deals with bacterial typing and whole genome analysis. I would like to spend a period (about three months depending on the project) abroad (Latin America, Asia, Europe and anywhere) to gain experience and improve my skill within a leading group in bioinformatics. The collaboration may be formalized by any kind of agreement between IUSM and the host institution. I will be glad to be contacted (santoni at caspur.it) by any party that could be interested in this proposal. Best regards Daniele ---------------------------------------------------------- Daniele Santoni mail: daniele.santoni at caspur.it CASPUR V. dei Tizii 6/b 00185 Rome Italy IUSM P.zza L. de Bosis 6 00194 Rome Italy ----------------------------------------------------------- From Peter.Andrews at Dartmouth.EDU Fri May 19 09:05:47 2006 From: Peter.Andrews at Dartmouth.EDU (Peter Andrews) Date: Fri, 19 May 2006 09:05:47 -0400 Subject: [BiO BB] Know how to use sort in NCBI Entrez web services ESearch? In-Reply-To: <73CA026E5E77C74398C69F3338C5967C07553E2D@atlexc01.atlanta.eyewonder.com> Message-ID: I received a prompt reply from the Entrez help desk. Unfortunately the answer was not what i hoped for: >Dear NCBI user: > >Sort parameter is not available for Entrez Gene eutilities calls. >Currently it works for Pubmed database only. > >Best regards, > >A. Gabrielian >NCBI Help desk My project is to create a backend database for use in a microarray visual analysis tool described here: http://helix-web.stanford.edu/psb05/abstracts/p296.html I need a backend db because of performance considerations -- Entrez's web service and 3 second inter-call limits would be too slow for my needs. ------------------------------------------- Peter Andrews Software Engineer Dartmouth Medical School Computational Genetics Rubin 708 (603) 653-6017 Peter.Andrews at dartmouth.edu > -----Original Message----- > From: > bio_bulletin_board-bounces+peter.andrews=dartmouth.edu at bioinformatics.or > g > [mailto:bio_bulletin_board-bounces+peter.andrews=dartmouth.edu at bioinform > atics.org]On Behalf Of Mike Marchywka > Sent: Wednesday, May 17, 2006 6:10 PM > To: The general forum at Bioinformatics.Org > Subject: RE: [BiO BB] Know how to use sort in NCBI Entrez web services > ESearch? > > > I think I have accused them of not supporting just about > everything but they keep > answering my questions. The eutil help desk is very responsive- I > would suggest trying > them and get back to us with the answer or a link to whatever you > are talking about. > > I used to do all this stuff with java until a linux guy got me > hooked on scripts. > So far, I haven't found any text processing that isn't easier > using the simple > search/fetch in a perl or bash script. Do you have a link > describing what you are using > and what it is supposed to do? > > One complaint I have with perl under cygwin is that I quickly end up with > hash tables thrashing through VM. I could probably fix this with > c++ or java but generally > it is just an annoyance rather than a barrier. > > > > ************************************************************************* > Mike Marchywka > EyeWonder > Instant Streaming, Infinite Results > > 1447 Peachtree Street > 9th Floor > Atlanta, GA 30309 > > w.678-891-2033 > c. > h.770-565-8101 > mmarchywka at eyewonder.com > alt: marchywka at hotmail.com > Instant Streaming, Intelligent results. > ************************************************************************* > > > > -----Original Message----- > From: > bio_bulletin_board-bounces+mmarchywka=eyewonder.com at bioinformatics > .org [mailto:bio_bulletin_board-bounces+mmarchywka=eyewonder.com at bioinformatics.o rg]On Behalf Of Peter Andrews Sent: WednesdayMay-17-2006 05:35 PM To: bio_bulletin_board at bioinformatics.org Subject: [BiO BB] Know how to use sort in NCBI Entrez web services ESearch? I am using the Entrez web service from java. I am having difficulty finding documentation or examples on how to control the sort order of results for esearch queries. Any help will be appreciated. Thank you, Peter ------------------------------------------- Peter Andrews Software Engineer Dartmouth Medical School Computational Genetics Rubin 708 (603) 653-6017 Peter.Andrews at dartmouth.edu _______________________________________________ Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From jeff at bioinformatics.org Mon May 22 21:15:11 2006 From: jeff at bioinformatics.org (J.W. Bizzaro) Date: Mon, 22 May 2006 21:15:11 -0400 Subject: [BiO BB] CfP: Scientific Development Poster Session at WWDC 2006 Message-ID: <4472621F.9050302@bioinformatics.org> The following announcement comes from Robert Kehrer at Apple Computer: ------------ CfP: Scientific Development Poster Session at WWDC 2006 There's a HOT new addition to the WWDC this year (Aug 7-11 in SF) that the Bioinformatics.Org community might be interested in. * Apple heard loud and clear last year that the ability to present at a poster session would help many academic researchers get funding to attend the conference. * Apple also clearly heard that scientific attendees want to be able to showcase their work and collaborate with researchers in similar disciplines. This year's WWDC will feature a Scientific Development Poster Session as part of the many Science Community activities. We are inviting the entire science community to submit an abstract of their work for approval and, if accepted, prepare a poster to present at the conference. Posters presented at the WWDC 2006 Scientific Development Poster Session will focus on software or hardware development techniques using Apple development tools and/or Mac OS X technologies to address key issues in the sciences. Possible topics include, but are not limited to: * Use of Mac OS X ?Tiger? technologies * Scientific or medical visualization * Multithreading and parallelization * Scientific computation and simulation * Signal/data processing * Database and data streaming using Mac OS X * High-performance computing and clustering * Scripting and automation http://developer.apple.com/wwdc/science_poster.html The deadline for submission is Friday July 7th 11:59pm PDT Questions about the WWDC 2006 Scientific Development Poster Session should be sent to: wwdcposterquery at apple.com http://developer.apple.com/wwdc/science.html From icdm06 at biomap.org Wed May 24 15:26:50 2006 From: icdm06 at biomap.org (ICDM DMB 2006) Date: Thu, 25 May 2006 05:26:50 +1000 Subject: [BiO BB] 1st CFP: ICDM 2006 Data Mining in Bioinformatics Workshop Message-ID: <4474B37A.5050609@biomap.org> IEEE ICDM 2006 Workshop on Data Mining in Bioinformatics (DMB 2006) 18 December 2006, Hong Kong http://icdm06.biomap.org/ Data Mining deals with the use of data analysis techniques and methodologies in the design, development and assessment of data and information systems for biomedical computing. The goal of this workshop is to share research solutions using data mining approaches to problems of today's biomedical systems and to identify new issues and directions for future research in biomedical data mining. This workshop will broadly address following areas: * Conceptual Models for Biological and Medical Data. * Biomedical Data Integration, Analysis and Interoperability. * Biomedical Query Processing, Query Optimization, and Information Retrieval. * Ontology-driven Biomedical Systems. * Biomedical Data Privacy and Security. * Data mining applications in bioinformatics, biomedicine, health care and other biomedical domain areas. Techniques and Methodologies proposed in this workshop will help addressing two major challenges to incorporate this vast biological knowledge into the data mining cycle: (i) designing efficient the data mining frameworks; and (ii) adapting existing data mining algorithms to understand constantly varying and changing biomedical data. In this workshop we hope to present to the audience, the state-of-the-art frameworks for bringing the background biomedical knowledge into the pattern recognition task for biomedical data. Authors are invited to submit original papers to the workshop exploring data mining theories, techniques, and applications for Bioinformatics. Papers are invited (but not limited) to the following themes: * Conceptual Models for Biological and Medical Data * Microarray Data Analysis * Protein/RNA Structure Prediction * Genomics and Proteomics * Drug Design * Biomedical Literature Mining * Modeling of Biochemical Pathways * Comparative Genomics * Biological data Visualization * Phylogenetics * Biomedical Ontologies * Biomedical Data Engineering using Ontologies * System Biology and Pathways * Biological Database Management * Interoperation of Biomedical Databases * Biomedical Query Processing, Query Optimization, and Information Retrieval * Biomedical Data Privacy and Security * Data mining applications in bioinformatics, biomedicine, health care and other biomedical domain areas Important Dates July 30, 2006 Paper Submission Deadline September 8, 2006 Notification of acceptance September 29, 2006 Final camera-ready paper due December 18, 2006 Workshop Day Paper Submission Only electronic submission of original technical contributions will be accepted. All submissions should be done electronically via the IEEE ICDM 2006 web submission system at http://www.comp.hkbu.edu.hk/~wii06/icdm/?index=submission (available soon). Authors will be notified of acceptance after a review process by two independent experts. For further questions, please contact technical program chair: icdm06 at biomap.org Workshop General Chairs * Tharam S. Dillon (University of Technology Sydney, Australia) * T. Y. Lin (San Jose State University, USA) * Elizabeth Chang (Curtin University of Technology, Australia) Workshop Program Chairs * Amandeep S. Sidhu (University of Technology Sydney, Australia) * Xiaohua (Tony) Hu (Drexel University, USA) * Jason Wang (New Jersey Institute of Technology, USA) Program Committee Members * Daniel Rubin (National Center of Biomedical Ontology, USA) * Michael Ng (Hong Kong Baptist University, Hong Kong) * Zhoujun Li (National University of Defense Technology, China) * Sun Kim (Indiana University, USA) * Shuigeng Zhou (Fudan University, China) * Illhoi Yoo (Drexel University, USA) * Xiaohua Zhou (Drexel University, USA) * Robert Meersman (Vrije Universiteit Brussel, Belgium) * Mustafa Jarrar (Vrije Universiteit Brussel, Belgium) * Zoren Obradovic (Temple University, USA) * Ernesto Damiani (Computer Science Department, Milan University, Italy) * Ling Feng (University of Twente, Netherlands) * Jake Chen (Indiana University, USA) * Yanqing Zhang (Georgia State University, USA) * Ying Liu (University of Texas, USA) * Suzanna Lewis (Berkeley Drosophila Genome Project, USA) * Jimmy Huang (York University, Canada) * Mihaela Ulieru (University of New Brunswick, Canada) * Farookh K. Hussain (Curtin University of Technology, Australia) * Hans-Dieter Ehrich (Technical University of Braunschweig, Germany) * Fabio Porto (Database Laboratory, EPFL, Switzerland) * Paul Kennedy (University of Technology Sydney, Australia) * Manish Bhide (IBM India Research Lab, India) * Henry Tan (Microsoft, USA) * Tony Jan (University of Technology Sydney, Australia) * David Taniar (Monash University, Australia) * Wenny Rahayu (La Trobe University, Australia) * Pornpit Wongthongtham (Curtin University of Technology, Australia) * Maja Hadzic (Curtin University of Technology, Australia) -- --------------------------------- Amandeep S. Sidhu Program Chair ICDM 2006 Workshop on Data Mining in Bioinformatics http://icdm06.biomap.org/ From andreas.bender at complife.org Fri May 26 11:35:29 2006 From: andreas.bender at complife.org (Andreas Bender (CompLife'06)) Date: Fri, 26 May 2006 11:35:29 -0400 Subject: [BiO BB] "Free Life Science Software" Session at CompLife, Cambridge/UK Message-ID: Call for Contributions ===================================================== LIFE SCIENCE FREE SOFTWARE SESSION held at CompLife 2006 (http://www.complife.org) in Cambridge, United Kingdom, on September 27 - 29, 2006 ===================================================== In the last years more and more free and open source software has been developed for chemo- and bioinformatics, molecular modelling or other Life Science applications, but many of the programs are not well known. During the CompLife 2006 conference we will organize a special session dedicated to this type of free software. The demo session will be preceeded by a short session having room for brief introductory presentations whereas the demo session itself will allow attendees to see the tools in action. Authors of free software will have the opportunity to present their program to the CompLife audience which will consist of researchers and users from computer science, biology, chemistry and everything in between. In case you are interested in the free software session, send us an email at fss at complife.org and briefly describe your program and how you intend to present it at the conference (1-2 pages max - please include URL to downloadable version where available). The only restrictions are that the program must be freely available for everyone or even open source and that it must be related to Life Science applications. The deadline for these proposals is June, 16th 2006. In mid July we will notify you if your software demo was accepted. -- Computational Life Sciences '06 Cambridge/UK, 27-29 September 2006: Visit http://www.complife.org for more information! Andreas Kieron Patrick Bender - http://www.andreasbender.de ICQ#: 166 835 816 - Yahoo Messenger: andreasbender Novartis Institutes for BioMedical Research, Cambridge/MA From idoerg at burnham.org Tue May 30 22:25:59 2006 From: idoerg at burnham.org (Iddo Friedberg) Date: Tue, 30 May 2006 19:25:59 -0700 Subject: [BiO BB] Automated Function Prediction 2006: registration and posters Message-ID: <447CFEB7.3020301@burnham.org> (Please post and pass on as appropriate, thanks) The Second Automated Function Prediction Meeting will be held August 30 -- September 1 2006, at the University of California San Diego AFP 2006: registration now open! Early registration deadline: July 15, 2006 http://biofunctionprediction.org/AFP/register/ Poster submission deadline extended to June 15, 2006. http://biofunctionprediction.org/AFP/afp06/posterabstracts/ General information: http://BioFunctionPrediction.org/AFP/afp06 Posters are sought in, but not limited to, the following topics: * Function prediction using sequence based methods. This would include "classic" methods such as detection of functional motifs and inferring function from sequence similarity. * Function from genomic information: prediction by genomic location; locus comparison with other organisms; function gain and loss. * Function prediction in metagenomics * Phylogeny based methods * Function from molecular interactions * Function from structure * Function prediction using combined methods * "Meta-talks" discussing the limitations and horizons of computational function prediction. * Assessing function prediction programs Sequence and structure genomics have generated a wealth of data, but extracting meaningful information from genomic information is becoming an increasingly difficult challenge. Both the number and the diversity of discovered genes is increasing. This increase means that established annotation methods, such as homology transfer, are annotating less data. In addition, there is a need for annotation which is standardized so that it could be incorporated into function annotation on a large scale. Finally, there is a need to assess the quality of the function prediction software which is out there. We probably know the sequence of the target for next generation antibiotics or cancer treatment. We just do not realize that because the target is currently annotated as a "domain of unknown function". For these reasons and many more, automated protein function prediction is rapidly gaining interest among computational biologists in academia and industry. The second AFP meeting will be a three day event, August 30-September 1st , 2006 at the campus of University of California, San Diego,California, USA. AFP 2006 will feature: * Plenary talks delivered by leading researchers in the field * Submitted talks * Conference proceedings published as research papers in BMC Bioinformatics * A special discussion panel on gene and protein annotation * A poster session Plenary speakers: * Philip E. Bourne, University of California, San Diego, USA * Steven E. Brenner, University of California, Berkeley, USA * Terry Gaasterland, Scripps Institute of Oceanography, La Jolla, USA * Adam Godzik, Burnham Institute for Medical Research and University of California, San Diego USA * Christos Ouzounis European Bioinformatics Institute, Cambridge, UK * Anna Tramontano, University of Rome, "La Sapienza", Rome, Italy * Shoshana Wodak, Hospital for Sick Children, and Departments of Biochemistry and Medical Genetics, University of Toronto, Canada. Talk submission is now closed. The talk program will soon be on the conference web site. For more information please see the meeting site: http://BioFunctionPrediction.org/AFP/afp06 Sincerely, Iddo Friedberg, in the name of the AFP 2006 organizing committee -- Iddo Friedberg, Ph.D. Burnham Institute for Medical Research 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 713 9949 http://iddo-friedberg.org http://BioFunctionPrediction.org From iain.m.wallace at gmail.com Wed May 31 06:33:38 2006 From: iain.m.wallace at gmail.com (Iain Wallace) Date: Wed, 31 May 2006 11:33:38 +0100 Subject: [Bio BB] Convert PFAM to FASTA Message-ID: <8cff3eb80605310333s4b03012m98bd76029e9a8bb8@mail.gmail.com> Hi all, Does anyone know of a programme that can take a pfam alignment and convert it into fasta? Thanks Iain -------------- next part -------------- An HTML attachment was scrubbed... URL: From MEC at Stowers-Institute.org Wed May 31 12:31:59 2006 From: MEC at Stowers-Institute.org (Cook, Malcolm) Date: Wed, 31 May 2006 11:31:59 -0500 Subject: [Bio BB] Convert PFAM to FASTA Message-ID: If you happen to have perl and bioperl installed, you can convert myfile.pfam to myfile.pfam.fa with the following unix one-liner: cat myfile.pfam | perl -MBio::AlignIO -e 'select Bio::AlignIO->newFh(-format => "fasta"); $in = Bio::AlignIO->newFh(-format => "pfam", -fh => \*STDIN); print while <$in>' > myfile.pfam.fa ________________________________ From: bio_bulletin_board-bounces+mec=stowers-institute.org at bioinformatics.org [mailto:bio_bulletin_board-bounces+mec=stowers-institute.org at bioinformat ics.org] On Behalf Of Iain Wallace Sent: Wednesday, May 31, 2006 5:34 AM To: bio_bulletin_board at bioinformatics.org Subject: [Bio BB] Convert PFAM to FASTA Hi all, Does anyone know of a programme that can take a pfam alignment and convert it into fasta? Thanks Iain -------------- next part -------------- An HTML attachment was scrubbed... URL: From ahmed at pobox.com Wed May 31 13:04:28 2006 From: ahmed at pobox.com (Ahmed Moustafa) Date: Wed, 31 May 2006 12:04:28 -0500 Subject: [BiO BB] SNP detection from ESTs Message-ID: <447DCC9C.1060909@pobox.com> Hi All! Which tool would you recommend for EST-based SNP detection? I have been checking PolyBayes, PolyPhred, novoSNP, PolyFreq, ? but in general I have not been successfully to install and setup either of them. In our case, we need to detect SNPs out of ESTs, some do have with reference genomes and some do not. Your help will be appreciated so much! Ahmed From biopctgi at yahoo.es Wed May 31 13:24:03 2006 From: biopctgi at yahoo.es (Txema Gonzalez Izarzugaza) Date: Wed, 31 May 2006 19:24:03 +0200 Subject: [BiO BB] GOslim Message-ID: <447DD133.8030809@yahoo.es> Hello everyone! Does anybody know how I should generate a GOslim ontology using the map2slim script? I have tried this: "map2slim GOslim.obo gene_ontology.obo gene_association.fb" where: GOslim.obo is the GO slim I have already downloaded from the GO site. gene_ontology.obo, is the full GO ontology gene_association.fb is the association file, also dowloaded from the GO site. It seems to run ok, but it generates loads of errors like this one: illegal header entry: %chondroitin sulfate metabolism ; GO:0030204 ; synonym:chondroitin sulphate metabolism < chondroitin sulfate proteoglycan metabolism ; GO:0050654 goslim_generic.obo GO::Parsers::obo_text_parser Does anybody know what is happening? Thanks in advance Regards Txema From clement at cs.byu.edu Wed May 31 16:43:12 2006 From: clement at cs.byu.edu (Mark Clement) Date: Wed, 31 May 2006 14:43:12 -0600 Subject: [BiO BB] Biotechnology and Bioinformatics Symposium Message-ID: <668805C8-7FE5-456F-A944-711E94540BB3@cs.byu.edu> ======================== Call for Papers Biotechnology and Bioinformatics Symposium (BIOT-2006) Provo, Utah October 20-21, 2006 http://www.biotconf.org/ ========================= Research and development in biotechnology requires the collaboration of scientists and engineers in fields such as biology, chemistry, computer science, chemical engineering, and electrical engineering. This symposium will bring together scientists, engineers and scholars from relevant fields with practitioners from industry in order to help each group to understand progress made in the area as a whole. The Biotechnology and Bioinformatics Symposium 2006 will be held in Provo , Utah on October 20-21, 2006. It will be hosted by Brigham Young University. Topics of interest include: Bio-molecular and Phylogenetic Databases Molecular Evolution and Phylogenetic analysis Drug Delivery Systems Bio-Ontology and Data Mining Sequence Search and Alignment Microarray Analysis System Biology Pathway analysis Identification and Classification of Genes Protein Structure Prediction and Molecular Simulation Functional Genomics Proteomics Tertiary structure prediction Drug Docking Gene Expression Analysis Biomedical Imaging Submissions An extended abstract or a paper must report significant research results, findings or advances within its own field. However, since the symposium is geared toward a diverse audience of biologists, computer scientists, chemists, engineers, technology transfer individuals, graduate students, professors, industry individuals, etc., the papers or extended abstracts must be presented in a lucid manner accessible to such individuals. A pdf version of your paper can be submitted at http://www.biotconf.org/papersubmission/openconf.php Extended Abstract Submission You should submit a two-page, single-spaced extended abstract by the submission date given below. Each extended abstract must be in 10 point type, in 2-column format. The extended abstract must show the names of the authors, their mailing and electronic addresses, and up to 3 keywords. An extended abstract must contain a paragraph summary of work followed by additional sections. Please note that sections in an extended abstract must contain enough information so that reviewers can judge the quality of work being reported. Thus, an extended abstract is like a mini paper. Each extended abstract will be reviewed. The accepted extended abstracts will be given several weeks after acceptance notification for revisions based on reviewers' comments. You must submit an updated camera-ready extended abstract following a given format by the date specified below. These extended abstracts will be printed in the Symposium proceedings. Full Paper Submission You submit a full paper of up to a maximum of 10 pages (following the tradition used in the field of Computer Science) by the submission date given below. Each paper must be in 10 point type, 2-column format. The paper must show the names of the authors, their mailing and electronic addresses, and up to 3 keywords on the top page. Each paper must contain a paragraph abstract or summary followed by other sections. Each paper will be reviewed. Accepted papers will be given several weeks after acceptance notification for revision based on reviewers' comments. You must submit an updated camera-ready paper by the date specified below. The papers will be printed in the Symposium proceedings. In keeping with the tradition of the bioinformatics and computational biology areas, authors for these sub-fields are strongly encouraged to submit full papers. Journal Publication The best papers from BIOT-2006 will be published in the International Journal of Bioinformatics Research and Applications (IJBRA). Important Dates Submission Deadline: June 2, 2006 (Two pages of Extended Abstracts or 6 pages of Full Papers). Acceptance Decision: July 17, 2006 Revised Camera Ready Extended Abstracts and Full papers due after revision: August 11, 2006 Symposium Date: October 20 and 21, 2006 ---------------- Dr. Mark Clement Department of Computer Science Brigham Young University 3370 TMCB Provo, Utah 84602 (801) 422-7608 clement at cs.byu.edu From mahef111 at link.net Tue May 30 10:19:01 2006 From: mahef111 at link.net (Mhmoud Elhefnawi) Date: Tue, 30 May 2006 17:19:01 +0300 Subject: [BiO BB] quasispecies or ambiguities Message-ID: <000201c685aa$8da604c0$5a98c952@pc> Dear members, I have sequences after PCR and direct sequencing of some hepatitis c virus from patients. The sequencer output has some positions that are A/g, or C/t etc...I want to know if these represent quasispecies or ambiguities? and if they represent quasispecies, what is the best way to deal with that? to write all different combinations of these in the sequence as different quasispecies or what? Thank you all in advance for your kind assistance, Mahmoud -------------- next part -------------- An HTML attachment was scrubbed... URL: