From parry_tomar at yahoo.com Fri Apr 1 02:09:11 2005 From: parry_tomar at yahoo.com (Pradeep Tomar) Date: Thu, 31 Mar 2005 23:09:11 -0800 (PST) Subject: [BiO BB] BIOT-05 papers and abstracts due In-Reply-To: 6667 Message-ID: <20050401070911.15821.qmail@web90101.mail.scd.yahoo.com> Respected Sir Pls give me time for 5 days i will send u the copy of my paper for conference as soosn as possible Pradeep Tomar "J. Kalita" wrote: Papers and abstracts in all aspects of Bioinformatics and Biotechnolgy are officially due today (3/31/05). The symposium features peer-reviewed papers that are presetend orally or in poster sessions. Extended abstracts (2 pages long) or full papers (up to 6 pages long) are requested. Please visit the http://bioinfo.uccs.edu and click on "Symposium 2005 (BIOT-05)". The symposium will be held in Colorado Springs, Colorado on August 15th and 16th. We had a very successful symposium in 2004. There was a printed proceedings document with all the accepted abstracts and papers. The printed proceedings are available on-line at the http://bioinfo.uccs.edu Web site. Please go to the Web site and submit your extended abstract or paper as soon as possible. If you want a little extra time (say, up to an extra week), please write to me. Jugal Kalita Department of Computer Science University of Colorado at Colorado Springs PS: We are looking for any other universities or organizations (in the US or Canada) that want to be co-sponsors of this symposium. We can hold it in several locations in alternate yers. _______________________________________________ Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board --------------------------------- Do you Yahoo!? Better first dates. More second dates. Yahoo! Personals -------------- next part -------------- An HTML attachment was scrubbed... URL: From MAG at Stowers-Institute.org Fri Apr 1 11:36:35 2005 From: MAG at Stowers-Institute.org (Goel, Manisha) Date: Fri, 1 Apr 2005 10:36:35 -0600 Subject: [BiO BB] Parsing taxonomy from blast output Message-ID: <20050401163215.BCDB5D1F03@www.bioinformatics.org> Hi All, I need to parse the blast ouput to get the taxonomy information. If I could get the taxonomy nodes associted with each gi number .. This would also work. I have been trying SEALS taxonomy commands but somehow quite a few sequences turn up "not_retrieved", although we have tried updating the database etc. I do not want to use the BLAST web server because I have too many files to run. Please suggest any program/script that might be useful. Thanks, -Manisha -------------- next part -------------- An HTML attachment was scrubbed... URL: From karplus at soe.ucsc.edu Fri Apr 1 12:23:40 2005 From: karplus at soe.ucsc.edu (Kevin Karplus) Date: Fri, 1 Apr 2005 09:23:40 -0800 Subject: [BiO BB] Re: pdb-l: Parsing taxonomy from blast output In-Reply-To: <200504011636.j31Gaoj24444@postal.sdsc.edu> (MAG@Stowers-Institute.org) References: <200504011636.j31Gaoj24444@postal.sdsc.edu> Message-ID: <200504011723.j31HNeVV027133@cheep.cse.ucsc.edu> Here is a piece of a perl module that we have used for retrieving taxonomy information: # # get_docsum_from_ncbi($query_gi_list) # # Queries the NCBI database to retrieve the document summary for each # gi ID in a comma delimited list ($query_gi_list). This document # summary will be parsed and its values placed into the global hash # variable, $xml_docsum. This hash is keyed on the accession number in # each sequence ID (this is not the same as the gi ID used to retrieve # the summary). Each entry in the hash contains a sub-hash with # name/value pairs from the document summary. The main value we're # interested in is 'TaxId', which is the taxonomy ID of the organism # the sequence comes from. # sub get_docsum_from_ncbi($) { my($query_gi_list) = @_; # NCBI will return its document summary in XML format. Here, we # set up an XML::Parser object with callback functions that will # be used to parse the XML. See the callback functions themselves # for more information. # Note that if you call "new XML::Parser(Style => 'Debug'), you # can see the full structure of the XML document printed to # STDOUT. If you do this, you will need to comment out the # setHandlers method below, since only the default handlers will # output the debug info. my $xml_parser = new XML::Parser(); $xml_parser->setHandlers(Start => \&xml_start_handler, Char => \&xml_char_handler, End => \&xml_end_handler); # NCBI provides tools made for batch utilities to interface with # their database. They call them "Entrez Programming Utilities", or # EUtils for short, and you can find their documentation here: # http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html # # One EUtil is called ESearch, and the following code can use # ESearch to query NCBI with a list of accession numbers (in the # form 'term' => '+OR+') and return a list # of gi IDs to be used with other EUtils that only take gi IDs as # input. Unfortunately, the search term is limited in length, and # you can only fit about 20 accession numbers per query. NCBI's # usage rules say you can only perform one query every 3 seconds, # so it would take 25 mins for 10,000 records. Therefore, I had # to drop this idea as a solution, but I leave the code here in # comments for future reference. # #$query_url = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi"; #$response = $browser->post($query_url, # [ 'db' => 'protein', # 'term' => $$query_accession_nums_ref, #'NP_828860.1+OR+P05040', # 'usehistory' => 'n', # 'tool' => NCBI_TOOL, # 'email=' => NCBI_EMAIL, # ]); #if(!$response->is_success) { # print STDERR "Error '" . $response->status_line . # "' retrieving url '" # . $query_url . "'.\n"; # return ""; #} # #$xml_parser->parse($response->content); #chop($query_gi_list); # Cut off trailing comma # Another way to convert accession numbers to gi ids is by using # "batch entrez". This part of NCBI's web site is made to be used # by humans, so scripting it and parsing the output is a lot # harder. However, the advantage of batch entrez over ESearch is # that it can return up to 10,000 results at a time (at least # according to my interpretation of the documentation - I never # tested with that many). Nevertheless, to avoid straining NCBI's # systems and parsing result data from a human interface that # might change format, we decided it was better to just use the gi # ids stored in each a2m file, even though they can become out of # date over time. # # The following is code that can be adapted to getting gi ids using batch # Entrez. The code was written originally as an attempt to find taxonomy # ids, before I knew about EUtils. Note that $$batch_accession_nums_ref # needs to be set to a \n delimited list of accession numbers. # One warning: if you give batch entrez an accession number it can't find, # it may return an error and no results. I didn't get to handling that. # Also note that if you need to display a next page of results, or # results in a different format, extract "WebEnv" and "query_num" to # pass on to the next page. Each new page seems to include new values # for these two things. In other words, one WebEnv can't be used to # page through one set of results - each page of the results gives you # a new WebEnv. The EUtils docs say WebEnv is a key into a server side # cache that can hold up to 10,000 records. # # Use LWP's multipart/form-data POST method to upload a batch # of accession numbers to look up. # Documentation for this is found in question 10 of the LWP FAQ # found here: http://groups.google.com/groups?q=perl+lwp+upload+file&hl=en&lr=lang_en&ie=UTF-8&oe=UTF-8&safe=off&selm=8ek1n2%24ej1%241%40cherry.rt.ru&rnum=2 # Better documentation here: http://www.perldoc.com/perl5.8.0/lib/HTTP/Request/Common.html # but even those docs are not clear how to upload file content without a # file. The key was found in this post: # http://groups.google.com/groups?q=lwp+form-data+undef+content&hl=en&lr=lang_en&ie=UTF-8&oe=UTF-8&safe=off&selm=u6594vsb9c2l1bpp6c1rg6tpf1tcs8o8r9%404ax.com&rnum=2 #my $request = POST ( # 'http://www.ncbi.nlm.nih.gov:80/entrez/batchentrez.cgi', # Content_Type => 'form-data', # Content => # [ # cmd_current => '', # cmd => 'Retrieve', # db => 'protein', # orig_db => 'nucleotide', # This is probably not needed # dispmax => '10000', # Untested, but this should make it return up to 10,000 results # # To fake the upload of a file without writing one out requires # simulating a bunch of extra headers. To upload a real file is # much simpler, requiring just one value of ['./batch.entrez'], # file => [undef, 'batch.entrez', # 'Content-Length' => length($$batch_accession_nums_ref), # 'Content-Type' => 'text/plain', # 'Content' => $$batch_accession_nums_ref], # # ], #); #$response = $browser->request($request); # To find the taxonomy ID for each gi ID, we use the NCBI ESummary utility. # This utility takes a comma delimited list of gi IDs and returns summary # data about each one in XML format. This summary information # includes the taxonomy ID, the gi ID, the sequence ID, and some other # things we don't care to know. To see what ESummary returns, try visiting # this URL: # http://www.ncbi.nih.gov/entrez/eutils/esummary.fcgi?db=protein&id=29837495,114475 # # Documentation for ESummary is here: # http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esummary_help.html my $query_url = "http://www.ncbi.nih.gov/entrez/eutils/esummary.fcgi"; # Older versions of LWP (like the one on the Condor cluster) don't # support the post() method, so use the older request(...POST()) # method instead. #my $response = $browser->post($query_url, my $response = $browser->request(HTTP::Request::Common::POST($query_url, [ 'db' => 'protein', # Interestingly, "protein" and "taxonomy" databases both return the same results 'id' => $query_gi_list, 'tool' => NCBI_TOOL, # Let NCBI know what script is accessing ESummary 'email=' => NCBI_EMAIL, # Let NCBI know who to e-mail if the script is causing problems on their servers. ])); if(!$response->is_success) { print STDERR "get_docsum_from_ncbi: Error '" . $response->status_line . "' retrieving url '" . $query_url . "'.\n"; return; } # Parse the XML response to our ESummary web query. The callback # functions like xml_start_handler will take care of putting all # the data into the global $xml_docsum hash. $xml_parser->parse($response->content); # ESummary is not the only way to get the taxonomy IDs we're looking # for. You can also use EFetch through a URL like this: # http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=protein&id=11036339,1078661&rettype=native&retmode=xml # where "11036339,1078661" is the comma separated list of gi ids # (although it seems you can also use accession numbers. The # problem with this method is that the amount of information it # returns is enormous, and the time it takes to process and return # it to you in XML is quite noticable even on just two records. # It goes faster using the default non-xml format like this: # http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=protein&id=11036339,1078661 # but it is still long. However, if you ever need more data than the # taxonomy ID, this may be the only way to get it. # # I also discovered that you can request multiple ids to be displayed # at the same time through the human "query.fcgi" interface using a # url like this: # http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=protein&list_uids=11036339,1078661&dopt=GenPept # Again, I don't trust that this output will remain stable like the EUtils # output, and I don't know how many ids you can request at once, and # this output is probably based on the EFetch backend data anyway # (so you're probably not saving their servers any work), but it's worth # looking into. E-mail info at ncbi.nlm.nih.gov if you really want to be # sure what is most efficient on their end. } ------------------------------------------------------------ This code is not complete---you may need some of the following packages: ------------------------------------------------------------ # LWP (Library for WWW in Perl) is used to retrieve documents # from web servers. See # http://www.perl.com/pub/a/2002/08/20/perlandlwp.html for documentation. use HTTP::Request::Common; use LWP::UserAgent; # There are a bewildering number of modules written to parse XML, # but I think the following advice found on a newsgroup made the # most sense: # # Go to http://search.cpan.org/ and search for "XML". # Have a look at XML::DOM, XML::Grove, XML::Twig, and maybe # XML::Simple, if you don't mind the uglyish data structures it # exposes to the user. XML::QL might be of interest, even though # it is still immature. # If you are not interested in a tree, but want to just pull out # elements while parsing the file, try XML::Node, or XML::Parser # directly. # # Based on this advice, I have chosen XML::Parser as a standard # module which works efficiently for what we need here. We # don't need to convert the whole document into a memory based # tree structure, we just need to pull out a few values we need. If # you do need a tree structure, I would lean towards XML::Grove # from the list above, though I have little to base that opinion # on. # Note that I'm not too happy with the hoops I had to jump through to # use Parser's event driven methodology. The code is not so easy to # understand. use XML::Parser; From dmb at mrc-dunn.cam.ac.uk Fri Apr 1 12:40:15 2005 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Fri, 1 Apr 2005 18:40:15 +0100 (BST) Subject: [BiO BB] Re: [ssml] Parsing taxonomy from blast output In-Reply-To: <20050401163215.BCDB5D1F03@www.bioinformatics.org> Message-ID: On Fri, 1 Apr 2005, Goel, Manisha wrote: >Hi All, > >I need to parse the blast ouput to get the taxonomy information. >If I could get the taxonomy nodes associted with each gi number .. This >would also work. Yeah, this data is here... ftp://ftp.ncbi.nih.gov/pub/taxonomy/ See... ftp://ftp.ncbi.nih.gov/pub/taxonomy/gi_taxid.readme "The gi_taxid_prot.dmp is about 17 MB and contains two columns: the protein's gi and taxid." You can then use the 'taxdump' to get the names.dmp (for the names) and nodes.dmp (for the structure of the taxonomic tree) files (if you need them). See... ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump_readme.txt All the best, Dan. >I have been trying SEALS taxonomy commands but somehow quite a few >sequences turn up "not_retrieved", although we have tried updating the >database etc. >I do not want to use the BLAST web server because I have too many files >to run. >Please suggest any program/script that might be useful. > >Thanks, >-Manisha > From idonalds at blueprint.org Fri Apr 1 15:11:35 2005 From: idonalds at blueprint.org (Ian Donaldson) Date: Fri, 01 Apr 2005 15:11:35 -0500 Subject: [BiO BB] Re: [ssml] Parsing taxonomy from blast output In-Reply-To: Message-ID: Hi all I should also mention that you can retrieve this information using the SeqHound remote Perl API (or Java/C/C++). No need to use up disk space or wait for downloads. The call is SHoundTaxIDFromGi described here: http://www.blueprint.org/seqhound/apifunctsdet.html#SHoundTaxIDFromGi You can download the API from here: ftp://ftp.blueprint.org/pub/SeqHound/Code/ and follow the enclosed instructions to get started or look at the first few pages of the SeqHound Manual http://www.blueprint.org/seqhound/seqhound_documentation.html. Taxid assignments to Gi's are updated daily as part of the core module. Check here http://seqhound.blueprint.org/report.html Other API calls can also provide you with names of taxons. Cheers Ian -----Original Message----- From: bio_bulletin_board-bounces+idonalds=blueprint.org at bioinformatics.org [mailto:bio_bulletin_board-bounces+idonalds=blueprint.org at bioinformatics .org]On Behalf Of Dan Bolser Sent: April 1, 2005 12:40 PM To: Goel, Manisha Cc: ssml-general at bioinformatics.org; bio_bulletin_board at bioinformatics.org; pdb-l at sdsc.edu Subject: [BiO BB] Re: [ssml] Parsing taxonomy from blast output On Fri, 1 Apr 2005, Goel, Manisha wrote: >Hi All, > >I need to parse the blast ouput to get the taxonomy information. >If I could get the taxonomy nodes associted with each gi number .. This >would also work. Yeah, this data is here... ftp://ftp.ncbi.nih.gov/pub/taxonomy/ See... ftp://ftp.ncbi.nih.gov/pub/taxonomy/gi_taxid.readme "The gi_taxid_prot.dmp is about 17 MB and contains two columns: the protein's gi and taxid." You can then use the 'taxdump' to get the names.dmp (for the names) and nodes.dmp (for the structure of the taxonomic tree) files (if you need them). See... ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump_readme.txt All the best, Dan. >I have been trying SEALS taxonomy commands but somehow quite a few >sequences turn up "not_retrieved", although we have tried updating the >database etc. >I do not want to use the BLAST web server because I have too many files >to run. >Please suggest any program/script that might be useful. > >Thanks, >-Manisha > _______________________________________________ Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From idoerg at burnham.org Mon Apr 4 17:19:36 2005 From: idoerg at burnham.org (Iddo Friedberg) Date: Mon, 04 Apr 2005 14:19:36 -0700 Subject: [BiO BB] Reminder: 3 weeks to abstract submission for AFP Message-ID: <4251AF68.2030307@burnham.org> Hello all, The deadline for abstract submission for the First Automated Function Prediction (AFP) meeting is approaching: April 26, 2005. The AFP meeting will be held as a Special Interest Group meeting (SIG) just before ISMB 2005, on Friday, June 24, 2005 at /The Detroit Marriott Renaissance Center, Detroit, MI USA./ The meeting will feature talks from various groups involved in protein function prediction. Additionally, an open assessment of function prediction servers will be held in the afternoon session. If you are involved in computational function prediction, we would like to see you there, preferably on the podium, telling us about your work. An abstract submitted to the AFP will be in the conference proceedings, but is not considered a publication, so you may publish your results elsewhere as well. Talks are sought in, but not limited to, the following topics: * Function prediction using sequence based methods. This would include "classic" methods such as detection of functional motifs and inferring function from sequence similarity. * Function from genomic information: prediction by genomic location; locus comparison with other organisms; function gain and loss. * Phylogeny based methods * Function from molecular interactions * Function from structure * Function prediction using combined methods * "Meta-talks" discussing the limitations and horizons of computational function prediction. * Assessing function prediction programs For more information please see: http://ffas.burnham.org/AFP Abstract submission instructions and templates can be found at: http://ffas.burnham.org/AFP/Speakers/ We are looking forward to seeing you in June. Iddo Friedberg, in the name of the AFP organizing committee. -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 http://ffas.ljcrf.edu/~iddo ========================== The First Automated Protein Function Prediction SIG Detroit, MI June 24, 2005 http://ffas.burnham.org/AFP From willy_valdivia at orionbiosciences.com Fri Apr 1 12:43:37 2005 From: willy_valdivia at orionbiosciences.com (Willy Valdivia-Granda) Date: Fri, 1 Apr 2005 10:43:37 -0700 Subject: [BiO BB] Call for Papers: Fifth Virtual Conference on Genomics and Bioinformatics Message-ID: <20050401174337.28845.qmail@webmail11.prod.mesa1.secureserver.net> Please apologize multiple postings ************************************************************************************** Call for Papers: Fifth Virtual Conference on Genomics and Bioinformatics ************************************************************************************* http://www.virtualgenomics.org/vcgb/conference_2005.htm Submission Deadline: June 30, 2005 The Virtual Conference on Genomics and Bioinformatics provides an advanced collaborative environment where high profile researchers discuss the challenges and opportunities in the understanding of living systems. There are NO REGISTRATION FEES to participate in this event. The main objective of the Proceedings of the Virtual Conference is to establish a prestigious compilation of research advances, discussions and reviews on emerging issues related with genomics and bioinformatics. To attain maximum interaction among the authors and their readers, each submitted paper will subject to a stringent double blind review process, and accepted manuscripts will be invited for oral presentations and published both in paper and electronic forms. Topics covered by the 2005 Issue will include: Artificial and Synthetic Life Biological Knowledge Representation Biological Data Mining Data Visualization Educational Experiences and Policies Related to Genomics and Bioinformatics Evolutionary Genomics Genomic Data Standardization, Management, and Integration High Throughput and GRID Computing Machine Learning Techniques for Genomic Analysis Nano-biotechnology Proteomic Analysis Protein Structural Analysis and Modeling Systems Biology Structural Genomics A maximum length of four pages will be considered for publication as poster and twelve pages as a paper. All accepted manuscripts will be considered for publication in PLoS Computational Biology. Review Committee: Kim Baldrigde. Universit?t Z?rich, Switzerland Patsy Babbitt. University of California, San Francisco. USA Eric Davidson. California Institute of Technology. USA Joaquin Dopazo. Spanish National Cancer Center. Spain Inna Dubchak. Lawrence Berkeley National Laboratory. USA Keith Dunker. University of Indiana. USA Kevin Karplus. University of California, Santa Cruz. USA Maricel Kann, NCBI, NLM, National Institutes of Health. USA Phillip Lord. University of Manchester. UK Anna Panchenko. NCBI, NLM, National Institutes of Health. USA Teresa Przytycka. NCBI, NLM, National Institutes of Health. USA Richard Simon. NCI, National Institutes of Health. USA Amit Sheth. University of Georgia. USA Robert Stevens. University of Manchester. UK Deanne Taylor. Serono Reproductive Institute. USA Alfonso Valencia. Centro Nacional de Biotecnologia. Spain From sourangshu at csa.iisc.ernet.in Tue Apr 5 14:27:16 2005 From: sourangshu at csa.iisc.ernet.in (Sourangshu Bhattacharya) Date: Tue, 5 Apr 2005 23:57:16 +0530 (IST) Subject: [BiO BB] Scop Domain Names Message-ID: Hi, I just noticed that some domain names (in the sequence files) start with 'g' instead of 'd' as usual. Is there any specific reason for it ?? Thank you. Sourangshu Sourangshu Bhattacharya PhD Student, Dept. of Computer Science & Automation, IISc, Bangalore. http://people.csa.iisc.ernet.in/sourangshu From dmb at mrc-dunn.cam.ac.uk Tue Apr 5 15:16:05 2005 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Tue, 5 Apr 2005 20:16:05 +0100 (BST) Subject: [BiO BB] Scop Domain Names In-Reply-To: Message-ID: On Tue, 5 Apr 2005, Sourangshu Bhattacharya wrote: >Hi, >I just noticed that some domain names (in the sequence files) start with >'g' instead of 'd' as usual. Is there any specific reason for it ?? See the readme at astral. g stands for genetic, and means that the rare scop domains which are composed of separate chains are concatenated in the sequence (with an X) in 'genetic' order (from the DNA). It is also used for those domains which are several fragments of the same chain, for example when one domain is inserted into the loop of another domain. If you can understand PDB entry 1dan in SCOP you will know what is going on. dan > >Thank you. > >Sourangshu > > >Sourangshu Bhattacharya >PhD Student, >Dept. of Computer Science & Automation, >IISc, Bangalore. > >http://people.csa.iisc.ernet.in/sourangshu > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From jeff at bioinformatics.org Wed Apr 6 18:06:12 2005 From: jeff at bioinformatics.org (J.W. Bizzaro) Date: Wed, 06 Apr 2005 18:06:12 -0400 Subject: [BiO BB] Bioinformatics.Org Annual Meeting (BiOAM) 2005 Message-ID: <42545D54.5030207@bioinformatics.org> 2005 ANNUAL MEETING OF THE BIOINFORMATICS ORGANIZATION MAY 17-19, 2005 HYNES CONVENTION CENTER, BOSTON Come join us as we hold our fifth Annual Meeting, in conjunction with the Bio-IT World Conference + Expo. Show Management is offering our members a great deal: all Bioinformatics.Org members who register online by May 16, 2005 will receive a 25% conference discount or a FREE Exhibit Only Pass*** to attend the event, taking place May 17-19, 2005 at The Hynes Convention Center in Boston, Mass. REGISTRATION: http://www.bio-itworldexpo.com/live/26/register To receive the 25% discount, please register online by May 16, 2005 with the exclusive Bioinformatics.Org Members PRIORITY CODE: B0638. Early-bird ends April 15, 2005! SPECIAL HIGHLIGHTS SPONSORED BY Bioinformatics.Org: - The Benjamin Franklin Award Ceremony: http://www.bio-itworldexpo.com/live/26/events/26BOS05A/keynotes - Laureate seminar by Ewan Birney, Ph.D., Head of Ensembl, European Bioinformatics Institute (EBI), Cambridge, UK - Filippo Rusconi, Ph.D., Researcher, Centre National de la Recherche Scientifique (CNRS) and Museum National d'Histoire Naturelle (MNHN), Paris Website for Bioinformatics.Org seminars: http://www.bio-itworldexpo.com/live/26/events/26BOS05A/conference/tracksessions/Special+Sessions/QMONYA04NF8C For further details, download the conference brochure: http://www.bio-itworldexpo.com/live/26/events/26BOS05A/conference/CC881012 WITH YOUR FREE EXHIBIT-ONLY PASS, ALSO ATTEND THESE GREAT KEYNOTES: - Tim Berners-Lee, Director, World Wide Web Consortium (W3C) Senior Research Scientist, CSAIL, MIT - Lawrence J. Lesko, Ph.D., FCP, Director Office of Clinical Pharmacology and Biopharmaceutics Center for Drug Evaluation and Research (CDER) Food and Drug Administration - J. Craig Venter, Ph.D. (following Ewan Birney!), Founder & Chairman of the Board, The Institute for Genomic Research (TIGR), President & Founder, J. Craig Venter Science Foundation (JCVSF) and President & Founder, The J. Craig Venter Institute For more keynote details, go to: http://www.bio-itworldexpo.com/live/26/events/26BOS05A/keynotes We look forward to seeing you at Bio-IT World Conference + Expo in Boston. ***NOTE: Valid for new registrations only. This offer cannot be redeemed for cash or used in conjunction with any other offer. All registration fees are non-refundable and credentials are non-transferable. Cheers. Jeff -- J.W. Bizzaro Bioinformatics Organization, Inc. (Bioinformatics.Org) E-mail: jeff at bioinformatics.org Phone: +1 508 890 8600 -- From vxg189 at bham.ac.uk Fri Apr 8 08:17:35 2005 From: vxg189 at bham.ac.uk (Vibhor Gupta) Date: Fri, 8 Apr 2005 13:17:35 +0100 Subject: [BiO BB] IMAGE IDs to chromosomal locations Message-ID: <1B4C5BA3CB5F2849B0C8DB99156046A43AC4EE@med-ex1.bham.ac.uk> Hello all, I have been working on the analysis of some microarray data. I am at a stage whereby I have a selection of genes and I am interested in looking at their specific chromosomal locations. Since in the GPR files I am only provided with the IMAGE IDs (referring to the EST sequence present on the spot), I am looking for a software or a database engine that provides me with the corresponsing GENBANK IDs (or even better, the chromosomal locations). In the GAL file I have been provided some Genbank IDs but the list is not complete (and so it would be good to have a list of Genbank IDs as well). I would be grateful to anyone who could help me in this matter. Thankyou. Mr. Vibhor Gupta Research Associate (Chromatin and Gene Expression Group) Division of Immunity and Infection - Anatomy Institute of Biomedical Research University of Birmingham Birmingham - B15 2TT Email: v.gupta.1 at bham.ac.uk Telephone number: 0121-4158684 From Hegedus.Tamas at mayo.edu Fri Apr 8 21:59:21 2005 From: Hegedus.Tamas at mayo.edu (Tamas Hegedus) Date: Fri, 08 Apr 2005 18:59:21 -0700 Subject: [BiO BB] storage_software Message-ID: <425736F9.7030800@mayo.edu> Hi, I am looking for a software for storage management in the lab. Like ItemTracker (http://www.itemtracker.co.uk), but cheaper (free?). It could be also simpler, as our group is a small research group. A server-side solution (to be accessed from different computers) would be the best. Thanks, Tamas -- Tamas Hegedus, Research Fellow | phone: (1) 480-301-6041 Mayo Clinic Scottsdale | fax: (1) 480-301-7017 13000 E. Shea Blvd | mailto:hegedus at mayo.edu Scottsdale, AZ, 85259 | http://hegedus.brumart.org From logan at cacs.louisiana.edu Mon Apr 11 13:58:46 2005 From: logan at cacs.louisiana.edu (logan at cacs.louisiana.edu) Date: Mon, 11 Apr 2005 12:58:46 -0500 (CDT) Subject: [BiO BB] 3rd Call for Participation in the Bioinformatics Symposium Message-ID: <1213.68.226.154.226.1113242326.squirrel@webmail.cacs.louisiana.edu> Second Annual Bioinformatics Symposium Place: Lafayette, Louisiana Date: April 18, 2005 Symposium web site: http://www.cacs.louisiana.edu/bioinformatics/symp05/index.html Please register to be part of the exciting event in Lafayette, Louisiana on April 18, The 2nd Annual Bioinformatics symposium. The symposium highlights the keynote presentations, invited and the research contributions. We have a limited seating capacity and the registration will be closed when we reach the limit. If you are interested, I urge you to register online without procrastinating (we still have some seats available and the registration will be closed once we reach the limit 70). The symposium's goal is to build a bridge among the researchers working in industries, academic and research institutions. This forum will provide an opportunity to learn and to understand the ongoing research activities that may help spark some new collaboration for future research and funding opportunities in this area. Keynote Speakers include: Professor Philip E. Bourne from UCSD Co-Director Protein Data Bank, Topic: Recent Developments in Structural Bioinformatics Professor. Ed Seidel from Center for Computation & Technology, Louisiana State University Topic: Grid Computing Dr. Mark Gosink, Head of Computational Biology of Scripps Florida Topic: Informatics at Scripps Florida Participation: Registration ($25 for students and $50 for others) is required to attend the symposium. The registration fee allows you to attend all the sessions, breakfast, lunch and two cofee breaks. On line registration is opened while seats are available (first come first serve basis). Thank you for your time Raja Logananatharaj Bioinformatics Symposium Organizer From forward at hongyu.org Mon Apr 11 18:49:30 2005 From: forward at hongyu.org (Hongyu Zhang) Date: Mon, 11 Apr 2005 15:49:30 -0700 Subject: [BiO BB] multiple sequence alignment displaying program Message-ID: <1113259770.425afefae30c4@hongyu.org> Dear all, Does anyone know a multiple sequence alignment displaying program that can output a view like this picture: http://hongyu.org/misc/alignment.jpg In this example, the conserved amino acids in the multiple alignment are framed using lines instead of the usual color or black/white scheme. Our outside attorneys like it better, so I need to find a program for us. Thanks! -- Hongyu Zhang, Ph.D. Computational biologist Ceres Inc. From mkgovindis at yahoo.com Mon Apr 11 22:57:16 2005 From: mkgovindis at yahoo.com (govind mk) Date: Mon, 11 Apr 2005 19:57:16 -0700 (PDT) Subject: [BiO BB] IMAGE IDs to chromosomal locations In-Reply-To: 6667 Message-ID: <20050412025716.32211.qmail@web54504.mail.yahoo.com> Hi Vibhor, Have a look at this web site. http://source.stanford.edu I hope this helps Regards --- Vibhor Gupta wrote: > Hello all, > > I have been working on the analysis of some > microarray data. I am at a stage whereby I have a > selection of genes and I am interested in looking at > their specific chromosomal locations. Since in the > GPR files I am only provided with the IMAGE IDs > (referring to the EST sequence present on the spot), > I am looking for a software or a database engine > that provides me with the corresponsing GENBANK IDs > (or even better, the chromosomal locations). In the > GAL file I have been provided some Genbank IDs but > the list is not complete (and so it would be good to > have a list of Genbank IDs as well). I would be > grateful to anyone who could help me in this matter. > Thankyou. > > Mr. Vibhor Gupta > Research Associate (Chromatin and Gene Expression > Group) > Division of Immunity and Infection - Anatomy > Institute of Biomedical Research > University of Birmingham > Birmingham - B15 2TT > Email: v.gupta.1 at bham.ac.uk > Telephone number: 0121-4158684 > > _______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > __________________________________ Do you Yahoo!? Yahoo! Small Business - Try our new resources site! http://smallbusiness.yahoo.com/resources/ From stefanielager at fastmail.ca Tue Apr 12 03:47:31 2005 From: stefanielager at fastmail.ca (Stefanie Lager) Date: Tue, 12 Apr 2005 07:47:31 +0000 (UTC) Subject: [BiO BB] multiple sequence alignment displaying program Message-ID: <20050412074732.24546861853@mail.interchange.ca> The EMBOSS program Prettyplot can do this http://emboss.sourceforge.net/apps/prettyplot.html It's available for local installation and also on the web, at several places. Stefanie > Dear all, > > Does anyone know a multiple sequence alignment displaying program that > can output a view like this picture: > http://hongyu.org/misc/alignment.jpg > > In this example, the conserved amino acids in the multiple alignment > are framed using lines instead of the usual color or black/white > scheme. Our outside attorneys like it better, so I need to find a > program for us. > > Thanks! > > -- > Hongyu Zhang, Ph.D. > Computational biologist > Ceres Inc. > > > > > > _______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board _________________________________________________________________ http://fastmail.ca/ - Fast Secure Web Email for Canadians From pankaj at nii.res.in Tue Apr 12 04:01:00 2005 From: pankaj at nii.res.in (Pankaj) Date: Tue, 12 Apr 2005 13:31:00 +0530 Subject: [BiO BB] modeller help Message-ID: <20050412080100.M82201@nii.res.in> hello everybody, i m using modeller package to build models of my sequence. There is an option to tell how many models i can build for the query sequence. But how do i know which is the best model for my sequence. any help would b highly appreciated Thanking u all in advance Pankaj Kamra -- Open WebMail Project (http://openwebmail.org) From christoph.gille at charite.de Tue Apr 12 04:00:09 2005 From: christoph.gille at charite.de (Dr. Christoph Gille) Date: Tue, 12 Apr 2005 10:00:09 +0200 (CEST) Subject: [BiO BB] multiple sequence alignment displaying program In-Reply-To: <1113259770.425afefae30c4@hongyu.org> References: <1113259770.425afefae30c4@hongyu.org> Message-ID: <49772.192.168.220.204.1113292809.squirrel@webmail.charite.de> Texshade can colorizes the conserved residues. A threshhold can be given. But it does not look exactly like your fig. Texshade is best printed in color. There is a comfortable java frontend for texshade. From vxg189 at bham.ac.uk Tue Apr 12 05:06:29 2005 From: vxg189 at bham.ac.uk (Vibhor Gupta) Date: Tue, 12 Apr 2005 10:06:29 +0100 Subject: [BiO BB] IMAGE IDs to chromosomal locations Message-ID: <1B4C5BA3CB5F2849B0C8DB99156046A43AC4F5@med-ex1.bham.ac.uk> Thankyou very much for your help, Mr. Govind. I had gone on this website but this doesnot entertain batch queries and asks for accession numbers to be entered one by one. It is a useful resource provided we are interested in a small number of genes. Since I am dealing with around 700-800 genes, I was interested in something that would allow me to enter the Genbank Accession numbers of all these genes at once and then give me their respective chromosomal locations and Band position. Thanking you again. Mr. Vibhor Gupta Research Associate (Chromatin and Gene Expression Group) Division of Immunity and Infection - Anatomy Institute of Biomedical Research University of Birmingham Birmingham - B15 2TT Email: v.gupta.1 at bham.ac.uk Telephone number: 0121-4158684 ________________________________ From: bio_bulletin_board-bounces+v.gupta.1=bham.ac.uk at bioinformatics.org on behalf of govind mk Sent: Tue 12/04/2005 03:57 To: The general forum at Bioinformatics.Org Subject: Re: [BiO BB] IMAGE IDs to chromosomal locations Hi Vibhor, Have a look at this web site. http://source.stanford.edu I hope this helps Regards --- Vibhor Gupta wrote: > Hello all, > > I have been working on the analysis of some > microarray data. I am at a stage whereby I have a > selection of genes and I am interested in looking at > their specific chromosomal locations. Since in the > GPR files I am only provided with the IMAGE IDs > (referring to the EST sequence present on the spot), > I am looking for a software or a database engine > that provides me with the corresponsing GENBANK IDs > (or even better, the chromosomal locations). In the > GAL file I have been provided some Genbank IDs but > the list is not complete (and so it would be good to > have a list of Genbank IDs as well). I would be > grateful to anyone who could help me in this matter. > Thankyou. > > Mr. Vibhor Gupta > Research Associate (Chromatin and Gene Expression > Group) > Division of Immunity and Infection - Anatomy > Institute of Biomedical Research > University of Birmingham > Birmingham - B15 2TT > Email: v.gupta.1 at bham.ac.uk > Telephone number: 0121-4158684 > > _______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > __________________________________ Do you Yahoo!? Yahoo! Small Business - Try our new resources site! http://smallbusiness.yahoo.com/resources/ _______________________________________________ Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 6660 bytes Desc: not available URL: From omoya at uib.es Tue Apr 12 14:16:23 2005 From: omoya at uib.es (Oscar) Date: Tue, 12 Apr 2005 11:16:23 -0700 Subject: [BiO BB] multiple sequence alignment displaying program In-Reply-To: <1113259770.425afefae30c4@hongyu.org> References: <1113259770.425afefae30c4@hongyu.org> Message-ID: <425C1077.5040704@uib.es> The output you show looks quite similar to the alignment report from the program Megalign, that belongd to the lasergene DNAstar package. Try with: http://www.dnastar.com/web/r10.php Good luck. Oscar Hongyu Zhang wrote: >Dear all, > >Does anyone know a multiple sequence alignment displaying program that >can output a view like this picture: >http://hongyu.org/misc/alignment.jpg > >In this example, the conserved amino acids in the multiple alignment >are framed using lines instead of the usual color or black/white >scheme. Our outside attorneys like it better, so I need to find a >program for us. > >Thanks! > >-- >Hongyu Zhang, Ph.D. >Computational biologist >Ceres Inc. > > > > > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > From vetdebarshi at gmail.com Tue Apr 12 17:08:57 2005 From: vetdebarshi at gmail.com (debarshi roy) Date: Tue, 12 Apr 2005 16:08:57 -0500 Subject: [BiO BB] multiple sequence alignment displaying program In-Reply-To: <1113259770.425afefae30c4@hongyu.org> References: <1113259770.425afefae30c4@hongyu.org> Message-ID: <95907580504121408276252f0@mail.gmail.com> Hello All, I am a Graduate student in Biology in USA. I would like to build a career in Bioinformatics. I would like to know if there is any summer training position available in the USA for graduate student in Bioinformatics companies or institutions or not. Thanks. Debarshi Roy On Apr 11, 2005 5:49 PM, Hongyu Zhang wrote: > Dear all, > > Does anyone know a multiple sequence alignment displaying program that > can output a view like this picture: > http://hongyu.org/misc/alignment.jpg > > In this example, the conserved amino acids in the multiple alignment > are framed using lines instead of the usual color or black/white > scheme. Our outside attorneys like it better, so I need to find a > program for us. > > Thanks! > > -- > Hongyu Zhang, Ph.D. > Computational biologist > Ceres Inc. > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From christoph.gille at charite.de Tue Apr 12 19:02:24 2005 From: christoph.gille at charite.de (Dr. Christoph Gille) Date: Wed, 13 Apr 2005 01:02:24 +0200 (CEST) Subject: [BiO BB] (no subject) Message-ID: <32906.217.86.170.155.1113346944.squirrel@webmail.charite.de> Hi everybody, I am writing an interface for BioJava and STRAP so that people who write plugins for STRAP or who use STRAP for scripting can take advantage of the BioJava API. please send comments and suggestions ! Christoph From mkgovindis at yahoo.com Tue Apr 12 23:13:22 2005 From: mkgovindis at yahoo.com (govind mk) Date: Tue, 12 Apr 2005 20:13:22 -0700 (PDT) Subject: [BiO BB] IMAGE IDs to chromosomal locations In-Reply-To: 6667 Message-ID: <20050413031322.42922.qmail@web54506.mail.yahoo.com> Hi vibhor Guess you missed the link in the page that allows you batch queries http://genome-www5.stanford.edu/cgi-bin/source/sourceBatchSearch Follow this URL ...allows you to upload a file and get their corresponding data. -Govind --- Vibhor Gupta wrote: > Thankyou very much for your help, Mr. Govind. > > I had gone on this website but this doesnot > entertain batch queries and asks for accession > numbers to be entered one by one. It is a useful > resource provided we are interested in a small > number of genes. Since I am dealing with around > 700-800 genes, I was interested in something that > would allow me to enter the Genbank Accession > numbers of all these genes at once and then give me > their respective chromosomal locations and Band > position. Thanking you again. > > Mr. Vibhor Gupta > Research Associate (Chromatin and Gene Expression > Group) > Division of Immunity and Infection - Anatomy > Institute of Biomedical Research > University of Birmingham > Birmingham - B15 2TT > Email: v.gupta.1 at bham.ac.uk > Telephone number: 0121-4158684 > > > ________________________________ > > From: > bio_bulletin_board-bounces+v.gupta.1=bham.ac.uk at bioinformatics.org > on behalf of govind mk > Sent: Tue 12/04/2005 03:57 > To: The general forum at Bioinformatics.Org > Subject: Re: [BiO BB] IMAGE IDs to chromosomal > locations > > > > > Hi Vibhor, > > Have a look at this web site. > > http://source.stanford.edu > > I hope this helps > Regards > > > --- Vibhor Gupta wrote: > > Hello all, > > > > I have been working on the analysis of some > > microarray data. I am at a stage whereby I have a > > selection of genes and I am interested in looking > at > > their specific chromosomal locations. Since in the > > GPR files I am only provided with the IMAGE IDs > > (referring to the EST sequence present on the > spot), > > I am looking for a software or a database engine > > that provides me with the corresponsing GENBANK > IDs > > (or even better, the chromosomal locations). In > the > > GAL file I have been provided some Genbank IDs but > > the list is not complete (and so it would be good > to > > have a list of Genbank IDs as well). I would be > > grateful to anyone who could help me in this > matter. > > Thankyou. > > > > Mr. Vibhor Gupta > > Research Associate (Chromatin and Gene Expression > > Group) > > Division of Immunity and Infection - Anatomy > > Institute of Biomedical Research > > University of Birmingham > > Birmingham - B15 2TT > > Email: v.gupta.1 at bham.ac.uk > > Telephone number: 0121-4158684 > > > > _______________________________________________ > > Bioinformatics.Org general forum - > > BiO_Bulletin_Board at bioinformatics.org > > > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > > > > __________________________________ > Do you Yahoo!? > Yahoo! Small Business - Try our new resources site! > http://smallbusiness.yahoo.com/resources/ > _______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > > _______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From Philippe.Hupe at curie.fr Wed Apr 13 11:56:15 2005 From: Philippe.Hupe at curie.fr (=?ISO-8859-1?Q?Philippe_Hup=E9?=) Date: Wed, 13 Apr 2005 17:56:15 +0200 Subject: [BiO BB] New tools for transcriptome and array CGH experiments Message-ID: <425D411F.8070302@curie.fr> Dear colleague, The Bioinformatics Unit of Institut Curie (Paris, France) is involved in the development of algorithms for microarray data analysis, including transcriptome, CGH and chromatine immunopricipitation microarrays. For example, we have developed the following softwares: - GLAD for breakpoints detection in array CGH experiments - MAIA for automatic microarray image analysis - MANOR for normalisation of microarray data - VAMP, a java graphical interface for visualisation and analysis of CGH array, transcriptome and other molecular profiles. A demo version of VAMP can be run on our Web server for several public CGH and transcriptome array data sets: Douglas et al. (2004), Veltman et al. (2003), Pollack et al. (CGH and transcriptome array data, 2002), Snijders et al. (2001), Nakao et al. (2004) and de Leeuw et al. (2004). VAMP can be requested at vamp at curie.fr, GLAD at glad at curie.fr and MANOR at manor at curie.fr Best regards The Bioinformatics Unit of Institut Curie PS: The VAMP demo is available at http://bioinfo.curie.fr/vamp (Then click on "Launch Vamp" and File->Import) Two movies give you an overview of VAMP software capabilities. - http://bioinfo-out.curie.fr/tutorial/vamp/vamp-demo1.html - http://bioinfo-out.curie.fr/tutorial/vamp/vamp-demo2.html We are developing a Web site and have several publications in preparation in the field, and if you are interested we will let you know when they will be available. You can visit our Web site at http://bioinfo.curie.fr -- Philippe Hup? UMR 144 - Service Bioinformatique Institut Curie Laboratoire de Transfert (4?me ?tage) 26 rue d'Ulm 75005 Paris - France Email : vamp at curie.fr T?l : +33 (0)1 44 32 42 75 Fax : +33 (0)1 42 34 65 28 -------------- next part -------------- An HTML attachment was scrubbed... URL: From christoph.gille at charite.de Thu Apr 14 06:44:49 2005 From: christoph.gille at charite.de (Dr. Christoph Gille) Date: Thu, 14 Apr 2005 12:44:49 +0200 (CEST) Subject: [BiO BB] extract regions from chomosome Message-ID: <49284.192.168.220.204.1113475489.squirrel@webmail.charite.de> I have a list of start positions and end postitions of human chromosoms and would like to extract the respective nt-sequences. It is a list of about 200 entries. Does anybody know an efficient way ? Thanks Christoph From mike.fursov at gmail.com Thu Apr 14 06:54:36 2005 From: mike.fursov at gmail.com (Mikhail Fursov) Date: Thu, 14 Apr 2005 17:54:36 +0700 Subject: [BiO BB] extract regions from chomosome In-Reply-To: <49284.192.168.220.204.1113475489.squirrel@webmail.charite.de> References: <49284.192.168.220.204.1113475489.squirrel@webmail.charite.de> Message-ID: I know the simple but not efficient way to extract a subsequence from whole sequence. We use GenomeBrowser (http://genome.unipro.ru) (trial is fully functional) Thw sequence of actions: load your genome -> choose select region option -> select region you need -> save selection to fasta or just copy it to clipboard. On 4/14/05, Dr. Christoph Gille wrote: > I have a list of start positions and end postitions of human chromosoms > and would like to extract the respective nt-sequences. > > It is a list of about 200 entries. > > Does anybody know an efficient way ? > > Thanks > > Christoph > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From bader at cbio.mskcc.org Fri Apr 15 14:07:21 2005 From: bader at cbio.mskcc.org (Gary Bader) Date: Fri, 15 Apr 2005 14:07:21 -0400 Subject: [BiO BB] Announcing Cytoscape Version 2.1 In-Reply-To: References: <49284.192.168.220.204.1113475489.squirrel@webmail.charite.de> Message-ID: <426002D9.5010803@cbio.mskcc.org> Greetings, Cytoscape is an open-source bioinformatics software platform for visualizing molecular interaction networks and integrating these interactions with gene expression profiles and other state data. http://www.cytoscape.org New version 2.1 Features: For users: -Major performance improvements -Complete on-line help system -cPath plugin for downloading protein interactions -Better network filtering including large network support -Layout and rotate selected nodes -Progress bars on long tasks -Support for very large networks (>100K nodes, edges) -Numerous Bug Fixes. -Numerous plugins available for feature expansion and network analysis E.g. Active Modules, MCODE, LitSearch See http://www.cytoscape.org/plugins2.php For programmers: -Significantly expanded API technical documentation -New task framework for long-term tasks -Headless mode (command line only operation) for advanced scripting Thanks, The Cytoscape Collaboration http://cytoscape.org/people.php From alama456 at hotmail.com Mon Apr 18 03:17:59 2005 From: alama456 at hotmail.com (alain M.) Date: Mon, 18 Apr 2005 09:17:59 +0200 Subject: [BiO BB] Harvester or Genecard Message-ID: Dear colleague, I actually work in a biology lab in Louvain (belgium) and I'm involved in the creation of a biotech company. We will developped an HTS platform to identify putative targets in the field of cancer and CNS deseases. Bioinformatics will be a key tool to performed such analysis and we have identify two integrated databases that may be useful for us : Genecard from Xennex inc. and Harvester from Biomax Informatics AG. An other possibility is to build in-house our own integrated database. What is your opinion about these two systems? Have you some information about the price for a small biotech? Are the private versions enhanced with other functionalities compared to the online free academic versions? thanks for your help, Best regards, A M From ruth.gull at continuing-education.oxford.ac.uk Fri Apr 15 10:06:04 2005 From: ruth.gull at continuing-education.oxford.ac.uk (Gull, Ruth) Date: Fri, 15 Apr 2005 15:06:04 +0100 Subject: [BiO BB] Oxford Bioinformatics Programme Virtual Open Day: 22nd April Message-ID: <250459F5945E84439644F58754F565A4029B6832@wildrice.conted.ox.ac.uk> OXFORD BIOINFORMATICS PROGRAMME Interested in studying bioinformatics....we have it all: short courses on specialised subjects, online courses to develop your skills and part time courses leading to a Certificate or Masters qualification. Want to find out more, then come to the Oxford Bioinformatics Programme online open day: April 22nd 12.00 - 16.00 held live at:- http://www.conted.ox.ac.uk/bioinformatics You will find information on the Programme and it's various options for study, sample lectures, virtual tours of teaching centres, live chat sessions with tutors and students, experiences of other students and course life. We hope to see you there! Please pass this on to anyone you think might be interested. From magda_mansour at yahoo.fr Tue Apr 19 04:56:23 2005 From: magda_mansour at yahoo.fr (Magda Mansour) Date: Tue, 19 Apr 2005 10:56:23 +0200 (CEST) Subject: [BiO BB] makemat cannot open the profile file? Message-ID: <20050419085623.79469.qmail@web25707.mail.ukl.yahoo.com> Hello, I'm a computer engineer student and currently working on a bioinformatics project using PSI-BLAST. I'm completely new in this domain and I have a little problem with the PSI-BLAST function "makemat". I first created a profile using blastpgp: blastpgp -d dataBase\nr -i files\protein.txt -o files\protein.out -C data\matrix ?j 3 Then using the function makemat to create an ASCII file from this matrix file with: makemat -d DataBase\nr -P data\matrix It gives me the error: [NULL_Caption] FATAL ERROR: Unable to open file ||@ I tried to construct this same matrix with different extensions: bin, chk, sn, mn, aux... It gives me the output: [NULL_Caption] FATAL ERROR: Unable to open profiles file data\matrix.ext.pn I did not found on Internet a documentation on the pn extension. Although, working with this matrix for a new iteration worked properly: blastpgp -d dataBase\nr -i files\protein.txt -o files\protein.out -R data\matrix If somebody can help me on this point I would be very grateful, Sincerely, Yours, Magda Mansour __________________________________________________________________ D?couvrez le nouveau Yahoo! Mail : 250 Mo d'espace de stockage pour vos mails ! Cr?ez votre Yahoo! Mail sur http://fr.mail.yahoo.com/ From gary at www.bioinformatics.org Tue Apr 19 09:48:02 2005 From: gary at www.bioinformatics.org (Gary Van Domselaar) Date: Tue, 19 Apr 2005 09:48:02 -0400 (EDT) Subject: [BiO BB] makemat cannot open the profile file? In-Reply-To: <20050419085623.79469.qmail@web25707.mail.ukl.yahoo.com> Message-ID: Hi Magda, Although I am no expert on PSIBLAST, I do recall seeing 'psipred' use the following code with sucess: $ncbidir/blastpgp -b 0 -j 3 -h 0.001 -d $dbname -i psitmp.fasta -C psitmp.chk >& $rootname.blast echo psitmp.chk > psitmp.pn echo psitmp.fasta > psitmp.sn $ncbidir/makemat -P psitmp so I assume you need to have a common rootname and appropriate '.pn' and '.sn' extensions for your matrix file and fasta file, respectfully, and then call makemat on the rootname. Best of Luck, g. -- Gary Van Domselaar, Ph.D. gary at bioinformatics.org On Tue, 19 Apr 2005, Magda Mansour wrote: > Hello, > > I'm a computer engineer student and currently working > on a bioinformatics project using PSI-BLAST. I'm > completely new in this domain and I have a little > problem with the PSI-BLAST function "makemat". > > I first created a profile using blastpgp: > blastpgp -d dataBase\nr -i files\protein.txt -o > files\protein.out -C data\matrix ?j 3 > > Then using the function makemat to create an ASCII > file from this matrix file with: > makemat -d DataBase\nr -P data\matrix > It gives me the error: > [NULL_Caption] FATAL ERROR: Unable to open file ||@ > > I tried to construct this same matrix with different > extensions: bin, chk, sn, mn, aux... > It gives me the output: > [NULL_Caption] FATAL ERROR: Unable to open profiles > file data\matrix.ext.pn > I did not found on Internet a documentation on the pn > extension. > > Although, working with this matrix for a new iteration > worked properly: > blastpgp -d dataBase\nr -i files\protein.txt -o > files\protein.out -R data\matrix > > If somebody can help me on this point I would be very > grateful, > > Sincerely, > Yours, > Magda Mansour > > > > > > > __________________________________________________________________ > D?couvrez le nouveau Yahoo! Mail : 250 Mo d'espace de stockage pour vos mails ! > Cr?ez votre Yahoo! Mail sur http://fr.mail.yahoo.com/ > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From chea at mail.nih.gov Tue Apr 19 10:22:47 2005 From: chea at mail.nih.gov (Anney Che) Date: Tue, 19 Apr 2005 10:22:47 -0400 Subject: [BiO BB] Question about selection in evolution In-Reply-To: <20050419085623.79469.qmail@web25707.mail.ukl.yahoo.com> Message-ID: Hi everyone, Does anyone know what kinds of genes that underwent positive selection the most? Could you also refer me some related paper about gene evolution? Thanks, Anney Anney Che, M.S. Biocomputing Specialist Laboratory of Molecular Microbiology/NIAID 4 Center Drive, Room 301 Bethesda, MD 20892 301-451-2851 (Office) 301-480-2716 (Fax) From danny at amelang.net Tue Apr 19 17:22:31 2005 From: danny at amelang.net (Daniel Amelang) Date: Tue, 19 Apr 2005 15:22:31 -0600 Subject: [BiO BB] Question about selection in evolution In-Reply-To: References: Message-ID: <42657697.5040604@amelang.net> >Does anyone know what kinds of genes that underwent positive selection the >most? >Could you also refer me some related paper about gene evolution? > > The other day I was playing around with some software that determines the level of positive/negative selection at the amino acid level for a given gene. Perhaps it would help. It's called TreeSAAP http://inbio.byu.edu/faculty/dam83/cdm/ Here's the paper introducing it: http://bioinformatics.oupjournals.org/cgi/content/abstract/19/5/671 Although it may be more fine-grained that what you need, you could theoretically use it to determine if a gene underwent positive selection or not. TreeSAAP will tell you where in the sequence the selection occured and for what amino acid property. Sounds like more that you need :) Hey, it was worth mentioning. Dan From idoerg at burnham.org Wed Apr 20 04:11:13 2005 From: idoerg at burnham.org (Iddo Friedberg) Date: Wed, 20 Apr 2005 01:11:13 -0700 Subject: [BiO BB] One week for abstract submission for the AFP SIG Message-ID: {Please pass on!} The Automated Function Prediction meeting will take place as a Special Interest Group (SIG) satellite meeting of ISMB 2005 in Detroit, MI, USA. Current speakers include: Michael Sternberg, Imperial College, London; Olivier Lichtarge, Baylor College of Medicine; Russ Altman, Stanford University; Patricia Babbitt, UCSF and Adam Godzik, The Burnham Institute. We are also seeking talk abstracts and poster abstracts from researchers in the field of automated function prediction. Abstracts are due April 26, 2005, a week from today. An abstract submitted to the AFP will be in the conference proceedings, but is not considered a publication, so you may publish your results elsewhere as well. With that in mind, Protein Science journal will publish selected talks from the meeting in a special section devoted to the topic of automated function predction. Talks and posters are sought in, but not limited to, the following topics: * Function prediction using sequence based methods. This would include "classic" methods such as detection of functional motifs and inferring function from sequence similarity. * Function from genomic information: prediction by genomic location; locus comparison with other organisms; function gain and loss. * Phylogeny based methods * Function from molecular interactions * Function from structure * Function prediction using combined methods * "Meta-talks" discussing the limitations and horizons of computational function prediction. * Assessing function prediction programs For more information please see: http://ffas.burnham.org/AFP Abstract submission instructions and templates can be found at: http://ffas.burnham.org/AFP/Speakers http://ffas.burnham.org/AFP/posters We are looking forward to seeing you in June. Iddo Friedberg, in the name of the AFP organizing committee. -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037, USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 646 3171 http://ffas.ljcrf.edu/~iddo ------------------------------------- Automated Protein Function Prediction Meeting, June 24, 2005 http://ffas.burnham.org/AFP From dmb at mrc-dunn.cam.ac.uk Wed Apr 20 04:56:35 2005 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Wed, 20 Apr 2005 09:56:35 +0100 (BST) Subject: [BiO BB] Question about selection in evolution In-Reply-To: <42657697.5040604@amelang.net> Message-ID: On Tue, 19 Apr 2005, Daniel Amelang wrote: > >>Does anyone know what kinds of genes that underwent positive selection the >>most? >>Could you also refer me some related paper about gene evolution? >> >> >The other day I was playing around with some software that determines >the level of positive/negative selection at the amino acid level for a >given gene. Perhaps it would help. It's called TreeSAAP > >http://inbio.byu.edu/faculty/dam83/cdm/ > >Here's the paper introducing it: > >http://bioinformatics.oupjournals.org/cgi/content/abstract/19/5/671 > >Although it may be more fine-grained that what you need, you could >theoretically use it to determine if a gene underwent positive selection >or not. TreeSAAP will tell you where in the sequence the selection >occured and for what amino acid property. Sounds like more that you need >:) Hey, it was worth mentioning. > >Dan Hey! We need another Dan to form "The three Dans"! Try googling for 'codon volatility'. It is one hypothetical measure of selection pressure. >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From landman at scalableinformatics.com Wed Apr 20 09:25:51 2005 From: landman at scalableinformatics.com (Joe Landman) Date: Wed, 20 Apr 2005 09:25:51 -0400 Subject: [BiO BB] General question on time consuming problems Message-ID: <4266585F.10200@scalableinformatics.com> Hi folks: Sorry for the "spam", I will try to keep this short/simple. Basically some questions on needs for the computational folks. We are looking at what computational needs exist today and what people think they will need tomorrow, and of course, clusters and their ilk are a major factor in this. What we are curious about (either online or off-line) are to hear about what computational bottlenecks exist for your processes today, and what you perceive as rate-limiting factors for the future. As usual, we have our own particular biases, but we really want to hear from folks working in academic/industrial research and development, biotech/pharma, ... . Are computational bottlenecks the major problem you are running into today? What do you see in the future in terms of rate limiting efforts? If you had an "infinitely fast" cluster (like a blue-gene from IBM), how would like impact your work/processes? Our goal is to broadly sound out the community and see what people need. No sales people will call. This is mostly about making sure we are barking up the right trees, and are not off in left field in terms of our performance focus. We figured that the biocluster/biobb lists are good places to ask these questions, as most folks who are running/using clusters are doing so because they need the extra cycles that clusters offer. Thanks in advance! Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 cell : +1 734 612 4615 From mike.fursov at gmail.com Wed Apr 20 09:38:20 2005 From: mike.fursov at gmail.com (Mikhail Fursov) Date: Wed, 20 Apr 2005 20:38:20 +0700 Subject: [BiO BB] extract regions from chomosome In-Reply-To: References: <49284.192.168.220.204.1113475489.squirrel@webmail.charite.de> Message-ID: BTW, Does this problem is still actual? I think I can write a simple plugin that will extract series of subsequencies from a given file just as a programming exercise. AFAIK each human chromosome is presented as set of confings. Do you have per contig coordinates? How your list is looks like? Mikhail Fursov. On 4/14/05, Mikhail Fursov wrote: > I know the simple but not efficient way to extract a subsequence from > whole sequence. > We use GenomeBrowser (http://genome.unipro.ru) (trial is fully functional) > Thw sequence of actions: load your genome -> choose select region > option -> select region you need -> save selection to fasta or just > copy it to clipboard. > > On 4/14/05, Dr. Christoph Gille wrote: > > I have a list of start positions and end postitions of human chromosoms > > and would like to extract the respective nt-sequences. > > > > It is a list of about 200 entries. > > > > Does anybody know an efficient way ? > > > > Thanks > > > > Christoph > > > > _______________________________________________ > > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > From christoph.gille at charite.de Wed Apr 20 10:28:58 2005 From: christoph.gille at charite.de (Dr. Christoph Gille) Date: Wed, 20 Apr 2005 16:28:58 +0200 (CEST) Subject: [BiO BB] extract regions from chomosome In-Reply-To: References: <49284.192.168.220.204.1113475489.squirrel@webmail.charite.de> Message-ID: <34326.192.168.220.204.1114007338.squirrel@webmail.charite.de> Mikhail, this is really very kind. We got a solution ourselfs: In the web brwoser lynx you can add post data at the command line. Thus we can contact the ensembl server. tahnks for your kind offer. Christoph > BTW, > Does this problem is still actual? I think I can write a simple plugin > that will extract series of subsequencies from a given file just as a > programming exercise. > > AFAIK each human chromosome is presented as set of confings. Do you > have per contig coordinates? How your list is looks like? > > Mikhail Fursov. > > > On 4/14/05, Mikhail Fursov wrote: > >> I know the simple but not efficient way to extract a subsequence from >> whole sequence. We use GenomeBrowser (http://genome.unipro.ru) (trial is >> fully functional) Thw sequence of actions: load your genome -> choose >> select region option -> select region you need -> save selection to >> fasta or just copy it to clipboard. >> >> On 4/14/05, Dr. Christoph Gille wrote: >> >>> I have a list of start positions and end postitions of human >>> chromosoms and would like to extract the respective nt-sequences. >>> >>> It is a list of about 200 entries. >>> >>> >>> Does anybody know an efficient way ? >>> >>> >>> Thanks >>> >>> >>> Christoph >>> >>> >>> _______________________________________________ >>> Bioinformatics.Org general forum - >>> BiO_Bulletin_Board at bioinformatics.org >>> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >>> >>> >> > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > From tjrc at sanger.ac.uk Wed Apr 20 10:27:28 2005 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Wed, 20 Apr 2005 15:27:28 +0100 Subject: [BiO BB] Re: [Bioclusters] General question on time consuming problems In-Reply-To: <4266585F.10200@scalableinformatics.com> References: <4266585F.10200@scalableinformatics.com> Message-ID: <8635b5e35f31ed96e9364ba53cbe4e7f@sanger.ac.uk> On 20 Apr 2005, at 2:25 pm, Joe Landman wrote: > Hi folks: > > Are computational bottlenecks the major problem you are running into > today? What do you see in the future in terms of rate limiting > efforts? If you had an "infinitely fast" cluster (like a blue-gene > from IBM), how would like impact your work/processes? The major bottlenecks are based around IO. We have plenty of CPU grunt. Scalable databases and parallel filesystems are what we need to sort out now. It's no use having infinite amounts of CPU power if you have to force all the output through a very tiny pipe. A lot of this can be solved by programming expertise, but most scientists aren't interested in coding for scalability, they're only interested in quickly producing something which produces "the right answer", whatever that means. Having scalable filesystems and databases would allow them to carry on coding in their current less-than-perfect ways and still maintain some half-decent performance. Tim -- Dr Tim Cutts Informatics Systems Group, Wellcome Trust Sanger Institute GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5 860B 3CDD 3F56 E313 4233 From rcamacho at fe.up.pt Thu Apr 21 10:19:45 2005 From: rcamacho at fe.up.pt (rcamacho at fe.up.pt) Date: Thu, 21 Apr 2005 15:19:45 +0100 Subject: [BiO BB] CFP - Workshop on Computational Methods in Bioinformatics Message-ID: <20050421151945.1iahl7ocpsk40o08@webmail.fe.up.pt> Workshop on Computational Methods in Bioinformatics cmb.epia05 at di.ubi.pt as part of the 12th Portuguese Conference on Artificial Intelligence Covilha, Portugal, 5-8 December 2005 http://epia05.di.ubi.pt email: epia05 at di.ubi.pt =============== Call for Papers =============== The success of bioinformatics in recent years has been prompted by research in molecular biology and molecular medicine in initiatives like the human genome project. These initiatives gave rise to an exponential increase in the volume and diversification of data, including protein and gene data, nucleotide sequences and biomedical literature. The accumulation and exploitation of large-scale data bases prompts for new computational technology and for research into these issues. In this context, many widely successful computational models and tools used by biologists in these initiatives, such as clustering and classification methods for gene expression data, are based on artificial intelligence (AI) techniques. Hence, this workshop brings the opportunity to discuss applications of AI with an interdisciplinary character, exploring the interactions between sub-areas of AI and Bioinformatics. Topics and Technical issues --------------------------- Papers should deal with bio-medical data sets. Computer science and mathematical modelling papers must contain a concise description of the biological problem being solved, and biology papers should show how computation or analysis improves the results. Biological areas of interest include, but are not limited to: - sequence analysis, - comparison and alignment methods; - motif, gene and signal recognition; - molecular evolution; - phylogenetics and phylogenomics; - determination or prediction of the structure of RNA and protein in two and three dimensions; - DNA twisting and folding; - gene expression and gene regulatory networks; - deduction of metabolic pathways; - microarray design and analysis; - proteomics; - functional genomics; - molecular docking and drug design; - computational problems in genetics such as linkage and QTL analysis, linkage disequilibrium analysis in populations, and haplotype determination; - systems biology. Computational areas of interest include, but are not limited to: - Knowledge Discovering techniques and Data Mining; - Text Mining and Language Processing; - Machine Learning and Pattern Recognition; - Rough, Fuzzy and Hybrid Techniques; - Hidden Markov Models; - Bayesian Approaches; - Artificial Neural Networks; - Support Vector Machines; - Evolutionary Computing; - Non-linear dynamical analysis methods and Intelligent signal processing Sponsors -------- The 12th Portuguese Conference on Artificial Intelligence is sponsored by APPIA, the Portuguese Association of AI (http://www.appia.pt/). Goals and Intended Audience --------------------------- The workshop's aim is to discuss computational aspects and challenges of important biological problems and, in doing so, identify new research opportunities. A second goal is to establish a forum that will bring together researchers in Artificial Intelligence face to face with researchers in bioinformatics. The hope is that this workshop will provide a forum where people from the two communities can come together to exchange ideas and discuss different approaches. Proceedings ___________ Authors may submit full papers with a maximum of 12 pages or short papers with a maximum of 4 pages. Both kinds of papers may report on innovative work, work reported elsewhere, ongoing research or student projects. Only innovative full papers are candidate to publications in the Springer proceedings as specified below. All papers must follow the Springer instructions for authors as indicated in http://www.springer.de/comp/lncs/authors.html/ A selection of the workshop accepted full papers will be published by Springer in the Lecture Notes in Artificial Intelligence sub-series of LNCS. All accepted papers not published in the Springer proceedings will be published by UBI in the Local Workshop Proceedings, in hard copy, in CD-ROM and in the web. To be a candidate for publication in the main proceedings the following rules must be met: * Submissions must be full technical papers on substantial, original, and previously unpublished research. * Submissions must be formatted according to the Springer LNCS Series guidelines, and must not exceed 12 pages. Each selected paper will be allowed 12 pages in the Proceedings. Up to two additional pages may be purchased by the authors. At least one of the authors must be registered at the conference for the paper to be published in the proceedings. Posters ------- The workshop will have a poster session running during the coffee breaks. Posters are intended to report on work in progress, student projects, open problems and research issues, as well as new application challenges. Important Dates _______________ Here are some important dates: Conference begins December 5th, 2005 Paper submission May 27th, 2005 Author notification July 15th, 2005 Camera-ready copies July 28th, 2005 Submission and reviewing Information ----------------------------------------------- The workshop will follow a blind reviewing policy. In order to make blind reviewing possible, submissions must be anonymous. This requires that authors exercise some care not to identify themselves in their contributions. Authors should omit their names and affiliations from the paper. Also, while the references should include all published literature relevant to the paper, including previous works of the authors, it should not include unpublished works. When referring to one's own work, authors must use the third person rather than the first person. To submit a paper the author fills in an electronic form with author's and papers information and them upload the paper in PDF format. All papers (short or full papers) will follow the same reviewing process. Submission is made through the submission page available in the EPIA 2005 website http://epia05.di.ubi.pt . Organization ____________ Program Chairs Rui Camacho Alexessander Alves LIACC and FEUP, LIACC Universidade do Porto, Portugal Universidade do Porto, Portugal rcamacho at fe.up.pt alves at ieee.org Joaquim Pinto da Costa Paulo Azevedo LIACC and FCUP Departamento de Informatica Universidade do Porto, Portugal Universidade do Minho, Portugal jpcosta at fc.up.pt pja at di.uminho.pt Program Committee P. Alexandrino (Portugal) D. Gilbert (Scotland, UK) A. Alves (Portugal) A. Jorge (Portugal) P. Azevedo (Portugal) R. King (Wales, UK) P. Bourne (USA) I. C. Lerman (France) V. Brusic (Singapore) M. P. Monteiro (Portugal) C. Bystroff (USA) M. Sagot (France) R. Camacho (Portugal) T. Scheffer (Germany) A. Carvalho (Brasil) S. Schulze-Kremer (Germany) J. P. Costa (Portugal) A. Srinivasan (India) V. Costa (Brasil) A. Valencia (Spain) I. Dutra (Brasil) J. Vieira (Portugal) L. Dehaspe (Belgium) M. Zaki (USA) From aa056 at chebucto.ns.ca Wed Apr 20 14:22:39 2005 From: aa056 at chebucto.ns.ca (George White) Date: Wed, 20 Apr 2005 15:22:39 -0300 (ADT) Subject: [BiO BB] Re: [Bioclusters] General question on time consuming problems In-Reply-To: <4266585F.10200@scalableinformatics.com> Message-ID: I work in remote sensing. Typical processing is a pipeline where raw data from a satellite is processed in chunks to get to some useful product. This falls into the embarrasingly parallel category, but new satellites have >2x more bits in each channel, >2x more channels, and >2x more pixels, so the size of the chunks is >8x bigger, which means you are looking at bigger storage, faster pipes, and more clock time to get processing done. 10 years ago we worked a 512x512 images was about all we could handle. Now we are looking at 4kx8k images, so a couple orders of magnitude larger. Our processors have gone from 30 to 3000 mhz, but processing time per pixel has only decreased by a factor of 10 as I/O becomes the bottleneck. People are looking at ways to transmit data in compressed form, but then you need to find ways to quickly get at a specific small region (e.g., to look at a time series of observations where there is ground truth). The other problem is that many the real-world clusters are lucky to get 50% uptime. The one down the hall was fried when the A/C died. They fixed all that, took a couple weeks to get a new A/C installed, and then a cable to the RAID stopped working, so now they have to get the cable and hope the files weren't damaged. You hear the success stories from people who have been lucky with A/C hardware, etc., but there are also lots of cluster owners who are swamped by the upkeep and or poorly maintained physical plant (power problems, A/C, etc.). -- George White 189 Parklea Dr., Head of St. Margarets Bay, Nova Scotia B3Z 2G6 From jakechen at iupui.edu Thu Apr 21 17:30:40 2005 From: jakechen at iupui.edu (Chen, Jake) Date: Thu, 21 Apr 2005 16:30:40 -0500 Subject: [BiO BB] CFP: ISPA '05 International Workshop on Bioinformatics at Nankin, China Nov 2-5 2005 Message-ID: <9BA28484EAC4B843B4EA7834449D6FFE025FD8A8@iu-mssg-mbx05.exchange.iu.edu> ISPA'05 International Workshop on Bioinformatics 2005 ====================================================== Date: November 2-5, 2005 Place: Nanjing University, Nanjing, P.R. China Paper submission Deadline: May 6th, 2005 Conference Web Site: =================== http://www.cs.iupui.edu/~bioin/bioinformatics05/ Scope: ====== Colorful analogies to catastrophic events, including floods, avalanches, tidal waves, and even explosions, have often been used to describe the overwhelming nature of high-throughput, biological data. The deluge of data shows no sign of abating, particularly as new technologies appear (protein chips) and established technologies are improved (mass spectrometers, DNA micro-arrays) or re- implemented on industrial scales. Finding new ways to integrate, manage, visualize, and interpret data from diverse sources is one of the grand challenges for modern bioinformatics, which merges information technology, computer science, statistics, applied mathematics, and biology. Topics of interest to this workshop include, but are not limited to: * Protein-protein interactions * mRNA and protein expression * Functional annotation of genes and proteins * Ontological classifications * Signaling and regulatory pathways * Biology-specific knowledge representation * Biological data preparation and cleansing * High-throughput experimental data monitoring and tracking * Sequence- or structure-based data analysis * Knowledge curation, annotation, and reporting * Use of natural language processing techniques and/or artificial * intelligence techniques to automatically extract multiple * biological objects such as gene names, protein names, drugs, organisms, disease, etc., from free-text. * Information and knowledge extraction such as object-object interactions (ex: protein interactions, functions, etc.). * Software systems to support biological research that integrates multi-format and multi-type data from heterogeneous databases. * Information visualization techniques for biological networks and integrated biological systems. * Application of machine learning in the mining of very large dimensional data such as microarray and mass-spectrometry data. * Computational methods that model cellular mechanisms, the protein machine, pathways, and regulatory networks. * Algorithms for processing and interpreting large-scale mass- spectrometry data * Comparative genomics and genome dynamics (i.e., evolution of whole genomes, e.g., by translocations, reversals, duplications, etc.) Modeling of small molecule ligand binding to proteins. * An informatics technique, strategy, and tool that combine multiple types of data. * Investigation of genome, transcriptome, proteome, or metabolome data, using multiple computational techniques, strategies, and tools. * High-performance systems engineering ideas and strategies using concepts learned from bioinformatics. * Significant biological discoveries using a consistent suite of investigative tools. * Other novel bioinformatics topics in life sciences may also be considered, as long as the topics contribute to the expansion of a system-scale understanding of biological processes and/or the creation of practical informatics solutions for real-world life science problems. Important Dates: ================ Paper submission due: April 30, 2005 Acceptance notification: July 1, 2005 Camera-ready due: July 30, 2005 Conference: =========== November 2-5, 2005 Instructions for Paper Submission: ================================== All papers should represent original and previously unpublished works that are currently not under review in any conference or journal. Both basic and applied research papers are welcome. Submissions should include an abstract, key words, the e-mail address of the corresponding author, and must not exceed 15 pages, including tables and figures, in PDF format. Please send your submission to: bioin at cs.iupui.edu . All enquiries and questions should be directed to the Workshop Chairs. Additional details are available at the Workshop home page at http://www.cs.iupui.edu/~bioin/bioinformatics05/and at the conference home page at http://keysoftlab.nju.edu.cn/ispa2005/ Workshop Chairs: =============== Mathew Palakal Department of Computer & Information Science Indiana University Purdue University Indianapolis , USA mpalakal at cs.iupui.edu Jake Chen School of Informatics/Computer Science Indiana University Purdue University Indianapolis , USA jakechen at iupui.edu Zongben Xu Xian Jiaotong University, China zbxu at mail.xjtu.edu.cn -------------- next part -------------- An HTML attachment was scrubbed... URL: From pfern at igc.gulbenkian.pt Fri Apr 22 03:00:46 2005 From: pfern at igc.gulbenkian.pt (Pedro Fernandes) Date: Fri, 22 Apr 2005 08:00:46 +0100 (WEST) Subject: [BiO BB] CFP: ISPA '05 International Workshop on Bioinformatics at Nankin, China Nov 2-5 2005 In-Reply-To: <9BA28484EAC4B843B4EA7834449D6FFE025FD8A8@iu-mssg-mbx05.exchange.iu.ed u> References: <9BA28484EAC4B843B4EA7834449D6FFE025FD8A8@iu-mssg-mbx05.exchange.iu.edu> Message-ID: <2974.83.132.110.159.1114153246.squirrel@webmail.igc.gulbenkian.pt> Chen, Jake said: > ISPA'05 International Workshop on Bioinformatics 2005 > > ==================================================== Date: November 2-5, > 2005 > > Place: Nanjing University, Nanjing, P.R. China > > Paper submission Deadline: May 6th, 2005 > > > > Conference Web Site: > > =================http://www.cs.iupui.edu/~bioin/bioinformatics05/ > > > > Scope: > > ====Colorful analogies to catastrophic events, including floods, > > avalanches, tidal waves, and even explosions, have often been used > > to describe the overwhelming nature of high-throughput, biological > > data. The deluge of data shows no sign of abating, particularly as > > new technologies appear (protein chips) and established technologies > > are improved (mass spectrometers, DNA micro-arrays) or re- > > implemented on industrial scales. Finding new ways to integrate, > > manage, visualize, and interpret data from diverse sources is one of > > the grand challenges for modern bioinformatics, which merges > > information technology, computer science, statistics, applied > > mathematics, and biology. Topics of interest to this workshop > > include, but are not limited to: > > > > * Protein-protein interactions > > * mRNA and protein expression > > * Functional annotation of genes and proteins > > * Ontological classifications > > * Signaling and regulatory pathways > > * Biology-specific knowledge representation > > * Biological data preparation and cleansing > > * High-throughput experimental data monitoring and tracking > > * Sequence- or structure-based data analysis > > * Knowledge curation, annotation, and reporting > > * Use of natural language processing techniques and/or artificial > > * intelligence techniques to automatically extract multiple > > * biological objects such as gene names, protein names, drugs, > > organisms, disease, etc., from free-text. > > * Information and knowledge extraction such as object-object > > interactions (ex: protein interactions, functions, etc.). > > * Software systems to support biological research that integrates > > multi-format and multi-type data from heterogeneous databases. > > * Information visualization techniques for biological networks and > > integrated biological systems. > > * Application of machine learning in the mining of very large > > dimensional data such as microarray and mass-spectrometry data. > > * Computational methods that model cellular mechanisms, the protein > > machine, pathways, and regulatory networks. > > * Algorithms for processing and interpreting large-scale mass- > > spectrometry data > > * Comparative genomics and genome dynamics (i.e., evolution of whole > > genomes, e.g., by translocations, reversals, duplications, etc.) > > Modeling of small molecule ligand binding to proteins. > > * An informatics technique, strategy, and tool that combine multiple > > types of data. > > * Investigation of genome, transcriptome, proteome, or metabolome > > data, using multiple computational techniques, strategies, and > > tools. > > * High-performance systems engineering ideas and strategies using > > concepts learned from bioinformatics. > > * Significant biological discoveries using a consistent suite of > > investigative tools. > > * Other novel bioinformatics topics in life sciences may also be > > considered, as long as the topics contribute to the expansion of a > > system-scale understanding of biological processes and/or the > > creation of practical informatics solutions for real-world life > > science problems. > > > > Important Dates: > > ==============Paper submission due: > > April 30, 2005 > > Acceptance notification: > > July 1, 2005 > > Camera-ready due: > > July 30, 2005 > > > > > > Conference: > > =========== > > November 2-5, 2005 > > > > > > Instructions for Paper Submission: > > ================================All papers should represent original and previously > unpublished > > works that are currently not under review in any conference or > > journal. Both basic and applied research papers are welcome. > > > > Submissions should include an abstract, key words, the e-mail > > address of the corresponding author, and must not exceed 15 pages, > > including tables and figures, in PDF format. Please send your > > submission to: bioin at cs.iupui.edu . > > > > All enquiries and questions should be directed to the Workshop > > Chairs. Additional details are available at the Workshop home page > > at http://www.cs.iupui.edu/~bioin/bioinformatics05/and at the > > conference home page at http://keysoftlab.nju.edu.cn/ispa2005/ > > > > Workshop Chairs: > > =============Mathew Palakal > > Department of Computer & Information Science > > Indiana University Purdue University Indianapolis , USA > > mpalakal at cs.iupui.edu > > > > Jake Chen > > School of Informatics/Computer Science > > Indiana University Purdue University Indianapolis , USA > > jakechen at iupui.edu > > > > Zongben Xu > > Xian Jiaotong University, China > > zbxu at mail.xjtu.edu.cn > > > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From tjrc at sanger.ac.uk Fri Apr 22 05:34:32 2005 From: tjrc at sanger.ac.uk (Tim Cutts) Date: Fri, 22 Apr 2005 10:34:32 +0100 Subject: [BiO BB] Re: [Bioclusters] General question on time consuming problems In-Reply-To: References: Message-ID: On 20 Apr 2005, at 7:22 pm, George White wrote: > The other problem is that many the real-world clusters are lucky to get > 50% uptime. The one down the hall was fried when the A/C died. They > fixed all that, took a couple weeks to get a new A/C installed, and > then a > cable to the RAID stopped working, so now they have to get the cable > and > hope the files weren't damaged. You hear the success stories from > people > who have been lucky with A/C hardware, etc., but there are also lots of > cluster owners who are swamped by the upkeep and or poorly maintained > physical plant (power problems, A/C, etc.). But then, as you say, if your problem is really embarrassingly parallel, and you code it right, losing a few nodes here and there isn't a problem. One of the nice things about embarrassingly parallel problems is that they tend to allow for gradual loss of capacity. It's quite useful for us; it allows us to wait for a number of nodes to fail before we batch them up and send them back for repair. This saves a lot of money in support costs, as well as effort on our part. Tim -- Dr Tim Cutts Informatics Systems Group, Wellcome Trust Sanger Institute GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5 860B 3CDD 3F56 E313 4233 From mgollery at unr.edu Fri Apr 22 15:12:19 2005 From: mgollery at unr.edu (Martin Gollery) Date: Fri, 22 Apr 2005 12:12:19 -0700 Subject: [BiO BB] Harvester or Genecard In-Reply-To: References: Message-ID: <42694C93.3080103@unr.edu> Generally, it will be much cheaper to buy a system than to reinvent the wheel. This will allow you to focus on your core competency. Xennex is available on line, so you should be able to have a look at it before buying. It is an excellent method to pull a vast amount of data about your genes of interest in a very short period of time. Harvester is more of an analysis pipeline system to automatically clean, mask, cluster and assemble EST data. If you have a lot of EST's to work through, go with Harvester, or perhaps 'Magic' from the Pratt lab in Georgia. Regards, Marty alain M. wrote: > Dear colleague, > > I actually work in a biology lab in Louvain (belgium) and I'm involved > in the creation of a biotech > > company. > We will developped an HTS platform to identify putative targets in the > field of cancer and CNS deseases. > > Bioinformatics will be a key tool to performed such analysis and we have > identify two integrated > > databases that may be useful for us : Genecard from Xennex inc. and > Harvester from Biomax Informatics > > AG. An other possibility is to build in-house our own integrated database. > What is your opinion about these two systems? Have you some information > about the price for a small > > biotech? Are the private versions enhanced with other functionalities > compared to the online free academic versions? > > thanks for your help, > > Best regards, > > A M > > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board -- Martin Gollery Associate Director Center For Bioinformatics University of Nevada at Reno Dept. of Biochemistry / MS330 775-784-7042 ----------- Marty's Law- the number of Bioinformatics acronyms will double every 18 months. From chea at mail.nih.gov Mon Apr 25 15:07:14 2005 From: chea at mail.nih.gov (Anney Che) Date: Mon, 25 Apr 2005 15:07:14 -0400 Subject: [BiO BB] Re: [Bioclusters] General question on time consuming problems In-Reply-To: Message-ID: Hi, Does anyone know if there is a program that will find a specific repeat in a DNA sequence and highly the repeats in the sequence? Similar to find function in MS Work but it will not find the pattern if the pattern is interrupted by a tab or newline (\n). Thanks, Anney Anney Che, M.S. Biocomputing Specialist Laboratory of Molecular Microbiology/NIAID 4 Center Drive, Room 301 Bethesda, MD 20892 301-451-2851 (Office) 301-480-2716 (Fax) From mike.fursov at gmail.com Mon Apr 25 16:00:43 2005 From: mike.fursov at gmail.com (Mikhail Fursov) Date: Tue, 26 Apr 2005 03:00:43 +0700 Subject: [BiO BB] Re: [Bioclusters] General question on time consuming problems In-Reply-To: References: Message-ID: We use GenomeBrowser (http://genome.unipro.ru). It is able to find, visualize and store as sequence anotations both exact and not exact patterns in DNA or its translations. I wrote some opensource plugins for this tool: (http://genome.sf.net). You can look for HMM signals using "HMM Seach" plugin by preparing initial model from protein alignment. Mikhail Fursov. On 4/26/05, Anney Che wrote: > Hi, > > Does anyone know if there is a program that will find a specific repeat in a > DNA sequence and highly the repeats in the sequence? > > Similar to find function in MS Work but it will not find the pattern if the > pattern is interrupted by a tab or newline (\n). > > Thanks, > > Anney > > > Anney Che, M.S. > Biocomputing Specialist > Laboratory of Molecular Microbiology/NIAID > 4 Center Drive, Room 301 > Bethesda, MD 20892 > 301-451-2851 (Office) > 301-480-2716 (Fax) > > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From mike.fursov at gmail.com Mon Apr 25 16:14:16 2005 From: mike.fursov at gmail.com (Mikhail Fursov) Date: Tue, 26 Apr 2005 03:14:16 +0700 Subject: [BiO BB] Re: [Bioclusters] General question on time consuming problems In-Reply-To: References: Message-ID: Moreover I want to base my PhD on string comparison algorithms and similarity searches. So, someday, I have to write a tool or a plugin that looks for a repeats and visualize them. If only simple repeats in genome is what you need I can write such kind a tool in a nearest week. (use my email instead of a board) Mikhail Fursov. On 4/22/05, Tim Cutts wrote: > > On 20 Apr 2005, at 7:22 pm, George White wrote: > > > The other problem is that many the real-world clusters are lucky to get > > 50% uptime. The one down the hall was fried when the A/C died. They > > fixed all that, took a couple weeks to get a new A/C installed, and > > then a > > cable to the RAID stopped working, so now they have to get the cable > > and > > hope the files weren't damaged. You hear the success stories from > > people > > who have been lucky with A/C hardware, etc., but there are also lots of > > cluster owners who are swamped by the upkeep and or poorly maintained > > physical plant (power problems, A/C, etc.). > > But then, as you say, if your problem is really embarrassingly > parallel, and you code it right, losing a few nodes here and there > isn't a problem. One of the nice things about embarrassingly parallel > problems is that they tend to allow for gradual loss of capacity. It's > quite useful for us; it allows us to wait for a number of nodes to fail > before we batch them up and send them back for repair. This saves a > lot of money in support costs, as well as effort on our part. > > Tim > > -- > Dr Tim Cutts > Informatics Systems Group, Wellcome Trust Sanger Institute > GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5 860B 3CDD 3F56 E313 4233 > > _______________________________________________ > Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From grunske at itee.uq.edu.au Tue Apr 26 01:04:48 2005 From: grunske at itee.uq.edu.au (Lars Grunske) Date: Tue, 26 Apr 2005 15:04:48 +1000 Subject: [BiO BB] Systems Engineering/ Test and Evaluation Conference (SETE 2005) Message-ID: <200504260504.j3Q54lGe018298@luma.itee.uq.edu.au> SETE 2005 CALL FOR PAPERS, PRESENTATIONS, TUTORIALS, EXHIBITORS & SPONSORS www.iceaustralia.com/sete2005/ Venue ----- Monday 7th to Wednesday 9th November 2005 Carlton Crest Hotel, Brisbane, Queensland, Australia A Conference of The Systems Engineering Society of Australia (SESA) www.sesa.org.au The Southern Cross Chapter of The International Test and Evaluation Association (ITEA) www.itea.org The International Council on Systems Engineering (INCOSE) Region IV (www.incose.org) Theme ----- The SETE 2005 theme is "A Decade of Growth and Beyond." in recognition of the 10th anniversary of SESA. Previous SETE conferences have addressed a range of themes concerning complexity and Systems thinking. The conference program will include a combination of keynote speakers, plenary sessions, panels, papers, tutorials, workshops and trade exhibits. Possible topics include, but are not limited to: Systems engineering applications Soft systems methodologies Systems of systems Promoting systems engineering Measurement Modelling and tools Systems engineering management Processes and methods Software engineering Education and training Experiments, exercises and trials Infrastructure planning & support Research and education in priorities Prediction tools Observation tools Operations analysis Networking and connectivity Collaborations and alliances Requirements engineering Measures of effectiveness (MOEs) T&E visions of the future Modelling & simulation for T&E T&E for systems of systems Planning and management of T&E T&E investment planning Virtual T&E Design of experiments Error budgeting Value added by T&E Legal aspects of T&E Papers and presentations, particularly case studies, relating to this theme as well as to broader areas of interest to SESA and ITEA, are invited from industry and academia. Papers and presentations are also invited from both graduate and undergraduate students in the fields of Systems Engineering and Test and Evaluation (T&E). Successful papers and presentations will compete for the ARC Centre for Complex Systems (ACCS) award for the best student paper. Refereed Papers --------------- This category is primarily designed for academics who need to meet Australian Government requirements for published papers. Refereed papers will be reviewed anonymously by at least two peer reviewers. If needed, comments will be provided to authors prior to submission of final papers. Guidance for authors of refereed papers will shortly be made available on the conference web site. Non-refereed Presentations -------------------------- This category is intended to facilitate contributions from industry practitioners who wish to communicate their experiences and ideas but who do not need to meet formal academic requirements. Presentation abstracts will be reviewed internally to ensure compatibility with the conference theme. Presentations are not required in a given format; however, use of MS PowerPoint is encouraged Presentations and papers accepted for the conference will be published in the conference proceedings to be provided to all participants on CD-ROM. Abstracts of all presentations and papers will be published as part of the conference program booklet to be issued at the commencement of the conference. Call for Tutorials, Exhibitors, & Sponsors ------------------------------------------ A tutorial program on the first day of the conference is planned. Expressions of interest are invited from companies or individuals willing to deliver a half-day tutorial on a subject related to the conference theme. Expressions of interest are sought from potential exhibition. SETE 2005 will include both static exhibitions and presentations by vendors. Expressions of interest are also sought from potential sponsors. SETE 2005 sponsorship opportunities will include 'gold' and 'silver' levels of support. The conference dinner, luncheons, morning/afternoon teas and conference items will all be available to sponsors. Each 'gold' sponsor is entitled to a presentation opportunity during a plenary session. Key Dates --------- Submission of 150-word abstracts for refereed papers 01 June 2005 Submission of 150-word abstracts for non-refereed papers15 June 2005 Submission of refereed papers for review 15 June 2005 Submission of tutorial proposals 15 June 2005 Notification and comments to refereed paper authors 05 August 2005 Submission of final papers 02 September 2005 Enquiries --------- Please direct all enquiries about papers, presentations and tutorials to the SETE 2005 Program Chair, Professor Peter Lindsay, University of Queensland at sete2005progchair at itee.uq.edu.au Please submit expressions of interest from potential sponsors and exhibitors to the SETE 2005 Conference Chair, Mr Dennis Sommers of Queensland Transport at Dennis.Z.Sommers at transport.qld.gov.au -------------------------------------------------------- Dr. rer. nat. Lars Grunske Boeing Postdoctoral Research Fellow, School of ITEE University of Queensland, St Lucia, Brisbane, Australia Telephone: +61 7 3365 1648 Webpage: http://www.itee.uq.edu.au/~grunske/ From jakechen at iupui.edu Thu Apr 28 21:09:32 2005 From: jakechen at iupui.edu (Chen, Jake) Date: Thu, 28 Apr 2005 20:09:32 -0500 Subject: [BiO BB] Last Call: ISPA 05 International Workshop on Bioinformatics Nanjing, China Nov. 2-5, 2005 Message-ID: <9BA28484EAC4B843B4EA7834449D6FFE0270D738@iu-mssg-mbx05.exchange.iu.edu> ISPA 05 International Workshop on Bioinformatics Nanjing, China Nov. 2-5, 2005 Conference Web Site: =================== http://www.cs.iupui.edu/~bioin/bioinformatics05/ **Papers will be published in Lecture Notes in Computer Science (LNCS) and indexed in SCI. Scope: ====== Colorful analogies to catastrophic events, including floods, avalanches, tidal waves, and even explosions, have often been used to describe the overwhelming nature of high-throughput, biological data. The deluge of data shows no sign of abating, particularly as new technologies appear (protein chips) and established technologies are improved (mass spectrometers, DNA micro-arrays) or re- implemented on industrial scales. Finding new ways to integrate, manage, visualize, and interpret data from diverse sources is one of the grand challenges for modern bioinformatics, which merges information technology, computer science, statistics, applied mathematics, and biology. Topics of interest to this workshop include, but are not limited to: * Protein-protein interactions * mRNA and protein expression * Functional annotation of genes and proteins * Ontological classifications * Signaling and regulatory pathways * Biology-specific knowledge representation * Biological data preparation and cleansing * High-throughput experimental data monitoring and tracking * Sequence- or structure-based data analysis * Knowledge curation, annotation, and reporting * Use of natural language processing techniques and/or artificial * intelligence techniques to automatically extract multiple * biological objects such as gene names, protein names, drugs, organisms, disease, etc., from free-text. * Information and knowledge extraction such as object-object interactions (ex: protein interactions, functions, etc.). * Software systems to support biological research that integrates multi-format and multi-type data from heterogeneous databases. * Information visualization techniques for biological networks and integrated biological systems. * Application of machine learning in the mining of very large dimensional data such as microarray and mass-spectrometry data. * Computational methods that model cellular mechanisms, the protein machine, pathways, and regulatory networks. * Algorithms for processing and interpreting large-scale mass- spectrometry data * Comparative genomics and genome dynamics (i.e., evolution of whole genomes, e.g., by translocations, reversals, duplications, etc.) Modeling of small molecule ligand binding to proteins. * An informatics technique, strategy, and tool that combine multiple types of data. * Investigation of genome, transcriptome, proteome, or metabolome data, using multiple computational techniques, strategies, and tools. * High-performance systems engineering ideas and strategies using concepts learned from bioinformatics. * Significant biological discoveries using a consistent suite of investigative tools. * Other novel bioinformatics topics in life sciences may also be considered, as long as the topics contribute to the expansion of a system-scale understanding of biological processes and/or the creation of practical informatics solutions for real-world life science problems. Important Dates: ================ Paper submission due: May 6, 2005 Acceptance notification: July 1, 2005 Camera-ready due: July 30, 2005 Conference: =========== November 2-5, 2005 Instructions for Paper Submission: ================================== All papers should represent original and previously unpublished works that are currently not under review in any conference or journal. Both basic and applied research papers are welcome. Submissions should include an abstract, key words, the e-mail address of the corresponding author, and must not exceed 15 pages, including tables and figures, in PDF format. Please send your submission to: bioin at cs.iupui.edu . All enquiries and questions should be directed to the Workshop Chairs. Additional details are available at the Workshop home page at http://www.cs.iupui.edu/~bioin/bioinformatics05/and at the conference home page at http://keysoftlab.nju.edu.cn/ispa2005/ Workshop Chairs: =============== Mathew Palakal Department of Computer & Information Science Indiana University Purdue University Indianapolis , USA mpalakal at cs.iupui.edu Jake Chen School of Informatics/Computer Science Indiana University Purdue University Indianapolis , USA jakechen at iupui.edu Zongben Xu Xian Jiaotong University, China zbxu at mail.xjtu.edu.cn -------------- next part -------------- An HTML attachment was scrubbed... URL: From rwang at bccrc.ca Fri Apr 29 14:29:20 2005 From: rwang at bccrc.ca (Renxue Wang) Date: Fri, 29 Apr 2005 11:29:20 -0700 Subject: [BiO BB] could not get annotation for DBSOURCE in GenBank Message-ID: <0BE438149FF2254DB4199E2682C8DFEB73B6A7@crcmail1.BCCRC.CA> Hi, There, I am working on a project that need extract "DBSOURCE" from GenBank record. I used Bio::AnnotationCollectionI for that. But somehow I cannot get "DBSOURCE" at all. When I tried to print out $key from my script, only "comment", "reference" and "origin" showed up. Tried many different ways but could not find "DBlink" or "DBSOURCE" anywhere. Anybody has an idea what is going on? Thanks a lot. Renxue Here is my code, ________________________________________________________ #!/usr/bin/perl -w use strict; use vars qw($USAGE); use Getopt::Long; use Bio::SeqIO; # for importing sequences and finding annotations use Bio::AnnotationCollectionI; $USAGE = "./test.pl [--help] [--notstrict] [--format seqformat] --input_pt --output_nt\n"; my ($input_pt,$sequenceformat,$notstrict,$printwidth,$output_nt, $help) = (undef, 'genbank', undef, 50, undef, undef); &GetOptions('input|i=s' => \$input_pt, 'output|o=s' => \$output_nt, 'format|f=s' => \$sequenceformat, 'notstrict|n' => \$notstrict, 'help|h' => \$help, ); my $seq_in = Bio::SeqIO->new('-file' => "<$input_pt", '-format' => 'genbank'); my $seq_out = Bio::SeqIO->new('-file' => ">$output_nt", '-format' => 'fasta'); while (my $inseq = $seq_in->next_seq) { my $annotation = $inseq->annotation; for my $key ( $annotation->get_all_annotation_keys ) { print $key, "\n"; } }