From MAG at Stowers-Institute.org Wed Dec 1 16:01:58 2004 From: MAG at Stowers-Institute.org (Goel, Manisha) Date: Wed, 1 Dec 2004 15:01:58 -0600 Subject: [BiO BB] Extracting blocks of seq alignment Message-ID: <20041201210203.5F705D1F00@www.bioinformatics.org> Hi All, Thanks for the various suggestions I received on the problem. Was really helpful. Regards, -Manisha > Hi All, > > > I have been looking for an alignment editor/ script that will extract > blocks of multiple sequence alignment (when given the residue numbers > etc.) from a longer multiple sequence alignment. Any suggetions will > be helpful. > > > Thanks, > - Manisha Goel > > > _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > _______________________________________________ BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From letondal at pasteur.fr Fri Dec 3 08:58:55 2004 From: letondal at pasteur.fr (Catherine Letondal) Date: Fri, 3 Dec 2004 14:58:55 +0100 Subject: [BiO BB] Looking for a sequence cleaner Message-ID: <78FE8E38-4533-11D9-B977-000D93B0BD32@pasteur.fr> Hi, This may seem strange, but I can't find any online tool for cleaning sequences (I mean removing extraneous characters such a positions, etc... in order the get, at least, a correct raw sequence). The only tools I could find are gap or vector cleaners. Thanks in advance, -- Catherine Letondal From tove.airola at medsci.uu.se Fri Dec 3 09:05:43 2004 From: tove.airola at medsci.uu.se (Tove Airola) Date: Fri, 03 Dec 2004 15:05:43 +0100 Subject: [BiO BB] Looking for a sequence cleaner In-Reply-To: <78FE8E38-4533-11D9-B977-000D93B0BD32@pasteur.fr> References: <78FE8E38-4533-11D9-B977-000D93B0BD32@pasteur.fr> Message-ID: <41B072B7.8020607@medsci.uu.se> Maybe the Nucleic Acid Sequence Massager can be helpful, http://www.attotron.com/cybertory/analysis/seqMassager.htm /Tove Catherine Letondal wrote: > Hi, > > This may seem strange, but I can't find any online tool for cleaning > sequences (I mean removing extraneous characters such a positions, > etc... in order the get, at least, a correct raw sequence). The only > tools I could find are gap or vector cleaners. > > Thanks in advance, > > -- > Catherine Letondal > > _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board -- Tove Airola, PhD Student E-mail: tove.airola at mdh.se Clinical Virology, tove.airola at medsci.uu.se Dpt of Medical Sciences, Phone: +46 (0)18-611 3953 Uppsala University; Mobile: +46 (0)707-162364 Dpt of Biology and Address: Dag Hammarskj?lds Chemical Engineering, v?g 17, 751 85 Uppsala, M?lardalen University Sweden From stefanielager at fastmail.ca Fri Dec 3 09:14:40 2004 From: stefanielager at fastmail.ca (Stefanie Lager) Date: Fri, 3 Dec 2004 14:14:40 +0000 (UTC) Subject: [BiO BB] Looking for a sequence cleaner In-Reply-To: <78FE8E38-4533-11D9-B977-000D93B0BD32@pasteur.fr> Message-ID: <20041203141440.01C5586121F@mail.interchange.ca> Also SMS has a tool that can do that http://bioinformatics.org/sms2/ Stefanie > Hi, > > This may seem strange, but I can't find any online tool for cleaning > sequences (I mean removing extraneous characters such a positions, > etc... in order the get, at least, a correct raw sequence). The only > tools I could find are gap or vector cleaners. > > Thanks in advance, > > -- > Catherine Letondal > _________________________________________________________________ http://fastmail.ca/ - Fast Secure Web Email for Canadians From christoph.gille at charite.de Fri Dec 3 11:42:52 2004 From: christoph.gille at charite.de (Dr. Christoph Gille) Date: Fri, 3 Dec 2004 17:42:52 +0100 (MET) Subject: [BiO BB] Looking for a sequence cleaner In-Reply-To: <78FE8E38-4533-11D9-B977-000D93B0BD32@pasteur.fr> References: <78FE8E38-4533-11D9-B977-000D93B0BD32@pasteur.fr> Message-ID: <40821.192.168.220.204.1102092172.squirrel@webmail.charite.de> sed 's|[^A-Z]||g' > Hi, > > > This may seem strange, but I can't find any online tool for cleaning > sequences (I mean removing extraneous characters such a positions, etc... > in order the get, at least, a correct raw sequence). The only tools I > could find are gap or vector cleaners. > > Thanks in advance, > > > -- > Catherine Letondal > > > _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > From christoph.gille at charite.de Mon Dec 6 09:04:21 2004 From: christoph.gille at charite.de (Dr. Christoph Gille) Date: Mon, 6 Dec 2004 15:04:21 +0100 (MET) Subject: [BiO BB] organism code Message-ID: <58124.192.168.220.204.1102341861.squirrel@webmail.charite.de> Hi all, where do I find a complete table mapping organism names to short code like Aeropyrum pernix ==> ape or Pyrococcus horikoshii ==> pho Thanks Christoph From bader at cbio.mskcc.org Mon Dec 6 09:22:53 2004 From: bader at cbio.mskcc.org (Gary Bader) Date: Mon, 6 Dec 2004 09:22:53 -0500 Subject: [BiO BB] organism code In-Reply-To: <58124.192.168.220.204.1102341861.squirrel@webmail.charite.de> References: <58124.192.168.220.204.1102341861.squirrel@webmail.charite.de> Message-ID: <515049C6-4792-11D9-9CCA-000A95A8840E@cbio.mskcc.org> As far as I know, there is no standard for 3 letter organism names and each resource that uses these names keeps its own list. Three letter codes can't cover more than 26^3 species and there are many more species than this. If you want to understand the codes from a specific resource, you need to find that information in the documentation of that resource. If you want to just use existing codes for convenience, KEGG maintains a list of its 3 letter codes for sequenced genomes here: http://www.genome.jp/kegg/kegg2.html Gary On Dec 6, 2004, at 9:04 AM, Dr. Christoph Gille wrote: > Hi all, > where do I find a complete table mapping organism names to short code > like > Aeropyrum pernix ==> ape > or > Pyrococcus horikoshii ==> pho > Thanks Christoph > > _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From lxyiwc at yahoo.com Mon Dec 6 17:29:25 2004 From: lxyiwc at yahoo.com (l x yi) Date: Mon, 6 Dec 2004 14:29:25 -0800 (PST) Subject: [BiO BB] a validation study Message-ID: <20041206222925.79367.qmail@web52005.mail.yahoo.com> I have developed a new method for searching protein banks with PSSM (position-specific matrices). To do a validation analysis of the method on real protein sequences, I need to decide -- a selection of some position-specific matrices for major protein domains -- a databank of single sequences -- a gold standard of which sequences truly belonged to which protein domain Could anyone suggest some reliable sources of references as to how to conduct this validation analysis? Any suggestions are appreciated. Lily __________________________________ Do you Yahoo!? Take Yahoo! Mail with you! Get it on your mobile phone. http://mobile.yahoo.com/maildemo From idoerg at burnham.org Mon Dec 6 17:45:51 2004 From: idoerg at burnham.org (Iddo Friedberg) Date: Mon, 06 Dec 2004 14:45:51 -0800 Subject: [BiO BB] a validation study In-Reply-To: <20041206222925.79367.qmail@web52005.mail.yahoo.com> References: <20041206222925.79367.qmail@web52005.mail.yahoo.com> Message-ID: <41B4E11F.6020105@burnham.org> l x yi wrote: >I have developed a new method for searching protein >banks with PSSM (position-specific matrices). To do a >validation analysis of the method on real protein >sequences, I need to decide > >-- a selection of some position-specific matrices for >major protein domains >-- a databank of single sequences >-- a gold standard of which sequences truly belonged >to which protein domain > >Could anyone suggest some reliable sources of >references as to how to conduct this validation >analysis? Any suggestions are appreciated. > >Lily > > > > >__________________________________ >Do you Yahoo!? >Take Yahoo! Mail with you! Get it on your mobile phone. >http://mobile.yahoo.com/maildemo >_______________________________________________ >BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > > How about looking at the papers which have already used PSSM-based searching, and choosing your benchmarks from there? -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 713 9930 http://ffas.ljcrf.edu/~iddo From dmb at mrc-dunn.cam.ac.uk Mon Dec 6 21:20:57 2004 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Tue, 7 Dec 2004 02:20:57 +0000 (GMT) Subject: [BiO BB] a validation study In-Reply-To: <20041206222925.79367.qmail@web52005.mail.yahoo.com> Message-ID: On Mon, 6 Dec 2004, l x yi wrote: >I have developed a new method for searching protein >banks with PSSM (position-specific matrices). To do a >validation analysis of the method on real protein >sequences, I need to decide Their are papers which ask the same questions (I don't have references to hand), so you could try to extend the analysis in them to cover your approach. As a gold standard lots of people pick SCOP as a definition of true (but very distant) relationships. You can check your results at family / superfamily and fold levels (folds are considered to be non-homologous but structurally similar protein domains). Generally there are two criteria of alignment, quality and length. If you are trying to do homology modeling you prefer length, pure fold recognition you could go for quality. I am sure lots of people can add much more details to this very shakey description. All the best > >-- a selection of some position-specific matrices for >major protein domains >-- a databank of single sequences >-- a gold standard of which sequences truly belonged >to which protein domain > >Could anyone suggest some reliable sources of >references as to how to conduct this validation >analysis? Any suggestions are appreciated. > >Lily > > > > >__________________________________ >Do you Yahoo!? >Take Yahoo! Mail with you! Get it on your mobile phone. >http://mobile.yahoo.com/maildemo >_______________________________________________ >BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From stefanielager at fastmail.ca Tue Dec 7 01:17:20 2004 From: stefanielager at fastmail.ca (Stefanie Lager) Date: Tue, 7 Dec 2004 06:17:20 +0000 (UTC) Subject: [BiO BB] organism code In-Reply-To: <58124.192.168.220.204.1102341861.squirrel@webmail.charite.de> Message-ID: <20041207061721.46E9C861503@mail.interchange.ca> The UniProt species coding system can be found at: http://www.expasy.org/cgi-bin/speclist > Hi all, > where do I find a complete table mapping organism names to short code > like Aeropyrum pernix ==> ape > or > Pyrococcus horikoshii ==> pho > Thanks Christoph > _________________________________________________________________ http://fastmail.ca/ - Fast Secure Web Email for Canadians From vnadeem at gmail.com Tue Dec 7 08:34:39 2004 From: vnadeem at gmail.com (V.Nadeem Ahmad) Date: Tue, 7 Dec 2004 19:04:39 +0530 Subject: [BiO BB] protein interaction Message-ID: hi is it possible to find those proteins of some organism that may interact with my protein of interest. I only have the sequence of my protein. how can i find the protein that may interact with it. -- V.Nadeem Ahmad From christoph.gille at charite.de Tue Dec 7 08:51:35 2004 From: christoph.gille at charite.de (Dr. Christoph Gille) Date: Tue, 7 Dec 2004 14:51:35 +0100 (MET) Subject: [BiO BB] protein interaction In-Reply-To: References: Message-ID: <45636.192.168.220.204.1102427495.squirrel@webmail.charite.de> perhaps there is data in the yeast2hybrid databases. > hi is it possible to find those proteins of some organism that may interact > with my protein of interest. I only have the sequence of my protein. how > can i find the protein that may interact with it. > From boris.steipe at utoronto.ca Tue Dec 7 09:02:14 2004 From: boris.steipe at utoronto.ca (Boris Steipe) Date: Tue, 7 Dec 2004 09:02:14 -0500 Subject: [BiO BB] protein interaction In-Reply-To: <45636.192.168.220.204.1102427495.squirrel@webmail.charite.de> Message-ID: <9932B5F0-4858-11D9-A6AD-000A9577512E@utoronto.ca> Access the BIND database at http://bind.ca and use BINDBlast. This should give you the most comprehensive results that are currently available. (BTW: you should not restrict your thinking to "protein" except if you really intend to exclude protein-RNA or protein-DNA interactions and BIND contains many of these as well). Regards, Boris On Tuesday, Dec 7, 2004, at 08:51 Canada/Eastern, Dr. Christoph Gille wrote: > perhaps there is data in the yeast2hybrid databases. > > >> hi is it possible to find those proteins of some organism that may >> interact >> with my protein of interest. I only have the sequence of my protein. >> how >> can i find the protein that may interact with it. >> > > > _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From val at vtek.com Tue Dec 7 10:35:03 2004 From: val at vtek.com (val) Date: Tue, 7 Dec 2004 10:35:03 -0500 Subject: [BiO BB] protein interaction References: Message-ID: <04fe01c4dc72$528b9b80$c400a8c0@sony> Very much meaningless thing to me. Take a way simpler question: what are the molecules which interact with the (simplest ever) molecule H2? Not to mention that you have only a sequence... That's reality, unfortunately... IT's a good thing to take a look at intermolecular interaction dynamics and see how rich it can be. See any books on molecular physics, chemical physics, physicala chemistry, etc. my best for you, val ----- Original Message ----- From: "V.Nadeem Ahmad" To: Sent: Tuesday, December 07, 2004 8:34 AM Subject: [BiO BB] protein interaction > hi > is it possible to find those proteins of some organism that may > interact with my protein of interest. I only have the sequence of my > protein. how can i find the protein that may interact with it. > > -- > V.Nadeem Ahmad > _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From lxyiwc at yahoo.com Tue Dec 7 10:45:02 2004 From: lxyiwc at yahoo.com (l x yi) Date: Tue, 7 Dec 2004 07:45:02 -0800 (PST) Subject: [BiO BB] Re: a validation study Message-ID: <20041207154502.27970.qmail@web52007.mail.yahoo.com> I was reading some references, but different people were using different datasets. I got confused. For example, to simulate random sequences, there are at least several ways: -- simulate sequences with frequences of each of 20 aa as in SWISS-PROT -- simulate seq freq according to Robinson and Robinson (1991), PNAS, 88, 8880-4 by BLAST paper. -- simulate seq freq by McCaldon et al. (1988) oligopeptide biases in protein seq and their use in predicting protein coding regions in nucelotide sequences. also, for the set of profiles, one way is to use the top 20 seed alignment of profiles in pfam,http://pfam.wustl.edu/browse.shtml but there are always several sections of the profiles, could I randomly cut out a section of a profile from each of the top 20 profiles? see http://pfam.wustl.edu/cgi-bin/getalignment for example. Thanks so much for all the suggestions. Lily __________________________________ Do you Yahoo!? Read only the mail you want - Yahoo! Mail SpamGuard. http://promotions.yahoo.com/new_mail From dmb at mrc-dunn.cam.ac.uk Tue Dec 7 12:57:35 2004 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Tue, 7 Dec 2004 17:57:35 +0000 (GMT) Subject: [BiO BB] protein interaction In-Reply-To: <9932B5F0-4858-11D9-A6AD-000A9577512E@utoronto.ca> Message-ID: In addition to blasting against databases of experimentally confirmed interactions you can lookup predicted interactions for your protein based on a variety of other interaction predictors. See some of the databases at the following page... http://www.hgmp.mrc.ac.uk/GenomeWeb/prot-interaction.html For example, predicting domain level interactions from the domain archetecture of your protein can provide clues that whole sequence blast will not show up as clearly. Other useful links can be found here http://mips.gsf.de/genre/proj/mpact/ and here http://mips.gsf.de/proj/ppi/ I am not sure how much of MIPS is contained in BIND, as I can't find that information from the BIND website (although I did ask ;) All the best, On Tue, 7 Dec 2004, Boris Steipe wrote: >Access the BIND database at http://bind.ca and use BINDBlast. This >should give you the most comprehensive results that are currently >available. (BTW: you should not restrict your thinking to "protein" >except if you really intend to exclude protein-RNA or protein-DNA >interactions and BIND contains many of these as well). > >Regards, > > >Boris > > >On Tuesday, Dec 7, 2004, at 08:51 Canada/Eastern, Dr. Christoph Gille >wrote: > >> perhaps there is data in the yeast2hybrid databases. >> >> >>> hi is it possible to find those proteins of some organism that may >>> interact >>> with my protein of interest. I only have the sequence of my protein. >>> how >>> can i find the protein that may interact with it. >>> >> >> >> _______________________________________________ >> BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > >_______________________________________________ >BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From bader at cbio.mskcc.org Tue Dec 7 15:31:12 2004 From: bader at cbio.mskcc.org (Gary Bader) Date: Tue, 7 Dec 2004 15:31:12 -0500 Subject: [BiO BB] protein interaction In-Reply-To: Message-ID: <010201c4dc9b$b178ebb0$f349a8c0@cbio.mskcc.org> Hi, And if you run out of options after following the advice of other replies so far, you can find a list of interaction and pathway databases that is quite exhaustive here: http://www.cbio.mskcc.org/prl/index.php Regards, Gary > -----Original Message----- > From: bio_bulletin_board-bounces at bioinformatics.org > [mailto:bio_bulletin_board-bounces at bioinformatics.org] On Behalf Of > V.Nadeem Ahmad > Sent: Tuesday, December 07, 2004 8:35 AM > To: BiO_Bulletin_Board at bioinformatics.org > Subject: [BiO BB] protein interaction > > hi > is it possible to find those proteins of some organism that may > interact with my protein of interest. I only have the sequence of my > protein. how can i find the protein that may interact with it. > > -- > V.Nadeem Ahmad > _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From idonalds at blueprint.org Tue Dec 7 17:04:01 2004 From: idonalds at blueprint.org (Ian Donaldson) Date: Tue, 07 Dec 2004 17:04:01 -0500 Subject: [BiO BB] protein interaction In-Reply-To: Message-ID: All of MIPS is in BIND. 1) go to the interface at http://bind.ca 2) select the browse option under the Seach menu (the magnifying glass) 3) click on BIND record Browse options 4) Beside "Restrict by Divisions" select "MIPS" 5) Click on the browse button. There are currently 3365 records in this division. Ian -----Original Message----- From: bio_bulletin_board-bounces at bioinformatics.org [mailto:bio_bulletin_board-bounces at bioinformatics.org]On Behalf Of Dan Bolser Sent: December 7, 2004 12:58 PM To: The general forum at Bioinformatics.Org Subject: Re: [BiO BB] protein interaction In addition to blasting against databases of experimentally confirmed interactions you can lookup predicted interactions for your protein based on a variety of other interaction predictors. See some of the databases at the following page... http://www.hgmp.mrc.ac.uk/GenomeWeb/prot-interaction.html For example, predicting domain level interactions from the domain archetecture of your protein can provide clues that whole sequence blast will not show up as clearly. Other useful links can be found here http://mips.gsf.de/genre/proj/mpact/ and here http://mips.gsf.de/proj/ppi/ I am not sure how much of MIPS is contained in BIND, as I can't find that information from the BIND website (although I did ask ;) All the best, On Tue, 7 Dec 2004, Boris Steipe wrote: >Access the BIND database at http://bind.ca and use BINDBlast. This >should give you the most comprehensive results that are currently >available. (BTW: you should not restrict your thinking to "protein" >except if you really intend to exclude protein-RNA or protein-DNA >interactions and BIND contains many of these as well). > >Regards, > > >Boris > > >On Tuesday, Dec 7, 2004, at 08:51 Canada/Eastern, Dr. Christoph Gille >wrote: > >> perhaps there is data in the yeast2hybrid databases. >> >> >>> hi is it possible to find those proteins of some organism that may >>> interact >>> with my protein of interest. I only have the sequence of my protein. >>> how >>> can i find the protein that may interact with it. >>> >> >> >> _______________________________________________ >> BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > >_______________________________________________ >BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > _______________________________________________ BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From MAG at Stowers-Institute.org Wed Dec 8 16:08:58 2004 From: MAG at Stowers-Institute.org (Goel, Manisha) Date: Wed, 8 Dec 2004 15:08:58 -0600 Subject: [BiO BB] (no subject) Message-ID: <20041208210838.5A88ED1F05@www.bioinformatics.org> Dear All, I have been using protein sequences of proteins with known structures (PDB databse) derived from the SEQRES records. But now that I need to run either DSSP or STRIDE on them.. I cannot map the secondary structure back to the sequence alignments because the SEQRES and ATOMS records do not agree on the residue number id. So a residue numbered as 278 in SEQRES record is listed as 268 in the ATOMS record, messing up my alignments (I was using this numbering to map the secondary structure on to the sequence) I have come across quite a bit of discussion in some mailing lists about the need & proposed methods of modification of the PDB files, so that they can be made consistent. BUT .. Meanwhile can somebody suggest a method or resource .. which could either fix this dicrepency or maybe a round about way of taking care of this. I guess with people working with stuctural/sequence mapping so often, some such fix would have definetly been devised by somebody. I just want to be able to modify the residue numbers in the ATOMS record to match the SEQRES records or something to that effect. CIF does not work because it does not segregate by chain numbers/id. Thanks for any input, -MAnisha -------------- next part -------------- An HTML attachment was scrubbed... URL: From MAG at Stowers-Institute.org Wed Dec 8 16:10:17 2004 From: MAG at Stowers-Institute.org (Goel, Manisha) Date: Wed, 8 Dec 2004 15:10:17 -0600 Subject: [BiO BB] SEQRES and ATOM record mismatch Message-ID: <20041208210957.82729D1F25@www.bioinformatics.org> Dear All, I have been using protein sequences of proteins with known structures (PDB databse) derived from the SEQRES records. But now that I need to run either DSSP or STRIDE on them.. I cannot map the secondary structure back to the sequence alignments because the SEQRES and ATOMS records do not agree on the residue number id. So a residue numbered as 278 in SEQRES record is listed as 268 in the ATOMS record, messing up my alignments (I was using this numbering to map the secondary structure on to the sequence) I have come across quite a bit of discussion in some mailing lists about the need & proposed methods of modification of the PDB files, so that they can be made consistent. BUT .. Meanwhile can somebody suggest a method or resource .. which could either fix this dicrepency or maybe a round about way of taking care of this. I guess with people working with stuctural/sequence mapping so often, some such fix would have definetly been devised by somebody. I just want to be able to modify the residue numbers in the ATOMS record to match the SEQRES records or something to that effect. CIF does not work because it does not segregate by chain numbers/id. Thanks for any input, -MAnisha -------------- next part -------------- An HTML attachment was scrubbed... URL: From christoph.gille at charite.de Wed Dec 8 16:58:57 2004 From: christoph.gille at charite.de (Dr. Christoph Gille) Date: Wed, 8 Dec 2004 22:58:57 +0100 (MET) Subject: [BiO BB] SEQRES and ATOM record mismatch In-Reply-To: <20041208210957.82729D1F25@www.bioinformatics.org> References: <20041208210957.82729D1F25@www.bioinformatics.org> Message-ID: <1185.217.81.122.113.1102543137.squirrel@webmail.charite.de> I understand the problem as follows: You have two sequences of the same protein, one derived from Calpha positions and one from SEQRES. In loops where coordinate are not resolved and no coordinates are recorded there will be a difference between both. suggestion: Align both with e.g.clustalW. What scripting language are you using ? In case you use java you are welcome to use the STRAP API. It includes a very fast PDB-parser (30ms per PDB file) which also reads SEQRES and also a dssp file parser. Christoph > Dear All, > > > I have been using protein sequences of proteins with known structures > (PDB databse) derived from the SEQRES records. > But now that I need to run either DSSP or STRIDE on them.. I cannot map > the secondary structure back to the sequence alignments because the SEQRES > and ATOMS records do not agree on the residue number id. So a residue > numbered as 278 in SEQRES record is listed as 268 in the ATOMS record, > messing up my alignments (I was using this numbering to map the secondary > structure on to the sequence) I have come across quite a bit of discussion > in some mailing lists about the need & proposed methods of modification of > the PDB files, so that they can be made consistent. BUT .. > Meanwhile can somebody suggest a method or resource .. which could > either fix this dicrepency or maybe a round about way of taking care of > this. I guess with people working with stuctural/sequence mapping so > often, some such fix would have definetly been devised by somebody. > > I just want to be able to modify the residue numbers in the ATOMS record > to match the SEQRES records or something to that effect. CIF does not work > because it does not segregate by chain numbers/id. > > Thanks for any input, > -MAnisha > > > > _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > From jstroud at mbi.ucla.edu Wed Dec 8 17:42:08 2004 From: jstroud at mbi.ucla.edu (James Stroud) Date: Wed, 8 Dec 2004 14:42:08 -0800 Subject: [BiO BB] SEQRES and ATOM record mismatch In-Reply-To: <20041208210957.82729D1F25@www.bioinformatics.org> References: <20041208210957.82729D1F25@www.bioinformatics.org> Message-ID: <200412081442.08248.jstroud@mbi.ucla.edu> Will a tool that pulls sequence from the ATOM records help? If so, I have written one that does this. It can do it programatically (ie automatically for a large number of sequences). I can send it to you if you like. It is part a python API. I am still developing it, so there is no documentation, but I can write the two or three lines of code required to use it for this purpose. James On Wednesday 08 December 2004 01:10 pm, Goel, Manisha wrote: > Dear All, > > I have been using protein sequences of proteins with known structures > (PDB databse) derived from the SEQRES records. > But now that I need to run either DSSP or STRIDE on them.. I cannot map > the secondary structure back to the sequence alignments because the > SEQRES and ATOMS records do not agree on the residue number id. > So a residue numbered as 278 in SEQRES record is listed as 268 in the > ATOMS record, messing up my alignments (I was using this numbering to > map the secondary structure on to the sequence) > I have come across quite a bit of discussion in some mailing lists about > the need & proposed methods of modification of the PDB files, so that > they can be made consistent. > BUT .. > Meanwhile can somebody suggest a method or resource .. which could > either fix this dicrepency or maybe a round about way of taking care of > this. > I guess with people working with stuctural/sequence mapping so often, > some such fix would have definetly been devised by somebody. > > I just want to be able to modify the residue numbers in the ATOMS record > to match the SEQRES records or something to that effect. CIF does not > work because it does not segregate by chain numbers/id. > > Thanks for any input, > -MAnisha -- James Stroud, Ph.D. UCLA-DOE Institute for Genomics and Proteomics 611 Charles E. Young Dr. S. MBI 205, UCLA 951570 Los Angeles CA 90095-1570 http://www.jamesstroud.com/ From dmb at mrc-dunn.cam.ac.uk Wed Dec 8 20:55:13 2004 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Thu, 9 Dec 2004 01:55:13 +0000 (GMT) Subject: [BiO BB] Re: [ssml] (no subject) In-Reply-To: <20041208210838.5A88ED1F05@www.bioinformatics.org> Message-ID: Hello Manisha! Several groups have done semi-automatic and manual ATOM to SEQRES mappings. I only know about two mappings in any detail. One is done by the MSD in their relational database. If you ask the people at the MSD they will give you their mapping table as a tab delimeted dump (I think it is called the swissprot table) which includes ATOM, SEQRES and SWISSPROT residue numbering, and the usual PDB residue identifiers (RES_NO, CHAIN, ALT_LOC, ICODE). The other mapping is done by SCOP and is maintained by ASTRAL in the easy to parse 'RAF' files. The RAF format gives the same information as above, and is manually curated (and they maintain a list of edits which unfortunatly the MSD don't). Sadly the current version of SCOP is quite old, and hence the RAF files for the current version are also old. Perhaps if you ask the people at astral they can give you a 'beta' release. If you like I can send you my RAF parser which creates a nice tab delimited format, but actually the files are trivial to parse, or the tab delimited file from MSD is just as good. Let me know how you get on, All the best, Dan. On Wed, 8 Dec 2004, Goel, Manisha wrote: >Dear All, > >I have been using protein sequences of proteins with known structures >(PDB databse) derived from the SEQRES records. >But now that I need to run either DSSP or STRIDE on them.. I cannot map >the secondary structure back to the sequence alignments because the >SEQRES and ATOMS records do not agree on the residue number id. >So a residue numbered as 278 in SEQRES record is listed as 268 in the >ATOMS record, messing up my alignments (I was using this numbering to >map the secondary structure on to the sequence) >I have come across quite a bit of discussion in some mailing lists about >the need & proposed methods of modification of the PDB files, so that >they can be made consistent. >BUT .. >Meanwhile can somebody suggest a method or resource .. which could >either fix this dicrepency or maybe a round about way of taking care of >this. >I guess with people working with stuctural/sequence mapping so often, >some such fix would have definetly been devised by somebody. > >I just want to be able to modify the residue numbers in the ATOMS record >to match the SEQRES records or something to that effect. CIF does not >work because it does not segregate by chain numbers/id. > >Thanks for any input, >-MAnisha > > > > From MAG at Stowers-Institute.org Thu Dec 9 11:03:47 2004 From: MAG at Stowers-Institute.org (Goel, Manisha) Date: Thu, 9 Dec 2004 10:03:47 -0600 Subject: [BiO BB] SEQRES and ATOM record mismatch Message-ID: <20041209160325.94DF7D1F0E@www.bioinformatics.org> Dear James, Thank you so much for the generous offer. That would have been really helpful, except that I already have the alignments done according to the SEQRES record sequences, and this pulling seqs out of the ATOMS would mean that I would have to repeat that step. So at present I am looking for any shortcut method to just transform the residue id's from SEQRES to ATOMS record. But In case that does not work too well (for many yet unforseen reasons, I guess) Can I still come back for help later ? Best Regards, -Manisha -----Original Message----- From: bio_bulletin_board-bounces at bioinformatics.org [mailto:bio_bulletin_board-bounces at bioinformatics.org] On Behalf Of James Stroud Sent: Wednesday, December 08, 2004 4:42 PM To: The general forum at Bioinformatics.Org Subject: Re: [BiO BB] SEQRES and ATOM record mismatch Will a tool that pulls sequence from the ATOM records help? If so, I have written one that does this. It can do it programatically (ie automatically for a large number of sequences). I can send it to you if you like. It is part a python API. I am still developing it, so there is no documentation, but I can write the two or three lines of code required to use it for this purpose. James On Wednesday 08 December 2004 01:10 pm, Goel, Manisha wrote: > Dear All, > > I have been using protein sequences of proteins with known structures > (PDB databse) derived from the SEQRES records. But now that I need to > run either DSSP or STRIDE on them.. I cannot map the secondary > structure back to the sequence alignments because the SEQRES and ATOMS > records do not agree on the residue number id. So a residue numbered > as 278 in SEQRES record is listed as 268 in the ATOMS record, messing > up my alignments (I was using this numbering to map the secondary > structure on to the sequence) I have come across quite a bit of > discussion in some mailing lists about the need & proposed methods of > modification of the PDB files, so that they can be made consistent. > BUT .. > Meanwhile can somebody suggest a method or resource .. which could > either fix this dicrepency or maybe a round about way of taking care of > this. > I guess with people working with stuctural/sequence mapping so often, > some such fix would have definetly been devised by somebody. > > I just want to be able to modify the residue numbers in the ATOMS > record to match the SEQRES records or something to that effect. CIF > does not work because it does not segregate by chain numbers/id. > > Thanks for any input, > -MAnisha -- James Stroud, Ph.D. UCLA-DOE Institute for Genomics and Proteomics 611 Charles E. Young Dr. S. MBI 205, UCLA 951570 Los Angeles CA 90095-1570 http://www.jamesstroud.com/ _______________________________________________ BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From jstroud at mbi.ucla.edu Thu Dec 9 15:29:11 2004 From: jstroud at mbi.ucla.edu (James Stroud) Date: Thu, 9 Dec 2004 12:29:11 -0800 Subject: [BiO BB] SEQRES and ATOM record mismatch In-Reply-To: <20041209160325.94DF7D1F0E@www.bioinformatics.org> References: <20041209160325.94DF7D1F0E@www.bioinformatics.org> Message-ID: <200412091229.11943.jstroud@mbi.ucla.edu> I will absolutely help if you need it. Let me know. James On Thursday 09 December 2004 08:03 am, Goel, Manisha wrote: > Dear James, > Thank you so much for the generous offer. > That would have been really helpful, except that I already have the > alignments done according to the SEQRES record sequences, and this > pulling seqs out of the ATOMS would mean that I would have to repeat > that step. > So at present I am looking for any shortcut method to just transform the > residue id's from SEQRES to ATOMS record. > But In case that does not work too well (for many yet unforseen reasons, > I guess) Can I still come back for help later ? > Best Regards, > -Manisha -- James Stroud, Ph.D. UCLA-DOE Institute for Genomics and Proteomics 611 Charles E. Young Dr. S. MBI 205, UCLA 951570 Los Angeles CA 90095-1570 http://www.jamesstroud.com/ From robertsc at uvic.ca Fri Dec 10 02:34:12 2004 From: robertsc at uvic.ca (Christine and Bob) Date: Thu, 09 Dec 2004 23:34:12 -0800 Subject: [BiO BB] Same problem as Sheryl Maher Message-ID: <5.2.1.1.0.20041209233337.00bf1cc0@POP.uvic.ca> I just downloaded the ClustalX files for my windows XP. I unzipped the folder and it seemed I didn't need to install the program, it was already up and running when I clicked on the icon. I also downlaoded a file called XP which seems to be just a single item and I haven't done anything with it since I don't see ant instructions anywhere for it. I don't know if this is why when I load sequences and try a multiple alignment I always get the error message that it cannot open the output file. I have tried making a number of blank .aln files using notebook, or renaming a copy of the .in file, or using an old clustalw.aln file and renaming it the same as my .in file, or trying an empty folder, or routing it to another folder, nothing works and I have trawled all over the internet to find something about it in the help files and there is nothing. It seems ClustalX is supposed to create the .aln file automatically but it is not doing this, it is looking for one instead. I tried making a new input file as a text file (instead of a BioEdit file), it loads both OK but it still gave me the "cannot open output file" message. What else can I do? Christine From kelly at etechhi.com Mon Dec 13 17:24:34 2004 From: kelly at etechhi.com (Kelly Russell) Date: Mon, 13 Dec 2004 14:24:34 -0800 Subject: [BiO BB] (no subject) Message-ID: <000b01c4e162$86be2770$0602a8c0@recruiter4> Good afternoon. I'm a Senior Recruiter and received a job order from a client in the Bay Area (CA) for a Biostatistican II. My question is this. and I apologize for sounding not informed, however, I am a recruiter not a Biostatistician, they are looking for a non-clinical Biostatistician not a clinical Biostatistician. What is the difference? Thank you for any help! Sincerely Kelly Russell Sr. Executive Recruiter Etech Hi, Inc. Direct: (951) 296-0629 Fax: (951) 296-6735 E-mail: kelly at etechhi.com Website: www.etechhi.com "Every job is a self-portrait of the person who did it.Autograph your work with excellence." -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Kelly Russell - Etech Hi.vcf Type: text/x-vcard Size: 491 bytes Desc: not available URL: From n.bell at selectconferences.com Mon Dec 13 05:20:42 2004 From: n.bell at selectconferences.com (Natalie Bell) Date: Mon, 13 Dec 2004 10:20:42 -0000 Subject: [BiO BB] Press Release for immediate release Message-ID: MedChem Europe 2005 Agenda Complete Scientific Update and Select Conferences are pleased to announce their first joint European conference on Medicinal Chemistry, which will provide the main European forum for cutting edge work within this dynamic field and an unmissable event for those wishing to keep abreast of new developments. MedChem Europe, to be held in Berlin on the 13-14th April 2005, will include topics of high interest in the pharmaceutical industry and encompasses major target families such as Kinases, GPCR's, and Proteases. Topics at this conference will include: . Theoretical considerations as well as case histories . Interactive workshops . Recent developments in the field of kinase & protease inhibitors . The design of ligands for challenging GPCR targets . The burgeoning field of nanotechnology This multidisciplinary conference focuses on current approaches to the design of molecules with therapeutic indications and is relevant to chemists in both industry and academia. MedChem Europe will be held at the Messe Berlin in conjunction with the BioFine 2005 exhibition for fine chemicals and drug discovery. Scientific Update will be hosting two pre-conference half-day seminars: "SAQINAVIR - More than a success story" and "Speeding up Chemistry: Organic Synthesis using Microwaves". Speakers will include top scientists from Bayer, Novartis, Amgen, Moscow State University, GlaxoSmithKline , Johnson & Johnson, Lilly and Schering to name a few. For the full downloadable agenda please visit www.MedChemEurope.com or, to register, visit www.scientificupdate.co.uk . About Select Conferences www.SelectConferences.com The Select Conferences division of Select Biosciences Ltd. is focused on organising specialist biomedical meetings. Experts from both academia and commerce are invited to present timely information from current research through to commercial implementation of new technologies. These events also provide a unique networking facility and the opportunity to reach a highly targeted scientific audience. About Scientific Update www.scientificupdate.co.uk Scientific Update LLP is the leading organic chemistry training, consultancy and conference specialist and is renowned for the high quality of it's scientific events. True to the founder's aims, the company has consistently provided the most up-to-date information on cutting edge technology and developments in organic process chemistry and related fields. For further on MedChem Europe Contact : Natalie Bell n.bell at selectconferences.com From yzz100 at psu.edu Thu Dec 16 14:35:37 2004 From: yzz100 at psu.edu (Anne Ya Zhang) Date: Thu, 16 Dec 2004 14:35:37 -0500 Subject: [BiO BB] All-again-all protein sequence comparison Message-ID: <006201c4e3a6$6b1b6410$7092cb82@ist.local> Hi, All I have been working on obtain the BLAST e-score for all-against-all protein sequences of two genomes. Is there is tool for script for this function? Any suggestions will be helpful. Thanks, Anne -------------- next part -------------- An HTML attachment was scrubbed... URL: From christoph.gille at charite.de Thu Dec 16 15:37:28 2004 From: christoph.gille at charite.de (Dr. Christoph Gille) Date: Thu, 16 Dec 2004 21:37:28 +0100 (MET) Subject: [BiO BB] All-again-all protein sequence comparison In-Reply-To: <006201c4e3a6$6b1b6410$7092cb82@ist.local> References: <006201c4e3a6$6b1b6410$7092cb82@ist.local> Message-ID: <47903.192.168.220.204.1103229448.squirrel@webmail.charite.de> the ncbi toolkit works well. I can loop over all proteins in one genome and run blast against the other. > Hi, All > > > I have been working on obtain the BLAST e-score for all-against-all > protein sequences of two genomes. Is there is tool for script for this > function? Any suggestions will be helpful. > > Thanks, > > > Anne_______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > From idoerg at burnham.org Thu Dec 16 16:47:15 2004 From: idoerg at burnham.org (Iddo Friedberg) Date: Thu, 16 Dec 2004 13:47:15 -0800 Subject: [BiO BB] All-again-all protein sequence comparison In-Reply-To: <47903.192.168.220.204.1103229448.squirrel@webmail.charite.de> References: <006201c4e3a6$6b1b6410$7092cb82@ist.local> <47903.192.168.220.204.1103229448.squirrel@webmail.charite.de> Message-ID: <41C20263.4000501@burnham.org> Use ncbi toolkit, write a script around bl2seq for the all-vs-all. If the genomes are really large, I would try and cluster each genome first at 90% Sequence ID, to remove redundancies, using CD-HIT. I wouldn't go with the strategy of having one genome as a database, and another as a query pool, because that would skew your BLAST statistics to give you false-positive hits. I would go with the all-vs-all pairwise BLAST. ./I Dr. Christoph Gille wrote: >the ncbi toolkit works well. >I can loop over all proteins in one genome >and run blast against the other. > > > > >>Hi, All >> >> >>I have been working on obtain the BLAST e-score for all-against-all >>protein sequences of two genomes. Is there is tool for script for this >>function? Any suggestions will be helpful. >> >>Thanks, >> >> >>Anne_______________________________________________ >>BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org >>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >> >> >> >> > > >_______________________________________________ >BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > > -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037 Tel: (858) 646 3100 x3516 Fax: (858) 713 9930 http://ffas.ljcrf.edu/~iddo From dmb at mrc-dunn.cam.ac.uk Thu Dec 16 17:15:39 2004 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Thu, 16 Dec 2004 22:15:39 +0000 (GMT) Subject: [BiO BB] All-again-all protein sequence comparison In-Reply-To: <41C20263.4000501@burnham.org> Message-ID: On Thu, 16 Dec 2004, Iddo Friedberg wrote: > >Use ncbi toolkit, write a script around bl2seq for the all-vs-all. Does bl2seq use fastacmd or does it expect two sequences only? >If the genomes are really large, I would try and cluster each genome >first at 90% Sequence ID, to remove redundancies, using CD-HIT. Agreed. Run this over a combined database and you already have some interesting data. Has anyone played with the new -L coverage cutoff threshold in cd-hit? >I wouldn't go with the strategy of having one genome as a database, and >another as a query pool, because that would skew your BLAST statistics >to give you false-positive hits. I would go with the all-vs-all pairwise >BLAST. I never used bl2seq, but it might be usefull to run formatdb on the two databases anyway, only because it lets you use fastacmd to get any sequence (or pair of sequences) out of the database very easily. > >./I > > >Dr. Christoph Gille wrote: > >>the ncbi toolkit works well. >>I can loop over all proteins in one genome >>and run blast against the other. >> >> >> >> >>>Hi, All >>> >>> >>>I have been working on obtain the BLAST e-score for all-against-all >>>protein sequences of two genomes. Is there is tool for script for this >>>function? Any suggestions will be helpful. >>> >>>Thanks, >>> >>> >>>Anne_______________________________________________ >>>BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org >>>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >>> >>> >>> >>> >> >> >>_______________________________________________ >>BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org >>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >> >> >> >> > > > From christoph.gille at charite.de Fri Dec 17 04:15:53 2004 From: christoph.gille at charite.de (Dr. Christoph Gille) Date: Fri, 17 Dec 2004 10:15:53 +0100 (MET) Subject: [BiO BB] All-again-all protein sequence comparison In-Reply-To: References: <41C20263.4000501@burnham.org> Message-ID: <52002.192.168.220.204.1103274953.squirrel@webmail.charite.de> bl2seq is different to blast since it does list all local matches, not just one. There is a web interface that uses the bl2seq output for drawing nice diagrams. pipmaker I think > On Thu, 16 Dec 2004, Iddo Friedberg wrote: > > >> >> Use ncbi toolkit, write a script around bl2seq for the all-vs-all. >> > > Does bl2seq use fastacmd or does it expect two sequences only? > > > >> If the genomes are really large, I would try and cluster each genome >> first at 90% Sequence ID, to remove redundancies, using CD-HIT. > > Agreed. Run this over a combined database and you already have some > interesting data. Has anyone played with the new -L coverage cutoff > threshold in cd-hit? > > >> I wouldn't go with the strategy of having one genome as a database, >> and another as a query pool, because that would skew your BLAST >> statistics to give you false-positive hits. I would go with the >> all-vs-all pairwise BLAST. >> > > I never used bl2seq, but it might be usefull to run formatdb on the two > databases anyway, only because it lets you use fastacmd to get any sequence > (or pair of sequences) out of the database very easily. > > > > >> >> ./I >> >> >> >> Dr. Christoph Gille wrote: >> >> >>> the ncbi toolkit works well. I can loop over all proteins in one >>> genome and run blast against the other. >>> >>> >>> >>> >>>> Hi, All >>>> >>>> >>>> >>>> I have been working on obtain the BLAST e-score for all-against-all >>>> protein sequences of two genomes. Is there is tool for script for >>>> this function? Any suggestions will be helpful. >>>> >>>> Thanks, >>>> >>>> >>>> >>>> Anne_______________________________________________ >>>> BiO_Bulletin_Board maillist - >>>> BiO_Bulletin_Board at bioinformatics.org >>>> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >>>> >>>> >>>> >>>> >>>> >>> >>> >>> _______________________________________________ >>> BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org >>> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >>> >>> >>> >>> >>> >> >> >> > > _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > From christoph.gille at charite.de Fri Dec 17 11:20:09 2004 From: christoph.gille at charite.de (Dr. Christoph Gille) Date: Fri, 17 Dec 2004 17:20:09 +0100 (MET) Subject: [BiO BB] ncbi post and get Message-ID: <59485.192.168.220.204.1103300409.squirrel@webmail.charite.de> Hi can one encode a query for ncbi pubmed in the url or does pubmed work only with http post ? I want to generate search terms automatically and open ncbi Thanks From mgollery at unr.edu Fri Dec 17 12:08:57 2004 From: mgollery at unr.edu (Martin Gollery) Date: Fri, 17 Dec 2004 09:08:57 -0800 Subject: [BiO BB] ncbi post and get In-Reply-To: <59485.192.168.220.204.1103300409.squirrel@webmail.charite.de> References: <59485.192.168.220.204.1103300409.squirrel@webmail.charite.de> Message-ID: <41C312A9.2010908@unr.edu> Yes, you can- here is a search for all mouse data: http://www.ncbi.nlm.nih.gov/gquery/gquery.fcgi?term=Mouse%5Borgn%5D Martin Gollery Dr. Christoph Gille wrote: >Hi > >can one encode a query for ncbi pubmed in the url >or does pubmed work only with http post ? > >I want to generate search terms automatically and open ncbi >Thanks > > >_______________________________________________ >BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > From forward at hongyu.org Fri Dec 17 13:24:06 2004 From: forward at hongyu.org (Hongyu Zhang) Date: Fri, 17 Dec 2004 10:24:06 -0800 Subject: [BiO BB] Re: All-again-all protein sequence comparison (Iddo Friedberg) In-Reply-To: <20041217170010.1A715D1F35@www.bioinformatics.org> References: <20041217170010.1A715D1F35@www.bioinformatics.org> Message-ID: <1103307846.41c32446a397a@hongyu.org> > > I wouldn't go with the strategy of having one > genome as a database, and > another as a query pool, because that would skew > your BLAST statistics > to give you false-positive hits. I would go with the > all-vs-all pairwise > BLAST. > The problem with all-vs-all pairwise comparison is that it will be slower than the strategy of using one genome as a database and the other as the query. The statistics issue, I think, only comes when you do reciprocal BLASTs, ie., blast genome A agaist B and then genome B against A, then you probably will get two slightly different E-values for the same pair of sequeneces. The problem, however, can be mostly circumvented by setting the database size the same in both BLAST directions (parameter "-z" in NCBI-BLAST and "Z=" in WU-BLAST) -- Hongyu Zhang, Ph.D. Computational biologist Ceres Inc. From idoerg at burnham.org Fri Dec 17 14:24:21 2004 From: idoerg at burnham.org (Iddo Friedberg) Date: Fri, 17 Dec 2004 11:24:21 -0800 Subject: [BiO BB] Re: All-again-all protein sequence comparison (Iddo Friedberg) In-Reply-To: <1103307846.41c32446a397a@hongyu.org> References: <20041217170010.1A715D1F35@www.bioinformatics.org> <1103307846.41c32446a397a@hongyu.org> Message-ID: <41C33265.5020308@burnham.org> The problem I see with the e-values is that the e-value is dependent upon the search database size.e-value gives you the number of expected false positives, given the database you are searching. If your database is the queried genome(s) only, you may receive skewed values becuase a hit which would be considered to have a high e-value (low significance, more false positives expected by chance) when searched against nr, would have a low e-value (high significance) when searched against the genome(s). Similarities may be mistaken to be significant simply because the predicted number of false positives will always be small due to a small database size. ./I Hongyu Zhang wrote: >>I wouldn't go with the strategy of having one >>genome as a database, and >>another as a query pool, because that would skew >>your BLAST statistics >>to give you false-positive hits. I would go with the >>all-vs-all pairwise >>BLAST. >> >> >> > >The problem with all-vs-all pairwise comparison is that it will be >slower than the strategy of using one genome as a database and the >other as the query. The statistics issue, I think, only comes when you >do reciprocal BLASTs, ie., blast genome A agaist B and then genome B >against A, then you probably will get two slightly different E-values >for the same pair of sequeneces. The problem, however, can be mostly >circumvented by setting the database size the same in both BLAST >directions (parameter "-z" in NCBI-BLAST and "Z=" in WU-BLAST) > >-- >Hongyu Zhang, Ph.D. >Computational biologist >Ceres Inc. > > > >_______________________________________________ >BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > > -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037 Tel: (858) 646 3100 x3516 Fax: (858) 713 9930 http://ffas.ljcrf.edu/~iddo From hchen at utmem.edu Fri Dec 17 14:52:58 2004 From: hchen at utmem.edu (Hao Chen) Date: Fri, 17 Dec 2004 13:52:58 -0600 Subject: [BiO BB] ncbi post and get In-Reply-To: <59485.192.168.220.204.1103300409.squirrel@webmail.charite.de> References: <59485.192.168.220.204.1103300409.squirrel@webmail.charite.de> Message-ID: <20041217195258.GA26827@utmail.utmem.edu> Hi, On Fri, Dec 17, 2004 at 05:20:09PM +0100, Dr. Christoph Gille wrote: > Hi > > can one encode a query for ncbi pubmed in the url > or does pubmed work only with http post ? > You may want to take a look at the Entrez utilities: http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html Hao > I want to generate search terms automatically and open ncbi > Thanks > > > _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board -- - : Hao Chen, Ph.D. : Research Associate : Department of Pharmacology : University of Tennessee Health Science Center : Memphis, TN 38163 USA : Office: 901 448 3201 : Mobil: 901 826 1845 Mining PubMed: http://www.chilibot.net - From thomas.m.keane at MAY.IE Thu Dec 16 05:59:03 2004 From: thomas.m.keane at MAY.IE (Thomas Keane) Date: Thu, 16 Dec 2004 10:59:03 +0000 Subject: [BiO BB] Software: Modelgenerator Message-ID: <41C16A77.3000502@may.ie> Hi, We have just completed the development of a fully self-contained substitution model selection application. Modelgenerator supports 28 amino acid models and 24 nucletoide models (base model with three optional rate distributions: Gamma, Invariable sites, and Gamma+Invariable sites). Model selection is performed using both the hLRT and AIK tests (Posada and Crandall, 2001). We have devised a modified hLRT for the amino acid models that has shown to be extremely effective in our simulated data tests (manuscript in preparation). Our tests have also shown the importance of using Modelgenerator when choosing a particular amino acid model - as arbitrarily choosing a model can lead to incorrect or suboptimal phylogenies. Unlike other popular modeltesting software, Modelgenerator does not require the installation of any other software package (e.g. PAUP*). We invite you to try this application on your datasets and welcome any positive or negative feedback on the application.The application can be downloaded from: http://bioinf.may.ie/software/modelgenerator Cheers, Thomas Posada, D. and Crandall, K.A. (2001) Selecting the Best-Fit Model of Nucleotide Substitution, Systematic Biology, 50(4), 580?601 -- Thomas Keane, Bioinformatics and Pharmacogenomics Lab, Department of Biology, National University of Ireland, Maynooth, Ireland. http://www.cs.may.ie/distributed http://www.cs.may.ie/~tkeane E: thomas.m.keane at may.ie P: +353 1 708 6043 F: +353 1 708 3845 From idonalds at blueprint.org Fri Dec 17 09:37:46 2004 From: idonalds at blueprint.org (Ian Donaldson) Date: Fri, 17 Dec 2004 09:37:46 -0500 Subject: [BiO BB] All-again-all protein sequence comparison In-Reply-To: <41C20263.4000501@burnham.org> Message-ID: Dear Anne There is a pre-computed BLAST of all pairwise proteins in the NCBI's nr database available at ftp://ftp.blueprint.org/pub/SeqHound/Data/NBLAST/ These results are also available via a remote API (in Perl/Java/C/C++). You can read http://www.blueprint.org/seqhound/seqhound_documentation.html for how to get started with this API if it meets your needs. Best regards Ian -----Original Message----- From: bio_bulletin_board-bounces at bioinformatics.org [mailto:bio_bulletin_board-bounces at bioinformatics.org]On Behalf Of Iddo Friedberg Sent: December 16, 2004 4:47 PM To: The general forum at Bioinformatics.Org Subject: Re: [BiO BB] All-again-all protein sequence comparison Use ncbi toolkit, write a script around bl2seq for the all-vs-all. If the genomes are really large, I would try and cluster each genome first at 90% Sequence ID, to remove redundancies, using CD-HIT. I wouldn't go with the strategy of having one genome as a database, and another as a query pool, because that would skew your BLAST statistics to give you false-positive hits. I would go with the all-vs-all pairwise BLAST. ./I Dr. Christoph Gille wrote: >the ncbi toolkit works well. >I can loop over all proteins in one genome >and run blast against the other. > > > > >>Hi, All >> >> >>I have been working on obtain the BLAST e-score for all-against-all >>protein sequences of two genomes. Is there is tool for script for this >>function? Any suggestions will be helpful. >> >>Thanks, >> >> >>Anne_______________________________________________ >>BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org >>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >> >> >> >> > > >_______________________________________________ >BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > > -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037 Tel: (858) 646 3100 x3516 Fax: (858) 713 9930 http://ffas.ljcrf.edu/~iddo _______________________________________________ BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From christoph.gille at charite.de Sat Dec 18 09:21:29 2004 From: christoph.gille at charite.de (Dr. Christoph Gille) Date: Sat, 18 Dec 2004 15:21:29 +0100 (MET) Subject: [BiO BB] compile c++ MacOsX and Win32 Message-ID: <49071.192.168.220.204.1103379689.squirrel@webmail.charite.de> Hi all, I am preparing a uniform GUI + API for several sequence alignment and 3D superposition procedures. Has someone already compiled the 3D superposition program CE for MacOSX and Win32 or the sequence alignment program Muscle for MacOSX ? Or perhaps someone could do it with little effort ? These are powerful GNU-compatible C++ programs. I am not sure whether it is possible to compile this into normal win32. Can one execute a program compiled for Win32 cygwin using Runtime.exec(argv[]) from a java program ? If yes does cygwin delay the start significantly ? Or do cygwin programs only run within cygwin environment ? Both are excellent programs I can highly recommend and their URLs are ftp://ftp.sdsc.edu/pub/sdsc/biology/CE/src/ http://www.drive5.com/muscle/ Many thanks Christoph From e_hongyu at yahoo.com Sun Dec 19 18:15:49 2004 From: e_hongyu at yahoo.com (Hongyu Zhang) Date: Sun, 19 Dec 2004 15:15:49 -0800 (PST) Subject: [BiO BB] Re: All-again-all protein sequence comparison In-Reply-To: <20041218170008.C7120D1F2F@www.bioinformatics.org> Message-ID: <20041219231550.63944.qmail@web51406.mail.yahoo.com> > The problem I see with the e-values is that the > e-value is dependent > upon the search database size.e-value gives you the > number of expected > false positives, given the database you are > searching. If your database > is the queried genome(s) only, you may receive > skewed values becuase a > hit which would be considered to have a high > e-value (low significance, > more false positives expected by chance) when > searched against nr, would > have a low e-value (high significance) when searched > against the > genome(s). Similarities may be mistaken to be > significant simply because > the predicted number of false positives will always > be small due to a > small database size. That's correct. But as long as you can "normalize" your search using a fixed database size (like NR), I don't see why you need to sacrifice the computer time. __________________________________ Do you Yahoo!? The all-new My Yahoo! - Get yours free! http://my.yahoo.com From pranathi at strandgenomics.com Mon Dec 20 06:53:52 2004 From: pranathi at strandgenomics.com (Pranathi) Date: Mon, 20 Dec 2004 17:23:52 +0530 (IST) Subject: [BiO BB] Regulatory approval Message-ID: <20041220115352.A8FBF4BCD1@mail.strandgenomics.com> Hi, Can anyone of you give me an idea about the regulatory mechanism a medical impliment and a drug has to go through in order to get into the market. Specially for medical impliments: Do we need to get approval from DPCO? Thanking you in advance Regards, Pranathi Bibireddy, M.S. Associate ( LifeSciences ) Strand Genomics Pvt. Ltd. Bangalore, India Ph: 91 (80) 23611349-51 Extn: 204 Fax: 91 (80) 23618996 Mobile: 9341342402 www.strandgenomics.com From christoph.gille at charite.de Mon Dec 20 07:58:39 2004 From: christoph.gille at charite.de (Dr. Christoph Gille) Date: Mon, 20 Dec 2004 13:58:39 +0100 (MET) Subject: [BiO BB] DNA vector design, What software ? Message-ID: <53962.192.168.220.204.1103547519.squirrel@webmail.charite.de> Scientists of the medical School Charite Berlin look for a program for DNA vector design. Since the design is the most critical part of DNA cloning it should be a good and relyable program and it should be appropriate for MDs and Biologists and Biochemists. It should also be not too expensive. We need a license for about 50 floating accounts. Has anyone a good suggestion , recommendation ? Is there a free program doing the job ? Thanks Christoph From stefanielager at fastmail.ca Tue Dec 21 00:33:09 2004 From: stefanielager at fastmail.ca (Stefanie Lager) Date: Tue, 21 Dec 2004 05:33:09 +0000 (UTC) Subject: [BiO BB] DNA vector design, What software ? In-Reply-To: <53962.192.168.220.204.1103547519.squirrel@webmail.charite.de> Message-ID: <20041221053309.06CD5861628@mail.interchange.ca> Hi, The free program GENtle http://gentle.magnusmanske.de/ looks promising, most other free programs can't work with sequence data. Stefanie > Scientists of the medical School Charite Berlin look for > a program for DNA vector design. > Since the design is the most critical part of DNA cloning it should be > a good and relyable program and it should be appropriate for MDs and > Biologists and Biochemists. > > It should also be not too expensive. We need a license for about 50 > floating accounts. > > Has anyone a good suggestion , recommendation ? > > Is there a free program doing the job ? > > Thanks > Christoph > > > > > _______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board _________________________________________________________________ http://fastmail.ca/ - Fast Secure Web Email for Canadians From davidg at lsi.upc.edu Sat Dec 25 08:11:20 2004 From: davidg at lsi.upc.edu (davidg at lsi.upc.edu) Date: Sat, 25 Dec 2004 14:11:20 +0100 (MET) Subject: [BiO BB] Iterating over all the sequences in NR database Message-ID: <4645347379davidg@lsi.upc.es> Hello. Is there any way to access every element of the nr database, iterating over all the sequences in that database? I thought about getting the nr database in a FASTA format text file which contained every protein in the nr database, but i haven't found it. And even if I did, i don't know if there's a better option. Thank you. -- David Garc?a Cort?s Instituto Nacional de Bioinform?tica (INB) Nodo Computacional GNHC-2 UPC-CIRI c/. Jordi Girona 1-3 Modul C6-E201 Tel. : 934 011 650 E-08034 Barcelona Fax : 934 017 014 Catalunya (Spain) e-mail: davidg at lsi.upc.edu From dmb at mrc-dunn.cam.ac.uk Sat Dec 25 08:15:53 2004 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Sat, 25 Dec 2004 13:15:53 +0000 (GMT) Subject: [BiO BB] Iterating over all the sequences in NR database In-Reply-To: <4645347379davidg@lsi.upc.es> Message-ID: You can try a sequence retreival API. If you find the fasta file, fastacmd is a great way to access the individual sequenecs (used with formatdb). Try using unipark90 if you can't find an up to date nrdb90. Cheers, On Sat, 25 Dec 2004 davidg at lsi.upc.edu wrote: >Hello. >Is there any way to access every element of the nr database, iterating >over all the sequences in that database? I thought about getting the >nr database in a FASTA format text file which contained every protein >in the nr database, but i haven't found it. And even if I did, i don't >know if there's a better option. >Thank you. > > >-- >David Garc?a Cort?s >Instituto Nacional de Bioinform?tica (INB) >Nodo Computacional GNHC-2 UPC-CIRI >c/. Jordi Girona 1-3 >Modul C6-E201 Tel. : 934 011 650 >E-08034 Barcelona Fax : 934 017 014 >Catalunya (Spain) e-mail: davidg at lsi.upc.edu > > > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From cdwan at bioteam.net Sat Dec 25 08:32:26 2004 From: cdwan at bioteam.net (Chris Dwan) Date: Sat, 25 Dec 2004 08:32:26 -0500 Subject: [BiO BB] Iterating over all the sequences in NR database In-Reply-To: <4645347379davidg@lsi.upc.es> References: <4645347379davidg@lsi.upc.es> Message-ID: <6AC43231-5679-11D9-910B-000A95CE2714@bioteam.net> FASTA file: ftp://ftp.ncbi.nih.gov/blast/db/FASTA/nr.gz Formatted NCBI BLAST target: ftp://ftp.ncbi.nih.gov/blast/db/nr.tar.gz -Chris Dwan The BioTeam On Dec 25, 2004, at 8:11 AM, wrote: > Hello. > Is there any way to access every element of the nr database, iterating > over all the sequences in that database? I thought about getting the > nr database in a FASTA format text file which contained every protein > in the nr database, but i haven't found it. And even if I did, i don't > know if there's a better option. > Thank you. > > > -- > David Garc?a Cort?s > Instituto Nacional de Bioinform?tica (INB) > Nodo Computacional GNHC-2 UPC-CIRI > c/. Jordi Girona 1-3 > Modul C6-E201 Tel. : 934 011 650 > E-08034 Barcelona Fax : 934 017 014 > Catalunya (Spain) e-mail: davidg at lsi.upc.edu > > > > _______________________________________________ > Bioinformatics.Org general forum - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From dmb at mrc-dunn.cam.ac.uk Sun Dec 26 08:28:27 2004 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Sun, 26 Dec 2004 13:28:27 +0000 (GMT) Subject: [BiO BB] Iterating over all the sequences in NR database In-Reply-To: <6AC43231-5679-11D9-910B-000A95CE2714@bioteam.net> Message-ID: On Sat, 25 Dec 2004, Chris Dwan wrote: > >FASTA file: ftp://ftp.ncbi.nih.gov/blast/db/FASTA/nr.gz >Formatted NCBI BLAST target: ftp://ftp.ncbi.nih.gov/blast/db/nr.tar.gz > >-Chris Dwan > The BioTeam > >On Dec 25, 2004, at 8:11 AM, wrote: > >> Hello. >> Is there any way to access every element of the nr database, iterating >> over all the sequences in that database? I thought about getting the By the way, the default behaviour of blast is to search all the sequences in the database. Best, dan. >> nr database in a FASTA format text file which contained every protein >> in the nr database, but i haven't found it. And even if I did, i don't >> know if there's a better option. >> Thank you. >> >> >> -- >> David Garc?a Cort?s >> Instituto Nacional de Bioinform?tica (INB) >> Nodo Computacional GNHC-2 UPC-CIRI >> c/. Jordi Girona 1-3 >> Modul C6-E201 Tel. : 934 011 650 >> E-08034 Barcelona Fax : 934 017 014 >> Catalunya (Spain) e-mail: davidg at lsi.upc.edu >> >> >> >> _______________________________________________ >> Bioinformatics.Org general forum - >> BiO_Bulletin_Board at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >> > >_______________________________________________ >Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From jrambla at hotmail.com Mon Dec 27 16:37:34 2004 From: jrambla at hotmail.com (JRambla) Date: Mon, 27 Dec 2004 22:37:34 +0100 Subject: [BiO BB] =?utf-8?q?RV=3A_Informaci=C3=B3_curs_Bioinform=C3=A0tic?= =?utf-8?q?a?= Message-ID: Hi everybody, Just in the case someone is interested, this is an announcement of two courses (in Spanish): - Molecular databases and sequence analysis - Perl for bioinformatics The courses will happen in Barcelona from January 17th to 20th and from January 24th to 27th, respectively. You can find more info on http://www.ebiointel.com/curso Best regards, Jordi Rambla -----Mensaje original----- De: Joan Colomer - ebioIntel [mailto:joan.colomer at ebiointel.com] Enviado el: jueves, 16 de diciembre de 2004 10:53 Para: Jordi Rambla Asunto: Informaci? curs Bioinform?tica El curs es divideix en 2 m?duls que es poden cursar independentment. Les places s?n limitades m?xim 15 alumnes. M?DUL I : BASES DE DADES MOLECULARS i AN?LISI DE SEQ??NCIES Dies 17-18-19-20 de gener del 2005 al mat? de 9h a 14:15h . Preu 450 ? ( 20 hores) M?DUL II : FONAMENTS DE PROGRAMACI? EN BIOINFORM?TICA Programaci? en perl per dissenyar i mantenir un sistema local de gesti? de la informaci? biom?dica Dies 24-25-26-27 de gener del 2004 al mat? de 9h a 14:15h. Preu 400 ? ( 20 hores ) Caracter?stiques del curs: -Curs essencialment pr?ctic amb un ordinador per alumne i aprenentatge de cada concepte mitjan?ant exercicis seleccionats. -Utilitzaci? dels m?todes did?ctics m?s moderns amb suport multim?dia, totes les classes amb presentaci? PowerPoint, connexi? a Internet per cada alumne, tutorials, autotests, assessorament per mitj? del correu electr?nic durant quinze dies despr?s de finalitzar el curs. -Acc?s local als programes est?ndard de cerca de similitud i alineaments, m?duls an?lisi SNPs , DPDB. -Molt material complementari: carpeta d?articles seleccionats, CD-ROM amb les diapositives, tutorials, autotests, animacions. -S?obt? un diploma d?assist?ncia i s?inclou els alumnes a la borsa de treball de ebioIntel. Per m?s informaci? visiteu: http://www.ebiointel.com/curso Es pot formalitzar la matr?cula via Web o posant-se en contacte amb ebioIntel al tel?fon: 93 591 29 37. ebioIntel Biocampus UAB Masia Can Fatj? Parc Tecnol?gic del Vall?s 08290 Cerdanyola del Vall?s http://www.ebiointel.com From davidg at lsi.upc.edu Tue Dec 28 11:35:53 2004 From: davidg at lsi.upc.edu (=?iso-8859-1?Q?David_Garc=EDa_Cort=E9s?=) Date: Tue, 28 Dec 2004 17:35:53 +0100 Subject: [BiO BB] BioPerl problem Message-ID: <002701c4ecfb$4d30d990$30b01950@latadecervecix> Hello. I'm trying to parse the results of a blast query, using BPLite. The problem is: I can't access the evalue of a HSP. For accessing the score, there are no problems: I only have to access the HSP's field "score. But... what with evalue? If I do the same using "evalue", there's an error: it can't find that field on the HSP. Here you can see the main part of the code: $factory = Bio::Tools::Run::StandAloneBlast->new(@pars); my $blst_rprt = $factory->blastall($seq); my $exists_results = parse_blast($blst_rprt); And the function parse_blast is this: sub parse_blast { my $blast_report = shift; while (my $result = $blast_report->nextSbjct) { while ( my $hsp = $result->nextHSP ) { my $score = $hsp->score; my $evalue = $hsp->evalue; } } } A different problem (although closely related with the one above) i'd like to ask is this: I thought that maybe it'd be better to use SearchIO for the parsing, but then the problem is that when creating a SearchIO variable, you have to pass the name of the file where the blast result to parse is. But as you can see in the code, my blast result is in the variable $blst_rprt (or $blast_report), not in a file. So should I write the content of the $blast_report variable into a file, and pass this file to SearchIO? I don't like this solution; i would be grateful if you could give me an alternative one. Thank you in advance. -- David Garc?a Cort?s Instituto Nacional de Bioinform?tica (INB) Nodo Computacional GNHC-2 UPC-CIRI c/. Jordi Girona 1-3 Modul C6-E201 Tel. : 934 011 650 E-08034 Barcelona Fax : 934 017 014 Catalunya (Spain) e-mail: davidg at lsi.upc.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From idoerg at burnham.org Tue Dec 28 14:09:35 2004 From: idoerg at burnham.org (Iddo Friedberg) Date: Tue, 28 Dec 2004 11:09:35 -0800 Subject: [BiO BB] GO <--> SCOP mapping? Message-ID: <41D1AF6F.4090100@burnham.org> Hi all, Is there a GO to SCOP mapping? I know of GOA_PDB, I need a SCOP domain based table. Apologies for cross-posting, Happy New Year, Iddo -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 713 9930 http://ffas.ljcrf.edu/~iddo From dmb at mrc-dunn.cam.ac.uk Wed Dec 29 08:41:31 2004 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Wed, 29 Dec 2004 13:41:31 +0000 (GMT) Subject: [BiO BB] GO <--> SCOP mapping? In-Reply-To: <41D1AF6F.4090100@burnham.org> Message-ID: On Tue, 28 Dec 2004, Iddo Friedberg wrote: >Hi all, > >Is there a GO to SCOP mapping? I know of GOA_PDB, I need a SCOP domain >based table. There is SCOPEC, for SCOP to EC mapping, and I think they were moving into the GO arena. http://www.enzome.com/databases/scopec.php IMHO EC is a better ontology than the GO Molecular Function Ontology, but I am happy to be slammed on that point. I was working on a SCOP <-> GO Biological Process Ontology mapping (which isn't trivial (so far as I can tell)), but gave up due to lethargy. >Apologies for cross-posting, Happy New Year, You too, all the best, Dan. >Iddo > > From idoerg at burnham.org Wed Dec 29 12:07:26 2004 From: idoerg at burnham.org (Iddo Friedberg) Date: Wed, 29 Dec 2004 09:07:26 -0800 Subject: [BiO BB] GO <--> SCOP mapping? In-Reply-To: References: Message-ID: <41D2E44E.9020308@burnham.org> Dan Bolser wrote: >On Tue, 28 Dec 2004, Iddo Friedberg wrote: > > > >>Hi all, >> >>Is there a GO to SCOP mapping? I know of GOA_PDB, I need a SCOP domain >>based table. >> >> > > >There is SCOPEC, for SCOP to EC mapping, and I think they were moving into >the GO arena. > >http://www.enzome.com/databases/scopec.php > > >IMHO EC is a better ontology than the GO Molecular Function Ontology, but >I am happy to be slammed on that point. > > E.C. only annotates enzymatic functions. Yes, for enzymes E.C. is probably more suitable. However, E.C. does not annotate non-enzymatic functions, and definitely does not annotate biological process and cellular location. -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 713 9930 http://ffas.ljcrf.edu/~iddo From maria.mirto at unile.it Thu Dec 30 06:00:31 2004 From: maria.mirto at unile.it (Maria Mirto) Date: Thu, 30 Dec 2004 12:00:31 +0100 (CET) Subject: [BiO BB] CFP: (IEEE CBMS2005) Special Track on Grids for Biomedicine and Bioinformatics Message-ID: <3173.193.204.86.210.1104404431.squirrel@webmail2.unile.it> Dear all, Please find attached the Call For Papers for: 18th IEEE Symposium on Computer-Based Medical Systems (CBMS) - Track on Grids for Biomedicine and Bioinformatics. Dublin, Ireland 23-24 June 2005 http://datadog.unile.it/cbms2005/cfp.htm http://conferences.computer.org/CBMS2005/index.html sponsored by IEEE Computer Society The main goal of the track is to discuss well-known and emerging bio data-intensive systems in the context of Grids and to analyse technologies and methodologies useful to develop such systems in these environments. In particular, this Conference Track aims at offering a forum of discussion where young researchers and PhD students could present their research activities, either at an early or mature phase. Best regards, Maria Mirto. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Maria Mirto, CACT/ISUFI (Center for Advanced Computing Technology) Engineering Faculty, Department of Innovation Engineering University of Lecce, Via per Monteroni, 73100 Lecce, Italy phone: +39-0832-297304, fax: +39-0832-297279 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ******************************************************************************** We apologize if you received multiple copies of this Call for Papers. Please feel free to distribute it to those who might be interested. ******************************************************************************** -------------------------------------------------------------------------------- Special Track on Grids for Biomedicine and Bioinformatics. CBMS 2005: IEEE on Computer-Based Medical Systems (CBMS) Sponsored by the IEEE Computer Society June 23-24, 2005 Dublin, Ireland -------------------------------------------------------------------------------- Call for Papers http://datadog.unile.it/cbms2005/cfp.htm http://conferences.computer.org/CBMS2005/index.html ***************** Bio-informatics, genomics, proteomics and medical image analysis are emerging methods in health care. Navigating between phenotype and genotype means that clinical data and genetic assessment are integrated in patient investigations. What is missing today is: ? the full integration of these methods and technologies to enhance all phases of health care, including diagnosis, prognosis, etc.; ? the dissemination of such methods in the clinical practice, whenever they are developed, deployed and maintained. Such a vision requires the design and implementation of computer tools, methods and platforms for seamless biomedical data and bioinformatics tools integration. Main issues to realize such a vision are: ? Integration of multiple laboratories collecting genomics and post-genomics data, so that biology or bio-informatics research laboratories: - can continue to maintain their own biological, biomedical and computing resources autonomously; - can face effectively the growth of data they need to manage and process exploiting recent algorithms such as data mining taking into account that biomedical data are produced and stored continuously; ? Provision of large computing power especially in areas such as: - The medical image processing community that is facing a growing need to analyse 2D, 3D, 4D images, to simulate medical treatments or surgeries (radiotherapy, plastic surgery, etc.), and to develop computer aided surgery; - Integration and access physicians to all of their patients?medical data from their office. The grid paradigm offers CPU and data handling capabilities and allows users and laboratories to share their facilities (computing and data storage resources, instruments, knowledge, etc.) through high bandwidth networks between dynamically formed Virtual Organizations. Grid middleware currently offers basic services for Grid management, and application development and deployment. To face the complexity of novel, cooperative, distributed Health and Bioinformatics applications, new specialized Grid services have to be developed: in such a way Grids can be deployed to address the needs of the biomedical community. The main goal of the track is to discuss well-known and emerging bio data-intensive systems in the context of Grids and to analyse technologies and methodologies useful to develop such systems in these environments. In particular, this Conference Track aims at offering a forum of discussion where young researchers and PhD students could present their research activities, either at an early or mature phase. TOPICS OF INTEREST include, but are not limited to: ? Grid solutions for bio data-intensive applications ? Grid infrastructures for bio data analysis ? High-performance computing for bio data-intensive applications ? Grid computing infrastructures, middleware and tools for Health ? Grid computing biomedical services ? Collaboration technologies ? Bio data analysis and management ? Databases and the grid in biomedical field ? Extracting knowledge from bio data grids ? Data grids for bioinformatics ? Security in bio data grids IMPORTANT DATES January 26, 2005 Submission of (6-page, maximum) paper March 1, 2005 Author Notification March 24, 2005 Final camera-ready paper due March 24, 2005 Pre-registration deadline SUBMISSION DETAILS No hardcopy submissions are being accepted. Electronic submissions of original technical research papers will only be accepted in PDF format. File size is limited to 2 MB. Use a maximum of six A4 pages, including figures and references. Include one cover sheet, stating the paper title, authors, technical area(s) covered in the article, corresponding author's information (telephone, fax, mailing address, e-mail address), and your preference for oral or poster presentation. Author names should appear only on the cover sheet, not on the summary. Submit your manuscript no later than January 26, 2005. Authors will be notified of acceptance by March 1, 2005 after a review process by three independent experts. Each accepted paper will be published in the conference proceedings by IEEE CS Press, conditional upon the author's advance registration. Submission in the IEEE Computer Science Press 6x9-inch format is encouraged. Formatting instructions, LaTeX macros and MSWord templates are available at ftp://pubftp.computer.org/press/outgoing/proceedings/. Authors should indicate the special track title (on the cover sheet). All submissions will be done electronically via the CBMS web submission system, at http://www.cs.tcd.ie/research_groups/mlg/CBMS2005/openconf/openconf.php. For further questions, please contact: Maria Mirto, CACT/ISUFI (Center for Advanced Computing Technology) & SPACI (Southern Partnership for Advanced Computational Infrastructures) Consortium, c/o Engineering Faculty, Department of Innovation Engineering, University of Lecce, Via per Monteroni, 73100 Lecce, Italy, Voice: +39-0832-297304, Fax: +39-0832-297279, Email: maria.mirto at unile.it Electronic submission (PostScript or PDF) is strongly encouraged.