From biopolak at yahoo.co.uk Mon Nov 3 08:57:44 2003 From: biopolak at yahoo.co.uk (=?iso-8859-1?q?Peter=20Oledzki?=) Date: Mon, 3 Nov 2003 13:57:44 +0000 (GMT) Subject: [BiO BB] Molecular Mechanics (Universal Force Fields models) In-Reply-To: <193701c39ffd$e467dc60$6400a8c0@vt1000> Message-ID: <20031103135744.96112.qmail@web25105.mail.ukl.yahoo.com> Hello, quoting you val "it does not help much in answering real questions in (bio)chemistry and bioinformatics" I would have to disagree these sort of calculations are used in structure based drug design all the time. The big pharma companies are using these sort of calculations to screen for lead componds and the preliminary results for current methods are encouraging. Pete --- val wrote: > Hi Miguel, > Google is your best friend. Try, e.g., google > with > 'molecule energy software'. > But i wouldn't rely much on this type of > calculation; it does not help much in answering real > questions in (bio)chemistry and bioinformatics. > my best, > val > > > ----- Original Message ----- > From: Miguel Pignatelli > To: bio_bulletin_board at bioinformatics.org > Sent: Thursday, October 30, 2003 4:43 AM > Subject: [BiO BB] Molecular Mechanics (Universal > Force Fields models) > > > Hi, > > I'm trying to make a program in C that calculates > the energy of a molecule > from the coordinates X,Y,Z of its atoms. We want to > apply the Universal > Force Fields (UFF) described by Rappe. Once we have > the total energy of the > molecule (contribution from each chemical bond (bond > stretching), angle > bending, torsional terms, improper torsions, out of > plane bending motions > and non-bonded interactions (electrostatic and Van > der Waals forces)) we > will look for energy minimisation through the > conjugate gradient > minimisation method. > > I am looking for a similar software to look at or > any algorithm that could > help me. First of all I would like to know what is > the best data structure > to use in this kind of software and if there is any > similar software to look > at its code or any algorithm that implements the > UFF. > > The software imput will be atoms and coordinates, so > it will determine which > atoms are bonded. Is there any efficient algorithm > for that? > > Any suggestions? > > Thanks all > > _______________________________________________ > BiO_Bulletin_Board maillist - > BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board ===== Peter Oledzki Department of Biochemistry and Molecular Biology University of Leeds,UK ________________________________________________________________________ Want to chat instantly with your online friends? Get the FREE Yahoo! Messenger http://mail.messenger.yahoo.co.uk From val at vtek.com Tue Nov 4 12:05:49 2003 From: val at vtek.com (val) Date: Tue, 4 Nov 2003 12:05:49 -0500 Subject: [BiO BB] Molecular Mechanics (Universal Force Fields models) References: <20031103135744.96112.qmail@web25105.mail.ukl.yahoo.com> Message-ID: <21ff01c3a2f5$e5a2ba00$6400a8c0@vt1000> Hi Pete, Thanks for your comment. Well generally, you are right. My point is a comparative inefficiency of the screening. An experienced chemist can do much better, and the reason is that energy is *not* a good/sensitive indicator of fine structural and functional effects involved in discovery of the *successful* drug. This is why a successful drug takes huge efforts to find - roughly $1B and 8-12 years, to me, an indication of wrong approach. cheers, val ----- Original Message ----- From: "Peter Oledzki" To: Sent: Monday, November 03, 2003 8:57 AM Subject: Re: [BiO BB] Molecular Mechanics (Universal Force Fields models) > > Hello, > > quoting you val "it does not help much in answering > real questions in (bio)chemistry and bioinformatics" > I would have to disagree these sort of calculations > are used in structure based drug design all the time. > The big pharma companies are using these sort of > calculations to screen for lead componds and the > preliminary results for current methods are > encouraging. > > Pete > > > > --- val wrote: > Hi Miguel, > > Google is your best friend. Try, e.g., google > > with > > 'molecule energy software'. > > But i wouldn't rely much on this type of > > calculation; it does not help much in answering real > > questions in (bio)chemistry and bioinformatics. > > my best, > > val > > > > > > ----- Original Message ----- > > From: Miguel Pignatelli > > To: bio_bulletin_board at bioinformatics.org > > Sent: Thursday, October 30, 2003 4:43 AM > > Subject: [BiO BB] Molecular Mechanics (Universal > > Force Fields models) > > > > > > Hi, > > > > I'm trying to make a program in C that calculates > > the energy of a molecule > > from the coordinates X,Y,Z of its atoms. We want to > > apply the Universal > > Force Fields (UFF) described by Rappe. Once we have > > the total energy of the > > molecule (contribution from each chemical bond (bond > > stretching), angle > > bending, torsional terms, improper torsions, out of > > plane bending motions > > and non-bonded interactions (electrostatic and Van > > der Waals forces)) we > > will look for energy minimisation through the > > conjugate gradient > > minimisation method. > > > > I am looking for a similar software to look at or > > any algorithm that could > > help me. First of all I would like to know what is > > the best data structure > > to use in this kind of software and if there is any > > similar software to look > > at its code or any algorithm that implements the > > UFF. > > > > The software imput will be atoms and coordinates, so > > it will determine which > > atoms are bonded. Is there any efficient algorithm > > for that? > > > > Any suggestions? > > > > Thanks all > > > > _______________________________________________ > > BiO_Bulletin_Board maillist - > > BiO_Bulletin_Board at bioinformatics.org > > > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > ===== > Peter Oledzki > Department of Biochemistry and Molecular Biology > University of Leeds,UK > > ________________________________________________________________________ > Want to chat instantly with your online friends? Get the FREE Yahoo! > Messenger http://mail.messenger.yahoo.co.uk > _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From B.A.T.Svensson at lumc.nl Wed Nov 5 05:41:12 2003 From: B.A.T.Svensson at lumc.nl (Svensson, B.A.T. (HKG)) Date: 05 Nov 2003 11:41:12 +0100 Subject: [BiO BB] about biojava......... In-Reply-To: <00b001c39e33$9e956f20$43c17b3e@atarippc> References: <20031023130609.29007.qmail@web8106.mail.in.yahoo.com> <00b001c39e33$9e956f20$43c17b3e@atarippc> Message-ID: <1068028872.2999.6.camel@ander> On Wed, 2003-10-29 at 16:45, Andrea Franceschini wrote: [...] > This is one of the articles i've found: > http://java.sun.com/features/2001/10/genome2.html > > I would be happy if someone could help me understanding better the > possible role of Java in the bioinformatic's software (now and on the > next few years). As the article you quoted from SUN says: "The Future [...] The need for scalability, cross-platform compatibility, network awareness, and security, all call out to Java technology's core strengths." > Thankyou > Andrea Franceschini > atari at portalis.it > From anstrom at yahoo.com Sat Nov 1 14:18:02 2003 From: anstrom at yahoo.com (Hongyu Zhang) Date: Sat, 1 Nov 2003 11:18:02 -0800 (PST) Subject: [BiO BB] Molecular Mechanics (Universal Force Fields models) In-Reply-To: <20031101170106.81029D2448@www.bioinformatics.org> Message-ID: <20031101191802.32399.qmail@web41111.mail.yahoo.com> MM3 is a pure molecular mechanics program. But I don't know whether it's free now. Also try to check programs like GROMOS, NAMD etc.. They are molecular dynamics (MD) packages, but I think they also contain molecular mechanics subroutines. __________________________________ Do you Yahoo!? Exclusive Video Premiere - Britney Spears http://launch.yahoo.com/promos/britneyspears/ From maria.mirto at unile.it Sun Nov 2 08:21:52 2003 From: maria.mirto at unile.it (Maria Mirto) Date: Sun, 2 Nov 2003 14:21:52 +0100 Subject: [BiO BB] Extended deadline for ITCC2004: track on distributed and Grid systems Message-ID: <1067779312.3fa504f0bfa2a@wm.unile.it> Dear all, the deadline of ITCC 2004 has been extended, so we extended the Distributed and Grid Systems track deadline. New and updated submissions are possible until November 21, 2003. Please visit the ITCC 2004 website prior to your final submission and use the online submission system. If any problem occurs during electronic paper submission, please contact Track Chair via email. Best regards, Maria Mirto. Submission Page: http://www.softconf.com/start/ITCC2004/submit.html Track Page: http://datadog.unile.it/itcc2004/cfp.htm IMPORTANT DATES November 21, 2003 Paper Due December 19, 2003 Author Notification January 9, 2004 Camera-Ready Copy ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Maria Mirto, CACT/ISUFI (Center for Advanced Computing Technology) Engineering Faculty, Department of Innovation Engineering University of Lecce, Via per Monteroni, 73100 Lecce, Italy phone: +39-0832-297304, fax: +39-0832-297279 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We apologize if you receive multiple copies of this email. ------------------------------------------------------------------------------- - Track on Methodologies, Technologies and Applications in distributed and Grid systems. ITCC 2004: IEEE International Conference on Information Technology: Coding and Computing Sponsored by the IEEE Computer Society April 5-7, 2004 The Orleans, Las Vegas, Nevada ------------------------------------------------------------------------------- - Call for Papers http://datadog.unile.it/itcc2004/cfp.htm http://www.itcc.info ****************** Computational Grids, initially used for the sharing of distributed computation resources in scientific applications, start to be used in different application domains offering basic services for application definition and execution in heterogeneous distributed systems. In health systems, the Grid offers the power and ubiquity needed to the acquisition of biomedical data, processing and delivering of biomedical images (CT, MRI, PET, SPECT, etc) located in different hospitals, within a wide area. So, the Grid acts as a Collaborative Working Environment: doctors often want to aggregate not only medical data, but also human expertise and might want colleague around the world to visualize the examinations in the same way and at the same time so that the group can discuss the diagnosis in real time. The Grid offers a dynamic infrastructure for retrieving and on-demand processing of remote sensing data, for instance, retrieving of SAR metadata related to terabyte of SAR data, starting on-demand processing on raw data, starting on-demand post-processing on focalized data and creating a complex application composing simple tasks. For atmospheric and climate modeling, a Grid offers tools for simulate and forecasting meteorological phenomena, simulate emission and dispersion of pollutants for air quality studies and simulate complex phenomena about the impact of global climate changes. Grid Computing techniques can be used in the motor industry, reducing the optimization process time for improvement of diesel engine emission performance using, for instance, micro-genetic algorithms for engine chamber geometry optimization and Kiva3 code to calculate chamber geometry fitness. In the computer aided medicine, a new research area involves the use of the Grid technologies for surgical simulations. Some simulations could be performed in a distributed system to allow surgeons to practise executing of particular surgical procedures. Analysis of the problems relevant to the use of GRID in medical virtual environments will be appreciated. Finally, bioinformatic applications call for the ability to read large datasets (e.g. protein databases) and to create new datasets (e.g. mass spectrometry proteomic data). They can require the ability to change (updating) existing datasets; consequently a Data Grid, i.e. a distributed infrastructure for storing large datasets, is needed. In the bioinformatic field, a Data Grid could reveal useful to build Electronic Patient Record systems (EPRs) for the management of patient information (data, metadata and images), to support data replication, allowing the integration and sharing of biological databases and, generally, for the developement of efficient bioinformatics (in particular proteomic) applications. The main goal of the Conference Track is to discuss well-known and emerging data-intensive applications in the context of distributed systems and Grid systems, and to analyze technologies and methodologies useful to develop such applications in such environments. In particular, this Conference Track aims at offering a forum of discussion where young researchers and PhD students could present their research activities, either at an early or mature phase. Topics include, but are not limited to: Data intensive applications in distributed and Grid systems: - Grid for biomedical imaging; - Grid for remote sensing and GIS application; - Grid for Atmospheric and Climate Modeling; - Grid for motor industry (diesel engine simulation); - Grid for surgery simulations; - Bioinformatic for: o Biomedical Imaging; o Proteomics and genomics; o Electronic Patient Records; o Medical images, data and metadata management; o Image Recognition, Processing and Analysis. Technologies and methodologies in distributed and grid-based applications: - Grid technologies (Grid portals, Web & Grid services, portlets); - Grid Information and Monitoring services and related (OO,Relational,XML) data models; - Grid Security; - Grid Workload and Data management services; - Grid Resource management; - Parallel and Distributed application (cluster and grid based); - Simulation and Applications of Modeling. IMPORTANT DATES November 21, 2003 Paper Due December 19, 2003 Author Notification January 9, 2004 Camera-Ready Copy Note: The Proceedings will be published by IEEE Computer Society. A special issue of an international journal is being planned consisting of selected papers from this conference. Authors of selected papers will be invited to submit an extended version for the journal. SUBMISSION DETAILS Papers should be original and contain contributions of theoretical or experimental nature. Interested authors should submit a paper (up to 8 pages, formatted in the style of IEEE Proceedings format - http://computer.org/cspress/instruct.htm), including keywords, using a specific form (http://www.softconf.com/start/ITCC2004/submit.html), before November 21, 2003. Instructions about submission are also available (http://www.ee.unlv.edu/%7Eit/Files/start/how-to-submit.html). If any problem occurs during electronic paper submission, please contact track chair. Maria Mirto, CACT/ISUFI (Center for Advanced Computing Technology), c/o Engineering Faculty, Department of Innovation Engineering, University of Lecce, Via per Monteroni, 73100 Lecce, Italy, Voice: +39-0832-297304, Fax: +39-0832-297279, Email: maria.mirto at unile.it Electronic submission (PostScript or PDF) is strongly encouraged. From idoerg at burnham.org Mon Nov 10 19:24:39 2003 From: idoerg at burnham.org (Iddo Friedberg) Date: Mon, 10 Nov 2003 16:24:39 -0800 Subject: [BiO BB] Room-mate for PSB Message-ID: <3FB02C47.6010509@burnham.org> (Apologies for cross-posting). Posting this for a friend, please reply directly to her: Female, nonsmoker looking for same as a roommate for PSB Jan 5-10 2004. Please reply to: jsasin at burnham.org ./I -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 646 3171 http://ffas.ljcrf.edu/~iddo -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 646 3171 http://ffas.ljcrf.edu/~iddo From risea at cecalc.ula.ve Tue Nov 11 04:04:35 2003 From: risea at cecalc.ula.ve (Raul Isea) Date: Tue, 11 Nov 2003 05:04:35 -0400 (VET) Subject: [BiO BB] Portal Iberoamericano de Bioinformatica en =?iso-8859-1?Q?espa=F1ol?= Message-ID: <64450.150.185.178.152.1068541475.squirrel@death.ula.ve> Hola a todos Sirva la presente para invitarles a conocer y participar (registr?ndose) en El "Portal Iberoamericano de Bioinform?tica" y como su nombre lo indica, es un portal centrado y/o focalizado en Bioinform?tica en toda el ?rea Iberoamericana. El sitio web del Portal Iberoamericano de Bioinform?tica est? en: http://portal-bio.ula.ve/ Dicho portal tiene como objetivo organizar/canalizar/focalizar informaci?n en Biolog?a Computacional en diversos centros y/o laboratorios de diversas regiones del mundo de habla castellana y portuguesa En el Portal Iberoamericano de Bioinform?tica (http://portal-bio.ula.ve), existen diversos canales y/o ?ndices de informaci?n entre los que podemos mencionar: ? Noticias internacionales en el ?rea de la bioinform?tica, tal c?mo: ?Hallan mutaci?n gen?tica en el trastorno obsesivo-compulsivo?, entre otros. ? Convocatorias de Becas para estudiantes tanto de pregrado como de postgrado. Actualmente las convocatorias para estudios en Europa son: las Convocatoria Alban y AECI. ? Eventos, Cursos/Talleres en todo el mundo. ? Recomendaci?n de libros especializados en biolog?a computacional, como por ejemplo: ?Fundamental concepts on Bioinformatics?. ? Posibilidad de realizar b?squedas en Google. ? Libretas de Direcciones. ? Un calendario p?blico y privado. ? Pr?ximamente estar? habilitado una secci?n de descarga de programas gratuitos en Bioinform?tica (o en car?cter de DEMO) tanto para sistemas operativos en Windows como en Linux. Para ello es importante suscribirse/registrase en el portal para estar informado y tener acceso a dicha informaci?n, y solo tomar? un minuto para ello. Visite el siguiente enlace web para suscribirse al portal: http://portal-bio.ula.ve/user.php En los pr?ximos d?as estar? activo un ?rea de CHAT privado para todos los miembros del portal y se espera que en un futuro inmediato, habilitar un canal de Videoconferencia, siendo la v?a m?s id?nea para integrar centros y laboratorios a escala internacional que trabajen y/o busquen informaci?n en Bioinform?tica en espa?ol y en portugu?s. Es importante resaltar que la principal ventaja de trabajar en un portal es que toda la informaci?n es din?mica, es decir, la informaci?n se maneja autom?ticamente y por ello, no hay tiempos de espera para divulgar/publicar la misma. Asimismo, el Portal Iberoamericano de Bioinform?tica es el Host de La Revista Iberoamericana de Bioinform?tica, el cual lanzar? su primer n?mero en Enero de 2004 (toda la informaci?n necesaria para enviar trabajos y/o notas a dicha revista, est? en el portal iberoamericano). No desaproveche la oportunidad de enviar un trabajo que caracterice su l?nea de investigaci?n, o destacar/comentar su centro y/o laboratorio al cual trabaja. Por otra parte, para garantizar el nivel de est? revista, el Comit? Editorial est? formado por investigadores de diversos pa?ses (Espa?a, Chile, Argentina, Brasil y Venezuela). Mayor informaci?n al respecto, puede escribir directamente cualquier de los integrantes que forman dicho Comit? Editorial. Por ultimo, comentarles que el valor agregado de dicho Portal depender? de todos nosotros, y el mismo deber?a ser referencia para diversos centros, universidades, e inclusive informaci?n ?til para la industria cuyo inter?s est? centrado en Bioinform?tica. Recuerden que se pueden registrar en el enlace del portal, y el enlace web es: http://portal-bio.ula.ve/ Cordiales saludos Ra?l Isea **** Dr. Raul Isea PhD in Chemistry Email: risea at cecalc.ula.ve http://www.cecalc.ula.ve/~risea (Mirror at http://www.geocities.com/lrisea) From idoerg at burnham.org Mon Nov 10 17:40:12 2003 From: idoerg at burnham.org (Iddo Friedberg) Date: Mon, 10 Nov 2003 14:40:12 -0800 Subject: [BiO BB] Room-mate for PSB Message-ID: <3FB013CC.9040501@burnham.org> (Apologies for cross-posting). Posting this for a friend, please reply directly to her: Female, nonsmoker looking for same as a roommate for PSB Jan 5-10 2004. Please reply to: jsasin at burnham.org ./I -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 646 3171 http://ffas.ljcrf.edu/~iddo From maria.mirto at unile.it Sat Nov 15 08:55:02 2003 From: maria.mirto at unile.it (Maria Mirto) Date: Sat, 15 Nov 2003 14:55:02 +0100 Subject: [BiO BB] LAST REMIND CFP: ITCC04-Distributed and Grid Systems Message-ID: <200311151455.03284.maria.mirto@unile.it> Dear all, the deadline of the IEEE ITCC 2004 has been extended. Submissions for ITCC and "Distributed and Grid Systems" track are possible until November 21, 2003. Call for Papers: http://datadog.unile.it/itcc2004/cfp.htm Submission Page: http://www.softconf.com/start/ITCC2004/submit.html Best Regards, Maria Mirto ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Maria Mirto, CACT/ISUFI (Center for Advanced Computational Technologies) Engineering Faculty, Department of Innovation Engineering University of Lecce, Via per Monteroni, 73100 Lecce, Italy phone: +39-0832-297304, fax: +39-0832-297279 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We apologize if you receive multiple copies of this email. ------------------------------------------------------------------------------- - Track on Methodologies, Technologies and Applications in distributed and Grid systems. ITCC 2004: IEEE International Conference on Information Technology: Coding and Computing Sponsored by the IEEE Computer Society April 5-7, 2004 The Orleans, Las Vegas, Nevada ------------------------------------------------------------------------------- - Call for Papers http://datadog.unile.it/itcc2004/cfp.htm http://www.itcc.info ****************** Computational Grids, initially used for the sharing of distributed computation resources in scientific applications, start to be used in different application domains offering basic services for application definition and execution in heterogeneous distributed systems. In health systems, the Grid offers the power and ubiquity needed to the acquisition of biomedical data, processing and delivering of biomedical images (CT, MRI, PET, SPECT, etc) located in different hospitals, within a wide area. So, the Grid acts as a Collaborative Working Environment: doctors often want to aggregate not only medical data, but also human expertise and might want colleague around the world to visualize the examinations in the same way and at the same time so that the group can discuss the diagnosis in real time. The Grid offers a dynamic infrastructure for retrieving and on-demand processing of remote sensing data, for instance, retrieving of SAR metadata related to terabyte of SAR data, starting on-demand processing on raw data, starting on-demand post-processing on focalized data and creating a complex application composing simple tasks. For atmospheric and climate modeling, a Grid offers tools for simulate and forecasting meteorological phenomena, simulate emission and dispersion of pollutants for air quality studies and simulate complex phenomena about the impact of global climate changes. Grid Computing techniques can be used in the motor industry, reducing the optimization process time for improvement of diesel engine emission performance using, for instance, micro-genetic algorithms for engine chamber geometry optimization and Kiva3 code to calculate chamber geometry fitness. In the computer aided medicine, a new research area involves the use of the Grid technologies for surgical simulations. Some simulations could be performed in a distributed system to allow surgeons to practise executing of particular surgical procedures. Analysis of the problems relevant to the use of GRID in medical virtual environments will be appreciated. Finally, bioinformatic applications call for the ability to read large datasets (e.g. protein databases) and to create new datasets (e.g. mass spectrometry proteomic data). They can require the ability to change (updating) existing datasets; consequently a Data Grid, i.e. a distributed infrastructure for storing large datasets, is needed. In the bioinformatic field, a Data Grid could reveal useful to build Electronic Patient Record systems (EPRs) for the management of patient information (data, metadata and images), to support data replication, allowing the integration and sharing of biological databases and, generally, for the developement of efficient bioinformatics (in particular proteomic) applications. The main goal of the Conference Track is to discuss well-known and emerging data-intensive applications in the context of distributed systems and Grid systems, and to analyze technologies and methodologies useful to develop such applications in such environments. In particular, this Conference Track aims at offering a forum of discussion where young researchers and PhD students could present their research activities, either at an early or mature phase. Topics include, but are not limited to: Data intensive applications in distributed and Grid systems: - Grid for biomedical imaging; - Grid for remote sensing and GIS application; - Grid for Atmospheric and Climate Modeling; - Grid for motor industry (diesel engine simulation); - Grid for surgery simulations; - Bioinformatic for: o Biomedical Imaging; o Proteomics and genomics; o Electronic Patient Records; o Medical images, data and metadata management; o Image Recognition, Processing and Analysis. Technologies and methodologies in distributed and grid-based applications: - Grid technologies (Grid portals, Web & Grid services, portlets); - Grid Information and Monitoring services and related (OO,Relational,XML) data models; - Grid Security; - Grid Workload and Data management services; - Grid Resource management; - Parallel and Distributed application (cluster and grid based); - Simulation and Applications of Modeling. IMPORTANT DATES November 21, 2003 Paper Due December 19, 2003 Author Notification January 9, 2004 Camera-Ready Copy Note: The Proceedings will be published by IEEE Computer Society. A special issue of an international journal is being planned consisting of selected papers from this conference. Authors of selected papers will be invited to submit an extended version for the journal. SUBMISSION DETAILS Papers should be original and contain contributions of theoretical or experimental nature. Interested authors should submit a paper (up to 8 pages, formatted in the style of IEEE Proceedings format - http://computer.org/cspress/instruct.htm), including keywords, using a specific form (http://www.softconf.com/start/ITCC2004/submit.html), before November 21, 2003. Instructions about submission are also available (http://www.ee.unlv.edu/%7Eit/Files/start/how-to-submit.html). If any problem occurs during electronic paper submission, please contact track chair. Maria Mirto, CACT/ISUFI (Center for Advanced Computing Technology), c/o Engineering Faculty, Department of Innovation Engineering, University of Lecce, Via per Monteroni, 73100 Lecce, Italy, Voice: +39-0832-297304, Fax: +39-0832-297279, Email: maria.mirto at unile.it Electronic submission (PostScript or PDF) is strongly encouraged. From mjmccorm at mtu.edu Sat Nov 15 19:33:59 2003 From: mjmccorm at mtu.edu (Matt McCormick) Date: Sat, 15 Nov 2003 19:33:59 -0500 Subject: [BiO BB] perl help Message-ID: <1068942839.28328.5.camel@bioinfpc1> Hi, I am trying to construct a sequence, with start position x and end position y, from A.thaliana chromosome 1. Does anyone have any code they would be willing to share that would do this? I have the current perl and bioperl modules installed. Thank you. -Matt McCormick From landman at scalableinformatics.com Sat Nov 15 19:47:16 2003 From: landman at scalableinformatics.com (Joe Landman) Date: Sat, 15 Nov 2003 19:47:16 -0500 Subject: [BiO BB] perl help In-Reply-To: <1068942839.28328.5.camel@bioinfpc1> References: <1068942839.28328.5.camel@bioinfpc1> Message-ID: <1068943635.10170.8.camel@protein.scalableinformatics.com> Hi Matt: On Sat, 2003-11-15 at 19:33, Matt McCormick wrote: > Hi, > I am trying to construct a sequence, with start position x and end > position y, from A.thaliana chromosome 1. Does anyone have any code > they would be willing to share that would do this? I have the current > perl and bioperl modules installed. Thank you. I might suggest giving a better specification of the problem. You have the chromosome 1 data in some format, and you need to extract a particular section from it? If this is the case, then is there a particular reason why the substr function will not work? my $small = substr $sequence,$x,$y-$x+1 ; Joe > > -Matt McCormick > > > _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board -- Joe Landman Scalable Informatics email: landman at scalableinformatics.com phone: +1 734 612 4615 web: http://scalableinformatics.com From mjmccorm at mtu.edu Sun Nov 16 16:28:21 2003 From: mjmccorm at mtu.edu (Matt McCormick) Date: Sun, 16 Nov 2003 16:28:21 -0500 Subject: [BiO BB] perl help In-Reply-To: <1068943635.10170.8.camel@protein.scalableinformatics.com> References: <1068942839.28328.5.camel@bioinfpc1> <1068943635.10170.8.camel@protein.scalableinformatics.com> Message-ID: <1069018101.6429.22.camel@bioinfpc1> Please allow me to elaborate a bit more. I'm trying to pull a random 100 kb section from the chromosomal sequence(in genbank format, Accession NC_003070). The chromosome sequence is 30494425 bp long, and the positions x and y represent the base i want my sequence to start and end respectively. I'm just beginning to use perl, and for me to use substr function, i would have to find and join all the lines starting at the ORIGIN. I m not sure on the best way to do this. Any help would be appreciated. Thank you. On Sat, 2003-11-15 at 19:47, Joe Landman wrote: > Hi Matt: > > > On Sat, 2003-11-15 at 19:33, Matt McCormick wrote: > > Hi, > > I am trying to construct a sequence, with start position x and end > > position y, from A.thaliana chromosome 1. Does anyone have any code > > they would be willing to share that would do this? I have the current > > perl and bioperl modules installed. Thank you. > > I might suggest giving a better specification of the problem. You > have the chromosome 1 data in some format, and you need to extract a > particular section from it? If this is the case, then is there a > particular reason why the substr function will not work? > > my $small = substr $sequence,$x,$y-$x+1 ; > > Joe > > > > > -Matt McCormick > > > > > > _______________________________________________ > > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From pooja at igc.gulbenkian.pt Mon Nov 17 06:15:10 2003 From: pooja at igc.gulbenkian.pt (Pooja Jain) Date: Mon, 17 Nov 2003 11:15:10 -0000 (WET) Subject: [BiO BB] Matching and Filtering Message-ID: <49731.194.117.22.137.1069067710.squirrel@webmail.igc.gulbenkian.pt> Dear List members, I am having a txt file with a list of accession numbers for few of the seqeuence from entire Arabidopsis thaliana genome. I have another tab delimited txt file with all the accession numbers and other details about every sequence peresent in the genome of it (row wise). From this later file I want to filter the details about only those sequences which have the same accesion numbers as in the former file. Could some one please suggest some simple way to do this matching and filtering? I tried using the simple shell scripts commands like cmp and diff but none of them worked. Is ther any other command I can use with the shell. Any other way to do so with perl is also welcome. Thank you. Best Regards, -Pooja From dig at bioinformatics.org Mon Nov 17 07:12:13 2003 From: dig at bioinformatics.org (Dmitri I GOULIAEV) Date: Mon, 17 Nov 2003 06:12:13 -0600 Subject: [BiO BB] Matching and Filtering -- try grep In-Reply-To: <49731.194.117.22.137.1069067710.squirrel@webmail.igc.gulbenkian.pt>; from pooja@igc.gulbenkian.pt on Mon, Nov 17, 2003 at 11:15:10AM -0000 References: <49731.194.117.22.137.1069067710.squirrel@webmail.igc.gulbenkian.pt> Message-ID: <20031117061212.J6754@lifebook> Hi, Pooja Jain ! On Mon, Nov 17, 2003 at 11:15:10AM -0000, Pooja Jain wrote: > I am having a txt file with a list of accession numbers for few of the > seqeuence from entire Arabidopsis thaliana genome. I have another tab > delimited txt file with all the accession numbers and other details about > every sequence peresent in the genome of it (row wise). From this later > file I want to filter the details about only those sequences which have > the same accesion numbers as in the former file. > > Could some one please suggest some simple way to do this matching and > filtering? I tried using the simple shell scripts commands like cmp and > diff but none of them worked. Is ther any other command I can use with the > shell. Any other way to do so with perl is also welcome. From man pages: grep, egrep, fgrep - print lines matching a pattern You should use grep. If file-with-a-list is a txt file with a list of accession numbers and file-with-all-the-details is the other file, then this shell one-liner user at host$ cat file-with-a-list \ | while read AN ; do \ grep "^$AN" file-with-all-the-details ; \ done >> file-with-the-details-for-the-listed-accnum should work for you (if the accession numbers are at the beginning of the lines in the "other" file). If they are not, but there are some white-space characters at the beginning of each lines, then change "^$AN" to "[ \t]$AN" (with quotation marks). Hope this helps, -- DIG (Dmitri I GOULIAEV) http://www.bioinformatics.org/~dig/ 1024D/63A6C649: 26A0 E4D5 AB3F C2D4 0112 66CD 4343 C0AF 63A6 C649 From arunnuraa at rediffmail.com Mon Nov 17 09:21:57 2003 From: arunnuraa at rediffmail.com (Arun . A) Date: 17 Nov 2003 14:21:57 -0000 Subject: [BiO BB] Computer program portal?? Message-ID: <20031117142157.5622.qmail@webmail7.rediffmail.com> An HTML attachment was scrubbed... URL: -------------- next part -------------- Dear Can any one tell me that any links that contain free computer programs/Algorithms that solve the biological problems. With regards A.A From pooja at igc.gulbenkian.pt Mon Nov 17 13:30:25 2003 From: pooja at igc.gulbenkian.pt (Pooja Jain) Date: Mon, 17 Nov 2003 18:30:25 -0000 (WET) Subject: [BiO BB] Matching and Filtering -- try grep- thanks In-Reply-To: <20031117061212.J6754@lifebook> References: <49731.194.117.22.137.1069067710.squirrel@webmail.igc.gulbenkian.pt> <20031117061212.J6754@lifebook> Message-ID: <54082.194.117.22.137.1069093825.squirrel@webmail.igc.gulbenkian.pt> Hi Dmitri I Gouliaev , Thank you for your suggestion. I followed the grep man pages and used grep -f and it worked. grep -f 'file1.txt' file2.txt > file3.txt Where file1.txt has the list of accession numbers corresponding to which I would like to filter the details from file2.txt. But the above command writes the contents of the file2.txt to file3.txt. thanks again. Regards, -Pooja > Hi, Pooja Jain ! > > On Mon, Nov 17, 2003 at 11:15:10AM -0000, Pooja Jain wrote: > >> I am having a txt file with a list of accession numbers for few of the >> seqeuence from entire Arabidopsis thaliana genome. I have another tab >> delimited txt file with all the accession numbers and other details >> about >> every sequence peresent in the genome of it (row wise). From this later >> file I want to filter the details about only those sequences which have >> the same accesion numbers as in the former file. >> >> Could some one please suggest some simple way to do this matching and >> filtering? I tried using the simple shell scripts commands like cmp and >> diff but none of them worked. Is ther any other command I can use with >> the >> shell. Any other way to do so with perl is also welcome. > > From man pages: > > grep, egrep, fgrep - print lines matching a pattern > > You should use grep. > > If > file-with-a-list is a txt file with a list of accession numbers > and > file-with-all-the-details is the other file, > > then this shell one-liner > > user at host$ cat file-with-a-list \ > | while read AN ; do \ > grep "^$AN" file-with-all-the-details ; \ > done >> file-with-the-details-for-the-listed-accnum > > should work for you (if the accession numbers are at the beginning of the > lines in the "other" file). If they are not, but there are some > white-space characters at the beginning of each lines, then change "^$AN" > to "[ \t]$AN" (with quotation marks). > > Hope this helps, > > -- > DIG (Dmitri I GOULIAEV) http://www.bioinformatics.org/~dig/ > 1024D/63A6C649: 26A0 E4D5 AB3F C2D4 0112 66CD 4343 C0AF 63A6 C649 > _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From idoerg at burnham.org Mon Nov 17 13:32:28 2003 From: idoerg at burnham.org (Iddo Friedberg) Date: Mon, 17 Nov 2003 10:32:28 -0800 Subject: [BiO BB] Tree/hierarchcal clustering software? Message-ID: <3FB9143C.9000902@burnham.org> Hi, (Apologies for cross-posting). I have data in the form of a distance matrix (well, similarity, but easily convertible). I would like a hierarchical clustering software which: 1) Has a good number of amalgamation rules. I want to throw as many things as I can at my data set. A good number of distance rules (Eucledian, squared, Chebychev etc.) would be nice too. 2) A nice tree display. Failing that, an output in one of the standard formats which I can display using TreeView, or somesuch. I know I can use Neighbor from Phylip, I just wondered what else is out there. Any reply would be helpful. Thanks, Iddo -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 646 3171 http://ffas.ljcrf.edu/~iddo -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 646 3171 http://ffas.ljcrf.edu/~iddo From stophebat at yahoo.fr Mon Nov 17 13:50:30 2003 From: stophebat at yahoo.fr (Christophe Battail) Date: Mon, 17 Nov 2003 18:50:30 +0000 Subject: [BiO BB] Tree/hierarchcal clustering software? In-Reply-To: <3FB9143C.9000902@burnham.org> References: <3FB9143C.9000902@burnham.org> Message-ID: <3FB91876.1080400@yahoo.fr> Hi, I know two software that seems to correspond to your needs: visual interface, hierarchical agglomerative clustering implemented, choice between various similarity metrics, output visualized using a tree. Their names: Genesis or J-express (free to download). cheers, christophe Iddo Friedberg wrote: > > Hi, > > (Apologies for cross-posting). > > I have data in the form of a distance matrix (well, similarity, but > easily convertible). I would like a hierarchical clustering software > which: > > 1) Has a good number of amalgamation rules. I want to throw as many > things as I can at my data set. A good number of distance rules > (Eucledian, squared, Chebychev etc.) would be nice too. > > 2) A nice tree display. Failing that, an output in one of the standard > formats which I can display using TreeView, or somesuch. > > I know I can use Neighbor from Phylip, I just wondered what else is out > there. Any reply would be helpful. > > Thanks, > > Iddo > From cdwan at mail.ahc.umn.edu Mon Nov 17 14:13:53 2003 From: cdwan at mail.ahc.umn.edu (Chris Dwan (CCGB)) Date: Mon, 17 Nov 2003 13:13:53 -0600 (CST) Subject: [BiO BB] Tree/hierarchcal clustering software? In-Reply-To: <3FB9143C.9000902@burnham.org> References: <3FB9143C.9000902@burnham.org> Message-ID: We have a clustering toolkit under a web interface at: http://cluto.ccgb.umn.edu/cgi-bin/wCluto/wCluto.cgi I've never used it myself. -Chris Dwan University of Minnesota On Mon, 17 Nov 2003, Iddo Friedberg wrote: > > Hi, > > (Apologies for cross-posting). > > I have data in the form of a distance matrix (well, similarity, but > easily convertible). I would like a hierarchical clustering software which: > > 1) Has a good number of amalgamation rules. I want to throw as many > things as I can at my data set. A good number of distance rules > (Eucledian, squared, Chebychev etc.) would be nice too. > > 2) A nice tree display. Failing that, an output in one of the standard > formats which I can display using TreeView, or somesuch. > > I know I can use Neighbor from Phylip, I just wondered what else is out > there. Any reply would be helpful. > > Thanks, > > Iddo > > -- > Iddo Friedberg, Ph.D. > The Burnham Institute > 10901 N. Torrey Pines Rd. > La Jolla, CA 92037 > USA > Tel: +1 (858) 646 3100 x3516 > Fax: +1 (858) 646 3171 > http://ffas.ljcrf.edu/~iddo > > > -- > Iddo Friedberg, Ph.D. > The Burnham Institute > 10901 N. Torrey Pines Rd. > La Jolla, CA 92037 > USA > Tel: +1 (858) 646 3100 x3516 > Fax: +1 (858) 646 3171 > http://ffas.ljcrf.edu/~iddo > > _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > From idoerg at burnham.org Mon Nov 17 14:22:21 2003 From: idoerg at burnham.org (Iddo Friedberg) Date: Mon, 17 Nov 2003 11:22:21 -0800 Subject: [BiO BB] Tree/hierarchcal clustering software? In-Reply-To: References: <3FB9143C.9000902@burnham.org> Message-ID: <3FB91FED.5070601@burnham.org> Hi Chris, Good to hear from you. Yes, I know Cluto, but it does not seem to have hierarchical clustering by distance matrix, only presentation. Cheers, ./I Chris Dwan (CCGB) wrote: > We have a clustering toolkit under a web interface at: > > http://cluto.ccgb.umn.edu/cgi-bin/wCluto/wCluto.cgi > > I've never used it myself. > > -Chris Dwan > University of Minnesota > > On Mon, 17 Nov 2003, Iddo Friedberg wrote: > > >>Hi, >> >>(Apologies for cross-posting). >> >>I have data in the form of a distance matrix (well, similarity, but >>easily convertible). I would like a hierarchical clustering software which: >> >>1) Has a good number of amalgamation rules. I want to throw as many >>things as I can at my data set. A good number of distance rules >>(Eucledian, squared, Chebychev etc.) would be nice too. >> >>2) A nice tree display. Failing that, an output in one of the standard >>formats which I can display using TreeView, or somesuch. >> >>I know I can use Neighbor from Phylip, I just wondered what else is out >>there. Any reply would be helpful. >> >>Thanks, >> >>Iddo >> >>-- >>Iddo Friedberg, Ph.D. >>The Burnham Institute >>10901 N. Torrey Pines Rd. >>La Jolla, CA 92037 >>USA >>Tel: +1 (858) 646 3100 x3516 >>Fax: +1 (858) 646 3171 >>http://ffas.ljcrf.edu/~iddo >> >> >>-- >>Iddo Friedberg, Ph.D. >>The Burnham Institute >>10901 N. Torrey Pines Rd. >>La Jolla, CA 92037 >>USA >>Tel: +1 (858) 646 3100 x3516 >>Fax: +1 (858) 646 3171 >>http://ffas.ljcrf.edu/~iddo >> >>_______________________________________________ >>BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org >>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >> > > _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 646 3171 http://ffas.ljcrf.edu/~iddo From hjm at tacgi.com Mon Nov 17 16:08:38 2003 From: hjm at tacgi.com (Harry Mangalam) Date: Mon, 17 Nov 2003 13:08:38 -0800 Subject: [BiO BB] Matching and Filtering -- try grep- thanks In-Reply-To: <54082.194.117.22.137.1069093825.squirrel@webmail.igc.gulbenkian.pt> References: <49731.194.117.22.137.1069067710.squirrel@webmail.igc.gulbenkian.pt> <20031117061212.J6754@lifebook> <54082.194.117.22.137.1069093825.squirrel@webmail.igc.gulbenkian.pt> Message-ID: <3FB938D6.9060102@tacgi.com> Try this. so named b/c it can act as a 'super cut' or can perform a lot of slicing/dicing scut work for you I wrote it do exactly what (I think) you're describing. Here's the description: Usage: scut [options, below] > output_file --f1=[file1] - the shorter or 'needle' file. If using as a smarter cut, use STDIN. --f2=[file2] - the 'longer' or 'haystack' file --k1=col# - the key column from file1 (numbered from ZERO, not 1) i.e the number of the column (starting from 0) that has the key column name for file1 (see example below) --k2=col# - the key column from file2 (ditto) --c1='# # ..' - the numbers of the columns from file1 that you want or printed out in the order in which you want them. If --c1='A C F ..' you DON'T want any columns from the file, just enter it as '' (2 single quotes) or omit it completely. If you want the whole line, type 'all' Notes: 1) #s are split on whitespace, not commas. or 2) scut also supports Excel-style column specifiers --c1='A C F ..' (A B F AD BG etc) for up to 78 columns (->BZ) If you want more, add them to the %excel_ids hash above or create an algo that does it right. --c2='# # ..' - ditto for file2 or --c2='A C F ..' --id1='...' - the delimiter string for file1; defaults to whitespace (specify TAB = '\t'), but can be a multicharacter string as well such as '_|_' --id2='...' - ditto for file2 --od='...' - the delimiter string for the output (defaults to TAB) --noerr - stops most stderr from being generated (for large files, most of the CPU is dedicated to processing the STDERR text stream (thanks for stressing it, Peter), but if you need this output, you'll just have to deal with it. NB: the following 3 options: --begin, --end, --excl currently only work with the single file version (as a smarter cut, not the merging functions). Stay tuned for the 2 file version.. --begin=[#|regex] - specifies the line to START processing data at (for example, if the file has 2 format sections and you only want to process one of them). The option can be either an integer value to specify the line number, or a non- repeating regular expression that unambiguously identifies the line. --end=[#|regex] - as above, but specifies the line to STOP processing data at. --excl - if added to the arguments, excludes the lines specified by --begin and --end (in case you need to exclude the defining header lines). --version - gives the version of the software and dies. --nocase - makes the merging key case INSENSITIVE. --sync - whether you want the output sync'ed on file2. The sync will insert blank lines where there are comments as well. --help - dumps these lines to stdout and dies. Notes: = there have to be the same number of columns in each line or it will get confused. The matches are case-sensitive, unless you use the '--nocase' option to turn it off. = scut sends its output to stdout, so if you want to catch the output in a file, use redirection '>' (see below) and if you want to catch the stderr you'll have to catch that as well ( >& out ). = scut ignores any line that starts with a '#', so you can document what the columns mean, add column numbering, etc, as long as those lines start with a '#' = scut always puts the matched key in the 1st column of the output = under Win/DOS execution, you will probably need to run it with the perl prefix i.e. perl scut [options] and will also have to enclose the option strings with DOUBLE QUOTES (\"opts\") instead of single quotes('opts'). Pooja Jain wrote: > Hi Dmitri I Gouliaev , > Thank you for your suggestion. I followed the grep man pages and used > grep -f and it worked. > > grep -f 'file1.txt' file2.txt > file3.txt > > Where file1.txt has the list of accession numbers corresponding to which I > would like to filter the details from file2.txt. But the above command > writes the contents of the file2.txt to file3.txt. > > thanks again. > > Regards, > -Pooja > > >>Hi, Pooja Jain ! >> >> On Mon, Nov 17, 2003 at 11:15:10AM -0000, Pooja Jain wrote: >> >> >>>I am having a txt file with a list of accession numbers for few of the >>>seqeuence from entire Arabidopsis thaliana genome. I have another tab >>>delimited txt file with all the accession numbers and other details >>>about >>>every sequence peresent in the genome of it (row wise). From this later >>>file I want to filter the details about only those sequences which have >>>the same accesion numbers as in the former file. >>> >>>Could some one please suggest some simple way to do this matching and >>>filtering? I tried using the simple shell scripts commands like cmp and >>>diff but none of them worked. Is ther any other command I can use with >>>the >>>shell. Any other way to do so with perl is also welcome. >> >>From man pages: >> >> grep, egrep, fgrep - print lines matching a pattern >> >>You should use grep. >> >>If >> file-with-a-list is a txt file with a list of accession numbers >>and >> file-with-all-the-details is the other file, >> >>then this shell one-liner >> >> user at host$ cat file-with-a-list \ >> | while read AN ; do \ >> grep "^$AN" file-with-all-the-details ; \ >> done >> file-with-the-details-for-the-listed-accnum >> >>should work for you (if the accession numbers are at the beginning of the >>lines in the "other" file). If they are not, but there are some >>white-space characters at the beginning of each lines, then change "^$AN" >>to "[ \t]$AN" (with quotation marks). >> >>Hope this helps, >> >>-- >>DIG (Dmitri I GOULIAEV) http://www.bioinformatics.org/~dig/ >>1024D/63A6C649: 26A0 E4D5 AB3F C2D4 0112 66CD 4343 C0AF 63A6 C649 >>_______________________________________________ >>BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org >>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board >> > > > _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > -- Cheers, Harry Harry J Mangalam - 949 856 2847 (v&f) - hjm at tacgi.com <> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: scut URL: From dmb at mrc-dunn.cam.ac.uk Tue Nov 18 12:52:46 2003 From: dmb at mrc-dunn.cam.ac.uk (Dan Bolser) Date: Tue, 18 Nov 2003 17:52:46 -0000 (GMT) Subject: [BiO BB] perl help In-Reply-To: <1069018101.6429.22.camel@bioinfpc1> References: <1068942839.28328.5.camel@bioinfpc1> <1068943635.10170.8.camel@protein.scalableinformatics.com> <1069018101.6429.22.camel@bioinfpc1> Message-ID: <48127.193.60.81.207.1069177966.squirrel@www.mrc-dunn.cam.ac.uk> not sure about the genbank format, but could you do something like while(<>){ # itterate through file given on command line... if (/^ORIGIN(.*)/){ $sequence .= $1; } } $rand = rand( length($sequence) ); substr ( $rand, 100, $sequence) ); Matt McCormick said: > Please allow me to elaborate a bit more. I'm trying to pull a random 100 kb > section from the chromosomal sequence(in genbank format, > Accession NC_003070). The chromosome sequence is 30494425 bp long, and the > positions x and y represent the base i want my sequence to start and end > respectively. I'm just beginning to use perl, and for me to use substr function, > i would have to find and join all the lines starting at the ORIGIN. I m not sure > on the best way to do this. Any help would be appreciated. Thank you. > > > On Sat, 2003-11-15 at 19:47, Joe Landman wrote: >> Hi Matt: >> >> >> On Sat, 2003-11-15 at 19:33, Matt McCormick wrote: >> > Hi, >> > I am trying to construct a sequence, with start position x and end position y, >> from A.thaliana chromosome 1. Does anyone have any code they would be willing >> to share that would do this? I have the current perl and bioperl modules >> installed. Thank you. >> >> I might suggest giving a better specification of the problem. You >> have the chromosome 1 data in some format, and you need to extract a particular >> section from it? If this is the case, then is there a particular reason why the >> substr function will not work? >> >> my $small = substr $sequence,$x,$y-$x+1 ; >> >> Joe >> >> > >> > -Matt McCormick >> > >> > >> > _______________________________________________ >> > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board From mbussolati at gujm.it Wed Nov 19 04:29:48 2003 From: mbussolati at gujm.it (Mariella Bussolati) Date: Wed, 19 Nov 2003 10:29:48 +0100 Subject: [BiO BB] Re: BiO_Bulletin_ search for free software In-Reply-To: <20031118170107.06DB4D25D4@www.bioinformatics.org> Message-ID: Hi, i am an italian science journalist . As a volunteer of a group named laser (www.e-laser.org) which works on a critics perspective of research and the work of researchers, and is involved in issue as free software applicaton to science, the concept of common goods translated to the science knowledge, the awareness of the consequences of copyrights, i am trying to help to collect a Cd with some examples of open source software used in some research. We are trying to do that before the World summit on information society wich will be held in Geneva 10-12 december, to recall that also science informations need to be shared for the benefit of all human beings. If you want to make your suggestion, please make a private answer if you prefer, give some info on the software and a link for more information and uploading. best regards mariella ***** Avviso ****** Il presente messaggio e gli eventuali allegati potrebbero contenere informazioni confidenziali. Pertanto, a tutela dei destinatari, ne sono vietati l'uso e la diffusione senza autorizzazione. Dal momento che qualsiasi messaggio elettronico puo' essere manomesso, Gruner und Jahr/Mondadori spa declina ogni responsabilita' in caso di falsificazione. Questo messaggio e' stato verificato da un sistema antivirus e nessun virus conosciuto e' stato rilevato. ******************* From johannes.huesing at uni-essen.de Wed Nov 19 07:18:16 2003 From: johannes.huesing at uni-essen.de (=?iso-8859-1?Q?Johannes_H=FCsing?=) Date: Wed, 19 Nov 2003 13:18:16 +0100 Subject: [BiO BB] Re: BiO_Bulletin_ search for free software In-Reply-To: ; from mbussolati@gujm.it on Wed, Nov 19, 2003 at 10:29:48AM +0100 References: <20031118170107.06DB4D25D4@www.bioinformatics.org> Message-ID: <20031119131816.A33658@spi.power.uni-essen.de> On Wed, Nov 19, 2003 at 10:29:48AM +0100, Mariella Bussolati wrote: > > Hi, i am an italian science journalist . As a volunteer of a group named > laser (www.e-laser.org) which works on a critics perspective of research > and the work of researchers, and is involved in issue as free software > applicaton to science, the concept of common goods translated to the science > knowledge, the awareness of the consequences of copyrights, i am trying to > help to collect a Cd with some examples of open source software used in some > research. Have you looked at the Quantian project? http://dirk.eddelbuettel.com/quantian.html [...] > If you want to make your suggestion, please make a private answer if you > prefer, give some info on the software and a link for more information and > uploading. The reply-to list "feature" does not lend very well to private answers. From brodo at itc.it Wed Nov 19 11:13:27 2003 From: brodo at itc.it (Linda Brodo) Date: Wed, 19 Nov 2003 17:13:27 +0100 Subject: [BiO BB] Job positions at ITC-irst Message-ID: <3FBB96A7.7070504@itc.it> Dear all, the Bioinformatics group of the Institute ITC-irst (http://www.itc.it) in Trento (Italy) offers two positions (researcher and research assistant) in the Bioinformatic area. The research field concerns the application of computer science Formal Methods for the simulation and the analysing of the behavior of biological systems. Please, have a look at the call at http://sra.itc.it/positions/bias_call.html There will be also the possibility of applying for a Ph.D. in this topic at the international Ph.D school at the University of Trento, sponsored by ITC-irst For more information and submissions, please reply to bioinfo at itc.it Best Regards, Linda Brodo ITC-irst The Centre for Scientific and Technological Research (ITC-irst) was founded in the 1976 as a public research center of the Autonomous Province of Trento, Italia. The Centre has developed research in the sectors of Information Technologies, Microsystems, and Physical Chemistry of Surfaces and Interfaces. The basic and applied research activities of ITC-irst aim towards the solution of real problems need for technological innovation in the economic world. The Centre fulfills its mission by disseminating its results and by technological transfer projects which involve enterprises and public entities. The ITC-irst is composed by five division and two applicative area. For more information on the ITC Institute, please see http://www.irst.itc.it/ Bioinformatics group The Bioinformatics group is characterized by a multi-disciplinary approach and an attention to real bio-medical problems and challenges. We established active collaborations with pathologists, biologists, and physicians in order to create a multi-disciplinary co-operative work environment with the ultimate goal to define methods for supporting bio-medical research and physician's daily activity. The main part of our activity takes place in Knowledge Discovery and Data Mining and Formal Methods fields, two of the topics of nowadays Bioinformatics research. We are now working on machine learning methods for the analysis of bio-genetic data provided by high-throughput experiments, related to clinical information and on modelling and analysing of complex biological systems by means of logical formalism. The group is currently made up by 5 researchers with different background: computer science, mathematics, physics, and with expertise in statistics, machine learning, formal methods, image processing, web-based systems, etc. For more information on the Bioinformatics group, please send and email to sboner at itc.it SRA division The Automated Reasoning Systems (SRA) division consists of about 60 people and develops methodologies and technologies that help increase autonomy, quality, safety and reliability of software systems. The members of SRA have been actively working in the field of formal verification since 1990, in the development of techniques and tools for automated deduction and model checking. SRA has been active in the development of NuSMV, an open architecture for model checking, and of bounded model checking techniques based on decision procedures for propositional satisfiability (SAT). Model checking techniques have been applied to the design and verification of safety critical systems, in particular in the field of railways, avionics, and industrial control. For more information on the SRA division of ITC-irst, please see http://sra.itc.it From dils04 at izbi.uni-leipzig.de Tue Nov 18 12:35:21 2003 From: dils04 at izbi.uni-leipzig.de (DILS04) Date: Tue, 18 Nov 2003 18:35:21 +0100 Subject: [BiO BB] Final Call: Data Integration in the Life Sciences, DILS04 Message-ID: <3FBA5859.5060503@izbi.uni-leipzig.de> Final CALL FOR PAPERS Int. Workshop on Data Integration in the Life Sciences (DILS 2004) Deadline: Nov. 30, 2003 Industrial exhibits welcome Proceedings will be published in Springer LNCS http://izbi.uni-leipzig.de/dils04 Workshop date: March 25-26, 2004, Univ. of Leipzig, Germany ------------------------------------------------------- AIM AND SCOPE New advances in life sciences, e.g. molecular biology, biodiversity, drug discovery and medical research, increasingly depend on bioinformatics methods to manage and analyze vast amounts of highly diverse data. The volume of data is increasing at an unprecedented pace, fueled by world-wide research activities producing publicly available data, and new technologies, e.g. high-throughput devices such as microarrays. Thus, data mining and analysis require comprehensive integration of heterogeneous data, that is typically distributed across many data sources on the web and often structured only to a limited extent. Despite new interoperability technologies such as XML and web services, data integration is a highly difficult and still largely manual task, especially due to the high degree of semantic heterogeneity and varying data quality as well as specific application requirements. DILS 2004 aims at providing a new forum for presenting novel research results and assessing the state of the art in the field of data integration for life sciences. It addresses researchers, professionals, and industrial practitioners to share their knowledge on this highly important bioinformatics subject. In an effort to bring together academics and industrial practitioners, we solicit both research papers and application / experience papers. It is planned to publish accepted papers by Springer-Verlag in the Lecture Notes in Computer Science (LNCS) series. TOPICS OF INTEREST Topics of interest include, but are not limited to: * Challenges for data integration in life sciences * Architectures for data integration in life sciences * Ontology-based data integration and analysis * Metadata and annotation management * Data quality and data cleaning * Tool integration and experimental workflows * Evaluation of data integration approaches in life sciences * Data integration for specific applications * Prototypes and commercial solutions PAPER SUBMISSION Authors are invited to submit original, previously unpublished papers. All submitted papers will be peer-refereed for quality, correctness, originality and relevance. Accepted papers will be published in the workshop proceedings which will be available at the workshop. Submissions must not exceed 15 pages and should be formatted according to the LNCS guidelines under: http://www.springer.de/comp/lncs/authors.html All submissions will be handled electronically. Please send your submissions in PDF format to dils04 at izbi.uni-leipzig.de Please also visit the workshop website for further information http://izbi.uni-leipzig.de/dils04 IMPORTANT DATES Paper submissions: November 30, 2003 Author notification: January 8, 2004 Camera-ready due: January 28, 2004 Workshop date: March 25-26, 2004 WORKSHOP CHAIR Erhard Rahm, Univ. of Leipzig PROGRAM COMMITTEE Howard Bilofsky, GlaxoSmithKline, USA Terence Critchlow, Lawrence Livermore National Laboratory, USA Peter Gray, Univ. of Aberdeen, UK Barbara Heller, Univ. of Leipzig, Germany Ralf Hofestaedt, Univ. of Bielefeld, Germany Jessie Kennedy, Napier Univ. Edinburgh, UK Ulf Leser, HU Berlin, Germany Bertram Lud?scher, San Diego Supercomputer Center, USA Sergey Melnik, Microsoft Research, USA Peter Mork, Univ. of Washington, Seattle, USA Felix Naumann, HU Berlin, Germany Frank Olken, Lawrence Berkeley National Laboratory, USA Norman Paton, Univ. of Manchester, UK Erhard Rahm, Univ. of Leipzig, Germany Louiqa Raschid, Univ. of Maryland, USA Kai-Uwe Sattler, TU Ilmenau, Germany Steffen Schulze-Kremer, FU Berlin, Germany Robert Stevens, Univ. of Manchester, UK Sharon Wang, IBM Life Sciences, USA Limsoon Wong, Institute for Infocomm Research, Singapore EXHIBITION Sponsors of the event can exhibit their software and other products. ORGANIZATION The workshop is organized by the Bioinformatics centre of the University of Leipzig (www.izbi.de). ------------------------------- Prof. Dr. Erhard Rahm http://dbs.uni-leipzig.de From deepan_3356 at yahoo.co.in Sun Nov 23 03:57:14 2003 From: deepan_3356 at yahoo.co.in (=?iso-8859-1?q?deepan=20chakravarthy=20n?=) Date: Sun, 23 Nov 2003 08:57:14 +0000 (GMT) Subject: [BiO BB] combinational chemistry In-Reply-To: <3FBB96A7.7070504@itc.it> Message-ID: <20031123085714.8274.qmail@web8203.mail.in.yahoo.com> hello , where can i get proper details about combinational chemisty in web .plz guide me. ===== --------------------------------------------- deepan chakravarthy n 2nd year,(2nd sem), b.tech(biotech), anna university , chennai. ph no: hostel:22354862(044)room no205, home:04287-241199,04287244399, address: ac tech hostel (jh 207), anna university, chennai-25. ________________________________________________________________________ Yahoo! India Mobile: Download the latest polyphonic ringtones. Go to http://in.mobile.yahoo.com From pvd4s at cms.mail.virginia.edu Tue Nov 25 16:35:52 2003 From: pvd4s at cms.mail.virginia.edu (Peter V. Decker) Date: Tue, 25 Nov 2003 16:35:52 -0500 Subject: [BiO BB] ScanArray 4000 Message-ID: Greetings, If anyone with experience with the Lumonics (now Perkin-Elmer) ScanArray 4000 scanner can assist me, please reply to pvd4s at virginia.edu. Can anyone please tell me what software can be used for data recording and analysis with the ScanArray 4000? I know that for the GenePix system, there is an output text file with the .gpr extension, but I can't seem to figure out what the file output for the ScanArray system is (if there is one). If it does not exist, then how can we quantify signal intensity, subtract background, etc. with just a simple TIFF image? Many thanks, Peter V. Decker From yhuang4 at memphis.edu Wed Nov 26 09:47:39 2003 From: yhuang4 at memphis.edu (yhuang4 at memphis.edu) Date: Wed, 26 Nov 2003 08:47:39 -0600 Subject: [BiO BB] QBLAST Message-ID: <235565239c0d.239c0d235565@memphis.edu> Dear Malcolm, i just saw your post about QBLAST on Sep 2003. I wrote code for batch BLAST through QBLAST at begining of this year (my Master project of bioinformatics). The code was written in JAVA and Perl (Bioperl). The flexibility of the code is: 0.good for single query or batch queries, BLASTn and BLASTp 1. observe the length of the first sequence of the batch queries, and automatically decide the parameters 2.dynamically check the working process 3.output format: HTML XML, Text 4.parse the result (Bioperl)and extraxt info by the user, e.g. top 1, or 2.. hit. If you are still interested in QBLAST, please contact me. Yong Huang Feinstone center for genomic research University of Memphis (901) 678 2458 Email: yhuang4 at memphis.edu From tsucheta at hotmail.com Wed Nov 26 15:35:50 2003 From: tsucheta at hotmail.com (Sucheta Tripathi) Date: Wed, 26 Nov 2003 20:35:50 +0000 Subject: [BiO BB] SignalP batch submission Message-ID: An HTML attachment was scrubbed... URL: From Natalio.Krasnogor at nottingham.ac.uk Fri Nov 28 06:10:34 2003 From: Natalio.Krasnogor at nottingham.ac.uk (Natalio Krasnogor) Date: Fri, 28 Nov 2003 11:10:34 +0000 Subject: [BiO BB] Bioinformatics Survey Message-ID: <3FC72D2A.275977A5@nottingham.ac.uk> Dear Colleague, I am conducting a webioner (i.e. Web based Questionnaire) with the aim of capturing recognized researchers and educators views regarding bioinformatics education in academia. The information you input will help me to devise a robust core bioinformatics curriculum for the School of Computer Science and IT at the University of Nottingham. More specifically, this poll is aimed at trying to understand your perceptions of what should be included (and what left out) in core bioinformatics and auxiliary modules, what skills must be developed in both teachers and learners, etc. I would very much appreciate your input as part of my information gathering efforts. The webioner can be found in: http://www.cs.nott.ac.uk/~nxk/POLL2/natWebioner.html Thanks in advance for your time and for sharing your expertise. Yours, Dr. N.Krasnogor Lecturer School of Computer Science and Information Technology University of Nottingham United Kingdom. -- ---------------------------------------------------------------------------------- NATALIO KRASNOGOR, Ph.D. Automated Scheduling, Planning and Optimisation Group Lecturer School of Computer Sciences and Information Technology Jubilee Campus University of Nottingham Tel.: +44 - 0115 - 8467592 Nottingham, NG81BB United Kingdom URL: http://www.cs.nott.ac.uk/~nxk/ e-mail: Natalio.Krasnogor-replace all this by at symbol-nottingham.ac.uk ---------------------------------------------------------------------------------- We have recently published a book on Fuzzy Sets, details at http://www.springer-ny.com/detail.tpl?isbn=354000551X From boris.steipe at utoronto.ca Sun Nov 30 12:35:35 2003 From: boris.steipe at utoronto.ca (Boris Steipe) Date: Sun, 30 Nov 2003 12:35:35 -0500 Subject: [BiO BB] Bioinformatics Survey References: <3FC72D2A.275977A5@nottingham.ac.uk> Message-ID: <3FCA2A67.AB207C1C@utoronto.ca> Natalio Krasnogor wrote: > > Dear Colleague, > > I am conducting a webioner (i.e. Web based Questionnaire) with the aim > of capturing recognized researchers and educators views regarding > bioinformatics education in academia. [...] I would be willing to participate if you would pledge to make the results of the survey available, either publically, or at least to those participating in the survey. I am sure this would be of interest and value to many who are subscribed here. Best regards, Boris --- Prof. Boris Steipe University of Toronto Program in Proteomics & Bioinformatics Departments of Biochemistry & Molecular and Medical Genetics http://biochemistry.utoronto.ca/steipe