From ykalidas at gmail.com  Fri Sep  1 12:39:58 2006
From: ykalidas at gmail.com (Kalidas Yeturu)
Date: Fri, 1 Sep 2006 22:09:58 +0530
Subject: [Bio BB] Rasmol not working in Power PC architecture
Message-ID: <5632703b0609010939w55296e8cpa4ba7505102797d4@mail.gmail.com>

Hi
 I have been using rasmol scripts in my project. But till now there is no
problem with running rasmol in i686 linux machines.
uname -a gives: "Linux threonine 2.4.20-8smp #1 SMP Thu Mar 13 17:45:54 EST
2003 i686 i686 i386 GNU/Linux"
 Now I have to execute my programs in an MPI environment on PowerPC
architecture.
uname on MPI-cluster gives:"Linux cnode39 2.6.5-7.139-pseries64 #1 SMP Fri
Jan 14 15:41:33 UTC 2005 ppc64 ppc64 ppc64 GNU/Linux"

 when i type ./rasmol: the error is :"bash: ./rasmol: cannot execute binary
file"

 I searched in internet for Rasmol FAQ's and tried out various installations
of Rasmol for PowerPC architecture, but could not solve the problem.

 I hope someone can solve this problem

Thanking You
Regards
-- 
Kalidas Y
http://ssl.serc.iisc.ernet.in/~kalidas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20060901/f52f9435/attachment.html>

From janderson_net at yahoo.com  Fri Sep  1 15:03:20 2006
From: janderson_net at yahoo.com (James Anderson)
Date: Fri, 1 Sep 2006 12:03:20 -0700 (PDT)
Subject: [BiO BB] question about Agilent microarray data format
Message-ID: <20060901190320.74897.qmail@web31210.mail.mud.yahoo.com>

Hi, 
  I have some Agilent microarray data. I am not familiar with the format. (I am more familiar with Affy data). There are some columns named "gProcessedSignal" "rProcessedSignal", "LogRatio", etc. I guess it's more like cDNA with two channels. So should I use the LogRatio value to perform the next step analysis (gene selection, PCA, clustering, etc).  

  Thanks,
  James

 		
---------------------------------
Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2?/min or less.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20060901/8bfe5e69/attachment.html>

From christoph.gille at charite.de  Fri Sep  1 16:50:50 2006
From: christoph.gille at charite.de (Dr. Christoph Gille)
Date: Fri, 1 Sep 2006 22:50:50 +0200 (CEST)
Subject: [Bio BB] Rasmol not working in Power PC architecture
In-Reply-To: <5632703b0609010939w55296e8cpa4ba7505102797d4@mail.gmail.com>
References: <5632703b0609010939w55296e8cpa4ba7505102797d4@mail.gmail.com>
Message-ID: <61253.84.190.71.56.1157143850.squirrel@webmail.charite.de>

I am not really familiar with ppc
but I know that there is a package "Fink"  http://fink.sourceforge.net/
which  turns a ppc into a UNIX wokstation with compilers, X-Windows etc.

To compile Rasmol you will need C, X-Windows and Tcl/TK.

Another idea, JMol is an excellent Rasmol like program in Java which takes
nearly all Rasmol commands. Perhaps you could use Jmol instead.


From ykalidas at gmail.com  Sat Sep  2 04:47:39 2006
From: ykalidas at gmail.com (Kalidas Yeturu)
Date: Sat, 2 Sep 2006 14:17:39 +0530
Subject: [Bio BB] Rasmol not working in Power PC architecture
In-Reply-To: <61253.84.190.71.56.1157143850.squirrel@webmail.charite.de>
References: <5632703b0609010939w55296e8cpa4ba7505102797d4@mail.gmail.com>
	<61253.84.190.71.56.1157143850.squirrel@webmail.charite.de>
Message-ID: <5632703b0609020147he1821bagbc77726d67d14f73@mail.gmail.com>

Thank you.
I will try it out.

On 9/2/06, Dr. Christoph Gille <christoph.gille at charite.de> wrote:
>
> I am not really familiar with ppc
> but I know that there is a package "Fink"  http://fink.sourceforge.net/
> which  turns a ppc into a UNIX wokstation with compilers, X-Windows etc.
>
> To compile Rasmol you will need C, X-Windows and Tcl/TK.
>
> Another idea, JMol is an excellent Rasmol like program in Java which takes
> nearly all Rasmol commands. Perhaps you could use Jmol instead.
>
>
>
> _______________________________________________
> General Forum at Bioinformatics.Org -
> BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>


-- 
Kalidas Y
http://ssl.serc.iisc.ernet.in/~kalidas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20060902/713244cd/attachment.html>

From christoph.gille at charite.de  Tue Sep  5 09:40:39 2006
From: christoph.gille at charite.de (Dr. Christoph Gille)
Date: Tue, 5 Sep 2006 15:40:39 +0200 (CEST)
Subject: [BiO BB] Yet another Web 3D alignment viewer
Message-ID: <50647.141.20.6.60.1157463639.squirrel@webmail.charite.de>

Hi,

I have made a  free Java Webstart  3D alignment viewer and would
like to ask you for your opinion and suggestions.

It has already been tested  on Linux, Windows XP (german),
Windows 2000 (german), Windows 98 (german) and Macintosh OS-X 10.4.7.
It still needs to be tested on English MS-Windows, Intel Macintosh, Solaris
and Irix.

When the application is loaded by clicking the jnlp file
the specified protein files are loaded and stored on HD.

The browser should invoke  .../bin/javaws on the jnlp file.

Computation is performed locally on the user computer.

Here are a few examples

http://3d-alignment.eu/pdb/a1.jnlp
This is a pure sequence alignment

http://3d-alignment.eu/pdb/a2.jnlp
This is a  multiple 3D alignment

http://3d-alignment.eu/pdb/a3.jnlp
This is mixed 3D and sequence alignment

http://3d-alignment.eu/pdb/a4.jnlp
This exmple demonstrates  the alternative syntax for PDB chain identifiers.


The following links load a pdb file with a given pdb Id and
search for structurally similar proteins.

http://3d-alignment.eu/pdb/1prn.jnlp
This is a simple case of an  X-ray structure with only one chain.

http://3d-alignment.eu/pdb/1aab.jnlp
This is an MNR structure. Only model 1 is loaded to save time.

http://3d-alignment.eu/pdb/1ryp.jnlp
This structure has 28 PDB chains.
There are 14 different sequences.
Results are shown in a tabbed pane with 14 tabs.

There is also a README telling the syntax how the Web link is formed.
It is quite simple.

Is it working smoothly on your computers ?


From pascual at cnb.uam.es  Tue Sep  5 04:38:45 2006
From: pascual at cnb.uam.es (Alberto Pascual Montano)
Date: Tue, 5 Sep 2006 10:38:45 +0200
Subject: [BiO BB] question about Agilent microarray data format
References: <20060901190320.74897.qmail@web31210.mail.mud.yahoo.com>
Message-ID: <009001c6d0c6$b336e8a0$7257f496@ANDREA>

Hi James,

You can download the manual for the Agilent Feature Extraction software at:

http://microarray.onc.jhmi.edu/forms/ImageAnalysisManual.pdf#search=%22Agilent%20Feature%20Extraction%20software%20%22

There you will find details of the data format. In summary, "gProcessedSignal" "rProcessedSignal" are the Cy3 and Cy5 processed signals (the normalization algorihtms used are described in the data file), "LogRatio" is the base10 log ratio (rProcessedSignal/gProcessedSignal) and PValueLogRatio is the significance level of the calculated log ratio.

Regards,
Alberto


----- Original Message ----- 
  From: James Anderson 
  To: bio board 
  Sent: Friday, September 01, 2006 9:03 PM
  Subject: [BiO BB] question about Agilent microarray data format


  Hi, 
    I have some Agilent microarray data. I am not familiar with the format. (I am more familiar with Affy data). There are some columns named "gProcessedSignal" "rProcessedSignal", "LogRatio", etc. I guess it's more like cDNA with two channels. So should I use the LogRatio value to perform the next step analysis (gene selection, PCA, clustering, etc).  

    Thanks,
    James


------------------------------------------------------------------------------
  Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2?/min or less. 


------------------------------------------------------------------------------


  _______________________________________________
  General Forum at Bioinformatics.Org - BiO_Bulletin_Board at bioinformatics.org
  https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20060905/9fe1644b/attachment.html>

From xiaowei.jiang at msn.com  Tue Sep  5 13:19:04 2006
From: xiaowei.jiang at msn.com (Xiaowei JIANG)
Date: Tue, 5 Sep 2006 19:19:04 +0200
Subject: [BiO BB] Re: question about Agilent microarray data format
In-Reply-To: <20060902160106.DF6533686D3@primary.bioinformatics.org>
Message-ID: <BAY119-DAV8C2F9B21AC98958DB0F12E4300@phx.gbl>

 Dear,

I asked the same question to the group weeks ago.
I used the raw Agilent microarray data in the JMP Microarray  analysis
platform. I did data preprocessing , normalization , gene selection,
multidimensional scaling , PCA , clustering ananysis , and annotation
analysis on the same platform 

The data input engine of JMP Micoarray used the columns  "gProcessedSignal"
and  "rProcessedSignal" to produce the SAS data set, I used a log2
transformation to make the data more normal like.  Depending on your
experiment and design, you choose your specific normalization method, there
are some books talking about this situation. So you need specific
normalization softwares, and other analysis platforms to perform data
preprocessing and data analysis.

So for the Agilent data format , you dont actually need to care too much
about the data format itself, instead, you should consider more about the
preprocessing and data analysis methods you are going to use , the whole
analysis procedure pertaining to the streamlined analysis software platforms
should be considered in advance, etc. 


Kind regards,
Xiaowei JIANG 


From J.Hane at murdoch.edu.au  Thu Sep  7 02:32:52 2006
From: J.Hane at murdoch.edu.au (James Hane)
Date: Thu, 7 Sep 2006 14:32:52 +0800
Subject: [BiO BB] Metabolomics software
Message-ID: <E5BEA780290ACE4DB8267F4ECFCA23DB0602FBB3@MERCURY.ad.murdoch.edu.au>

Hi I was wondering what metabolomics software you use and/or recommend?

Many thanks,
James Hane


From varvenne at genoway.com  Thu Sep  7 03:39:42 2006
From: varvenne at genoway.com (Benoit VARVENNE)
Date: Thu, 07 Sep 2006 09:39:42 +0200
Subject: [BiO BB] Restriction sites frequencies in mouse genome
In-Reply-To: <200609061243.09299.harry.mangalam@uci.edu>
Message-ID: <C125995E.6AA%varvenne@genoway.com>

Hello,

Harry,
Thanks for your answer. I'd be very interested in having this code.

First i only had to calculate frequencies in mouse genome but now things
have changed... I'm interested in having positions of hits and in
calculating distribution, fragment length ...
The next step will be to make the link between hits found and corresponding
features available in Ensembl databases (site in an existing gene,
centromere, repeat regions, ...).
I think i'm going to use Ensembl Perl API to do so.

If anyone has got other ideas, i'd be very interested in them.

If anyone's interested, i've got an optimized (program memory and
performance) general perl script for finding number of hits of a sequence
(or a pattern version) in very big sequences (like chromosomes or genome).
Let me know if you want it.
There is no management of a list of program entries for the moment and no
management of storing positions, ....


Regards,

Benoit Varvenne,
Bioinformatics pearson in charge,
Genoway Lyon - France.

Le 6/09/06 21:43, ??Harry Mangalam?? <harry.mangalam at uci.edu> a ?crit?:

> If by calculating frequencies, you want to find all the sites in a
> genome, tacg will do this.  It will find all the sites you give it
> (I've tested it on all human chromosome assemblies) as well as the
> predicted frequency based on the base pair distribution.
> 
> It can theoretically do the entire genome in one shot if you have
> enough RAM, but I've never tried it and the output would be pretty
> ferocious.
> for example, for chromosome 21 (a paltry 33.6MB), the summary output
> is:
> 
> ## Sequence: #1; from file: UNAVAILABLE
>  Format: FASTA; ID: gi:89161201; Description: Homo sapiens
> chromosome 21, alternate assembly (based on Celera assembly), whole
> genome shotgun sequence.
> 
> == Sequence info:
> 
>   NB: sequence length > A+C+G+T due to -> 224404 <- IUPAC
> degeneracies.
>   # of:  N:224404  Y:0  R:0  W:0  S:0  K:0  M:0  B:0  D:0  H:0  V:0
> 
>  #s below are for top strand; 'sites exp' values calculated on the
> basis of both strands.
>  33216610 bases; 9772353 A(29.42 %)  6752472 C(20.33 %)  6753971
> G(20.33 %)  9713410 T(29.24 %)
> 
> == Enzymes that DO NOT MAP to this sequence:
> 
>       There were NO NON-matches - ALL patterns matched at least
> ONCE.
> 
> 
> == Total Number of Hits per Enzyme:
>      AatII  1068       BsiEI  1803       EcoRV  4841        PsiI
> 20384
>       AccI 12230     BsiHKAI 23981        FauI 18509
> PspGI112279
>      AccII  9733       BsiWI   174      Fnu4HI 74994      PspOMI
> 6067
>     Acc65I  3021        BslI 91011        FokI 59656        PstI
> 15561
>       AciI 52859        BsmI 13955        FseI   235        PvuI
> 181
>       AclI  2047       BsmAI 73662        FspI  1211       PvuII
> 12841
>       AfeI  1406       BsmBI  7619       HaeII  7030        RsaI
> 56361
>      AflII  7226       BsmFI 45828      HaeIII 99508       RsrII
> 126
>     AflIII 18426    Bsp1286I 57995        HgaI  8115        SacI
> 6829
>       AgeI   676       BspEI  1246        HhaI 21013       SacII
> 893
>       AhdI  3149       BspHI 11844      HinP1I 21013        SalI
> 392
>       AluI143869       BspMI 16591      HincII 13046       SanDI
> 3409
>       AlwI 37296        BsrI 63802     HindIII  9457        SapI
> 4316
>      AlwNI 16140       BsrBI  2994       HinfI 96900      Sau96I
> 77627
>       ApaI  6067       BsrDI 16179        HpaI  4478      Sau3AI
> 79640
>      ApaLI  6042       BsrFI  4609       HpaII 29934        SbfI
> 1068
>       ApoI 74171       BsrGI  9408        HphI 67904        ScaI
> 5880
>       AscI    47      BssHII   890        KasI  2793
> ScrFI137189
>       AseI 17631       BssKI137189        KpnI  3021       SexAI
> 3472
>       AvaI 12916       BssSI  5101       MaeII 28783       SfaNI
> 42093
>      AvaII 31938      BstAPI  9253      MaeIII 83257        SfcI
> 39408
>      AvrII  6112       BstBI  1256       MboII100007        SfiI
> 599
>       BaeI  2868      Bst4CI 87767        MfeI  6359        SfoI
> 2793
>       BaeI  2868      BstDSI 14918        MluI   334        SgfI
> 13
>      BamHI  4165      BstEII  4065        MlyI 44962       SgrAI
> 214
>       BanI 18704      BstF5I 59661        MnlI308118        SmaI
> 4948
>      BanII 27893       BstNI112279        MscI 14579        SmlI
> 29332
>       BbeI  2793       BstUI  9733        MseI226716       SnaBI
> 1598
>       BbsI 16623       BstXI 19685        MslI 38862        SpeI
> 4362
>       BbvI 63057       BstYI 24349      MspA1I 17762        SphI
> 6477
>      BbvCI 14806     BstZ17I  4605        MwoI 73785        SrfI
> 302
>       BcgI  3733      Bsu36I 10646        NaeI  1898        SspI
> 28450
>       BcgI  3733        BtgI 14918        NarI  2793        StuI
> 8988
>      BciVI  7495        BtrI  3836        NciI 24927        StyI
> 34781
>       BclI  8350       Cac8I 66066        NcoI  8941        SwaI
> 2801
>       BfaI 83296        ClaI  1121        NdeI 10096        TaiI
> 28783
>       BglI  6550       Csp6I 56361      NgoMIV  1898        TaqI
> 17908
>      BglII  8895       CviJI507227        NheI  2770        TatI
> 30303
>       BlpI  6131       CviRI168208      NlaIII161486        TfiI
> 51945
>       BmrI 19063        DdeI155096       NlaIV 87348        TliI
> 1496
>       BplI 11478        DpnI 79640        NotI   127        TseI
> 63101
>       BpmI 32957        DraI 41466        NruI   209      Tsp45I
> 47283
>     Bpu10I 25858      DraIII  6989        NsiI 11383
> Tsp509I254887
>       BsaI 18254        DrdI  3165        NspI 36783       TspRI
> 98632
>      BsaAI  9382        EaeI 20232        PacI  1946     Tth111I
> 7783
>      BsaBI  4988        EagI  1139        PciI 12666        XbaI
> 9158
>      BsaHI  6162        EarI 25525       PflMI 11275        XcmI
> 9507
>      BsaJI121468        EciI  6774        PleI 44962        XhoI
> 1496
>      BsaWI  3529    Ecl136II  6829        PmeI   539        XmaI
> 4948
>     BseMII104754      Eco57I 24123        PmlI  4081        XmnI
> 11146
>      BseRI 23673       EcoNI  8774      Ppu10I 11383
>      BseSI 25059    EcoO109I 28937       PpuMI 12989
>       BsgI 24191       EcoRI  8938       PshAI  3251
> 
> To get the actual prdicted number of sites, you have to generate the
> Sites info which would be enormous but easily sed-able to extract
> what you needed.
> 
> This took 9.5s on a 2GHz Opteron running 64bit Linux
> 
> If you want, I'll send you the source tarball in a separate email.
> 
> hjm
> 
> 
> On Tuesday 29 August 2006 05:35, Benoit VARVENNE wrote:
>> Hello everybody,
>> 
>> Thanks to all for your ideas and suggestions. I think i'm going to
>> consider perl programming to calculate restriction sites frequency
>> as softwares mentionned in your mails (+softwares i found) don't
>> seem to be useful for a whole genome scale. Programming was to be
>> avoid for this study but it seems to be the only solution. I'm
>> really surprised not being able to find such an already done study.
>> 
>> Thanks again,
>> Regards,
>> 
>> Beno?t Varvenne,
>> Bioinformatics pearson in charge,
>> Genoway Lyon - France.
>> 
>> Le 28/08/06 11:34, ??Benoit VARVENNE?? <varvenne at genoway.com>
> a ?crit?:
>>> Dear Members,
>>> 
>>> I am a new member of this mailing-list and i don't know if such a
>>> post will draw the attention of anyone here. So excuse me in
>>> advance if my subject is not appropriate.
>>> I am searching for a way to calculate restriction sites frequency
>>> in mouse genome (so sequences from 6 to 13bp). I have already
>>> tried to do so using blast (or blast-like) tools and configuring
>>> them as needed but it gave no results, because of too numerous
>>> hits i think.
>>> 
>>> I would be very greatful if someone could help me on this topic.
>>> 
>>> Thanks a lot for your help,
>>> Best regards,
>>> 
>>> Beno?t Varvenne,
>>> Bioinformatics pearson in charge,
>>> Genoway Lyon - France
>>> 
>>> _______________________________________________
>>> General Forum at Bioinformatics.Org -
>>> BiO_Bulletin_Board at bioinformatics.org
>>> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>> 
>> _______________________________________________
>> General Forum at Bioinformatics.Org -
>> BiO_Bulletin_Board at bioinformatics.org
>> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board


From harry.mangalam at uci.edu  Wed Sep  6 15:43:08 2006
From: harry.mangalam at uci.edu (Harry Mangalam)
Date: Wed, 6 Sep 2006 12:43:08 -0700
Subject: [BiO BB] Restriction sites frequencies in mouse genome
In-Reply-To: <C11A0140.59A%varvenne@genoway.com>
References: <C11A0140.59A%varvenne@genoway.com>
Message-ID: <200609061243.09299.harry.mangalam@uci.edu>

If by calculating frequencies, you want to find all the sites in a 
genome, tacg will do this.  It will find all the sites you give it 
(I've tested it on all human chromosome assemblies) as well as the 
predicted frequency based on the base pair distribution.

It can theoretically do the entire genome in one shot if you have 
enough RAM, but I've never tried it and the output would be pretty 
ferocious.
for example, for chromosome 21 (a paltry 33.6MB), the summary output 
is:

## Sequence: #1; from file: UNAVAILABLE
   Format: FASTA; ID: gi:89161201; Description: Homo sapiens 
chromosome 21, alternate assembly (based on Celera assembly), whole 
genome shotgun sequence.

== Sequence info:

    NB: sequence length > A+C+G+T due to -> 224404 <- IUPAC 
degeneracies.
    # of:  N:224404  Y:0  R:0  W:0  S:0  K:0  M:0  B:0  D:0  H:0  V:0

   #s below are for top strand; 'sites exp' values calculated on the 
basis of both strands.
   33216610 bases; 9772353 A(29.42 %)  6752472 C(20.33 %)  6753971 
G(20.33 %)  9713410 T(29.24 %)

== Enzymes that DO NOT MAP to this sequence:

        There were NO NON-matches - ALL patterns matched at least 
ONCE.


== Total Number of Hits per Enzyme:
       AatII  1068       BsiEI  1803       EcoRV  4841        PsiI 
20384
        AccI 12230     BsiHKAI 23981        FauI 18509       
PspGI112279
       AccII  9733       BsiWI   174      Fnu4HI 74994      PspOMI  
6067
      Acc65I  3021        BslI 91011        FokI 59656        PstI 
15561
        AciI 52859        BsmI 13955        FseI   235        PvuI   
181
        AclI  2047       BsmAI 73662        FspI  1211       PvuII 
12841
        AfeI  1406       BsmBI  7619       HaeII  7030        RsaI 
56361
       AflII  7226       BsmFI 45828      HaeIII 99508       RsrII   
126
      AflIII 18426    Bsp1286I 57995        HgaI  8115        SacI  
6829
        AgeI   676       BspEI  1246        HhaI 21013       SacII   
893
        AhdI  3149       BspHI 11844      HinP1I 21013        SalI   
392
        AluI143869       BspMI 16591      HincII 13046       SanDI  
3409
        AlwI 37296        BsrI 63802     HindIII  9457        SapI  
4316
       AlwNI 16140       BsrBI  2994       HinfI 96900      Sau96I 
77627
        ApaI  6067       BsrDI 16179        HpaI  4478      Sau3AI 
79640
       ApaLI  6042       BsrFI  4609       HpaII 29934        SbfI  
1068
        ApoI 74171       BsrGI  9408        HphI 67904        ScaI  
5880
        AscI    47      BssHII   890        KasI  2793       
ScrFI137189
        AseI 17631       BssKI137189        KpnI  3021       SexAI  
3472
        AvaI 12916       BssSI  5101       MaeII 28783       SfaNI 
42093
       AvaII 31938      BstAPI  9253      MaeIII 83257        SfcI 
39408
       AvrII  6112       BstBI  1256       MboII100007        SfiI   
599
        BaeI  2868      Bst4CI 87767        MfeI  6359        SfoI  
2793
        BaeI  2868      BstDSI 14918        MluI   334        SgfI    
13
       BamHI  4165      BstEII  4065        MlyI 44962       SgrAI   
214
        BanI 18704      BstF5I 59661        MnlI308118        SmaI  
4948
       BanII 27893       BstNI112279        MscI 14579        SmlI 
29332
        BbeI  2793       BstUI  9733        MseI226716       SnaBI  
1598
        BbsI 16623       BstXI 19685        MslI 38862        SpeI  
4362
        BbvI 63057       BstYI 24349      MspA1I 17762        SphI  
6477
       BbvCI 14806     BstZ17I  4605        MwoI 73785        SrfI   
302
        BcgI  3733      Bsu36I 10646        NaeI  1898        SspI 
28450
        BcgI  3733        BtgI 14918        NarI  2793        StuI  
8988
       BciVI  7495        BtrI  3836        NciI 24927        StyI 
34781
        BclI  8350       Cac8I 66066        NcoI  8941        SwaI  
2801
        BfaI 83296        ClaI  1121        NdeI 10096        TaiI 
28783
        BglI  6550       Csp6I 56361      NgoMIV  1898        TaqI 
17908
       BglII  8895       CviJI507227        NheI  2770        TatI 
30303
        BlpI  6131       CviRI168208      NlaIII161486        TfiI 
51945
        BmrI 19063        DdeI155096       NlaIV 87348        TliI  
1496
        BplI 11478        DpnI 79640        NotI   127        TseI 
63101
        BpmI 32957        DraI 41466        NruI   209      Tsp45I 
47283
      Bpu10I 25858      DraIII  6989        NsiI 11383     
Tsp509I254887
        BsaI 18254        DrdI  3165        NspI 36783       TspRI 
98632
       BsaAI  9382        EaeI 20232        PacI  1946     Tth111I  
7783
       BsaBI  4988        EagI  1139        PciI 12666        XbaI  
9158
       BsaHI  6162        EarI 25525       PflMI 11275        XcmI  
9507
       BsaJI121468        EciI  6774        PleI 44962        XhoI  
1496
       BsaWI  3529    Ecl136II  6829        PmeI   539        XmaI  
4948
      BseMII104754      Eco57I 24123        PmlI  4081        XmnI 
11146
       BseRI 23673       EcoNI  8774      Ppu10I 11383
       BseSI 25059    EcoO109I 28937       PpuMI 12989
        BsgI 24191       EcoRI  8938       PshAI  3251

To get the actual prdicted number of sites, you have to generate the 
Sites info which would be enormous but easily sed-able to extract 
what you needed.

This took 9.5s on a 2GHz Opteron running 64bit Linux  

If you want, I'll send you the source tarball in a separate email.

hjm


On Tuesday 29 August 2006 05:35, Benoit VARVENNE wrote:
> Hello everybody,
>
> Thanks to all for your ideas and suggestions. I think i'm going to
> consider perl programming to calculate restriction sites frequency
> as softwares mentionned in your mails (+softwares i found) don't
> seem to be useful for a whole genome scale. Programming was to be
> avoid for this study but it seems to be the only solution. I'm
> really surprised not being able to find such an already done study.
>
> Thanks again,
> Regards,
>
> Beno?t Varvenne,
> Bioinformatics pearson in charge,
> Genoway Lyon - France.
>
> Le 28/08/06 11:34, ??Benoit VARVENNE?? <varvenne at genoway.com> 
a ?crit?:
> > Dear Members,
> >
> > I am a new member of this mailing-list and i don't know if such a
> > post will draw the attention of anyone here. So excuse me in
> > advance if my subject is not appropriate.
> > I am searching for a way to calculate restriction sites frequency
> > in mouse genome (so sequences from 6 to 13bp). I have already
> > tried to do so using blast (or blast-like) tools and configuring
> > them as needed but it gave no results, because of too numerous
> > hits i think.
> >
> > I would be very greatful if someone could help me on this topic.
> >
> > Thanks a lot for your help,
> > Best regards,
> >
> > Beno?t Varvenne,
> > Bioinformatics pearson in charge,
> > Genoway Lyon - France
> >
> > _______________________________________________
> > General Forum at Bioinformatics.Org -
> > BiO_Bulletin_Board at bioinformatics.org
> > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>
> _______________________________________________
> General Forum at Bioinformatics.Org -
> BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board

-- 
Harry Mangalam - Research Computing at NACS, E2148, Engineering Gateway, 
UC Irvine 92697  949 824 0084(o), 949 285 4487(c) 
harry.mangalam at uci.edu


From harry.mangalam at uci.edu  Thu Sep  7 12:23:39 2006
From: harry.mangalam at uci.edu (Harry Mangalam)
Date: Thu, 7 Sep 2006 09:23:39 -0700
Subject: [BiO BB] Restriction sites frequencies in mouse genome
In-Reply-To: <C125995E.6AA%varvenne@genoway.com>
References: <C125995E.6AA%varvenne@genoway.com>
Message-ID: <200609070923.39999.harry.mangalam@uci.edu>

On Thursday 07 September 2006 00:39, Benoit VARVENNE wrote:
> Hello,
>
> Harry,
> Thanks for your answer. I'd be very interested in having this code.
>
> First i only had to calculate frequencies in mouse genome but now
> things have changed... I'm interested in having positions of hits
> and in calculating distribution, fragment length ...

It can do the above things with no problems besides size of output (if 
you ask for all the hits for a 4cutter in 200MB, you'll get lots of 
output). tacg can generate output for gnuplotting directly for these 
kinds of distribution plots or in a few different table formats. 
(see -G option).


> The next step will be to make the link between hits found and
> corresponding features available in Ensembl databases (site in an
> existing gene, centromere, repeat regions, ...).
> I think i'm going to use Ensembl Perl API to do so.

Unfortunately, it will not do this directly now..  Your stated 
approach is probably best.  

The src is on its way.

hjm


> If anyone has got other ideas, i'd be very interested in them.
>
> If anyone's interested, i've got an optimized (program memory and
> performance) general perl script for finding number of hits of a
> sequence (or a pattern version) in very big sequences (like
> chromosomes or genome). Let me know if you want it.
> There is no management of a list of program entries for the moment
> and no management of storing positions, ....
>
>
> Regards,
>
> Benoit Varvenne,
> Bioinformatics pearson in charge,
> Genoway Lyon - France.
>
> Le 6/09/06 21:43, ??Harry Mangalam?? <harry.mangalam at uci.edu> 
a ?crit?:
> > If by calculating frequencies, you want to find all the sites in
> > a genome, tacg will do this.  It will find all the sites you give
> > it (I've tested it on all human chromosome assemblies) as well as
> > the predicted frequency based on the base pair distribution.
> >
> > It can theoretically do the entire genome in one shot if you have
> > enough RAM, but I've never tried it and the output would be
> > pretty ferocious.
> > for example, for chromosome 21 (a paltry 33.6MB), the summary
> > output is:
> >
> > ## Sequence: #1; from file: UNAVAILABLE
> >  Format: FASTA; ID: gi:89161201; Description: Homo sapiens
> > chromosome 21, alternate assembly (based on Celera assembly),
> > whole genome shotgun sequence.
> >
> > == Sequence info:
> >
> >   NB: sequence length > A+C+G+T due to -> 224404 <- IUPAC
> > degeneracies.
> >   # of:  N:224404  Y:0  R:0  W:0  S:0  K:0  M:0  B:0  D:0  H:0 
> > V:0
> >
> >  #s below are for top strand; 'sites exp' values calculated on
> > the basis of both strands.
> >  33216610 bases; 9772353 A(29.42 %)  6752472 C(20.33 %)  6753971
> > G(20.33 %)  9713410 T(29.24 %)
> >
> > == Enzymes that DO NOT MAP to this sequence:
> >
> >       There were NO NON-matches - ALL patterns matched at least
> > ONCE.
> >
> >
> > == Total Number of Hits per Enzyme:
> >      AatII  1068       BsiEI  1803       EcoRV  4841        PsiI
> > 20384
> >       AccI 12230     BsiHKAI 23981        FauI 18509
> > PspGI112279
> >      AccII  9733       BsiWI   174      Fnu4HI 74994      PspOMI
> > 6067
> >     Acc65I  3021        BslI 91011        FokI 59656        PstI
> > 15561
> >       AciI 52859        BsmI 13955        FseI   235        PvuI
> > 181
> >       AclI  2047       BsmAI 73662        FspI  1211       PvuII
> > 12841
> >       AfeI  1406       BsmBI  7619       HaeII  7030        RsaI
> > 56361
> >      AflII  7226       BsmFI 45828      HaeIII 99508       RsrII
> > 126
> >     AflIII 18426    Bsp1286I 57995        HgaI  8115        SacI
> > 6829
> >       AgeI   676       BspEI  1246        HhaI 21013       SacII
> > 893
> >       AhdI  3149       BspHI 11844      HinP1I 21013        SalI
> > 392
> >       AluI143869       BspMI 16591      HincII 13046       SanDI
> > 3409
> >       AlwI 37296        BsrI 63802     HindIII  9457        SapI
> > 4316
> >      AlwNI 16140       BsrBI  2994       HinfI 96900      Sau96I
> > 77627
> >       ApaI  6067       BsrDI 16179        HpaI  4478      Sau3AI
> > 79640
> >      ApaLI  6042       BsrFI  4609       HpaII 29934        SbfI
> > 1068
> >       ApoI 74171       BsrGI  9408        HphI 67904        ScaI
> > 5880
> >       AscI    47      BssHII   890        KasI  2793
> > ScrFI137189
> >       AseI 17631       BssKI137189        KpnI  3021       SexAI
> > 3472
> >       AvaI 12916       BssSI  5101       MaeII 28783       SfaNI
> > 42093
> >      AvaII 31938      BstAPI  9253      MaeIII 83257        SfcI
> > 39408
> >      AvrII  6112       BstBI  1256       MboII100007        SfiI
> > 599
> >       BaeI  2868      Bst4CI 87767        MfeI  6359        SfoI
> > 2793
> >       BaeI  2868      BstDSI 14918        MluI   334        SgfI
> > 13
> >      BamHI  4165      BstEII  4065        MlyI 44962       SgrAI
> > 214
> >       BanI 18704      BstF5I 59661        MnlI308118        SmaI
> > 4948
> >      BanII 27893       BstNI112279        MscI 14579        SmlI
> > 29332
> >       BbeI  2793       BstUI  9733        MseI226716       SnaBI
> > 1598
> >       BbsI 16623       BstXI 19685        MslI 38862        SpeI
> > 4362
> >       BbvI 63057       BstYI 24349      MspA1I 17762        SphI
> > 6477
> >      BbvCI 14806     BstZ17I  4605        MwoI 73785        SrfI
> > 302
> >       BcgI  3733      Bsu36I 10646        NaeI  1898        SspI
> > 28450
> >       BcgI  3733        BtgI 14918        NarI  2793        StuI
> > 8988
> >      BciVI  7495        BtrI  3836        NciI 24927        StyI
> > 34781
> >       BclI  8350       Cac8I 66066        NcoI  8941        SwaI
> > 2801
> >       BfaI 83296        ClaI  1121        NdeI 10096        TaiI
> > 28783
> >       BglI  6550       Csp6I 56361      NgoMIV  1898        TaqI
> > 17908
> >      BglII  8895       CviJI507227        NheI  2770        TatI
> > 30303
> >       BlpI  6131       CviRI168208      NlaIII161486        TfiI
> > 51945
> >       BmrI 19063        DdeI155096       NlaIV 87348        TliI
> > 1496
> >       BplI 11478        DpnI 79640        NotI   127        TseI
> > 63101
> >       BpmI 32957        DraI 41466        NruI   209      Tsp45I
> > 47283
> >     Bpu10I 25858      DraIII  6989        NsiI 11383
> > Tsp509I254887
> >       BsaI 18254        DrdI  3165        NspI 36783       TspRI
> > 98632
> >      BsaAI  9382        EaeI 20232        PacI  1946     Tth111I
> > 7783
> >      BsaBI  4988        EagI  1139        PciI 12666        XbaI
> > 9158
> >      BsaHI  6162        EarI 25525       PflMI 11275        XcmI
> > 9507
> >      BsaJI121468        EciI  6774        PleI 44962        XhoI
> > 1496
> >      BsaWI  3529    Ecl136II  6829        PmeI   539        XmaI
> > 4948
> >     BseMII104754      Eco57I 24123        PmlI  4081        XmnI
> > 11146
> >      BseRI 23673       EcoNI  8774      Ppu10I 11383
> >      BseSI 25059    EcoO109I 28937       PpuMI 12989
> >       BsgI 24191       EcoRI  8938       PshAI  3251
> >
> > To get the actual prdicted number of sites, you have to generate
> > the Sites info which would be enormous but easily sed-able to
> > extract what you needed.
> >
> > This took 9.5s on a 2GHz Opteron running 64bit Linux
> >
> > If you want, I'll send you the source tarball in a separate
> > email.
> >
> > hjm
> >
> > On Tuesday 29 August 2006 05:35, Benoit VARVENNE wrote:
> >> Hello everybody,
> >>
> >> Thanks to all for your ideas and suggestions. I think i'm going
> >> to consider perl programming to calculate restriction sites
> >> frequency as softwares mentionned in your mails (+softwares i
> >> found) don't seem to be useful for a whole genome scale.
> >> Programming was to be avoid for this study but it seems to be
> >> the only solution. I'm really surprised not being able to find
> >> such an already done study.
> >>
> >> Thanks again,
> >> Regards,
> >>
> >> Beno?t Varvenne,
> >> Bioinformatics pearson in charge,
> >> Genoway Lyon - France.
> >>
> >> Le 28/08/06 11:34, ??Benoit VARVENNE?? <varvenne at genoway.com>
> >
> > a ?crit?:
> >>> Dear Members,
> >>>
> >>> I am a new member of this mailing-list and i don't know if such
> >>> a post will draw the attention of anyone here. So excuse me in
> >>> advance if my subject is not appropriate.
> >>> I am searching for a way to calculate restriction sites
> >>> frequency in mouse genome (so sequences from 6 to 13bp). I have
> >>> already tried to do so using blast (or blast-like) tools and
> >>> configuring them as needed but it gave no results, because of
> >>> too numerous hits i think.
> >>>
> >>> I would be very greatful if someone could help me on this
> >>> topic.
> >>>
> >>> Thanks a lot for your help,
> >>> Best regards,
> >>>
> >>> Beno?t Varvenne,
> >>> Bioinformatics pearson in charge,
> >>> Genoway Lyon - France
> >>>
> >>> _______________________________________________
> >>> General Forum at Bioinformatics.Org -
> >>> BiO_Bulletin_Board at bioinformatics.org
> >>> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
> >>
> >> _______________________________________________
> >> General Forum at Bioinformatics.Org -
> >> BiO_Bulletin_Board at bioinformatics.org
> >> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board

-- 
Harry Mangalam - Research Computing at NACS, E2148, Engineering Gateway, 
UC Irvine 92697  949 824 0084(o), 949 285 4487(c) 
harry.mangalam at uci.edu


From keshet1 at umbc.edu  Thu Sep  7 14:37:25 2006
From: keshet1 at umbc.edu (Ben Keshet)
Date: Thu, 7 Sep 2006 14:37:25 -0400
Subject: [BiO BB] How to read Naccess .asa .rsa files?
Message-ID: <001e01c6d2ac$a9ce3830$29ad5582@umbc80a173302c>

Hello,

 
I installed Naccess (accessibility calculations, Simon Hubbard, University
College London) and trying to use it.  Does anyone know how to read the .asa
and .rsa files that are formed after running the program on a .pdb file? I
read the README file of the program, but could not understand what do the
different columns represent.

 
I suspect that the key to understand them is knowing how to interpret a PDB
file, so if someone knows that, please share with me.

 
Thanks a lot.

Ben

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20060907/5148750e/attachment.html>

From clement at cs.byu.edu  Tue Sep 12 16:04:49 2006
From: clement at cs.byu.edu (Mark Clement)
Date: Tue, 12 Sep 2006 14:04:49 -0600
Subject: [BiO BB] Biotechnology and Bioinformatics Symposium (BIOT-2006)
Message-ID: <2EFA73C2-7C7F-4F58-A737-936EE7827611@cs.byu.edu>

We invite you to participate in the Biotechnology and Bioinformatics  
Symposium (BIOT-2006) on October 20-21 in Provo Utah. The symposium  
will include a keynote discussion on the use of knockout mice in drug  
development as well as discussions of the the Cancer Biomedical  
Informatics Grid (caBIG). Accepted papers will be presented  
describing research into Pharmagenomics, cost effective genotyping,  
human genomic sequencing, protein-DNA interactions, hardware for exon  
prediction, protein folding, data mining and secondary structure  
prediction.

October 20-21, 2006
Provo, Utah
http://www.biotconf.org/index.shtml

----------------
Dr. Mark Clement
Department of Computer Science
Brigham Young University
3370 TMCB
Provo, Utah 84602
(801) 422-7608
clement at cs.byu.edu


From viveksr56 at hotmail.com  Wed Sep 13 03:01:45 2006
From: viveksr56 at hotmail.com (vivekanandan ramanathan)
Date: Wed, 13 Sep 2006 07:01:45 +0000
Subject: [BiO BB] New setup Bioinformatics
In-Reply-To: <20060816045318.M95213@hcl.in>
Message-ID: <BAY123-F11805AFA82F20812831EF1AA280@phx.gbl>

Dear sir

I am interested in setting up a Bioinformatics Lab in Forestry research 
iNstitute . what are the potential areas of BIoinformatics Application in 
Forestry.

With best regards
R.Vivekanandan

>From: "Balamurugan R" <rb at hcl.in>
>Reply-To: "General Forum at Bioinformatics.Org" 
><bio_bulletin_board at bioinformatics.org>
>To: "General Forum at Bioinformatics.Org" 
><bio_bulletin_board at bioinformatics.org>
>Subject: Re: [BiO BB] New setup Bioinformatics
>Date: Wed, 16 Aug 2006 10:32:05 +0530
>
>On Sat, 12 Aug 2006 04:56:36 -0700 (PDT), Rajib Borpuzari wrote
> > Dear Member,
> >
> > I want to setup new bioinformatics centre in a
> > institute through that to create database of almost
> > 2000 Germplasm.Therefore you may please give me answer
> > for following queries.
> >
> > 1.Initial infrastructure requirement.
>Depends on how you want to setup your lab.
>
>HARDWARE:
>a. Atleast a workstation or a server machine to host your database.
>b. you may require some Desktop machines as client (depends on the number 
>of
>intended users in the lab).
>
>
> > 2.Software to create database.
>If you opt for all linux solution then you get postgresql (GNU) and
>MYSQL(LGPL) versions of databases that you could use.
>
> > 3.Total cost of in Indian rupee.
>With all Linux solution, you will be spending only for your hardware and
>probably for your internet connectivity ofcourse.
>
> >
> > Thanking you.
> > With best regards,
> > R.Borpuzari
>
>Best Regards,
>Balamurugan.R
>Bioinformatics Solutions Group
>HCL Infosystems Ltd.
>Pondicherry.
>_______________________________________________
>General Forum at Bioinformatics.Org - BiO_Bulletin_Board at bioinformatics.org
>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board


From alex.li at pioneer.com  Wed Sep 13 15:33:05 2006
From: alex.li at pioneer.com (Li, Alex)
Date: Wed, 13 Sep 2006 14:33:05 -0500
Subject: [BiO BB] Seqio and fmtseq
Message-ID: <A67011DC63BEB446A1EC6BEF9919CB48596054@jhms18.phibred.com>

We have got the fixes for James Knight's legend seqio and fmtseq to get
compiled and work on linux.

 
Let me know if anyone is still interested in compiling seqio on linux or
newer unix machines.

 
Alex Li

Bioinformatics

515-334-4736

 
Alex.li at pioneer.com

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20060913/a900480b/attachment.html>

From witch_of_agnessi at yahoo.com  Thu Sep 14 12:22:58 2006
From: witch_of_agnessi at yahoo.com (Skull Crossbones)
Date: Thu, 14 Sep 2006 09:22:58 -0700 (PDT)
Subject: [BiO BB] A question on Smith-Waterman algorithm
Message-ID: <20060914162258.89586.qmail@web37901.mail.mud.yahoo.com>

Hello all,

In the SW algo. mismatches are given negative scores.
Does this mean I can not use an Identity Scoring
Matrix ( 1 for match and 0 for mismatch) for aligning
DNA sequences? Does the term "Mismatch" applies for
protein scoring matrices like PAM and BLOSUM

Thanks in advance
WoA

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


From sankar.achuth at gmail.com  Thu Sep 14 12:30:31 2006
From: sankar.achuth at gmail.com (Dr. Achuthsankar S. Nair)
Date: Thu, 14 Sep 2006 22:00:31 +0530
Subject: [BiO BB] A question on Smith-Waterman algorithm
In-Reply-To: <20060914162258.89586.qmail@web37901.mail.mud.yahoo.com>
References: <20060914162258.89586.qmail@web37901.mail.mud.yahoo.com>
Message-ID: <2b168b460609140930p5d38a648w8b805a851f5fd707@mail.gmail.com>

When you use PAM and BLOSUM, the simple 1/0 scoring is no longer applicable,
they are overtaken by the PAM/BLOSUM matrices themselves
-- 
Dr Achuthsankar S Nair
Hon. Director
Centre for Bioinformatics
University of Kerala, Trivandrum 695581, INDIA
Tel (O) 471-2412759 (R) 471-2542220
www.cbi.keralauniversity.edu
www.achu.keralauniversity.edu
===================================================================
Admissions to MPhil Bioinformatics for Jan 2007 Open - Brochure and
Application forms can be downloaded from www.cbi.keralauniversity.edu

On 9/14/06, Skull Crossbones <witch_of_agnessi at yahoo.com> wrote:
>
> Hello all,
>
> In the SW algo. mismatches are given negative scores.
> Does this mean I can not use an Identity Scoring
> Matrix ( 1 for match and 0 for mismatch) for aligning
> DNA sequences? Does the term "Mismatch" applies for
> protein scoring matrices like PAM and BLOSUM
>
> Thanks in advance
> WoA
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
> _______________________________________________
> General Forum at Bioinformatics.Org -
> BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20060914/bcafb22d/attachment.html>

From marty.gollery at gmail.com  Thu Sep 14 12:36:52 2006
From: marty.gollery at gmail.com (Martin Gollery)
Date: Thu, 14 Sep 2006 09:36:52 -0700
Subject: [BiO BB] A question on Smith-Waterman algorithm
In-Reply-To: <20060914162258.89586.qmail@web37901.mail.mud.yahoo.com>
References: <20060914162258.89586.qmail@web37901.mail.mud.yahoo.com>
Message-ID: <bdd10c2a0609140936x67875eej7535c592dcb5c944@mail.gmail.com>

Yes, you can use an Identity matrix with nucleotide.

Marty

On 9/14/06, Skull Crossbones <witch_of_agnessi at yahoo.com> wrote:
>
> Hello all,
>
> In the SW algo. mismatches are given negative scores.
> Does this mean I can not use an Identity Scoring
> Matrix ( 1 for match and 0 for mismatch) for aligning
> DNA sequences? Does the term "Mismatch" applies for
> protein scoring matrices like PAM and BLOSUM
>
> Thanks in advance
> WoA
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
> _______________________________________________
> General Forum at Bioinformatics.Org -
> BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>


-- 
-- 
Martin Gollery
Associate Director
Center For Bioinformatics
University of Nevada at Reno
Dept. of Biochemistry / MS330
775-784-7042
-----------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20060914/94cd5cbd/attachment.html>

From jeff at bioinformatics.org  Thu Sep 14 14:54:20 2006
From: jeff at bioinformatics.org (J.W. Bizzaro)
Date: Thu, 14 Sep 2006 14:54:20 -0400
Subject: [BiO BB] Open Source Bioinformatics for Researchers (reminder)
Message-ID: <4509A55C.6050905@bioinformatics.org>

This course is being held on Tuesday and Wednesday of next week.  Some space is still available.

---------------------

                OPEN SOURCE BIOINFORMATICS FOR RESEARCHERS
                         RADISSON HOTEL CAMBRIDGE
                               CAMBRIDGE, MA
                            SEPT. 19 & 20, 2006

                    http://edu.bioinformatics.org/06a

(CS101 Introduction to Bioinformatics Programming: Perl I & R I for Biologists)

September 19th & 20th, 2006
Radisson Hotel Cambridge <http://www.radisson.com/cambridgema>, 777 Memorial Drive, Cambridge, Massachusetts

Poster <http://edu.bioinformatics.org/06a/poster.pdf> (172 KB PDF)


DESCRIPTION

This is a course on the fundamentals of open-source programming, to help biologists understand how and when to use the right computer tools for solving computational biology problems, whether sequence analysis, gene expression, mass spectrometry, or systems biology.

This course is modularized so that researchers can understand two distinct tools: a programming and scripting language such as Perl and a data analysis and visualization language such as R. Armed with knowledge and some hands-on experience with these tools (including add-on modules like BioPerl and Bioconductor), scientists will be able to appreciate and use software better in their organization, and also be able to put research questions in the context of these tools. They will be able to do basic computational tasks themselves and better communicate with their IT group.


PREREQUISITES

There are no prerequisites for this course other than having a need to
learn some of the fundamentals of programming, in case the scientist has
any bioinformatic tasks in their day-to-day work.


COURSE OUTLINE

Day 1:

    Session 1:
    08:00 - 08:30 am: Registration
    08:30 - 10:00 am: Installation, Fundamentals of Perl
    10:00 - 10:30 am: Exercises - simple Perl programs
    10:30 - 10:45 am: BREAK
    10:45 - 11:30 am: Perl loops, file i/o, list operations, conditional statements
    11:30 - 12:00 pm: Exercises - more Perl programs

    Session 2:
    01:00 - 02:00 pm: Perl syntax - regular expressions, hash functions
    02:00 - 03:00 pm: Exercises - manipulate DNA sequence, annotation data
    03:00 - 03:15 pm: BREAK
    03:15 - 03:30 pm: Additional Perl syntax - subroutines
    03:30 - 05:00 pm: Exercises - automate BLAST queries

Day 2:

    09:00 - 12:00 pm: Basics of R
    12:00 - 01:00 pm: LUNCH
    01:00 - 04:30 pm: Bioconductor for microarray analysis 

Breakfast and afternoon tea are provided, but lunch is not.


LOGISTICS

When: September 19th & 20th, 2006
Where: Radisson Hotel Cambridge <http://www.radisson.com/cambridgema>, 777 Memorial Drive, Cambridge, Massachusetts

All attendees are encouraged to bring their own notebook computers, since this will be a hands-on workshop. CDs will be provided to install the necessary software, and lecture notes and exercises will be provided.


CERTIFICATION

This course is certified by the Bioinformatics Organization, Inc. <http://bioinformatics.org/about/>, the largest international affiliation in the field, and it will count as *16* "Continuing Scientific Education" (CSE) credits (one credit per contact hour) within the Organization. Students completing the course will receive a certificate attesting to that.


REGISTRATION

Commercial tuition: $600/person
Academic tuition: $300/person

Registration deadline: September 18th, 2006 or when filled

Available payment methods:

1. Online Registration Form <http://edu.bioinformatics.org/course/view.php?id=2> (account required)
Use this form only if paying by credit card (via secured PayPal).

2. Mail-in Registration Form <http://edu.bioinformatics.org/download/registration_form.pdf> (158 KB PDF)
Use this form for all other methods of payment.

You may also go to the course website <http://edu.bioinformatics.org/course/view.php?id=2> and click on "login as a guest" to view the online course materials.

Please send questions to <edu at bioinformatics.org>.

-- 
J.W. Bizzaro
Bioinformatics Organization, Inc. (Bioinformatics.Org)
E-mail: jeff at bioinformatics.org
Phone:  +1 508 890 8600
--


From boris.steipe at utoronto.ca  Thu Sep 14 13:56:55 2006
From: boris.steipe at utoronto.ca (Boris Steipe)
Date: Thu, 14 Sep 2006 13:56:55 -0400
Subject: [BiO BB] A question on Smith-Waterman algorithm
In-Reply-To: <20060914162258.89586.qmail@web37901.mail.mud.yahoo.com>
References: <20060914162258.89586.qmail@web37901.mail.mud.yahoo.com>
Message-ID: <324FAF37-4F07-4CAF-B87F-1812A28B2442@utoronto.ca>

If you use a matrix that gives a positive expectation value for a  
random match, a >>local<< alignment algorithm like SW will simply  
extend the alignment into random noise, since the mismatches it  
encounters do not reduce the score.

Remember that a scoring matrix is only a tool to represent a model of  
how similarity came about. The 1/0 matrix implicitly states that  
there is information if you observe matches and no information if you  
observe mismatches. This is not a model of evolution however, since  
evolution implies that mismatches are less likely and thus should be  
penalized if two sequences are related.

HTH.


Boris


On 14-Sep-06, at 12:22 PM, Skull Crossbones wrote:

> Hello all,
>
> In the SW algo. mismatches are given negative scores.
> Does this mean I can not use an Identity Scoring
> Matrix ( 1 for match and 0 for mismatch) for aligning
> DNA sequences? Does the term "Mismatch" applies for
> protein scoring matrices like PAM and BLOSUM
>
> Thanks in advance
> WoA
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
> _______________________________________________
> General Forum at Bioinformatics.Org -  
> BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board


From aloraine at gmail.com  Sun Sep 17 03:19:38 2006
From: aloraine at gmail.com (Ann Loraine)
Date: Sun, 17 Sep 2006 02:19:38 -0500
Subject: [BiO BB] command-line (scriptable) ORF finders?
Message-ID: <83722dde0609170019n17c690f4xde230b88626d76d9@mail.gmail.com>

Hello all,

I'm hoping someone on the list who is involved with EST or full-length
cDNA sequencing projects can help me with something (well..two
things):

(1) I am looking for a command-line, scriptable tool that can take as
input an EST, cDNA, or assembled EST contig ("unigene") sequence and
return the most likely or longest open reading frame. This is for a
plant EST project.  It should also pay attention to codon usage rules.

(2) I am also looking for a tool that can take as input a set of exon
annotations (or mRNA-to-genome alignments) and return the most likely
CDS start and end for the given gene structure. Tools that can jigger
the alignment/exon boundaries to optimize the ORF *and* which pay
attention to codon usage rules would be extra great. This is for
deducing novel gene structures from cross-species mRNA-to-genome
alignments. Maybe there is a gene-finder that does this?

I've found a variety of web sites that claim to do this, but, as you
know, Web sites don't really cut it when you are working with
thousands of sequences. And also, I would like to see the code in case
I run into problems.

Any thoughts or suggestions (other than pointers to Web tools, please)
would be greatly appreciated!

Sincerely,

Ann Loraine

-- 
Ann Loraine
Assistant Professor
Section on Statistical Genetics
University of Alabama at Birmingham
http://www.ssg.uab.edu
http://www.transvar.org


From pmr at ebi.ac.uk  Fri Sep 15 05:09:58 2006
From: pmr at ebi.ac.uk (pmr at ebi.ac.uk)
Date: Fri, 15 Sep 2006 10:09:58 +0100 (BST)
Subject: [BiO BB] A question on Smith-Waterman algorithm
In-Reply-To: <20060914162258.89586.qmail@web37901.mail.mud.yahoo.com>
References: <20060914162258.89586.qmail@web37901.mail.mud.yahoo.com>
Message-ID: <1230.86.133.36.221.1158311398.squirrel@webmail.ebi.ac.uk>

Dear WoA

> In the SW algo. mismatches are given negative scores.
> Does this mean I can not use an Identity Scoring
> Matrix ( 1 for match and 0 for mismatch) for aligning
> DNA sequences? Does the term "Mismatch" applies for
> protein scoring matrices like PAM and BLOSUM

No, you cannot use a matrix with only 1 and 0. Well, you can - but it will
not work.

This is because of the way the Smith Waterman algorithm works. It
calculates scores for all pairwise matches, allows for gap penalties,
finds the highest score anywhere in the matrix and works back until the
score becomes negative.

It is the "becomes negative" that catches you. With no negative scores in
the matrix you will get a global (Needleman Wunsch) alignment instead,
starting at one terminmating edge of the matrix (because scores will never
go down) and ending at one of the starting edges.

Mismatch scores for nucleotide are simply mismatches usually all with the
same score (you can adjust for G:U base pairing in RNA) - there is not the
same concept of partial matches that you have with protein matrices.

So, pick a reasonable identity score (it doesn't have to be 1, you can try
10 to avoid a +1 and -1 matrix)) and something negative for everything
else.

Hope that helps,

Peter Rice


From landman at scalableinformatics.com  Sun Sep 17 13:39:10 2006
From: landman at scalableinformatics.com (Joe Landman)
Date: Sun, 17 Sep 2006 13:39:10 -0400
Subject: [BiO BB] command-line (scriptable) ORF finders?
In-Reply-To: <83722dde0609170019n17c690f4xde230b88626d76d9@mail.gmail.com>
References: <83722dde0609170019n17c690f4xde230b88626d76d9@mail.gmail.com>
Message-ID: <450D883E.7010806@scalableinformatics.com>

Hi Ann:

Ann Loraine wrote:
> Hello all,
> 
> I'm hoping someone on the list who is involved with EST or full-length
> cDNA sequencing projects can help me with something (well..two
> things):
> 
> (1) I am looking for a command-line, scriptable tool that can take as
> input an EST, cDNA, or assembled EST contig ("unigene") sequence and
> return the most likely or longest open reading frame. This is for a
> plant EST project.  It should also pay attention to codon usage rules.

Would getorf from EMBOSS help?
http://emboss.sourceforge.net/apps/cvs/getorf.html

Joe

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452 or +1 866 888 3112
cell : +1 734 612 4615


From sariego9 at yahoo.com  Sun Sep 17 14:06:58 2006
From: sariego9 at yahoo.com (Diego Martinez)
Date: Sun, 17 Sep 2006 11:06:58 -0700 (PDT)
Subject: [BiO BB] command-line (scriptable) ORF finders?
In-Reply-To: <83722dde0609170019n17c690f4xde230b88626d76d9@mail.gmail.com>
Message-ID: <20060917180658.52071.qmail@web32506.mail.mud.yahoo.com>

Hello, 

There is also the SEALS package from Koonin's group at NCBI,
we use that alot.  it has a bunch of command line tools, I believe it 
is all in PERL, so you can gut it and reuse.

http://www.ncbi.nlm.nih.gov/CBBresearch/Walker/SEALS/

if you are looking at ESTs, you may also want to look at estscan,

http://www.ch.embnet.org/software/ESTScan2.html

or there is a genewise like est Gene modeler tool the Wise2
package by Birney and Durbin that you may want to look at.

Diego
 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
             .=$=.   .=$=.           .=$=.   .=$=.           .=$=.   .=$=.
   @       @ | | | @ | | | @       @ | | | @ | | | @       @ | | | @ | | |
   | @   @ | | | @   @ | | | @   @ | | | @   @ | | | @   @ | | | @   @ | |
   | | @ | | | @       @ | | | @ | | | @       @ | | | @ | | | @       @ |
    ~'   `~$~'           `~$~'   `~$~'           `~$~'   `~$~'           `~

----- Original Message ----
From: Ann Loraine <aloraine at gmail.com>
To: General Forum at Bioinformatics.Org <bio_bulletin_board at bioinformatics.org>
Sent: Sunday, September 17, 2006 1:19:38 AM
Subject: [BiO BB] command-line (scriptable) ORF finders?

Hello all,

I'm hoping someone on the list who is involved with EST or full-length
cDNA sequencing projects can help me with something (well..two
things):

(1) I am looking for a command-line, scriptable tool that can take as
input an EST, cDNA, or assembled EST contig ("unigene") sequence and
return the most likely or longest open reading frame. This is for a
plant EST project.  It should also pay attention to codon usage rules.

(2) I am also looking for a tool that can take as input a set of exon
annotations (or mRNA-to-genome alignments) and return the most likely
CDS start and end for the given gene structure. Tools that can jigger
the alignment/exon boundaries to optimize the ORF *and* which pay
attention to codon usage rules would be extra great. This is for
deducing novel gene structures from cross-species mRNA-to-genome
alignments. Maybe there is a gene-finder that does this?

I've found a variety of web sites that claim to do this, but, as you
know, Web sites don't really cut it when you are working with
thousands of sequences. And also, I would like to see the code in case
I run into problems.

Any thoughts or suggestions (other than pointers to Web tools, please)
would be greatly appreciated!

Sincerely,

Ann Loraine

-- 
Ann Loraine
Assistant Professor
Section on Statistical Genetics
University of Alabama at Birmingham
http://www.ssg.uab.edu
http://www.transvar.org
_______________________________________________
General Forum at Bioinformatics.Org - BiO_Bulletin_Board at bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bio_bulletin_board


From aloraine at gmail.com  Sun Sep 17 17:39:45 2006
From: aloraine at gmail.com (Ann Loraine)
Date: Sun, 17 Sep 2006 16:39:45 -0500
Subject: [BiO BB] command-line (scriptable) ORF finders?
In-Reply-To: <20060917180658.52071.qmail@web32506.mail.mud.yahoo.com>
References: <83722dde0609170019n17c690f4xde230b88626d76d9@mail.gmail.com>
	<20060917180658.52071.qmail@web32506.mail.mud.yahoo.com>
Message-ID: <83722dde0609171439j1575a722m280fbcb3b4428dbd@mail.gmail.com>

Thanks!

-Ann

On 9/17/06, Diego Martinez <sariego9 at yahoo.com> wrote:
> Hello,
>
> There is also the SEALS package from Koonin's group at NCBI,
> we use that alot.  it has a bunch of command line tools, I believe it
> is all in PERL, so you can gut it and reuse.
>
> http://www.ncbi.nlm.nih.gov/CBBresearch/Walker/SEALS/
>
> if you are looking at ESTs, you may also want to look at estscan,
>
> http://www.ch.embnet.org/software/ESTScan2.html
>
> or there is a genewise like est Gene modeler tool the Wise2
> package by Birney and Durbin that you may want to look at.
>
> Diego
>
>
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>              .=$=.   .=$=.           .=$=.   .=$=.           .=$=.   .=$=.
>    @       @ | | | @ | | | @       @ | | | @ | | | @       @ | | | @ | | |
>    | @   @ | | | @   @ | | | @   @ | | | @   @ | | | @   @ | | | @   @ | |
>    | | @ | | | @       @ | | | @ | | | @       @ | | | @ | | | @       @ |
>     ~'   `~$~'           `~$~'   `~$~'           `~$~'   `~$~'           `~
>
> ----- Original Message ----
> From: Ann Loraine <aloraine at gmail.com>
> To: General Forum at Bioinformatics.Org <bio_bulletin_board at bioinformatics.org>
> Sent: Sunday, September 17, 2006 1:19:38 AM
> Subject: [BiO BB] command-line (scriptable) ORF finders?
>
> Hello all,
>
> I'm hoping someone on the list who is involved with EST or full-length
> cDNA sequencing projects can help me with something (well..two
> things):
>
> (1) I am looking for a command-line, scriptable tool that can take as
> input an EST, cDNA, or assembled EST contig ("unigene") sequence and
> return the most likely or longest open reading frame. This is for a
> plant EST project.  It should also pay attention to codon usage rules.
>
> (2) I am also looking for a tool that can take as input a set of exon
> annotations (or mRNA-to-genome alignments) and return the most likely
> CDS start and end for the given gene structure. Tools that can jigger
> the alignment/exon boundaries to optimize the ORF *and* which pay
> attention to codon usage rules would be extra great. This is for
> deducing novel gene structures from cross-species mRNA-to-genome
> alignments. Maybe there is a gene-finder that does this?
>
> I've found a variety of web sites that claim to do this, but, as you
> know, Web sites don't really cut it when you are working with
> thousands of sequences. And also, I would like to see the code in case
> I run into problems.
>
> Any thoughts or suggestions (other than pointers to Web tools, please)
> would be greatly appreciated!
>
> Sincerely,
>
> Ann Loraine
>
> --
> Ann Loraine
> Assistant Professor
> Section on Statistical Genetics
> University of Alabama at Birmingham
> http://www.ssg.uab.edu
> http://www.transvar.org
> _______________________________________________
> General Forum at Bioinformatics.Org - BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>
>
>
>
> _______________________________________________
> General Forum at Bioinformatics.Org - BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>


-- 
Ann Loraine
Assistant Professor
Section on Statistical Genetics
University of Alabama at Birmingham
http://www.ssg.uab.edu
http://www.transvar.org


From stefan.rensing at biologie.uni-freiburg.de  Mon Sep 18 02:44:55 2006
From: stefan.rensing at biologie.uni-freiburg.de (Stefan Rensing)
Date: Mon, 18 Sep 2006 08:44:55 +0200
Subject: [BiO BB] command-line (scriptable) ORF finders?
In-Reply-To: <83722dde0609170019n17c690f4xde230b88626d76d9@mail.gmail.com>
References: <83722dde0609170019n17c690f4xde230b88626d76d9@mail.gmail.com>
Message-ID: <450E4067.9030305@biologie.uni-freiburg.de>

Hi,

> (1) I am looking for a command-line, scriptable tool that can take as
> input an EST, cDNA, or assembled EST contig ("unigene") sequence and
> return the most likely or longest open reading frame. This is for a
> plant EST project.  It should also pay attention to codon usage rules.

We found FrameD to be superior to ESTScan and Estwise in predicting ORFs
in moss (P. patens). We are using species-specific (i)HMMs, repectively.
http://bioinfo.genopole-toulouse.prd.fr/apps/FrameD/FD
http://bioinfo.genopole-toulouse.prd.fr/apps/FrameD/Help/FrameDWeb_5.html

> (2) I am also looking for a tool that can take as input a set of exon
> annotations (or mRNA-to-genome alignments) and return the most likely
> CDS start and end for the given gene structure. Tools that can jigger
> the alignment/exon boundaries to optimize the ORF *and* which pay
> attention to codon usage rules would be extra great. This is for
> deducing novel gene structures from cross-species mRNA-to-genome
> alignments. Maybe there is a gene-finder that does this?

You might want to have a look at GenomeThreader,
http://www.genomethreader.org/, which allows spliced alignments using
non-identical mRNA/protein sequences (i.e., homologs from other species).

Cheers, Stefan


-- 
Dr. Stefan Rensing, Group Leader Computational Biology
Plant Biotechnology, Faculty of Biology, University of Freiburg
Schaenzlestr. 1, D-79104 Freiburg, Fon: +49 761 203-6974, Fax: -6945
http://www.plant-biotech.net/  http://www.cosmoss.org/
stefan.rensing at biologie.uni-freiburg.de

"There is science, logic, reason;
 there is thought verified by experience.
 And then there is California."
					  Edward Abbey


From aloraine at gmail.com  Mon Sep 18 11:05:18 2006
From: aloraine at gmail.com (Ann Loraine)
Date: Mon, 18 Sep 2006 10:05:18 -0500
Subject: [BiO BB] command-line (scriptable) ORF finders?
In-Reply-To: <450E4067.9030305@biologie.uni-freiburg.de>
References: <83722dde0609170019n17c690f4xde230b88626d76d9@mail.gmail.com>
	<450E4067.9030305@biologie.uni-freiburg.de>
Message-ID: <83722dde0609180805v7294cce9ifbc90f65950539c8@mail.gmail.com>

Thank you very much for the pointers...it was very helpful.

Sincerely,

Ann

On 9/18/06, Stefan Rensing <stefan.rensing at biologie.uni-freiburg.de> wrote:
> Hi,
>
> > (1) I am looking for a command-line, scriptable tool that can take as
> > input an EST, cDNA, or assembled EST contig ("unigene") sequence and
> > return the most likely or longest open reading frame. This is for a
> > plant EST project.  It should also pay attention to codon usage rules.
>
> We found FrameD to be superior to ESTScan and Estwise in predicting ORFs
> in moss (P. patens). We are using species-specific (i)HMMs, repectively.
> http://bioinfo.genopole-toulouse.prd.fr/apps/FrameD/FD
> http://bioinfo.genopole-toulouse.prd.fr/apps/FrameD/Help/FrameDWeb_5.html
>
> > (2) I am also looking for a tool that can take as input a set of exon
> > annotations (or mRNA-to-genome alignments) and return the most likely
> > CDS start and end for the given gene structure. Tools that can jigger
> > the alignment/exon boundaries to optimize the ORF *and* which pay
> > attention to codon usage rules would be extra great. This is for
> > deducing novel gene structures from cross-species mRNA-to-genome
> > alignments. Maybe there is a gene-finder that does this?
>
> You might want to have a look at GenomeThreader,
> http://www.genomethreader.org/, which allows spliced alignments using
> non-identical mRNA/protein sequences (i.e., homologs from other species).
>
> Cheers, Stefan
>
>
> --
> Dr. Stefan Rensing, Group Leader Computational Biology
> Plant Biotechnology, Faculty of Biology, University of Freiburg
> Schaenzlestr. 1, D-79104 Freiburg, Fon: +49 761 203-6974, Fax: -6945
> http://www.plant-biotech.net/  http://www.cosmoss.org/
> stefan.rensing at biologie.uni-freiburg.de
>
> "There is science, logic, reason;
>  there is thought verified by experience.
>  And then there is California."
>                                           Edward Abbey
> _______________________________________________
> General Forum at Bioinformatics.Org - BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>


-- 
Ann Loraine
Assistant Professor
Section on Statistical Genetics
University of Alabama at Birmingham
http://www.ssg.uab.edu
http://www.transvar.org


From divyaps at ncbs.res.in  Mon Sep 18 12:49:48 2006
From: divyaps at ncbs.res.in (divyaps at ncbs.res.in)
Date: Mon, 18 Sep 2006 22:19:48 +0530 (IST)
Subject: [BiO BB] ncbi entry retrieval
Message-ID: <32933.192.168.1.1.1158598188.squirrel@192.168.1.1>

dear all,
     I was doing a psiblast search with the organism specific peptide
sequence  downloaded from ensembl.Now I have the blast output
sequences with ensemble id. I need to retrieve the corresponding ncbi
entries of these psiblast hits. Is there any way to do the same? A
software, server or a perl script? A suggestion or solution will be highly
appreciated.

thanks in advance

divya p syamala
NCBS


From basu at pharm.sunysb.edu  Mon Sep 18 16:35:50 2006
From: basu at pharm.sunysb.edu (Siddhartha Basu)
Date: Mon, 18 Sep 2006 16:35:50 -0400
Subject: [BiO BB] ncbi entry retrieval
In-Reply-To: <32933.192.168.1.1.1158598188.squirrel@192.168.1.1>
References: <32933.192.168.1.1.1158598188.squirrel@192.168.1.1>
Message-ID: <450F0326.20403@pharm.sunysb.edu>

divyaps at ncbs.res.in wrote:
> dear all,
>      I was doing a psiblast search with the organism specific peptide
> sequence  downloaded from ensembl.Now I have the blast output
> sequences with ensemble id. I need to retrieve the corresponding ncbi
> entries of these psiblast hits. Is there any way to do the same? A
> software, server or a perl script? A suggestion or solution will be highly
> appreciated.
> 
> thanks in advance
> 
> divya p syamala
> NCBS
> 

Hi,
Presuming that you are looking to convert your ensembl ids to entrez 
ids, biomart (http://www.ensembl.org/Multi/martview) should be a good 
option. In the first screen, choose your organism, in the second load 
your ensembl ids(in the id list limit) and in the third, select out 
"EntrezGene ID" in the "External References" section. Lastly, select the 
output format you prefer and click export. Hopefully, that will do the 
conversion for you.

-siddhartha


> 
> 
> 
> _______________________________________________
> General Forum at Bioinformatics.Org - BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board


From ethan.strauss at promega.com  Tue Sep 19 12:04:43 2006
From: ethan.strauss at promega.com (Ethan Strauss)
Date: Tue, 19 Sep 2006 11:04:43 -0500
Subject: [BiO BB] NCBI web service
In-Reply-To: <450F0326.20403@pharm.sunysb.edu>
Message-ID: <D8D8119118899D4A8EB5AD9BD24C1932034DCA49@MADMSG003.promega.com>

Hi, 
	I have been using the NCBI eutils web service from a C#
application to automatically retrieve additional information (species,
links to the Gene db etc) about BLAST hits from the gi numbers. This
works when it works, but I have been having a lot of trouble with the
web service. It is slow and frequently I get Web service errors (The
underlying connection was closed: An unexpected error occurred on a
send) which seem to related to timeout and proxies and keepalive and
other stuff I don't really understand. 
	Anyway, Is there another web service that I might use to get the
same sorts of information from gi numbers or accession numbers? I need
to get species and associations with the gene database. If you know how
to get the NCBI service to work better for me, that would be good too. I
would like to get the full description line, but could live without it.
It is important for this application that I can get the info from a web
service. 
Thanks!
Ethan


From ethan.strauss at promega.com  Tue Sep 19 16:02:42 2006
From: ethan.strauss at promega.com (Ethan Strauss)
Date: Tue, 19 Sep 2006 15:02:42 -0500
Subject: [BiO BB] NCBI web service
In-Reply-To: <D8D8119118899D4A8EB5AD9BD24C1932034DCA49@MADMSG003.promega.com>
Message-ID: <D8D8119118899D4A8EB5AD9BD24C1932034DCA4D@MADMSG003.promega.com>

Hello everyone, 
	I have figured out part of my problem, but not how to fix it...
	What is happening is that I am running blast and just grabbing
gi numbers from the XML blast results. When I send these gi numbers to
NCBI, it sends me back all the data associated with each gi number. This
data includes the sequence. It turns out that a few of the gi numbers in
my test set point to complete chromosomes! I think that NCBI is
returning the information completely, but that the connection can't
support the many millions of characters being passed. 
	What I would like to do is somehow query just for the size of
the sequences being returned and not asked for info on sequences which
are too large. I have not dug through NCBI's documentation in great
depth yet, but a quick look turns up nothing. If anyone knows how to do
this already, I would appreciated it. 
Thanks!
Ethan

-----Original Message-----
From:
bio_bulletin_board-bounces+ethan.strauss=promega.com at bioinformatics.org
[mailto:bio_bulletin_board-bounces+ethan.strauss=promega.com at bioinformat
ics.org] On Behalf Of Ethan Strauss
Sent: Tuesday, September 19, 2006 11:05 AM
To: General Forum at Bioinformatics.Org
Subject: [BiO BB] NCBI web service

Hi, 
	I have been using the NCBI eutils web service from a C#
application to automatically retrieve additional information (species,
links to the Gene db etc) about BLAST hits from the gi numbers. This
works when it works, but I have been having a lot of trouble with the
web service. It is slow and frequently I get Web service errors (The
underlying connection was closed: An unexpected error occurred on a
send) which seem to related to timeout and proxies and keepalive and
other stuff I don't really understand. 
	Anyway, Is there another web service that I might use to get the
same sorts of information from gi numbers or accession numbers? I need
to get species and associations with the gene database. If you know how
to get the NCBI service to work better for me, that would be good too. I
would like to get the full description line, but could live without it.
It is important for this application that I can get the info from a web
service. 
Thanks!
Ethan
_______________________________________________
General Forum at Bioinformatics.Org -
BiO_Bulletin_Board at bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bio_bulletin_board


From mkgovindis at yahoo.com  Tue Sep 19 23:21:38 2006
From: mkgovindis at yahoo.com (govind mk)
Date: Tue, 19 Sep 2006 20:21:38 -0700 (PDT)
Subject: [BiO BB] Yet another Web 3D alignment viewer
In-Reply-To: <50647.141.20.6.60.1157463639.squirrel@webmail.charite.de>
Message-ID: <20060920032138.60911.qmail@web34410.mail.mud.yahoo.com>

Hi 
   
  Is any one aware of any database that keeps track of the NCBI (Accession's) sequence revision history.
   
  If such a database is available , is the data available in a downloadable format or can it be accessed by a program ?
   
  I have had a look at the NCBI.The NCBI Sequence Revision History db is not available for download.
   
   
  Thank you
   
  Regards,
  Govind

 		
---------------------------------
Stay in the know. Pulse on the new Yahoo.com.  Check it out. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20060919/6581d67a/attachment.html>

From mkgovindis at yahoo.com  Tue Sep 19 23:21:55 2006
From: mkgovindis at yahoo.com (govind mk)
Date: Tue, 19 Sep 2006 20:21:55 -0700 (PDT)
Subject: [BiO BB] Re: NCBI Sequence revision history data
In-Reply-To: <50647.141.20.6.60.1157463639.squirrel@webmail.charite.de>
Message-ID: <20060920032155.89467.qmail@web34404.mail.mud.yahoo.com>

Hi 
   
  Is any one aware of any database that keeps track of the NCBI (Accession's) sequence revision history.
   
  If such a database is available , is the data available in a downloadable format or can it be accessed by a program ?
   
  I have had a look at the NCBI.The NCBI Sequence Revision History db is not available for download.
   
   
  Thank you
   
  Regards,
  Govind

 		
---------------------------------
Do you Yahoo!?
 Get on board. You're invited to try the new Yahoo! Mail.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20060919/a8614608/attachment.html>

From floris at crs4.it  Wed Sep 20 02:14:27 2006
From: floris at crs4.it (Matteo Floris)
Date: Wed, 20 Sep 2006 08:14:27 +0200
Subject: [BiO BB] Re: ncbi entry retrieval (divyaps@ncbs.res.in) 
Message-ID: <2AF9E22C-7C55-4E45-B280-318461DE55B6@crs4.it>

Hi,

you can use BioMart for that.
See http://www.ensembl.org/biomart/martview

you can submit a list of ensembl IDs, then export their ncbi IDs.
It is very easy.

Regards,

Matteo Floris


From basu at pharm.sunysb.edu  Wed Sep 20 11:07:15 2006
From: basu at pharm.sunysb.edu (Siddhartha Basu)
Date: Wed, 20 Sep 2006 11:07:15 -0400
Subject: [BiO BB] Re: NCBI Sequence revision history data
In-Reply-To: <20060920032155.89467.qmail@web34404.mail.mud.yahoo.com>
References: <20060920032155.89467.qmail@web34404.mail.mud.yahoo.com>
Message-ID: <45115923.3030500@pharm.sunysb.edu>

govind mk wrote:
> Hi
>  
> Is any one aware of any database that keeps track of the NCBI 
> (Accession's) sequence revision history.
>  
> If such a database is available , is the data available in a 
> downloadable format or can it be accessed by a program ?
Hi,
If you are familiar and have bioperl installed, Bio::DB::SeqVersion is 
the module that can access the sequence revision history of NCBI.

-siddhartha


>  
> I have had a look at the NCBI.The NCBI Sequence Revision History db is 
> not available for download.
>  
>  
> Thank you
>  
> Regards,
> Govind
> 
> ------------------------------------------------------------------------
> Do you Yahoo!?
> Get on board. You're invited 
> <http://us.rd.yahoo.com/evt=40791/*http://advision.webevents.yahoo.com/mailbeta> 
> to try the new Yahoo! Mail.
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> General Forum at Bioinformatics.Org - BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board


From janderson_net at yahoo.com  Thu Sep 21 12:34:36 2006
From: janderson_net at yahoo.com (James Anderson)
Date: Thu, 21 Sep 2006 09:34:36 -0700 (PDT)
Subject: [BiO BB] question on low level processing of Liquid chromatography
	(LC)
Message-ID: <20060921163436.25862.qmail@web31202.mail.mud.yahoo.com>

Hi,
  I am new to LC, I have a question about low level processing of LC. I am quite familar with the low level processing of SELDI/MADLI which has the following steps:
1. Smoothing 2. Baseline removal, 3. normalization 4. Peak detection, 5. Peak alignment. 

Does low level processing of LC have the same steps? Especially for baseline removal (or background subtraction) and normalization. For seldi/maldi, the baseline removal is the removed the artifact caused by the energy absorbing matrix, but for LC, do I need to subtract the baseline as well? If so, what's the physical reason behind this? In addition, the normalization of seldi/maldi uses TIC.  what should I do with the normalization of LC? 

Another question is: is every point is LC the sum of every point of Mass spec on the same retention time? 

Thanks,
James

 		
---------------------------------
Get your email and more, right on the  new Yahoo.com 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20060921/e308787c/attachment.html>

From hulk.norris at googlemail.com  Thu Sep 21 10:57:24 2006
From: hulk.norris at googlemail.com (Hulk Norris)
Date: Thu, 21 Sep 2006 15:57:24 +0100
Subject: [BiO BB] Extreme 3` EST Assembly
Message-ID: <98f499160609210757v55875681n9b21a2c1eb2cfdd9@mail.gmail.com>

Hi,

I have started work on the clustering and assembly of 3` sequenced ESTs.
Because of the nature of the sequencing process we can be certain that each
EST represents the extreme 3` end of an expressed transcript.

In order to allow for incorrect base calling and determine which transcripts
are detected with greater frequency I wish to cluster and assemble these
ESTs to form consensus sequences and generate contigs.

My problem is that clustering and assembly software does not take into
account the fact that all the ESTs under investigation are confirmed extreme
3` and will assimilate genuine terminal 3` sequence into upstream positions
of longer transcripts in cases of alternative polyadenylation of a single
gene.

Does anyone have experience of similar problems or approaches?  Any help or
direction would be sincerely appreciated.

Regards,

Dr Hulk Norris
Principal Bioinformatician
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20060921/feb9615b/attachment.html>

From schuerer at genomining.com  Fri Sep 22 04:19:40 2006
From: schuerer at genomining.com (Katja Schuerer)
Date: Fri, 22 Sep 2006 10:19:40 +0200
Subject: [BiO BB] Programming for bioinformatics
Message-ID: <45139C9C.30407@genomining.com>

Hi,

  *************************************************************************
     Course in informatics for biology 2007 at Institut Pasteur
        http://www.pasteur.fr/formation/infobio-en.html

***    ATTENTION : Registration will be closed on October 15 2006.   ***

*************************************************************************

     In the series of courses offered at the Pasteur Institute, a course 
will be offered in informatics in biology. The next session will take 
place from January to end of April 2007.

     The main goal of this course is to provide researchers in biology 
an initial exposure to informatics. Admitance in the course is reserved 
for those with a degree in biology or a related discipline.

     With more and more bioinformatics tools available, it becomes 
increasingly important for researchers in biology to be able both to 
manage their data, implement their ideas, and judge for themselves the 
usefulness of new algorithms and software.

     This course will emphasize fundamental aspects of computer science 
and apply them to biological examples. Theoretical aspects (algorithm 
development, logic, problem modeling and design methods), and technical 
applications (databases and web technologies) that are relevant for 
biologists will be thoroughly discussed.


     Programming is presented through the object-oriented paradigm, 
using a modern high-level language, Python, provided with tools for 
biology and enabling both prototyping or scripting and the building of 
important software systems. Learning of an additional language (C) will 
be available for interested students.

     Learning during the course will be reinforced with computing 
exercises, and effective training will be provided by a 2 month research 
project.

     The working language of the course is French.

For further information, please consult:

     http://www.pasteur.fr/formation/infobio-en.html

  *** Registration will be closed on October 15 2006. ***

Sincerely,

-- 
Catherine Letondal, Institut Pasteur & Katja Schuerer, Genomining
Course informatics for biology


From ykalidas at gmail.com  Sat Sep 23 19:15:19 2006
From: ykalidas at gmail.com (Kalidas Yeturu)
Date: Sun, 24 Sep 2006 04:45:19 +0530
Subject: [BiO BB] Seperation of Protein PDB into multiple units having
	binding sites
Message-ID: <5632703b0609231615h4506e921qca2c107904f9904e@mail.gmail.com>

Hello Everyone
 I am working on protein binding sites. I am not yet very much familiar with
terminology - subunits,domains etc.,

 My work requires obtaining/splitting a protein PDB into various structural
units such that each has binding site.
 For example 1A4G neuraminidase has two structural units - chain A and chain
B both having binding sites.
 But splitting a PDB based on chain-id alone, may not always be correct.
Some manually curated database would be better.

 I would be grateful if anyone can cite a database where protein-PDB files
corresponding to structural units having binding sites are provided for each
protein.

Thanking You

-- 
Kalidas Y
http://ssl.serc.iisc.ernet.in/~kalidas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20060924/df25b070/attachment.html>

From skhadar at gmail.com  Mon Sep 25 03:44:09 2006
From: skhadar at gmail.com (Shameer Khadar)
Date: Mon, 25 Sep 2006 00:44:09 -0700
Subject: [BiO BB] Seperation of Protein PDB into multiple units having
	binding sites
In-Reply-To: <5632703b0609231615h4506e921qca2c107904f9904e@mail.gmail.com>
References: <5632703b0609231615h4506e921qca2c107904f9904e@mail.gmail.com>
Message-ID: <b6ff81950609250044l57c673cesde3809cce95fd447@mail.gmail.com>

To get a hold of all teminology related to subunits and subdomains
Grab the book and Read it once or twice :)
Bioinformatics : Genes, Proteins & Computers
Eds. Christine Orengo and JM Thornton

To split your proteins try with PROTEIN PEELING approach,
Web Server is available here : http://www.ebgm.jussieu.fr/~gelly/

Happy splitting with your proteins,
Shameer Khadar
NCBS - TIFR


On 9/23/06, Kalidas Yeturu <ykalidas at gmail.com> wrote:
>
> Hello Everyone
>  I am working on protein binding sites. I am not yet very much familiar
> with terminology - subunits,domains etc.,
>
>  My work requires obtaining/splitting a protein PDB into various
> structural units such that each has binding site.
>  For example 1A4G neuraminidase has two structural units - chain A and
> chain B both having binding sites.
>  But splitting a PDB based on chain-id alone, may not always be correct.
> Some manually curated database would be better.
>
>  I would be grateful if anyone can cite a database where protein-PDB files
> corresponding to structural units having binding sites are provided for each
> protein.
>
> Thanking You
>
> --
> Kalidas Y
> http://ssl.serc.iisc.ernet.in/~kalidas<http://ssl.serc.iisc.ernet.in/%7Ekalidas>
> _______________________________________________
> General Forum at Bioinformatics.Org -
> BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20060925/7c175c33/attachment.html>

From dtheobald at brandeis.edu  Tue Sep 26 12:23:39 2006
From: dtheobald at brandeis.edu (Douglas L. Theobald)
Date: Tue, 26 Sep 2006 12:23:39 -0400
Subject: [BiO BB] THESEUS,
	a program for maximum likelihood superpositioning of macromolecules
Message-ID: <CB3EC48F-2389-4447-BB3C-E6259F9C91CD@brandeis.edu>

Announcing a fundamentally new way to superimpose structures: Maximum
likelihood instead of least squares.

http://www.theseus3d.org/


The Program:

THESEUS is a unix command line program for performing maximum likelihood
(ML) superpositions and analysis of macromolecular structures. While all
conventional superpositioning methods use ordinary least-squares as the
optimization criterion, THESEUS uses maximum likelihood, which provides
superpositions with substantially improved accuracy (see the figure at
http://www.theseus3d.org/ for an example). When superpositioning
macromolecules with different residue sequences, other programs and
algorithms currently discard residues that are aligned with gaps.
THESEUS, however, uses a novel ML algorithm that includes all of the
available data.


The Rationale:

Over 30 years ago, Cox, Diamond, McLachlan, Kabsch, and others
investigated and solved the least-squares superposition problem for
macromolecular structures (Flower 1999), and the least-squares method
has been used effectively ever since for comparing structures. However,
least-squares is not ideal. As a fitting criterion, least-squares is
based theoretically on two strong assumptions: (1) that all atoms in a
structure have the same variability and (2) that all atoms are
independent and uncorrelated. We know that both of these assumptions are
false. Some regions of a structure are more variable than others, and
atoms are connected to each other via chemical bonds. The ML method used
by THESEUS properly down-weights variable structural regions and
corrects for correlations among atoms.


The Benefits:

ML superpositioning is robust and insensitive to the specific atoms
included in the analysis.  In current practice, regions of structures
that are considered "unsuperimposable" or divergent are subjectively
excluded from the superposition.  However, when doing a ML
superposition, you do not need to hand prune selected variable atomic
coordinates, since the variability is already accounted for in the ML
method. ML superpositioning will greatly improve our ability to
accurately compare biological macromolecules in many applications,
including analysis of NMR families, alternate crystal structures,
evolutionarily homologous molecules, molecular dynamics simulations, and
de novo structure predictions.


Output from THESEUS includes both likelihood-based and frequentist
statistics for evaluation of the adequacy of a superposition and for
reliable analysis of structural similarities and differences. Residue
ranges for excluding/including in the superposition can be specified on
the command line. For ease of comparison, THESEUS will also calculates
least-squares superpositions.  Additionally, THESEUS performs principal
components analysis (PCA) for analyzing the complex correlations found
among the atoms and residues within a structural ensemble.


Source code and binaries for several platforms are available from:

http://www.theseus3d.org/


Refs:

Theobald, D.L. and Wuttke, D.S. (2006)
"THESEUS: Maximum likelihood superpositioning and analysis of
macromolecular structures."
Bioinformatics 22(17):2171
http://bioinformatics.oxfordjournals.org/cgi/content/abstract/22/17/2171

Overview of mathematical results and algorithm (supplementary materials
from Theobald & Wuttke 2006):
http://www.theseus3d.org/pdfs/ 
Theobald_Wuttke_2006_Bioinformatics_THESEUS_SuppMat.pdf

Theobald, D. L. and Wuttke, D. S. (2006)
"Empirical Bayes hierarchical models for regularizing maximum likelihood
estimation in the matrix Gaussian Procrustes problem."
PNAS, in press


Cox, J. M. (1967)
"Mathematical methods used in the comparison of the quaternary  
structures."
J Mol Biol, 28, 151?156.

Diamond, R. (1966)
"A mathematical model-building procedure for proteins."
Acta Crystallogr, 21, 253?266.

Diamond, R. (1976)
"On the comparison of conformations using linear and quadratic  
transformations."
Acta Crystallogr A, 32, 1?10.

Flower, D. R. (1999)
"Rotational superposition: A review of methods."
J Mol Graph Model, 17, 238?244.

Kabsch, W. (1978)
"A discussion of the solution for the best rotation to relate two sets
of vectors."
Acta Crystallogr A, 34, 827?828.

McLachlan, A. (1972)
"A mathematical procedure for superimposing atomic coordinates of  
proteins."
Acta Crystallogr A, 28, 656?657.


From john_abraham_bio at yahoo.com  Thu Sep 28 02:58:00 2006
From: john_abraham_bio at yahoo.com (John Abraham)
Date: Wed, 27 Sep 2006 23:58:00 -0700 (PDT)
Subject: [BiO BB] Re: NCBI Sequence revision history data
In-Reply-To: <20060920032155.89467.qmail@web34404.mail.mud.yahoo.com>
Message-ID: <20060928065800.15987.qmail@web57007.mail.re3.yahoo.com>

The readme file keep tracks such a changes 

govind mk <mkgovindis at yahoo.com> wrote:    Hi 
   
  Is any one aware of any database that keeps track of the NCBI (Accession's) sequence revision history.
   
  If such a database is available , is the data available in a downloadable format or can it be accessed by a program ?
   
  I have had a look at the NCBI.The NCBI Sequence Revision History db is not available for download.
   
   
  Thank you
   
  Regards,
  Govind
    
---------------------------------
  Do you Yahoo!?
Get on board. You're invited to try the new Yahoo! Mail._______________________________________________
General Forum at Bioinformatics.Org - BiO_Bulletin_Board at bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bio_bulletin_board


---------------------------------
Want to be your own boss? Learn how on  Yahoo! Small Business. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20060927/39917779/attachment.html>

From invitation at iaria.org  Thu Sep 28 03:51:11 2006
From: invitation at iaria.org (invitation at iaria.org)
Date: Thu, 28 Sep 2006 00:51:11 -0700 (PDT)
Subject: [BiO BB] Second Call for Submissions ||  ICCGI 2007 || ICWMC 2007, 
 Guadeloupe, March 2007
Message-ID: <7089827.1159429871479.JavaMail.Onitza@Oana2>

CALL FOR PAPERS

The Third International Conference on Wireless and Mobile Communications

ICWMC 2007

Date: March 4-9, 2007

Place: Guadeloupe, French Caribbean

Site: http://www.iaria.org/conferences2007/ICWMC07.html

Submit: http://www.iaria.org/conferences2007/SubmitICWMC07.html
 

Important deadlines:

Full paper submission, October 15, 2006

Author notification, November 15, 2006 

Registration/camera ready, December 1, 2006


CALL FOR PAPERS

The Second International Multi-Conference on Computing in the Global Information Technology

ICCGI 2007

Date: March 4-9, 2007

Place: Guadeloupe, French Caribbean

Site: http://www.iaria.org/conferences2007/ICCGI07.html

Submit: http://www.iaria.org/conferences2007/SubmitICCGI07.html
        
includes:

- IPv6TD 2007 : The Second International Workshop on IPv6 Today - Technology and Deployment

  http://www.iaria.org/conferences2007/IPV6TD.html

- MOC 2007 : The Second International Workshop on Modeling, Optimization, and Complexity

  http://www.iaria.org/conferences2007/MOC.html


Important deadlines:

Full paper submission, October 15, 2006 

Author notification, November 15, 2006 

Registration/camera ready, December 1, 2006


Note:

For ICWMC 2006 and ICCGI 2006 programs, awards, photos, see:

http://www.iaria.org/conferences/ICW06.html

http://www.iaria.org/conferences/ICCGI06.html


Publicity Board

======================================================================= 

To be removed from this announcement list, please reply to this email with UNSUBSCRIBE in the subject line. 


From phoebe at deakin.edu.au  Fri Sep 29 00:32:24 2006
From: phoebe at deakin.edu.au (Phoebe Chen)
Date: Fri, 29 Sep 2006 14:32:24 +1000
Subject: [BiO BB] APBC2007 Call for Posters/Tutorials by Tomorrow
Message-ID: <5.1.1.5.2.20060929143141.0387e470@mail.deakin.edu.au>

Dear Colleagues,

We apologize if you receive multiple copies of this
call for posters and tutorials.

Regards,
Organizing Committee of APBC 2007
------------
CALL FOR POSTERS/TUTORIALS (APBC 2007)
Asia-Pacific Bioinformatics Conference, APBC 2007
will be held in Hong Kong during 15-17 January 2007.
http://www.cs.hku.hk/apbc2007

Please consider to submit a poster or hold a tutorial
in the conference. The deadline for submission is
30 Sept, 2006.

The details of the call for posters and call
for tutorials can be found here:
http://www.cs.hku.hk/apbc2007/callforposters.htm
http://www.cs.hku.hk/apbc2007/callfortutorial.htm
We look forward to seeing you in Hong Kong, an
exciting place to explore.

Regards, Organizing Committee of APBC 2007

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20060929/8fab1afc/attachment.html>

From asidhu at biomap.org  Sat Sep 30 13:32:04 2006
From: asidhu at biomap.org (Amandeep S. Sidhu)
Date: Sun, 1 Oct 2006 03:32:04 +1000 (EST)
Subject: [BiO BB] (no subject)
Message-ID: <49639.144.136.102.223.1159637524.squirrel@biomap.org>

2007 IEEE Workshop on Biomedical Applications for Digital Ecosystems (BADE
2007) with

Inaugural IEEE International Digital Ecosystems and Technology Conference
2007

20 February 2007, Cairns, Australia

http://bade07.biomap.org/

Scope of the workshop:

The primary focus of BADE 2007 workshop is to share research applications
using biomedical data and to identify new issues and directions for future
research in biomedical applications. Authors are invited to submit
original papers to the workshop exploring theories, techniques, and
applications for biomedicine. Papers are invited, but not limited to the
following themes:

* Bioinformatics and Computational Biology
* Data Representation and Visualization
* Biological Databases & Data Integration
* Microarray analysis
* Protein and RNA structure prediction
* Feature selection and pattern discovery in biological data
* System Biology and Pathways
* Biomedical Ontologies and taxonomies
* Text Mining
* Health Care Information Systems
* Electronic Health Records
* Clinical Assessment and Patient Diagnosis
* Disease Control and Prevention
* Privacy and Security in Healthcare

Important Dates:

* Submission of  Full Papers:		November 10, 2006
* Noification of Acceptance:		December 10, 2006
* Camera-ready Copies of papers:	December 31, 2006

Paper Submission Procedures:

All paper submissions will be handled electronically at:

http://bradleyuniversityvolunteer.ieee-ies.org/submit/dest07/

* Authors are Invited to submit electronically, a full paper (6 pages,
about 4500 words, PDF file) of their original work.
* Select "W01: Biomedical Applications" in Technical Track drop down menu
on Author Page.

High quality papers in biomedical applications are solicited. Original
papers exploring new directions will receive especially careful and
supportive reviews. Papers that have already been accepted or are
currently under review for other conferences or journals will not be
considered for publication at IEEE DEST 2007.

Paper submissions should be in the IEEE 2-column format, and will be
reviewed by the Program Committee on the basis of technical quality,
relevance, originality, significance, and clarity. Accepted IEEE BADE 2007
will be published in the conference proceedings by IEEE Industrial
Electronics Society and will be included in EI index and IEEE Xplore.

General Chairs:

Tharam S. Dillon (University of Technology Sydney, Australia)
Elizabeth Chang (Curtin University of Technology, Australia)

Program Chairs:

Amandeep S. Sidhu (University of Technology Sydney, Australia)
Xiaohua Hu (Drexel University, USA)
Farookh Hussain (Curtin University of Technology, Australia)
Maja Hadzic (Curtin University of Technology, Australia)

For further inquiries, please contact bade07 at biomap.org


From asidhu at biomap.org  Sat Sep 30 13:37:44 2006
From: asidhu at biomap.org (Amandeep S. Sidhu)
Date: Sun, 1 Oct 2006 03:37:44 +1000 (EST)
Subject: [BiO BB] 1st CFP: IEEE Workshop on Biomedical Applications for
 Digital Ecosystems (BADE 2007)
Message-ID: <49694.144.136.102.223.1159637864.squirrel@biomap.org>

2007 IEEE Workshop on Biomedical Applications for Digital Ecosystems (BADE
2007) with

Inaugural IEEE International Digital Ecosystems and Technology Conference
2007

20 February 2007, Cairns, Australia

http://bade07.biomap.org/

Scope of the workshop:

The primary focus of BADE 2007 workshop is to share research applications
using biomedical data and to identify new issues and directions for future
research in biomedical applications. Authors are invited to submit
original papers to the workshop exploring theories, techniques, and
applications for biomedicine. Papers are invited, but not limited to the
following themes:

* Bioinformatics and Computational Biology
* Data Representation and Visualization
* Biological Databases & Data Integration
* Microarray analysis
* Protein and RNA structure prediction
* Feature selection and pattern discovery in biological data
* System Biology and Pathways
* Biomedical Ontologies and taxonomies
* Text Mining
* Health Care Information Systems
* Electronic Health Records
* Clinical Assessment and Patient Diagnosis
* Disease Control and Prevention
* Privacy and Security in Healthcare

Important Dates:

* Submission of  Full Papers:		November 10, 2006
* Noification of Acceptance:		December 10, 2006
* Camera-ready Copies of papers:	December 31, 2006

Paper Submission Procedures:

All paper submissions will be handled electronically at:

http://bradleyuniversityvolunteer.ieee-ies.org/submit/dest07/

* Authors are Invited to submit electronically, a full paper (6 pages,
about 4500 words, PDF file) of their original work.
* Select "W01: Biomedical Applications" in Technical Track drop down menu
on Author Page.

High quality papers in biomedical applications are solicited. Original
papers exploring new directions will receive especially careful and
supportive reviews. Papers that have already been accepted or are
currently under review for other conferences or journals will not be
considered for publication at IEEE DEST 2007.

Paper submissions should be in the IEEE 2-column format, and will be
reviewed by the Program Committee on the basis of technical quality,
relevance, originality, significance, and clarity. Accepted IEEE BADE 2007
will be published in the conference proceedings by IEEE Industrial
Electronics Society and will be included in EI index and IEEE Xplore.

General Chairs:

Tharam S. Dillon (University of Technology Sydney, Australia)
Elizabeth Chang (Curtin University of Technology, Australia)

Program Chairs:

Amandeep S. Sidhu (University of Technology Sydney, Australia)
Xiaohua Hu (Drexel University, USA)
Farookh Hussain (Curtin University of Technology, Australia)
Maja Hadzic (Curtin University of Technology, Australia)

For further inquiries, please contact bade07 at biomap.org