Main»Home Page

Home Page

Project: Simulation and Analysis of putative Effects of the Cancer Genome on lineage selection during somatic cancer evolution

Most clinically distinguishable malignant tumors are characterized by specific mutations, specific patterns of chromosomal rearrangements and a predominant mechanism of genetic instability but it remains unsolved whether modifications of cancer genomes can be explained solely by mutations and positive or negative selection through the cancer microenvironment.

It has been suggested that internal dynamics of genomic modifications as opposed to the external evolutionary forces have a significant and complex impact on Darwinian species evolution. A similar situation can be expected for somatic cancer evolution as molecular key mechanisms encountered in species evolution also constitute prevalent mutation mechanisms in human cancers.

The principal hypothesis is that permissive or restrictive effects of the genome architecture on lineage selection during somatic cancer evolution exist and have an impact which is comparable in magnitude to the effects of selection by the tumor microenvironment. The goal of the project is to develop software applications which demonstrate, analyze or predict the impact of the genome architecture on lineage selection during somatic cancer evolution.

The starting point is a simulation of somatic cancer evolution with unequal sister chromatid exchange as only mutation mechanism (CSim-2000 6.14). Improvements of this program or additional approaches for data mining of cancer genomes and cancer protein function networks are appreciated.

Link to Independent Institute of Systems Sciences Aachen: http://www.iiss-aachen.de/41563/home.html.

The program can be downloaded under the GNU Free Documentation License at http://ftp.bioinformatics.org/pub/canevolvability/. or at http://www.iiss-aachen.de/41563/43374.html.

What the program does:

The program simulates the evolution of a cancer starting with one cancer cell which divides and acquires mutations through the process of unequal sister chromatid exchange. The simulation starts with one cell with one chromosome pair containing 16 genes each, four essential genes (E1 to E4), four neutral genes (N1 to N4), four oncogenes (O1 to O4) and four tumor suppressor genes (T1 to T4). The order of these genes on the chromosomes can be varied as well as the frequency of mutations, the maximum length of genes involved in USCE and the number of cell divisions. The simulation only considers the end result of reciprocal USCE where one chromatid loses and the sister chromatid gains a contiguous stretch of genetic material.

The simplified mutation process consists of four random steps. First a random generator selects one cell to mutate (R1). After doubling of the chromosomes of this cell, a random generator chooses one chromatid to change (R2). The random generator then picks one gene on the chromatid as first breakpoint (R3) and finally the random generator selects a second breakpoint within a predefined length of genes (R4). The chromatid segment defined by these two breakpoints is transferred to the sister chromatid resulting in a duplication on the sister chromatid and a deletion on the donor chromatid.

Example:

1. The first cell contains two identical chromosomes (diploid state) with the following genes:

Chromosome 1: E1-E2-E3-E4-T1-T2-T3-T4-O1-O2-O3-O4-N1-N2-N3-N4 Chromosome 2: E1-E2-E3-E4-T1-T2-T3-T4-O1-O2-O3-O4-N1-N2-N3-N4

Cells proliferate with this genome until a random generator selects one cell to mutate (R1).

The cell genome will mutate as follows (example):

2. Doubling of the chromatids of this cell, a random generator chooses one chromatid to change (R2):

Chromosome 1-chromatid-1: E1-E2-E3-E4-T1-T2-T3-T4-O1-O2-O3-O4-N1-N2-N3-N4 Chromosome 1-chromatid-2: E1-E2-E3-E4-T1-T2-T3-T4-O1-O2-O3-O4-N1-N2-N3-N4 this one will mutate

Chromosome 2-chromatid-1: E1-E2-E3-E4-T1-T2-T3-T4-O1-O2-O3-O4-N1-N2-N3-N4 Chromosome 2-chromatid-2: E1-E2-E3-E4-T1-T2-T3-T4-O1-O2-O3-O4-N1-N2-N3-N4

3. The random generator then picks one gene on the chromatid as first breakpoint (R3 example = between E4 and T1) and the random generator selects a second breakpoint within a predefined length of genes (R4 example = between T4 and O1).

Chromosome 1-chromatid-1: E1-E2-E3-E4-T1-T2-T3-T4-O1-O2-O3-O4-N1-N2-N3-N4 Chromosome 1-chromatid-2: E1-E2-E3-E4** T1-T2-T3-T4**O1-O2-O3-O4-N1-N2-N3-N4

Chromosome 2-chromatid-1: E1-E2-E3-E4-T1-T2-T3-T4-O1-O2-O3-O4-N1-N2-N3-N4 Chromosome 2-chromatid-2: E1-E2-E3-E4-T1-T2-T3-T4-O1-O2-O3-O4-N1-N2-N3-N4

4. The chromatid segment defined by these two breakpoints is transferred to the sister chromatid resulting in a duplication on the sister chromatid and a deletion on the donor chromatid.

Chromosome 1-chromatid-1: E1-E2-E3-E4-T1-T2-T3-T4- T1-T2-T3-T4-O1-O2-O3-O4-N1-N2-N3-N4 Chromosome 1-chromatid-2: E1-E2-E3-E4-O1-O2-O3-O4-N1-N2-N3-N4

Chromosome 2-chromatid-1: E1-E2-E3-E4-T1-T2-T3-T4-O1-O2-O3-O4-N1-N2-N3-N4 Chromosome 2-chromatid-2: E1-E2-E3-E4-T1-T2-T3-T4-O1-O2-O3-O4-N1-N2-N3-N4

5. The cell divides and the chromatids are distributed equally to the daughter cells:

Daughter cell1:

Chromosome 1: E1-E2-E3-E4-T1-T2-T3-T4- T1-T2-T3-T4-O1-O2-O3-O4-N1-N2-N3-N4 Chromosome 2: E1-E2-E3-E4-T1-T2-T3-T4-O1-O2-O3-O4-N1-N2-N3-N4

Daughter cell1:

Chromosome1: E1-E2-E3-E4-O1-O2-O3-O4-N1-N2-N3-N4 Chromosome2: E1-E2-E3-E4-T1-T2-T3-T4-O1-O2-O3-O4-N1-N2-N3-N4

The daughter cells continue to divide until the next mutation occurs.

The computer simulation allows simulating a linear chromosome where the mutation probability of each gene is also dependent of its relative position on the chromatid or a ring chromosome with uniform mutation probability. In the linear chromosome model, the maximum number of transferred genes from the donor chromatid to the acceptor chromatid is dependent on the number of genes in telomere (right) direction of the first breakpoint whereas in the ring chromosome model the maximum number of transferred genes is the total number of genes on the donor chromatid .

The simulation differentiates between five stages.

The simulation starts with cells in stage one. If one cell loses either tumor suppressor gene one or two (T1 or T2) it proceeds to stage two. If one cell acquires, in addition, an extra copy of oncogenes one or two (O1 or O2) it advances to stage three. With an additional loss of tumor suppressor genes three or four, the cell enters stage four and an additional extra copy of oncogenes three or four (O3 or O4) allows progression to the final stage five. Duplications or losses of neutral genes have no effects. If a cell loses both copies of one of the four essential genes E1 to E4, the cell is eliminated. The cell is also eliminated when the total number of genes exceeds 64.

Within the simulation, proliferation rates and mutation rates do not differ between stages. The only selection effect occurs through elimination of non-viable cells with either homozygeous loss of essential genes or a total gene number exceeding 64.

How to use the program:

The program runs under linux. It has to be opened in a terminal window with the command: “./sim”. The result data will be then saved as “ergebnis.sim”. In order to save the result file under another name, for example as “results01” please type: “sim results01”.

After starting the program, you will be asked for the program parameters.

1. The program first asks if you want to enter a new gene sequence for the chromosome (option 1, "ein neues Genmuster eingeben") or if you want to use the sequence of the last simulation (option 2, "das Genmuster der letzten Simulation benutzen"). The file "genmuster.sim", contains the chromosome sequence of the latest simulation run.

If option 1 is choosen, the program asks for each gene, first gene (=" 1. Gen"), 2. gene and so on. Genes E1 to E4, T1 to T4, O1 to O4, N1 to N4 can be entered. One can enter multiple copies of a gene or omit genes.

2. The program then askes if the user wants to enter new simulation parameters (option 1 = "die Simulationsparameter eingeben" or if the user wants to use the simulation parameters of the last simulation (option 2 = "die Parameter der letzen Simulation benutzen").

If option 1 is used, the program will then ask for the maximum number of cell divisions, which should be below 50 depending on memory space and mutation probability.

3. The program will then ask for the muation probability which will be entered as reciprocal value, i.e. for mutation probability of 1:100 please enter „100“.

4. The program will ask how the simulation will be stopped: Option 1 ("Durchlauf des Programms bis zur x. Teilung" means, it will stop after the last cell division. Option 2 means, it will stop after x cells have reached stage 5 or after the last cell division, if this goal has not been reached.

If option 2 is entered, the program asks for the number x of cells which have to reach stage 5 in order to stop the program early.

5. The program asks for one of the following mutation options:

      1: This is the linear chromosome model. The maximum number of transferred genes from the donor chromatid to the acceptor chromatid is dependent on the number of genes in telomere (right) direction of the first breakpoint.
      2: This is the ring chromosome model. The maximum number of transferred genes from the donor chromatid to the acceptor chromatid is the total number of genes on the donor chromatid 
      3: The user enters the maximum number of transfered genes (between 1 and 8). The model behaves like model 2. 

6. The simulation starts computing after input of these parameters. When computing has ended, the program displays the computing time.

Data output:

The data are contained in the result file which can be read with the editor.

The first two lines contains the simulation parameters: “Verwende Mutationswahrscheinlichkeit” = mutation frequency and “Verwende Mutationsart” = mutation option.

Then the lines with “Stufe”=division numbers indicate the division numbers when “Progressionsstufe” = stages (2 to 5) are reached.

The five following lines display the chromosome sequences of the starting cell (=”Startzelle”).

Then the result file displays the number of cells (“Gesamtanzahl”) with this sequence, which is present at the end of the simulation. The "Anzahl der reproduzierten Zellen" indicates how often this cell has been reproduced through mutation from cells with different genomes. “Geburt, Evolutionsstufe” indicates the division number “Geburt” when the starting cell has first occurred and the number of mutation steps “Evolutionsstufe” which were necessary to produce the cell (both are necessarily 0).

Then the result file displays the total number of cells at the end of the simulation (=”Gesamtanzahl an Zellen), after which a table displays the progression stages („stufe“) 1 to 5, the number of cells in each progression stage and the number of different cell types (“verschiedene Zelltypen) in each progression stage.

It follows a line “Anzahl Zelltypen insgesamt..“ with the total number of different cell types at the end of the simulation.

Then the file displays a statistics on the number of genes in the cells. The first line shows the mean number of genes per cell type “Durchschnittliche Genanzahl eines Zelltyps”. The table shows the distribution ranging from 4 genes (the minimal possible number) to 64 (the maximum allowed number).

The following line indicates the number of stage transitions (=”Folgende Stufenuebergaenge …“ from stages 1 to 4 into stages 2 to 5 which occurred during the simulation.

Then the result file displays the genome of the cell which has first reached stage 5 “Erste Zelle der Stufe 5”as well as the number of cells (“Gesamtanzahl”) with this sequence in stage 5 at the end of the simulation. The "Anzahl der reproduzierten Zellen" indicates how often this cell has been reproduced through mutation from cells with different genomes. “Geburt, Evolutionsstufe” indicates the cell division number (“Geburt”) when the cell first occurred and the number of mutations steps which were necessary to produce the cell.

If the stop option 2 has been used, the result line also displays the ancestry "Erblinie" of the cells which ave reached stage 5. For each cell type of the lineage leading to the first cell in stage 5, the result file displays “Geburt"and "Evolutionsstufe”

The program package consists of the following files:

sim the executable file README this file The program produces the file "genmuster.sim" and the results file

LICENCE AGREEMENT:

The program and all accompanying software, files, data and materials, are distributed and provided "AS IS" and with no warranties of any kind, whether express or implied, including, without limitation, any warranty of merchantability or fitness for a particular purpose. Neither Albert Rübben or the authors of the program warrant, guarantee, or make any representations regarding the use of, or the results of the use of the program. Neither Albert Rübben or the authors of the program are liable for any damage or financial loss arising out of the use of, or inability to use the program. By using , installing, copying, distributing, or transmitting the program, the user agrees to all of the terms of this license agreement.