Occupancy modeling, maximum contig size probabilities and
designing metagenomics experiments
Stephen A. Stanhope
University of Chicago, Biological Sciences Division
June 14, 2010
These R codes accompany "Occupancy modeling, maximum contig size probabilities and designing metagenomics
experiments," Stephen A. Stanhope (2010). They enable the user to replicate most of the results presented in
the paper and perform experimental designs described therein. Their intended use is to be edited for the
desired number of genomes, genome length, number of reads, etc., and then called via "source" command in an
R terminal or from "Rscript" on the command line. They are to be considered research codes and anticipate
reasonable proficiency in R. Please do contact the author with any questions.
Included are the following:
maximum_contig_length_simulation.R - This code performs simulations of genome assemblies and related
Wendl and expected overlap tiling discretizations, and reports distributions of maximum contig sizes over
a number of iterations. Additionally, it produces the Poisson approximation of the maximum contig size.
These results are described in "Largest contig size probabilities for a single genome."
design_single_novel.R - This code computes experimental designs as described in the "Detecting a single novel
species in a pool of known species" subsection.
design_fixed_pool.R - This code computes experimental designs described in "Obtaining contigs representative
of a pool of species."
design_fixed_pool_distributed_size_abundance.R - This code computes experimental designs described in "Fixed pool
sizes with distributed genome sizes and abundances."
design_stochastic_pool_distributed_size_abundance.R - This code computes experimental designs described in
"Stochastic pools with distributed genome sizes and abundances."
metagenome_assembly_simulation.R - This code is used to simulate whole metagenome assemblies. It is used to
verify experimental designs computed in "Stochastic pools with distributed genome sizes and abundances."