OVERVIEW

The files used for this study follow into three main categories: The files necessary to run an evolutionary simulation, the files necessary to analyze the output, and finally some R code to plot output. Almost all of the code is written in Python, though a small amount of Cython code is used to link up ViennaRNA, which is the real workhorse of the code that estimates RNA secondary structure from sequence input.

COMPILATION AND DEPENDENCIES

Most of the code is Python, but it also depends on ViennaRNA (a C library) and Cython, which profides linker code to let Python interact with ViennaRNA.

First, download and install ViennaRNA (http://www.tbi.univie.ac.at/RNA/)  and Cython (http://cython.org/). Then, modify the Setup2.py file so that the path to ViennaRNA matches where you installed it to. To compile the cython module (pyfold.so), run "python Setup2.py build_ext -i" and that should work.

SIMULATION

Simulations use a total of four files: pymut_rec.py , pymutfun_rec.py , pyrna.so (compiled from pyrna.pyx, a cython module) , and globs.py . To run a simulation, use python to run pymut_rec.py, along with several required and optional associated input.

REQUIRED INPUT AND OPTIONAL SWITCHES

Minimally, you have to feed the program a mostly useless input file and the selection coefficient. I say mostly useless because most of the settings can now be assigned as switches at run time, making it easier to run on a cluster. Anyway, here are the settings in line order:
    
    1. Run name
    2. file containing initial genotype(s)
    3. number of chromosomes per genotype
    4. maximum population size
    5. number of generation to run the simulation
    6. write current genotypes and associated data to a file every X generations
    7. write population summary data to a file every X generations
    8. per base probability of no mutation ( ie 1- mutatio n rate)
    9. average number of offspring produced by best phenotype
    10. recombination rate
    11. segregation rate
    12. optimal phenotype
    13 + onwards. optimal phenotype(s) of additional chromosomes
    
OK, any of these values will be ignored in favor of command-line options if they are given.
Here are the switches and their defaults, ie you can assign the input population file with -p or --popfile.

    --noben (short: -n)
        prevent beneficial mutations from occurring
    --rs_mut (short: -m)
        mutation rate for rate of segregation or recombination
    --ext (short:-e)
        if true, populations cannot go extinct
    --fitfun
        fitness function as hyperbolic ('h') or linear ('l')
    --comp_freq(short: -t)
        frequency of asexual individuals at start of populations
    --fidelity (short:-f)
        per base probability of no mutation ( ie 1- mutation rate)
    --rec (short:-r)
        rate of crossover events per generation
    --seg (short:-s)
        rate of chromosomal segregation per generation
    --output (short:-o)
        run name
    --popfile (short:-p)
        read in a given population output file from a previous simulation, only for asexual populations
    --chrom_num(short:-c)
        number of chromosomes per individual
    --dom (short:-d)
        for multiple allele simulations, how is fitness calculated?
        d=0 means least fit allele does not bear into fitness calculation
        d=0.5 means fitness is intermediate
        d=1.0 means fitness determined by least-fit allele
    --rs_linkage (short:-l)
        linkage between fitness genes and recombination allele
    --evpop (short:-v)
        popfile analog for sexual population output
    --c2_fitpar (short:-x)
        fitness function for second chromosomes
    --ploidy
        number of alleles per gene
    
Many of the switches relate to recombination options, ie rsmut relates to whether the rate of recombination/segregation itself can evolve. -t run an evolutionary competition between genotypes that can recombine/segregate, and genotypes that do not recombine in any fashion. -r and -s set the rate of recombination. 

EXAMPLES

python pymut_rec.py pymut_2 2.5 -r 0.0
    reads in the input file pymut_2, with a fitness function parameter setting of 2.5 (corresponding to "weak selection" in the manuscript, with no recombination rate.
    
python pymut_rec.py pymut_1 15.0 -s 0.1 -f .999
    reads in the input file pymut_1, with a fitness function parameter setting of 15, a segregation rate between chromosomes of 0.1 per individual per generation, and a fidelity of 0.999

ANALYZING OUTPUT

There is really one main special program called "adjfdel_cluster_rec.py" This program analyzes the population dumps, going through every genotype and documenting the effect of every possible single mutation. In this incarnation, it also documents the effect of recombination. The upshot of this analysis program is that it provides insight into how robustness in the population has evolved.

The program "adjfdel_masher.py" combines the output from the individual runs into one main csv file.

The program "calc_harmonic_evolved.py" was used to calculate the harmonic mean of mutation selection values. 

The program "fit_dis_R.py" creates a csv file of the fitnesses of individuals across different mutation rates so they can be plotted using R (Figures 1B and 4B). The files ml2_fit_dis.csv and mlhrd_fit_dis.csv were created using this program

l2_fit_time.csv was used to create figure 5.