The Founder Effect Simulation

The founder effect is an evolutionary phenomenon that occurs when a small group of individuals becomes isolated from a larger population and establishes a new population. This new population can have a significantly different genetic makeup compared to the original population due to the limited genetic variation carried by the founding members. The significant features of the founder effect include:

Let's consider a hypothetical example to illustrate the founder effect.

Original Population

Imagine a population of 1,000 birds with the following distribution of a particular gene (let's call it gene A) with two alleles: A1 and A2.

Founding Population

A small group of 10 birds becomes isolated on an island. Now, what the percentage of A1 and A2 alleles in this small group will be is entirely based on chance (the only factor that can somewhat influence the founder allele frequency is an extreme allele frequency in the original population, such as A1 = 0.95, and A2 = 0.05 or vice versa). Let's say, for example, this founding population has the following distribution of the alleles by chance:

Impact of the Founder Effect

The allele frequencies in the founding population differ significantly from those in the original population. This disparity can lead to several consequences:

The following simulation demonstrates allele frequencies across generations, illustrating the founder effect in action. Please note that the founder allele frequency in this code is random, which will have different effects on the subsequent generation. Update the simulation parameters (i.e., population size, allele frequencies, and number of generations) to visualize their effects in the subsequent generations.

Algorithm for Simulation

A function called simulateFounderEffect models the Founder effect by simulating allele frequency dynamics over multiple generations. The process begins by generating an initial population of a specified size (original_pop_size), where allele frequencies of two variants (A1 and A2) are determined based on a given probability (freq_A1). A subset of individuals is then randomly sampled from the original population to form a small founding population (founder_pop_size). The allele frequencies within this founding population are computed. The function then iterates over a predefined number of generations (generations), where in each generation, the population is subject to random sampling with replacement (mimicking genetic drift), and the allele frequencies are recalculated. If an allele is missing from the sample, its frequency is recorded as zero. The allele frequencies of both alleles (A1 and A2) are tracked and stored in a data frame, with each row representing the frequencies at a particular generation. This data frame is returned as the output, providing a temporal snapshot of allele frequency shifts driven by genetic drift. The complete code for this simulation is available on my GitHub repository.