Genomic segment sharing

Theoretical arguments suggest that in founder populations, even unrelated individuals share a number of genomic segments that are nearly identical over mega base-pairs. Such segments are called identical-by-descent (IBD). The recent sharp decline in genotyping costs revealed that IBD shared segments are indeed abundant in many worldwide populations, including Ashkenazi Jews. When such segments exist, they are invaluable; for example, using the number of shared segments, we can infer the population’s history, the effect of natural selection, or genes associated with a disease. Or, using sequence from one individual, we can infer, with high accuracy, some of the sequence of another individual, even if that other individual was only sparsely genotyped.

Theoretical modeling of the expected levels of IBD sharing in populations is crucial for many applications and for study design. In a number of papers, we have developed the relevant theory, covering the following aspects of IBD sharing:

  • The number of segments and the amount of genetic material covered by segments under models of population evolution. Specifically, we studied the coalescent with recombination and its approximations, including a novel “renewal” approximation [1,2].
  • The average amount and the genomic span of segments shared between a single individual and a reference panel. Those quantities are important for effective design of sequencing studies [3].

In parallel, we applied IBD analysis to reconstruct the history of Ashkenazi Jews and Druze [3,4], and developed a tool for fast extraction of IBD segments from simulated genomes [5].

Current and future work focuses on development of improved methods for demographic inference (i.e., reconstruction of the population history) based on sharing between multiple individuals, more realistic models of practical IBD segments detection, accounting for sex-imbalance, and extending the method to ancient genomes. Other projects include efficient algorithms for inference of missing variants (“imputation”) based on segment sharing, with specific applications for extremely low-coverage sequencing as well as for reproductive genetics. Methods we develop can be applied to any founder population, and in particular in the context of Ashkenazi Jewish genetics.

[1] Carmi et al., Genetics, 2013 (link)
[2] Carmi et al., Theoretical Population Biology, 2014 (link)
[3] Carmi et al., Nature Communications, 2014 (link)
[4] Zidan et al., European Journal of Human Genetics, 2014 (link)
[5] Yang et al., RECOMB 2015 (link)

An illustration of the sample history leading to sharing of a genomic segment. The ancestral chromosome (red) is broken by recombination at each generation. However, if the common ancestor of A and B lived recently, the shared segment will remain long.

An illustration of a hypothetical history leading to sharing of a genomic segment. The ancestral chromosome (red) is broken by recombination at each generation. However, if the common ancestor of A and B lived recently, the shared segment will remain long and easily detectable.

The simulated distribution of the number of shared segments (of length greater than 1cM) in a chromosome of length 200cM , under three values of the effective population size. Theory was computed based on renewal processes. From Carmi et al., Theor. Popul. Biol., 2014.