Questions and answers on our paper: Screening human embryos for polygenic traits has limited utility

Our paper on the utility of screening embryos for polygenic traits has been published in Cell. The questions and answers below address some of the key points and caveats.

Since when did it become possible to screen embryos for complex traits?

Testing embryos for an aneuploidy (a loss or a duplication of an entire chromosome) or for the genotype of a single mutation (in the context of severe Mendelian disorders) has been possible for over two decades, and is known as preimplantation genetic diagnosis (PGD). Testing embryos for complex traits (e.g., height, weight, blood pressure) became possible due to two recent developments.

It is now relatively easy and affordable to generate accurate genome-wide data from single cells of IVF embryos (e.g., our recent paper or other papers such as (1, 2, 3, 4, 5)).
The genome sequence of an individual can be used to compute a “polygenic score”, and these scores were shown to predict, with increasing accuracy, traits such as height or intelligence.

Thus, prospective parents who are interested in “improving” the height or IQ of their future children can generate embryos by IVF, genotype them, and implant only the top-scoring embryo. Companies are already starting to offer related tests (1, 2; see also below). We call this procedure “embryo screening for polygenic traits”.

What is the problem?

There are multiple ethical and societal issues with screening embryos for complex traits. Many people are concerned with the return of eugenics, a dark period in the early 20th century science that has eventually led to the forced sterilization and mass murder of people with undesired traits. The prospects of “designer babies” raise concerns of unequal opportunities, stigmatization, and changes to the fundamental meaning of parenthood. All these are and should be subject to debate among all stakeholders.

What is crucially missing from these debates is a thorough, evidence-based evaluation of the effectiveness of the procedure. An evolving blog provides a modeling effort. But no empirical data existed on the expected outcomes of embryo screening.

What was the goal of our research?

Our goal was to evaluate the expected outcomes, in terms of the increase in trait value, when screening embryos for continuous traits such as height or IQ.

How did we perform the study?

We obviously cannot experiment with actual embryos, nor can we wait years in order to measure their adult phenotypes. Instead, we used theory and data-driven simulations. We used currently available polygenic scores for height and IQ and assumed that five embryos are available per couple (this is the typical number per IVF cycle). We simulated the genomes of embryos based on real genomic data from families and unrelated individuals. In parallel, we derived a theoretical statistical model. Our purpose was to compute the gain, or the increase in trait value compared to the family average, when selecting the embryo top-scoring for a certain trait.

What were the main findings?

We found that by selecting the embryo top-scoring for height, parents can increase the height of their child by an average of ~2.5cm. If selecting for intelligence, the increase is around ~2.5 IQ points. The numerical similarity is coincidental; the gain in height is much larger in relative terms, as it is 40% of the (per sex) standard deviation in the population (2.5/6cm), compared to just 15% for IQ (2.5/15). This difference exists because height was studied in many more subjects and has a more accurate genetic score.

Can these gains be improved?

We found that generating more embryos will increase the gain but will quickly have diminishing returns. For example, the increase in height or any other trait will not double even if it were miraculously possible to generate hundreds of embryos (which is currently science fiction).

Polygenic scores are expected to improve, due to the ever increasing size of genetic studies, and this can substantially increase the gain. However, the accuracy of the scores cannot increase indefinitely, because it is limited by the “heritability” of the trait, i.e., by the contribution of genetics to the determination of the trait. Current polygenic scores are further limited because they are only based on common genetic markers.

Is the gain guaranteed?

The fact that height increases by 2-3cm on average does not guarantee that the child will indeed be 2-3cm taller than based on natural conception. Three factors add noise:

Children inherit a random mixture of the two chromosomes of each parent. Thus, the scores of a given set of embryos must vary to some degree due to chance alone.
There are genetic factors that are not modeled in the polygenic score. For most traits, the majority of these factors are unknown. They can increase or decrease the trait beyond what is expected by the score.
Genetics alone does not fully determine the trait. Environmental factors have an important influence on virtually all polygenic traits.

Thus, an embryo selected for its top score for (say) height may end up taller or shorter than expected based on the average gain cited above. When we analyzed data from large nuclear families (an average of 10 children per family), we found that only in a quarter of the families, the child with the top score for height was the tallest as an adult.

What else can limit the utility of embryo screening?

A number of additional factors will further reduce the practical gain and appeal of embryo screening. We did not quantitatively model most of them, which is a limitation of our study. But these factors are important to mention.

The number of viable embryos generated by IVF decreases with maternal age, and the gain from embryo screening will decrease very fast once the number of embryos goes under five. Further, not all embryos that seem viable after growing for a few days in the lab will eventually lead to a live birth.
IVF in an invasive procedure, which causes substantial discomfort and some health risks to the prospective mother. It is also very expensive.
As people tend to marry others with similar traits (“assortative mating”), the two chromosomes of each parent may be more genetically similar to each other than expected by chance. If ancestors are all equal in their “genetically-determined” (say) intelligence, embryos will be similar as well, and the gain smaller.
Scores are less predictive within families than originally reported when evaluated across unrelated individuals. This is because polygenic scores owe some their their success to simply predicting the ancestry of the individual, or predicting the genomes (and hence the behavior) of the parents and thus the environment they induce. Within a family, all siblings have the same ancestry and the same parents, and hence the scores are less accurate for the purpose of distinguishing between siblings (the embryos).
The scores used to predict height or IQ work considerably worse in non-European populations. The accuracy of polygenic scores also varies across age, sex, and socio-economic status.
It is plausible that prospective parents will select the embryo for implantation based on more than one trait. After all, once embryos were generated and genotyped, screening for additional traits comes at no cost. However, when selecting for multiple traits, the gain per each individual trait decreases substantially. This is because an embryo scoring high for one trait may not necessarily score high for another.
A related problem is that when selecting for one desired trait, we may be unknowingly selecting for undesired traits or health risks. For example, a tendency towards greater IQ is genetically correlated with risks for anorexia and autism.

What does it all mean?

Our results suggest that currently, the average gain from screening embryos for complex traits is rather small and limited by several practical factors. However, even these small gains may be of interest to some prospective couples, in particular for height, for which the gain is larger in relative terms. Also, the gain for IQ is likely to substantially increase in the near future due to increasing study sizes. As the technology for embryo screening is already available, and embryo screening is legal at least in the United States, there is an urgent need for the public and policy makers to discuss the ethics, societal implications, and regulation of this procedure.

Our goal in this work was to support the debate by providing empirical and principled evaluation of the expected outcomes of embryo screening for traits. We hope that our results will contribute to an informed and balanced discussion. At the very least, our results could be used to regulate what may be advertised to consumers, and to allow consumers to critically evaluate what may be offered.

Is this related to the company Genomic Prediction?

Some readers may have noticed recent media coverage of a company called Genomic Prediction (recent articles appeared, for example, in the Economist and the MIT technology review). The company offers genetic testing of embryos, including the calculation of polygenic risk scores for a panel of diseases. We are not affiliated or associated with Genomic Prediction. Further, our study has evaluated selection for continuous (quantitative) traits (e.g., height, IQ), whereas Genomic Prediction is concerned with binary, disease traits. Our study did not evaluate the gains for disease risk reduction and thus does not apply to to the test offered by Genomic Prediction.

Moreover, it would be irresponsible to draw any conclusions based on our study regarding the possible utility of embryo screening for disease risk reduction. This is because (1) there could be multiple ways to select the embryo to be implanted, and the procedure must be precisely defined; (2) a metric of risk reduction must similarly be properly defined; and (3) disease risk depends on the score in a non-linear (and thus more complex) manner, as opposed to the linear dependence observed for continuous traits. We are currently working on quantifying the utility of embryo screening for reducing disease risk, and we expect to share our results in the near future.

Shai Carmi's lab @ The Hebrew University

Statistical and population genetics