Our paper on the utility of screening embryos for polygenic traits has been published in Cell. The questions and answers below address some of the key points and caveats.
Since when did it become possible to screen embryos for complex traits?
Testing embryos for an aneuploidy (a loss or a duplication of an entire chromosome) or for the genotype of a single mutation (in the context of severe Mendelian disorders) has been possible for over 25 years, and is known as preimplantation genetic diagnosis (PGD). Genetically testing embryos for complex traits (e.g., height, weight, blood pressure) became possible due to two recent developments.
- It is now relatively easy and affordable to generate accurate genome-wide data from single cells of IVF embryos (e.g., our recent paper or other papers such as (1, 2, 3, 4, 5)).
- The genome sequence of an individual can be used to compute a “polygenic score”, and these scores were shown to predict, with increasing accuracy, traits such as height or intelligence.
As a consequence of these developments, prospective parents who are interested in “improving” the height or IQ of their future children can generate embryos by IVF, genotype them, and implant only the top-scoring embryo. Companies are already starting to offer related tests (1, 2; see also below). We call this procedure “embryo screening for polygenic traits”.
What is the problem?
There are multiple ethical and societal issues with screening embryos for complex traits. Many people are concerned with the return of eugenics, a dark cloud over early 20th century science that has led to the forced sterilization and mass murder of people of undesired traits. The prospects of “designer babies” raise concerns of unequal opportunities, stigmatization, and changes to the fundamental meaning of parenthood. All these are and should be subject to debate among all stakeholders.
What is crucially missing from these debates is a thorough, evidence-based evaluation of the effectiveness of the procedure. An evolving blog provides a modeling effort. But no empirical data existed on the expected outcomes of embryo screening.
What was the goal of your research?
Our goal was to evaluate the expected outcomes, in terms of the increase in trait value, when screening embryos for continuous traits such as height and IQ.
How did you perform the study?
We obviously cannot experiment with actual embryos, nor can we wait years in order to measure their phenotypes. Instead, we used theory and data-driven simulations. We used currently available polygenic scores for height and IQ and assumed that five embryos are available per couple (this is the typical number per IVF cycle). We simulated the genomes of embryos based on real genomic data from families and unrelated individuals. In parallel, we derived a theoretical statistical model. Our purpose was to compute the gain, or the increase in trait value compared to the family average, when selecting the embryo top-scoring for a certain trait.
What were the main findings?
We found that by selecting the embryo top-scoring for height, parents can increase the height of their child by an average of 2.5cm. If selecting for intelligence, the increase is around 2.5 IQ points. The numerical similarity is coincidental; the gain in height is much larger in relative terms, as it is 40% of the (per sex) standard deviation in the population (~2.5/6cm), compared to just 15% for IQ (2.5/15). This difference exists because height was studied in many more subjects and has a more accurate genetic score.
Can these gains be improved?
We found that generating more embryos will increase the gain but will quickly have diminishing returns. The increase in height or any other trait will not even double, even if it were miraculously possible to generate hundreds of embryos (this is currently science fiction).
Polygenic scores are expected to improve, due to the ever increasing size of genetic studies. This can substantially increase the gain. However, the accuracy of the scores cannot increase indefinitely, because it is limited by the “heritability” of the trait, i.e., by the contribution of genetics to the determination of the trait. Current polygenic scores are further limited because they are only based on common genetic markers.
Is the gain guaranteed?
The fact that height improves by 2-3cm on average does not guarantee that the child will indeed be 2-3cm taller than based on natural conception. Three factors add noise:
- Children inherit a random mixture of the two chromosomes of each parent. The scores of each given set of embryos will thus vary to some degree due to chance.
- There are genetic factors that not modeled in the polygenic score. For most traits, the majority of these factors are unknown. They can increase or decrease the trait beyond what is expected by the score.
- Genetics alone does not fully determine the trait. Environmental factors have an important influence, in particular for intelligence.
Thus, an embryo selected for its top score for (say) height may end up taller or shorter than expected based on the average gains cited above. When we analyzed data from large nuclear families (an average of 10 children per family), we found that only in a quarter of the families, the child with the top score for height was the tallest as an adult.
What else can limit the utility of embryo screening?
A number of additional factors will further reduce the practical gain and appeal of embryo screening. We did not quantitatively model most of them, which is a limitation of our study, but these factors are important to mention.
- The two chromosomes of each parent may be more similar to each other than expected by chance (“assortative mating”). In other words, if ancestors are all equal in their “genetically-determined” intelligence, embryos will be similar as well, and the gain smaller.
- The number of viable embryos generated by IVF decreases sharply with maternal age. The gain from embryo screening will decrease very fast once the number of embryos goes under five. Further, not all embryos that seem viable after growing for a few days in the lab will eventually lead to a live birth.
- IVF in an invasive procedure, which causes substantial discomfort and some health risks to the prospective mother. It is also very expensive.
- Scores are less predictive within families than originally reported when tested across unrelated individuals. This is because polygenic scores owe some their their success to simply predicting the ancestry of the individual, or predicting the genomes (and hence the behavior) of the parents. Within a family, all siblings have the same ancestry and the same parents, and hence the scores are less accurate for the purpose of distinguishing between siblings (the embryos).
- The scores used to predict height or IQ work considerably worse in non-European populations. The accuracy of polygenic scores also varies across age, sex, and socio-economic status.
- It is plausible that prospective parents will select the embryo for implantation based on more than one trait. After all, once embryos were generated and genotyped, screening for additional traits comes at no cost. However, when selecting for multiple traits, the gain per each individual trait decreases substantially. This is because an embryo scoring high for one trait may not necessarily score high for another.
- A related problem is that when selecting for one desired trait, we may be unknowingly selecting for undesired traits or health risks. For example, tendency towards greater IQ is genetically correlated with risk for anorexia and autism.
What does it all mean?
Our results suggest that currently, the average gain from screening embryos for complex traits is rather small and limited by several practical factors. However, even these small gains may be of interest to some prospective couples, in particular for height, for which the gain is larger in relative terms. Also, the gain for IQ is likely to substantially rise in the near future due to increasing study sizes. As the technology for embryo screening is already available, and embryo screening is legal at least in the United States, there is an urgent need for the public and policy makers to discuss the ethics, societal implications, and regulation of this procedure.
Our goal in this work was to support the debate by providing empirical and principled evaluation of the expected outcomes of embryo screening. We hope that our results will contribute to an informed and balanced discussion. At the very least, our results could be used to regulate what may be advertised to consumers and to allow consumers to critically evaluate what may be offered.
Is this related to the company Genomic Prediction?
Some readers may have noticed recent media coverage of a company called Genomic Prediction (recent articles appeared, for example, in the Economist and the MIT technology review). The company offers genetic testing of embryos, including the calculation of polygenic risk scores for a panel of diseases. We are not affiliated or associated with Genomic Prediction. Further, our study has evaluated selection for continuous (quantitative) traits (e.g., height, IQ), whereas Genomic Prediction is concerned with binary, disease traits. Our study did not evaluate the gains for disease risk reduction and thus does not apply to to the test offered by Genomic Prediction.
Moreover, it would be irresponsible to draw any conclusions based on our study regarding the possible utility of embryo screening for disease risk reduction. This is because (1) there could be multiple ways to select the embryo to be implanted, and the procedure must be precisely defined; (2) a metric of risk reduction must similarly be properly defined; and (3) disease risk depends on the score in a non-linear (and thus more complex) fashion, as opposed to the linear dependence observed for continuous traits. We are currently working on quantifying the utility of embryo screening for reducing disease risk, and we expect to share our results in the near future.