Skip to cookie consent Skip to main content

Developing Polygenic Risk Scores with Data from the Mass General Brigham Biobank

Contributor: Pradeep Natarajan, MD, MMSC
4 minute read
Pradeep Nataranjan speaking at a conference. Pradeep Natarajan, MD, MMSC

In celebration of its 15-year anniversary, the Mass General Brigham Biobank — research infrastructure that offers a bank of blood samples and data to researchers studying the causes of common diseases — is running a series of articles. Topics will highlight innovative research using Biobank samples, data, and more.


Polygenic Risk Scores (PRS) are new tools for evaluating someone’s genetic predisposition for developing a disease. Some genetic conditions are caused by a single rare gene variant. Many others, particularly common conditions, are the result of a combination of many common gene variants. Heart disease has been associated with at least three hundred different variants. However, not all variants play an equal role in contributing to disease. Having one variant may put a person at 1% increased risk of developing heart disease. Meanwhile, a different variant may put a person at a 5% increased risk. These individually do not provide meaningful information, but a PRS summarizes how many variants an individual has and how impactful each variant is into one number.

Using PRSs, doctors can intervene with high-risk patients before they begin experiencing any symptoms of disease.

Pradeep Natarajan, MD, MMSC

Director, Preventive Cardiology Program

Massachusetts General Hospital

Using PRS for heart disease

Pradeep Natarajan, MD, MMSC, directs the Preventive Cardiology program at Massachusetts General Hospital and uses Biobank samples and data in his research to build PRSs for heart disease. Dr. Natarajan’s research includes both developing PRSs for heart disease and understanding how to bring them into clinical care.

To identify variants associated with a health condition and their level of impact, researchers first perform a genome-wide association study (GWAS). In a GWAS, researchers identify genetic variants that are commonly observed among those with versus without a health condition. GWASs require large data sets for accurate results. For this reason, the Biobank’s genetic data from 65,000 participants has been a valuable tool for Dr. Natarajan’s work.

Next, researchers construct a PRS by both identifying the variants more commonly observed in those with a health condition and assigning a score based on the relative difference. By plotting a distribution of PRSs, physicians can identify patients at highest risk for developing a health condition compared to the rest of the population. Unlike environmental risk factors for disease, genes are consistent throughout a lifetime, starting at birth. This offers the chance to identify high-risk individuals early in life and lower risk through healthy behaviors.

Heart disease is the leading cause of death worldwide, making it a particularly good application for preventative tools like PRSs because recognition or management of risk often is too late. There are other polygenic conditions such as diabetes, cancers, and psychiatric conditions for which PRSs also have the potential to be a useful tool for managing risk.

Diversifying training data to improve PRSs for all

One weakness of PRSs is the ability to generalize PRSs depends on the training data used to build them. A PRS created using samples from an entirely European population is likely to yield accurate predictive results for someone of European ancestry. Meanwhile, the same score would not be as reliable for someone with Asian or African ancestry. For this reason, it is important to perform GWASs and PRS training in diverse cohorts. In fact, greater diversity generally improves PRS performance in all populations.

This year the Biobank is opening new recruitment sites at the community health centers in an effort to increase the diversity of Biobank participants. We want to make sure the Biobank is representative of all the patients who receive their care at Mass General Brigham. This effort will help investigators like Dr. Natarajan produce robust, generalizable tools, such as Polygenic Risk Scores, to improve outcomes for all patients.

Mass General Brigham Biobank celebrates 15-year anniversary and 150k participants

The Biobank started as a simple idea fifteen years ago, to work together to reduce the cost of research while accelerating its pace and to minimize the burden of research participation on our patients.

Pradeep Natarajan, MD, MMSC

Contributor