TY - JOUR
T1 - Quantifying factors that affect polygenic risk score performance across diverse ancestries and age groups for body mass index
AU - Hui, Daniel
AU - Xiao, Brenda
AU - Dikilitas, Ozan
AU - Freimuth, Robert R.
AU - Irvin, Marguerite R.
AU - Jarvik, Gail P.
AU - Kottyan, Leah
AU - Kullo, Iftikhar
AU - Limdi, Nita A.
AU - Liu, Cong
AU - Luo, Yuan
AU - Namjou, Bahram
AU - Puckelwartz, Megan J.
AU - Schaid, Daniel
AU - Tiwari, Hemant
AU - Wei, Wei Qi
AU - Verma, Shefali
AU - Kim, Dokyoon
AU - Ritchie, Marylyn D.
N1 - Publisher Copyright:
© 2022 The Authors.
PY - 2023
Y1 - 2023
N2 - Polygenic risk scores (PRS) have led to enthusiasm for precision medicine. However, it is well documented that PRS do not generalize across groups differing in ancestry or sample characteristics e.g., age. Quantifying performance of PRS across different groups of study participants, using genome-wide association study (GWAS) summary statistics from multiple ancestry groups and sample sizes, and using different linkage disequilibrium (LD) reference panels may clarify which factors are limiting PRS transferability. To evaluate these factors in the PRS generation process, we generated body mass index (BMI) PRS (PRSBMI) in the Electronic Medical Records and Genomics (eMERGE) network (N=75,661). Analyses were conducted in two ancestry groups (European and African) and three age ranges (adult, teenagers, and children). For PRSBMI calculations, we evaluated five LD reference panels and three sets of GWAS summary statistics of varying sample size and ancestry. PRSBMI performance increased for both African and European ancestry individuals using cross-ancestry GWAS summary statistics compared to European-only summary statistics (6.3% and 3.7% relative R2 increase, respectively, pAfrican=0.038, pEuropean=6.26x10-4). The effects of LD reference panels were more pronounced in African ancestry study datasets. PRSBMI performance degraded in children; R2 was less than half of teenagers or adults. The effect of GWAS summary statistics sample size was small when modeled with the other factors. Additionally, the potential of using a PRS generated for one trait to predict risk for comorbid diseases is not well understood especially in the context of cross-ancestry analyses - we explored clinical comorbidities from the electronic health record associated with PRSBMI and identified significant associations with type 2 diabetes and coronary atherosclerosis. In summary, this study quantifies the effects that ancestry, GWAS summary statistic sample size, and LD reference panel have on PRS performance, especially in cross-ancestry and age-specific analyses.
AB - Polygenic risk scores (PRS) have led to enthusiasm for precision medicine. However, it is well documented that PRS do not generalize across groups differing in ancestry or sample characteristics e.g., age. Quantifying performance of PRS across different groups of study participants, using genome-wide association study (GWAS) summary statistics from multiple ancestry groups and sample sizes, and using different linkage disequilibrium (LD) reference panels may clarify which factors are limiting PRS transferability. To evaluate these factors in the PRS generation process, we generated body mass index (BMI) PRS (PRSBMI) in the Electronic Medical Records and Genomics (eMERGE) network (N=75,661). Analyses were conducted in two ancestry groups (European and African) and three age ranges (adult, teenagers, and children). For PRSBMI calculations, we evaluated five LD reference panels and three sets of GWAS summary statistics of varying sample size and ancestry. PRSBMI performance increased for both African and European ancestry individuals using cross-ancestry GWAS summary statistics compared to European-only summary statistics (6.3% and 3.7% relative R2 increase, respectively, pAfrican=0.038, pEuropean=6.26x10-4). The effects of LD reference panels were more pronounced in African ancestry study datasets. PRSBMI performance degraded in children; R2 was less than half of teenagers or adults. The effect of GWAS summary statistics sample size was small when modeled with the other factors. Additionally, the potential of using a PRS generated for one trait to predict risk for comorbid diseases is not well understood especially in the context of cross-ancestry analyses - we explored clinical comorbidities from the electronic health record associated with PRSBMI and identified significant associations with type 2 diabetes and coronary atherosclerosis. In summary, this study quantifies the effects that ancestry, GWAS summary statistic sample size, and LD reference panel have on PRS performance, especially in cross-ancestry and age-specific analyses.
KW - diversity
KW - polygenic risk scores (PRS)
KW - risk prediction
KW - transferability
UR - http://www.scopus.com/inward/record.url?scp=85144312487&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85144312487&partnerID=8YFLogxK
U2 - 10.1142/9789811270611_0040
DO - 10.1142/9789811270611_0040
M3 - Conference article
C2 - 36540998
AN - SCOPUS:85144312487
SN - 2335-6928
SP - 437
EP - 448
JO - Pacific Symposium on Biocomputing
JF - Pacific Symposium on Biocomputing
IS - 2023
T2 - 28th Pacific Symposium on Biocomputing, PSB 2023
Y2 - 3 January 2023 through 7 January 2023
ER -