Introduction
Depressive disorders represent the highest disease burden globally in terms of years lived with disability (World Health Organization, 2017). Their heterogeneous symptom profiles complicate diagnosis and the development of prevention and intervention strategies (Zimmerman et al., Reference Zimmerman, Ellison, Young, Chelminski and Dalrymple2015). Also, subclinical symptoms are highly common, creating a burden on individuals and society, and these symptoms might develop into clinical diagnoses (Murray et al., Reference Murray, Lin, Austin, McGrath, Hickie and Wray2021; Shah et al., Reference Shah, Scott, McGorry, Cross, Keshavan, Nelson, Wood, Marwaha, Yung, Scott, Öngür, Conus, Henry and Hickie2020). Successfully addressing the phenotypic diversity of depression demands a thorough comprehension of its multifaceted presentations and associated genetic and environmental risk factors.
Family-based studies have consistently shown that major depressive disorder has a heritability estimate ranging from 30 to 40% (Polderman et al., Reference Polderman, Benyamin, de Leeuw, Sullivan, van Bochoven, Visscher and Posthuma2015; Sullivan et al., Reference Sullivan, Neale and Kendler2000). Genome-wide association studies (GWAS) underscore the highly polygenic nature of the disorder, characterized by the involvement of numerous common variants, each contributing small increments of risk, jointly explaining 5–6% of variance in European cohorts (Als et al., Reference Als, Kurki, Grove, Voloudakis, Therrien, Tasanko, Nielsen, Naamanka, Veerapen, Levey, Bendl, Bybjerg-Grauholm, Zeng, Demontis, Rosengren, Athanasiadis, Bækved-Hansen, Qvist, Bragi Walters and Børglum2023; Howard et al., Reference Howard, Adams, Clarke, Hafferty, Gibson, Shirali, Coleman, Hagenaars, Ward, Wigmore, Alloza, Shen, Barbu, Xu, Whalley, Marioni, Porteous, Davies and Deary2019; Levey et al., Reference Levey, Stein, Wendt, Pathak, Zhou, Aslan, Quaden, Harrington, Nuñez, Overstreet, Radhakrishnan, Sanacora, McIntosh, Shi, Shringarpure, Research, Million Veteran, Concato, Polimanti and Gelernter2021; McIntosh & Lewis, Reference McIntosh and Lewis2024; Meng et al., Reference Meng, Navoly, Giannakopoulou, Levey, Koller, Pathak, Koen, Lin, Adams, Rentería, Feng, Gaziano, Stein, Zar, Campbell, van Heel, Trivedi, Finer, McQuillin and BioBank Japan2024; Wray et al., Reference Wray, Ripke, Mattheisen, Trzaskowski, Byrne, Abdellaoui, Adams, Agerbo, Air, Andlauer, Bacanu, Bækvad-Hansen, Beekman, Bigdeli, Binder, Blackwood, Bryois, Buttenschøn and Bybjerg-Grauholm2018). Subsequently, polygenic risk scores have been developed to quantify individual genetic liability (Als et al., Reference Als, Kurki, Grove, Voloudakis, Therrien, Tasanko, Nielsen, Naamanka, Veerapen, Levey, Bendl, Bybjerg-Grauholm, Zeng, Demontis, Rosengren, Athanasiadis, Bækved-Hansen, Qvist, Bragi Walters and Børglum2023; Halldorsdottir et al., Reference Halldorsdottir, Piechaczek, Soares de Matos, Czamara, Pehl, Wagenbuechler, Feldmann, Quickenstedt-Reinhardt, Allgaier, Freisleder, Greimel, Kvist, Lahti, Räikkönen, Rex-Haffner, Arnarson, Craighead, Schulte-Körne and Binder2019; Howard et al., Reference Howard, Adams, Clarke, Hafferty, Gibson, Shirali, Coleman, Hagenaars, Ward, Wigmore, Alloza, Shen, Barbu, Xu, Whalley, Marioni, Porteous, Davies and Deary2019; Kosciuszko et al., Reference Kosciuszko, Steptoe and Ajnakina2023; Kwong et al., Reference Kwong, Morris, Pearson, Timpson, Rice, Stergiakouli and Tilling2021a; McIntosh & Lewis, Reference McIntosh and Lewis2024; Mitchell et al., Reference Mitchell, Thorp, Wu, Campos, Nyholt, Gordon, Whiteman, Olsen, Hickie, Martin, Medland, Wray and Byrne2021; Musliner et al., Reference Musliner, Mortensen, McGrath, Suppli, Hougaard, Bybjerg-Grauholm, Bækvad-Hansen, Andreassen, Pedersen, Pedersen, Mors, Nordentoft, Børglum, Werge and Agerbo2019; Wray et al., Reference Wray, Ripke, Mattheisen, Trzaskowski, Byrne, Abdellaoui, Adams, Agerbo, Air, Andlauer, Bacanu, Bækvad-Hansen, Beekman, Bigdeli, Binder, Blackwood, Bryois, Buttenschøn and Bybjerg-Grauholm2018). However, their predictive power remains limited, and their clinical utility needs further study.
Although these genetic risk scores have been studied in clinical and, to lesser extent, in population-based samples, the majority of studies focused on cohorts with younger participants (up to 35 years of age, including children) (Halldorsdottir et al., Reference Halldorsdottir, Piechaczek, Soares de Matos, Czamara, Pehl, Wagenbuechler, Feldmann, Quickenstedt-Reinhardt, Allgaier, Freisleder, Greimel, Kvist, Lahti, Räikkönen, Rex-Haffner, Arnarson, Craighead, Schulte-Körne and Binder2019; Kwong et al., Reference Kwong, Morris, Pearson, Timpson, Rice, Stergiakouli and Tilling2021a; Musliner et al., Reference Musliner, Mortensen, McGrath, Suppli, Hougaard, Bybjerg-Grauholm, Bækvad-Hansen, Andreassen, Pedersen, Pedersen, Mors, Nordentoft, Børglum, Werge and Agerbo2019), or on broad age ranges across adulthood (Als et al., Reference Als, Kurki, Grove, Voloudakis, Therrien, Tasanko, Nielsen, Naamanka, Veerapen, Levey, Bendl, Bybjerg-Grauholm, Zeng, Demontis, Rosengren, Athanasiadis, Bækved-Hansen, Qvist, Bragi Walters and Børglum2023; Howard et al., Reference Howard, Adams, Clarke, Hafferty, Gibson, Shirali, Coleman, Hagenaars, Ward, Wigmore, Alloza, Shen, Barbu, Xu, Whalley, Marioni, Porteous, Davies and Deary2019; McIntosh & Lewis, Reference McIntosh and Lewis2024; Mitchell et al., Reference Mitchell, Thorp, Wu, Campos, Nyholt, Gordon, Whiteman, Olsen, Hickie, Martin, Medland, Wray and Byrne2021; Wray et al., Reference Wray, Ripke, Mattheisen, Trzaskowski, Byrne, Abdellaoui, Adams, Agerbo, Air, Andlauer, Bacanu, Bækvad-Hansen, Beekman, Bigdeli, Binder, Blackwood, Bryois, Buttenschøn and Bybjerg-Grauholm2018). Studies into genetic risk assessment in middle-aged and older adults remain limited, despite the ageing population globally and a peak of depression prevalence in middle and late adulthood (World Health Organization, 2017). Although twin studies generally suggest stable heritability estimates across adulthood, the relative influence of genetic factors may decrease as environmental factors play an increasingly important role at older ages (Kendler et al., Reference Kendler, Eaves, Loken, Pedersen, Middeldorp, Reynolds, Boomsma, Lichtenstein, Silberg and Gardner2011; Nivard et al., Reference Nivard, Dolan, Kendler, Kan, Willemsen, van Beijsterveldt, Lindauer, van Beek, Geels, Bartels, Middeldorp and Boomsma2015). Consistently, polygenic risk scores have shown stronger associations with early-onset than late-onset depression (Power et al., Reference Power, Tansey, Buttenschøn, Cohen-Woods, Bigdeli, Hall, Kutalik, Lee, Ripke, Steinberg, Teumer, Viktorin, Wray, Arolt, Baune, Boomsma, Børglum, Byrne, Castelao and Lewis2017). Moreover, depression in late life may involve different biological mechanisms and symptom profiles, including somatic and neurodegenerative processes (Schaakxs et al., Reference Schaakxs, Comijs, Lamers, Beekman and Penninx2017; Szymkowicz et al., Reference Szymkowicz, Gerlach, Homiack and Taylor2023), further emphasizing potential age-related variation in genetic liability.
In this context, Kosciuszko et al. (Reference Kosciuszko, Steptoe and Ajnakina2023) showed that genetic risk scores are related to depressive symptoms in adults aged 50 and over, but only used one self-reported depression phenotype based on the Center for Epidemiological Studies Depression (CES-D) scale (Kosciuszko et al., Reference Kosciuszko, Steptoe and Ajnakina2023). However, previous studies have shown that associations may differ depending on depression phenotype, episode severity, or the presence of situational triggers, such as the COVID-19 pandemic (Halldorsdottir et al., Reference Halldorsdottir, Piechaczek, Soares de Matos, Czamara, Pehl, Wagenbuechler, Feldmann, Quickenstedt-Reinhardt, Allgaier, Freisleder, Greimel, Kvist, Lahti, Räikkönen, Rex-Haffner, Arnarson, Craighead, Schulte-Körne and Binder2019; Huang et al., Reference Huang, Tang, Rietkerk, Appadurai, Krebs, Schork, Werge, Zuber, Kendler and Cai2023; Kwong et al., Reference Kwong, Morris, Pearson, Timpson, Rice, Stergiakouli and Tilling2021a; Musliner et al., Reference Musliner, Agerbo, Vilhjálmsson, Albiñana, Als, Østergaard and Mortensen2021; Sullivan et al., Reference Sullivan, Neale and Kendler2000; Wray et al., Reference Wray, Ripke, Mattheisen, Trzaskowski, Byrne, Abdellaoui, Adams, Agerbo, Air, Andlauer, Bacanu, Bækvad-Hansen, Beekman, Bigdeli, Binder, Blackwood, Bryois, Buttenschøn and Bybjerg-Grauholm2018). Therefore, an extensive assessment of genetic risk scores across phenotypes and characteristics of depression in middle and older age is warranted.
In this study, we aimed to evaluate the predictive capabilities of genetic risk scores based on the summary statistics of the study by Howard et al. (Reference Howard, Adams, Clarke, Hafferty, Gibson, Shirali, Coleman, Hagenaars, Ward, Wigmore, Alloza, Shen, Barbu, Xu, Whalley, Marioni, Porteous, Davies and Deary2019). We constructed and validated two commonly used types of genetic risk scores: a clumping and thresholding (C + T) genome-wide risk score (GRS) and a restricted polygenic risk score (PRS), including variants meeting the genome-wide significance (p < 5 × 10–8) for depression. While the GRS aggregates a broad range of genetic variants across the genome and may offer greater predictive power, the PRS is more interpretable and potentially easier to translate to clinical or biological contexts. This dual approach allows us to explore whether the GRS and PRS are differentially associated with the occurrence of depression-related phenotypes in a prospective cohort of middle-aged and older Dutch adults. Additionally, we examined whether associations differed depending on when and how depression-related phenotypes were measured: standard cross-sectional symptom reports, cross-sectional symptoms during the COVID-19 pandemic as an acute environmental stressor, and longitudinal data (follow-up >10 years) using clinically defined phenotypes.
Methods
Study design and population
This study was embedded within the population-based Rotterdam Study, a prospective cohort of adults aged 40 years and over that originated from 1989 (Ikram et al., Reference Ikram, Kieboom, Brouwer, Brusselle, Chaker, Ghanbari, Goedegebure, Ikram, Kavousi, de Knegt, Luik, van Meurs, Pardo, Rivadeneira, van Rooij, Vernooij, Voortman and Terzikhan2024). The Rotterdam Study (RS) consists of four subcohorts of which three were included in this study: RS-I (started from 1989, N = 7,983 aged 55 years and over at initial recruitment), RS-II (started from 2000, N = 3,011 aged 55 years and over), and RS-III (started from 2006, N = 3,932 aged 45 years and over). Follow-up visits occur every 5 years on average.
In total, 9,282 participants from the three subcohorts were included in this study, with data available on genotype and any of the depression-related phenotypes of interest. Of these, a total of 7,316 participants contributed cross-sectional measured data on depression between 2002 and 2008. Longitudinal data were available for 9,198 participants based on continuous monitoring of medical records and additional assessments during study waves: RS-I (1993–2012, four waves), RS-II (2000–2012, three waves), and RS-III (2006–2019, two waves). Additionally, during a period in the COVID-19 pandemic (April 2020–July 2020), we collected information on depressive symptoms in 2,955 participants with genotype data available (Licher et al., Reference Licher, Terzikhan, Splinter, Velek, van Rooij, Heemst, Haarman, Thee, Geurts, Mens, van der Schaft, de Feijter, Pardo, Kieboom and Ikram2021).
The Rotterdam Study has been approved by the Medical Ethics Committee of Erasmus MC (registration number MEC 02.1015) and by the Dutch Ministry of Health, Welfare and Sport (Population Screening Act WBO, license number 1071272–159521-PG). The Rotterdam Study Personal Registration Data collection is filed with the Erasmus MC Data Protection Officer under registration number EMC1712001. The Rotterdam Study has been registered at the Netherlands National Trial Register (NTR; www.trialregister.nl) and the WHO International Clinical Trials Registry Platform (ICTRP; www.who.int/ictrp/network/primary/en/) under shared catalog number NTR6831. All participants provided written informed consent to participate in the study and to have their information obtained from treating physicians.
Measurements
Genotyping and genetic scores calculation
Genotyping of data within the Rotterdam Study cohorts has been described previously (Ikram et al., Reference Ikram, Brusselle, Ghanbari, Goedegebure, Ikram, Kavousi, Kieboom, Klaver, de Knegt and Luik2020). In short, samples were genotyped using Illumina’s 550 k or 610 k platform and imputed to the 1KGp3v5 reference panel.
The restricted PRS contains 102 independent SNPs with genome-wide significance (p < 5 × 10−8) selected from a large GWAS on depression, which included populations with a wide age range across adulthood (Howard et al., Reference Howard, Adams, Clarke, Hafferty, Gibson, Shirali, Coleman, Hagenaars, Ward, Wigmore, Alloza, Shen, Barbu, Xu, Whalley, Marioni, Porteous, Davies and Deary2019). We used the same study summary statistics to generate the GRS. Linkage disequilibrium clumping (r2 < 0.1 in a 1-Mb window) was performed on any overlapping SNPs with the 1000 Genomes Project European samples for reference. GRS was calculated using PLINK (v1.9) for eight different scores based on different p-value thresholds (p ≤ 5 × 10−8, p ≤ 1 × 10−5, p ≤ 1 × 10−3, p ≤ 0.01, p ≤ 0.05, p ≤ 0.1, p ≤ 0.5, and p ≤ 1). The threshold yielding the highest variance explained (R2) across three RS subcohorts was selected for the calculation of the genome-wide risk score.
The average weighted score of each GRS and PRS is determined for each participant based on the following formula:

where genetic risk scorei refers to the genetic score for subject i,
$ {x}_{ij} $
is the posterior probability of the alternate allele by a subject i of a variant j based on best-guess imputations, κ is the number of independent variants in the genetic risk score for subject i, and
$ {\hat{\beta}}_j $
is the weight for variant j obtained from GWAS summary statistics. Both GRS and PRS were standardized to a mean of zero and a standard deviation of one (z-transformation) for each subcohort.
Depression
Cross-sectional data
Depressive symptoms were assessed during home interviews using the Dutch version of the CES-D scale (Beekman et al., Reference Beekman, Deeg, Van Limbeek, Braam, De Vries and Van Tilburg1997; Radloff, Reference Radloff1977). The questionnaire, designed for research in the general population, consists of 20 questions referring to the presence and severity of depressive symptoms during the past week, such as ‘I felt depressed’ or ‘I could not get going’. Answer options range from zero (‘rarely or none of the time’) to three (‘most or all of the time’), resulting in a sum score between 0 and 60, with higher scores indicating more depressive symptoms. A validated cut-off score of 16 points or higher on the CES-D scale is considered indicative of clinically relevant depressive symptoms (Beekman et al., Reference Beekman, Deeg, Van Limbeek, Braam, De Vries and Van Tilburg1997). Depressive symptoms during a period in the COVID-19 pandemic (April 2020–July 2020) were assessed by survey using a validated shortened version of the CES-D scale, consisting of 10 items, with a sum score ranging from 0 to 30 and a commonly used cut-off score of 10 points or higher (Andresen et al., Reference Andresen, Malmgren, Carter and Patrick1994; Mooldijk et al., Reference Mooldijk, Dommershuijsen, de Feijter and Luik2022; van den Besselaar et al., Reference van den Besselaar, MacNeil Vroomen, Buurman, Hertogh, Huisman, Kok and Hoogendijk2021).
Longitudinal data
Depression-related phenotypes were identified using three information sources, as previously described in detail (Luijendijk et al., Reference Luijendijk, van den Berg, Dekker, van Tuijl, Otte, Smit, Hofman, Stricker and Tiemeier2008). First, the CES-D scale was assessed during the home interview at baseline and at each follow-up measurement. Second, participants screening positive for depressive symptoms on the CES-D scale were invited for a clinical semi-structured interview by trained personnel: The Dutch version of the Schedules for Clinical Assessment in Neuropsychiatry (SCAN). Based on the interview, depressive disorders were diagnosed and classified according to the Statistical Manual of Mental Disorders, Fourth revised edition (DSM-IV-TR). Third, monitoring of medical records for the occurrence of depression-related phenotypes took place from baseline onward. To assess the incidence of depression-related phenotypes during the follow-up period, trained research assistants reviewed the medical records of general practitioners, which include not only general practitioners’ notes but also letters from hospitals, specialists, and psychologists/psychiatrists. Each medical record was reviewed by two trained research assistants, and depressive episodes were categorized according to a predefined protocol. Disagreeing categorizations were discussed in consensus meetings. If no depression-related phenotype was identified, participants were censored if they moved, at death, at January 1, 2012 (RS-I and RS-II), or at January 1, 2019 (RS-III).
From these sources, three depression-related phenotypes were derived, ordered by increasing severity: (1) clinically relevant depressive symptoms (based on SCAN or clinician-reported symptoms); (2) depressive syndromes (DSM-IV-TR defined cases of dysthymic disorder, mood disorder due to medical condition, and mood disorder not otherwise specified); and (3) major depressive disorder (DSM-IV-TR defined, single or recurrent) (Luijendijk et al., Reference Luijendijk, van den Berg, Dekker, van Tuijl, Otte, Smit, Hofman, Stricker and Tiemeier2008).
We defined ‘any depression-related phenotype’ as the occurrence of at least one phenotype during follow-up. For assessment per phenotype, we categorized each participant based on the most severe phenotype during follow-up.
Covariates
Self-report data were available on sex and age.
Statistical analysis
The study sample was characterized using descriptive statistics. Pearson’s correlations between GRS and PRS were assessed for cases and non-cases based on any depression-related phenotype, and separately for each phenotype.
Generalized linear models (GLM) were used to estimate the cross-sectional association of the continuous GRS and PRS for depression with depressive symptoms (CES-D score). To assess whether genetic risk is longitudinally associated with the occurrence of depression-related phenotypes at middle and older age, binomial GLMs were used to estimate the association of the continuous GRS and PRS with any depression-related phenotype, and multinomial models to estimate the associations with the type of phenotype (i.e. clinically relevant depressive symptoms, depressive syndrome, and major depressive disorder). To establish relative risks for individuals surpassing specific percentiles of the risk score distribution, participants were grouped by their risk score percentile (i.e. 5th, 10th, 20th, 25th, 75th, 80th, 90th, or 95th percentiles). Each group was compared to the middle 50% of the relevant risk score distribution. Lastly, linear models were used to assess the association of GRS and PRS with depressive symptoms during the COVID-19 pandemic. We also explored the models for any depression-related phenotype in different age strata (45–54 years, 55–64 years, 65–74 years, and 75 years and older). All models were adjusted for sex. As the study population represented a genetically homogeneous European ancestry group, population structure did not warrant adjustment, as previously described (Ikram et al., Reference Ikram, Kieboom, Brouwer, Brusselle, Chaker, Ghanbari, Goedegebure, Ikram, Kavousi, de Knegt, Luik, van Meurs, Pardo, Rivadeneira, van Rooij, Vernooij, Voortman and Terzikhan2024). All analyses were performed per subcohort and then meta-analyzed (Meta). Fixed-effects models were used as the three subcohorts originate from the same district in Rotterdam and represent subpopulations of the population-based Rotterdam Study. Data were handled and analyzed using R version 4.0.2 with ‘rmeta’, ‘nnet’, and ‘ggplot2’ packages (The R Foundation for Statistical Computing, Vienna, Austria).
Results
Table 1 (and Supplementary Table S1) presents study sample characteristics. The mean follow-up time for subcohort RS-I (N = 4,312) was 12.6 years (standard deviation [SD]: 5.3 years), RS-II (N = 2,063) 10.1 years (SD: 2.4 years), and RS-III (N = 2,823) 10.3 years (SD: 2.2 years).
Table 1. Characteristics of the study sample

Note: Data are presented as N (%) or mean ± SD unless otherwise indicated. *Age of measurement refers to the date of interview for cross-sectionally measured data, and the start date of follow-up (FU) for longitudinal data. RS = Rotterdam Study, referring to subcohort, SD = standard deviation, GRS = genome-wide risk score, PRS = polygenic risk score, CES-D = Center for Epidemiological Studies – Depression scale, IQR = interquartile range.
To compute the GRS based on the best-fitting model, we opted for a cutoff of p < 0.05 as the preferred threshold across three RS subcohorts. This choice was made because it yielded the highest variance explained in RS-I (R2 = 8.1x10−7) and RS-III (R2 = 1.4x10−8). However, for RS-II, the highest variance explained was achieved with a cutoff of p < 0.5 (R2 = 1.0x10−3) (Supplementary Figure S1). The unstandardized GRS and PRS showed similar distributions across subcohorts (Table 1). Correlations between the GRS and PRS were r ≤ 0.20 (Supplementary Figure S2).
Genetic risk scores and associations with cross-sectionally measured depressive symptoms
One standard deviation higher GRS for depression was associated with a 0.71 points higher cross-sectionally measured depressive symptoms score on the CES-D (estimated difference βGRS = 0.71, 95% confidence interval [0.53–0.89]), and one standard deviation higher PRS for depression with a 0.36 points higher score (βPRS = 0.36 [0.19–0.54], Figure 1, Supplementary Table S2). Both GRS and PRS also showed a positive association with a depressive symptoms score above the validated cut-off, indicating clinically relevant depressive symptoms (Odds Ratio (OR)GRS = 1.29 [1.20–1.39]; ORPRS = 1.13 [1.05–1.22], Supplementary Table S2). For depressive symptoms measured during the COVID-19 pandemic using the shortened version (scale 0–30), higher GRS and PRS were also associated with higher CES-D scores (βGRS = 0.33 [0.17–0.49], βPRS = 0.26 [0.10–0.41], Figure 2, Supplementary Table S3).

Figure 1. Associations of the GRS for depression (top panel) and the PRS for depression (bottom panel) with cross-sectionally measured depressive symptoms. CES-D refers to the Center for Epidemiological Studies – Depression scale, RS to Rotterdam Study (subcohort), GRS to genome-wide risk score, PRS to polygenic risk score, and META to meta-analysis.

Figure 2. Associations of the GRS for depression (top panel) and the PRS for depression (bottom panel) with cross-sectionally measured depressive symptoms during the COVID-19 pandemic. CES-D refers to the Center for Epidemiological Studies – Depression scale, RS to Rotterdam Study (subcohort), GRS to genome-wide risk score, PRS to polygenic risk score, and META to meta-analysis.
Genetic risk scores and associations with longitudinally measured phenotypes of depression
The standardized GRS and PRS for depression were significantly associated with any depression-related phenotype during follow-up (ORGRS = 1.20 [1.15–1.26], ORPRS = 1.10 [1.05–1.16], Figure 3, Supplementary Table S4). When assessing per phenotype (clinically relevant depressive symptoms, depressive syndromes, and major depressive disorder), we observed the highest effect size for episodes of major depressive disorder (ORGRS = 1.57 [1.43–1.73], ORPRS = 1.19 [1.09–1.31]), while the effect sizes for clinically relevant depressive symptoms (ORGRS = 1.13 [1.06–1.19], ORPRS = 1.08 [1.02–1.14]) and depressive syndrome (ORGRS = 1.14 [1.02–1.28], ORPRS = 1.08 [0.97–1.20]) were more similar (Figure 3, Supplementary Table S4). When these associations for any depression-related phenotype were studied by 10-year age groups, effect estimates for both the GRS and PRS were consistent in direction and comparable in size across participants aged 45–74 years (Supplementary Table S5). In the oldest group (75+ years), estimates from meta-analyses were smaller and non-significant, but estimates per cohort differed in direction and were based on a limited number of cases.

Figure 3. Associations of the GRS for depression (top panels) and the PRS for depression (bottom panels) with longitudinally measured depression-related phenotypes: any depression-related phenotype, and then categorized into worst phenotype, that is depressive symptoms, depressive syndrome, or major depressive disorder. RS refers to Rotterdam Study (subcohort), GRS to genome-wide risk score, PRS to polygenic risk score, and META to meta-analysis.
Relative risks of any depression-related phenotype in the tails of the genetic risk score distributions
Participants in the lower and upper tails of the GRS distribution (25%, 20%, 10%, and 5%) showed significantly different risks of any depression-related phenotype compared to the middle 50% of the distribution (e.g. ORGRS,90% = 1.33 [1.14–1.55], ORGRS,95% = 1.46 [1.18–1.79]). Again, effect sizes were largest for episodes of major depressive disorder (e.g. ORGRS,90% = 1.99 [1.53–2.57], ORGRS,95% = 2.22 [1.58–3.06]). All lower and upper tails (25% and smaller) of the GRS distribution were significantly associated with each depression-related phenotype, except that all upper tails of the GRS distribution did not show a significantly higher risk for depressive syndrome (Supplementary Figure S3, Supplementary Table S6).
For the PRS distribution, similar trends were observed, although effect sizes were smaller and not consistently statistically significant across all lower and upper tails (e.g. for any depression-related phenotype: ORPRS,90% = 1.26 [1.08–1.48], ORPRS,95% = 1.16 [0.94–1.44]). Also for PRS, effect sizes were largest for major depressive disorder, but not all statistically significant (e.g. ORPRS,90% = 1.51 [1.13–1.99], ORPRS,95% = 1.45 [0.98–2.09], Supplementary Figure S3, Supplementary Table S6).
Discussion
Our findings indicated that a GRS for depression was associated with depression-related phenotypes in this population-based cohort of middle-aged and older adults, including cross-sectional, longitudinal, and COVID-19-related outcomes. Effect sizes varied slightly across these different outcomes. The PRS for depression showed similar patterns, with generally more modest associations.
Our results indicate that higher GRS and PRS for depression are more strongly associated with major depressive disorder, compared to clinically depressive symptoms or syndrome. This likely reflects the phenotype definitions used in the GWAS study (Howard et al., Reference Howard, Adams, Clarke, Hafferty, Gibson, Shirali, Coleman, Hagenaars, Ward, Wigmore, Alloza, Shen, Barbu, Xu, Whalley, Marioni, Porteous, Davies and Deary2019), ranging from depressive symptom scales to clinical diagnoses. Consequently, cases of major depressive disorder were probably captured across all cohorts, while less severe cases were only included in certain cohorts, potentially tailoring the risk scores to predict risk of more severe cases of depression. Genetic assessment might benefit from GWAS that differentiate between phenotypes (e.g. based on severity) (Mitchell et al., Reference Mitchell, Thorp, Wu, Campos, Nyholt, Gordon, Whiteman, Olsen, Hickie, Martin, Medland, Wray and Byrne2021), although this remains challenging due to the heterogeneous presentation of depression-related phenotypes (Olbert et al., Reference Olbert, Gala and Tupler2014). Also, due to greater heterogeneity and stronger influence of environmental factors, depressive symptoms might be less heritable than major depressive disorder (Fried et al., Reference Fried, Nesse, Zivin, Guille and Sen2014; Jang et al., Reference Jang, Livesley, Taylor, Stein and Moon2004). This may also explain the smaller effect sizes for continuous CES-D scores. Although statistically significant, a difference of 0.71 points on the CES-D scale per SD in GRS/PRS might be of uncertain clinical relevance. Overall, these findings might suggest greater potential for genetic risk scores for identifying individuals at risk of clinically relevant depression, rather than capturing subtle variation in symptom levels.
Our results suggest that genetic risk scores (GRS and PRS), based on a large GWAS study (Howard et al., Reference Howard, Adams, Clarke, Hafferty, Gibson, Shirali, Coleman, Hagenaars, Ward, Wigmore, Alloza, Shen, Barbu, Xu, Whalley, Marioni, Porteous, Davies and Deary2019), show modest associations with depression-related phenotypes in middle-aged and older adults. This aligns with previous studies reporting modest effect sizes and limited explained variance (Halldorsdottir et al., Reference Halldorsdottir, Piechaczek, Soares de Matos, Czamara, Pehl, Wagenbuechler, Feldmann, Quickenstedt-Reinhardt, Allgaier, Freisleder, Greimel, Kvist, Lahti, Räikkönen, Rex-Haffner, Arnarson, Craighead, Schulte-Körne and Binder2019; Howard et al., Reference Howard, Adams, Clarke, Hafferty, Gibson, Shirali, Coleman, Hagenaars, Ward, Wigmore, Alloza, Shen, Barbu, Xu, Whalley, Marioni, Porteous, Davies and Deary2019; Kosciuszko et al., Reference Kosciuszko, Steptoe and Ajnakina2023; Musliner et al., Reference Musliner, Mortensen, McGrath, Suppli, Hougaard, Bybjerg-Grauholm, Bækvad-Hansen, Andreassen, Pedersen, Pedersen, Mors, Nordentoft, Børglum, Werge and Agerbo2019). More specifically, a comparison with the validation cohorts in Howard et al. is possible: the odds ratio for MDD in the top versus bottom 10% of our GRS was 4.2, which is similar to the estimates reported in their study (ORs ranging from 2.0 to 3.5 across cohorts and p-value thresholds). Although the original GWAS included studies with broad age ranges across adulthood, older adults may have been underrepresented. Nevertheless, our findings suggest the derived genetic risk scores associate with depression-related phenotypes in a sample of middle-aged and older adults. Within our cohort, effect sizes for the GRS and PRS were consistent across age groups between 45 and 74 years. In the oldest group (75+), associations appeared weaker, but low event numbers and heterogeneity between cohorts limit interpretation. Although studies allowing for more direct comparisons across adulthood are needed, our and prior findings do not indicate that the role of genetic risk meaningfully changes with age.
Genetic risk scores were associated with cross-sectional and longitudinal ascertained phenotypes, even as depressive symptoms during the COVID-19 pandemic. This supports the potential applicability of genetic risk scores across various depression characteristics and settings. Associations present during the COVID-19 pandemic align with studies linking genetic risk scores to psychiatric symptoms following specific triggers or events (Kwong et al., Reference Kwong, Pearson, Adams, Northstone, Tilling, Smith, Fawns-Ritchie, Bould, Warne, Zammit, Gunnell, Moran, Micali, Reichenberg, Hickman, Rai, Haworth, Campbell, Altschul and Timpson2021b; Waszczuk et al., Reference Waszczuk, Docherty, Shabalin, Miao, Yang, Kuan, Bromet, Kotov and Luft2020). Direct comparison with pre-pandemic results is complicated by population (participants with available data during the pandemic were older on average) and measurement differences (full vs. shortened version of the CES-D; interview vs. survey data). Among older Dutch adults, self-reported symptoms tend to be higher than data from interviews (Geerlings et al., Reference Geerlings, Beekman, Deeg, Tilburg and Smit1999). Therefore, we could not determine whether the role of genetic predisposition is potentially stronger or weaker in the context of an external stressor such as the pandemic. Nonetheless, our findings suggest that cross-sectional depression measures may capture similar genetic associations as lifetime prevalence, which may be relevant for genetic research settings.
Regarding the two genetic risk scores we used, a notable trend emerges when comparing PRS and GRS: the effect sizes for genome-wide scores (GRS) tend to surpass those of restricted scores (PRS), although most differences were not statistically significant. The broader range of genetic variants included in the GRS likely contributed to its slightly better performance. For the GRS, we selected the p-value threshold that maximized explained variance in our study, although this variance remained very low across all thresholds and was not fully consistent across cohorts. This underscores the limited predictive power of current approaches for depression and highlights that optimal thresholds may be population specific. In contrast, PRS includes only a select set of the most significantly associated variants, which may facilitate biological interpretation and clinical translation. Particularly, biological annotation and interpretation of a restricted PRS could be used to explore the underlying disease mechanisms (Schreurs et al., Reference Schreurs, Ramón, Adank, Collée, Hollestelle, van Rooij, Schmidt and Hooning2024; Stikker et al., Reference Stikker, Trap, Sedaghati-Khayat, de Bruijn, van Ijcken, de Roos, Ikram, Hendriks, Brusselle, van Rooij and Stadhouders2024), although we acknowledge that this approach may oversimplify the biological complexity of depression by ignoring the thousands of more subtle associated SNPs. Future studies should consider advanced polygenic scoring methods such as SBayesRC or LDpred-funct, which integrate functional annotations and genome-wide LD modeling, thereby offering enhanced predictive accuracy and the potential for deeper biological insights.
In psychiatric phenotypes, genetic risk factors might ultimately contribute to early risk detection, accurate diagnosis, and personalized treatment strategies (Murray et al., Reference Murray, Lin, Austin, McGrath, Hickie and Wray2021). However, as genetic factors account for only a small part of the overall risk of psychiatric disorders and the variance explained by genetic risk scores is generally low, interpretation and communication of these results remain challenging. A more comprehensive and strongly associated risk score may be needed before application in clinical practice. Additionally, given the lack of a clear prevention strategy for psychiatric disorders, it is important to consider the potential applications and consequences of knowing an individual’s genetic risk. Understanding these factors is crucial before incorporating and communicating these scores in clinical practice (Murray et al., Reference Murray, Lin, Austin, McGrath, Hickie and Wray2021; Wray et al., Reference Wray, Lin, Austin, McGrath, Hickie, Murray and Visscher2021).
Strengths of this study include the population-based character of our sample, the inclusion of multiple and extensive measurements of depression, and the comparison of two common types of genetic risk scores. However, several limitations should be mentioned. First, the sample size was relatively small in some analyses, e.g. the analyses by age group or the RS-I subcohort in the COVID-19 data analyses. Second, the risk scores that we used were based on a GWAS including various definitions of depression rather than specific depressive (sub)phenotypes. Third, although we used one of the most extensive GWAS studies available so far, the explained variance of the PRS is still small (ranging from 1.5 to 3.2%) (Howard et al., Reference Howard, Adams, Clarke, Hafferty, Gibson, Shirali, Coleman, Hagenaars, Ward, Wigmore, Alloza, Shen, Barbu, Xu, Whalley, Marioni, Porteous, Davies and Deary2019). Probably, future GWAS studies will enable genetic risk scores with higher explained variance, as shown by the available preprint of McIntosh et al. (explained variance of 5.7%) (McIntosh & Lewis, Reference McIntosh and Lewis2024). Fourth, the risk scores we constructed were based on GWAS in European populations. Therefore, our findings are only applicable to individuals of European ancestry. Fifth, we were unable to assess the effects of antidepressant use; future sensitivity analyses should address this, while considering potential collider bias. Finally, our study focused on genetic risk only, while in real-life settings, it might be important to consider genetic risk in interaction with other individual and environmental risk factors.
Conclusions
In a population of middle-aged and older adults, both the GRS and, to a smaller extent, the PRS for depression showed modest associations with a range of depression-related phenotypes. Effect sizes for both GRS and PRS seem to increase with the severity of depression. Based on cross-sectional and longitudinal measured data, the GRS and PRS are associated with both prevalent and incident depression. These findings support the potential relevance of genetic risk scores in understanding vulnerability to depression. However, their explained variance remains limited, emphasizing the need for further improvement before their clinical utility can be meaningfully assessed.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1017/S0033291725101943.
Data availability statement
Data can be obtained upon request. Requests should be directed toward the management team of the Rotterdam Study (secretariat.epi@erasmusmc.nl), which has a protocol for approving data requests. Because of restrictions based on privacy regulations and informed consent of the participants, data cannot be made freely available in a public repository.
Acknowledgments
We acknowledge the dedication, commitment, and contribution of the inhabitants, general practitioners, and pharmacists of the Ommoord district who took part in the Rotterdam Study. We acknowledge Frank van Rooij as data manager and Brenda C.T. Kieboom as study coordinator. We thank Jolande Verkroost-van Heemst for her invaluable contribution to the collection of the data.
We acknowledge the use of ChatGPT (version 4o and 5) for English language refinement and text editing. The tool was accessed between July and August 2025 via a secure account. Prompts included wording suggestions for short sentences or text fragments. The tool was not involved in study design, data analysis, or interpretation. All text suggested by ChatGPT was carefully reviewed and edited by the authors, who take full responsibility for the content.
Funding statement
The Rotterdam Study is funded by Erasmus Medical Center and Erasmus University, Rotterdam, Netherlands Organization for the Health Research and Development (ZonMw), the Research Institute for Diseases in the Elderly (RIDE), the Ministry of Education, Culture and Science, the Ministry for Health, Welfare and Sports, the European Commission (DG XII), and the Municipality of Rotterdam. This study was supported by the GOALL project. The GOALL project is funded by the internal Koers23 program from the Erasmus Medical Centre (project number 109433). The generation and management of genotype data for the Rotterdam Study were executed by the Human Genotyping Facility of the Genetic Laboratory of the Department of Internal Medicine, Erasmus MC, Rotterdam, The Netherlands.
Competing interests
The authors declare none.
Ethical standard
The Rotterdam Study has been approved by the Medical Ethics Committee of Erasmus MC (registration number MEC 02.1015) and by the Dutch Ministry of Health, Welfare and Sport (Population Screening Act WBO, license number 1071272–159521-PG). The Rotterdam Study Personal Registration Data collection is filed with the Erasmus MC Data Protection Officer under registration number EMC1712001. The Rotterdam Study has been registered at the Netherlands National Trial Register (NTR; www.trialregister.nl) and the WHO International Clinical Trials Registry Platform (ICTRP; www.who.int/ictrp/network/primary/en/) under shared catalog number NTR6831. All participants provided written informed consent to participate in the study and to have their information obtained from treating physicians.