Impact statement
This work addresses a critical gap in environmental health research: the inability of traditional observational epidemiology to fully elucidate the causal and interactive effects of plastic chemical exposures. By proposing a hybrid epidemiological framework that incorporates underpinning genetic susceptibility and molecular mechanisms, advanced statistical methods, machine learning and experimental tools, this perspective charts a path towards a more comprehensive understanding of exposure-disease relationships. This approach not only reveals the biological plausibility of the findings but also provides robust evidence for policymakers to regulate hazardous chemical mixtures effectively.
Introduction
For decades, observational epidemiological studies have formed the basis for our understanding of environmental health risks, particularly those posed by plastic chemicals (Grandjean and Landrigan, Reference Grandjean and Landrigan2006; Woodruff et al., Reference Woodruff, Zota and Schwartz2011; Landrigan et al., Reference Landrigan, Raps, Cropper, Bald, Brunner, Canonizado, Charles, Chiles, Donohue and Enck2023). These studies have provided critical information on the associations between exposure to numerous ubiquitous chemicals, in particular during early life, and various health conditions, including the aetiologies of adverse perinatal and childhood outcomes, endocrine disruption, adverse neurodevelopmental outcomes, obesity, diabetes, and cancer (Symeonides et al., Reference Symeonides, Aromataris, Mulders, Dizon, Stern, Barker, Whitehorn, Pollock, Marin and Dunlop2024, and references therein). Despite their significant contributions, observational epidemiological studies often face limitations in establishing causal relationships. Such studies typically focus on the health impacts of individual chemicals, an approach that may not fully capture the complexities of endocrine-disrupting chemicals (EDCs) such as phthalates and bisphenols. These chemicals can exert effects through shared and/or overlapping mechanisms on endocrine systems, such as binding to specific hormone receptors in target tissues or interfering with biotransformation and excretion of endogenous hormones, enabling multiple chemicals to act additively or synergistically at lower concentrations to produce outcomes that would require higher concentrations if acting alone (Kortenkamp, Reference Kortenkamp2008; Al-Gubory, Reference Al-Gubory2014; Darbre, Reference Darbre2022). The inability to account for the full exposome (Wild, Reference Wild2005), encompassing all chemical and non-chemical factors that influence human well-being throughout life, e.g., lifestyle, socio-economic status, diet, adverse life-events and psychosocial factors, can bias findings further and obscure direct causal links between chemical exposure and health outcomes later in life. In addition, the long latency periods associated with conditions such as cancer or reproductive disorders make it difficult to attribute these health impacts to specific chemical exposures many years earlier in life (Amolegbe et al., Reference Amolegbe, Carlin, Henry, Heacock, Trottier and Suk2022). Exposures to even very low concentrations during the critical developmental window in the foetal and childhood life stages could be related to a programming effect with an increased risk of disease later in life (Lanphear, Reference Lanphear2015). Furthermore, as EDCs often show non-linear exposure–response relationships, assuming only linear relationships will obscure the true impact of these chemicals. These challenges highlight the need for a hybrid epidemiological approach that combines advances in mixture analysis, machine learning and molecular methods to improve causal inference. Integrating experimental models (e.g., in vitro, in vivo, in silico and mechanistic toxicology) with epidemiological data enables testing of biological plausibility, dose–response and temporality. This alignment helps move from statistical associations to identifying mechanistic pathways linking EDC exposure to health outcomes.
The challenge faced by observational studies to establish causality closely resembles those outlined in the Bradford Hill criteria, such as demonstrating strength, consistency, biological plausibility and temporality in observed associations (Woodside III and Davis, Reference Woodside and Davis2012; Fedak et al., Reference Fedak, Bernal, Capshaw and Gross2015). Traditional epidemiology often produces small effect sizes (Symeonides et al., Reference Symeonides, Aromataris, Mulders, Dizon, Stern, Barker, Whitehorn, Pollock, Marin and Dunlop2024), which poses inconclusive and weak confidence in causal links to human harm based on Bradford Hill’s criterion of strength of association (Woodside III and Davis, Reference Woodside and Davis2012; Fedak et al., Reference Fedak, Bernal, Capshaw and Gross2015). While the consistency across observational epidemiological studies, another Bradford Hill criterion, provides some support for causality, the strength of the association is the primary factor in considering ‘cause’ (Woodside III and Davis, Reference Woodside and Davis2012; Fedak et al., Reference Fedak, Bernal, Capshaw and Gross2015). In contrast, adopting a hybrid epidemiology approach that explores the underlying biological mechanisms is likely to reveal larger effect sizes expressed as several-fold changes (Caporale et al., Reference Caporale, Leemans, Birgersson, Germain, Cheroni, Borbély, Engdahl, Lindh, Bressan and Cavallo2022; Symeonides et al., Reference Symeonides, Vacy, Thomson, Tanner, Chua, Dixit, Mansell, O’Hely, Novakovic and Herbstman2024; Elagali et al., Reference Elagali, Eisner, Tanner, Drummond, Symeonides, Love, Tang, Mansell, Burgner and Collier2025), which increases the evidence for causation. Moving beyond traditional observational studies, this perspective advocates for innovative hybrid epidemiological strategies that will equip researchers with powerful tools to uncover causal relationships and assess the health impacts of complex plastic chemical mixtures. In this perspective, we offer key recommendations to support the shift towards strengthening causal evidence, with substantial potential to drive progress in precision medicine and improve population health outcomes.
Beyond one-chemical-at-a-time: mixture analysis and machine learning methods
Traditional epidemiological approaches are sometimes limited by their focus on assessing health risks associated with single chemical exposure, which fail to capture the complexity of real-world scenarios and treat environmental exposures as isolated events. This reductionist approach has resulted in a limited understanding of the health impacts associated with chemical mixtures and presents a significant regulatory challenge. In practice, humans are exposed to numerous environmental chemicals throughout their lifespan, often concurrently, as shown by national human biomonitoring programs (Woodruff et al., Reference Woodruff, Zota and Schwartz2011; Stanfield et al., Reference Stanfield, Setzer, Hull, Sayre, Isaacs and Wambaugh2024). These exposures are frequently correlated, may show non-linear dose–response relationships and can lead to cumulative or interactive health effects. Traditional regression models struggle with such complexity due to multicollinearity, limited power to detect interactions and challenges in selecting meaningful variables from high-dimensional data. To address this, advanced statistical techniques have been developed to more accurately model chemical mixtures. These approaches differ in assumptions, interpretability, ability to account for correlation and capacity to detect non-linear or interactive effects. Advancing methods to quantify the disease risks posed by chemical mixtures could help identify modifiable exposures, enabling targeted public health interventions and prevention strategies.
In adapting this approach, epidemiologists must consider that the current research on plastic chemicals is constrained by a focus on substances already known to be hazardous or measurable using existing methods, a limitation often referred to as the ‘street-light effect’ (J. M. Braun et al., Reference Braun, Gennings, Hauser and Webster2016). With the increase in the use of non-targeted approaches to quantify chemical exposures, mixture studies can rely on these methods to prioritize which chemicals are included in mixture-based statistical approaches i.e., ‘in silico’ prioritization. To support this advancement, it is crucial to develop efficient high-throughput screening assays that target key endpoints and account for the toxicity of both individual chemicals and their mixtures. G. Braun et al. (Reference Braun, Herberth, Krauss, König, Wojtysiak, Zenclussen and Escher2024) demonstrated this approach by analysing the neurotoxic effects of hundreds of chemical mixtures extracted from the blood of pregnant women. Using unbiased extraction methods, automated target screening and high-throughput in vitro neurotoxicity assays, Braun et al. uncovered the impact of complex organic mixtures on neurite development and demonstrated how in silico prioritization can effectively narrow down chemicals of concern for neurodevelopmental endpoints.
As the study of chemical mixtures remains an evolving field of research, there is currently no established consensus on the most appropriate statistical or machine learning methods to investigate the health effects of chemical mixtures in epidemiological studies (Bobb et al., Reference Bobb, Valeri, Henn, Christiani, Wright, Mazumdar, Godleski and Coull2015; Lazarevic et al., Reference Lazarevic, Barnett, Sly and Knibbs2019; Maitre et al., Reference Maitre, Guimbaud, Warembourg, Güil-Oumrait, Petrone, Chadeau-Hyam, Vrijheid, Basagaña and Gonzalez2022; Miller and Consortium, Reference Miller and Consortium2025). In recent years, numerous methods have been introduced to address the complexities associated with multiple exposures and their interactions (Sun et al., Reference Sun, Tao, Li, Ferguson, Meeker, Park, Batterman and Mukherjee2013; Forns et al., Reference Forns, Mandal, Iszatt, Polder, Thomsen, Lyche, Stigum, Vermeulen and Eggesbø2016; Stafoggia et al., Reference Stafoggia, Breitner, Hampel and Basagaña2017). These include techniques for variable selection, shrinkage and grouping of correlated variables, such as the least absolute shrinkage and selection operator (LASSO; Sun et al., Reference Sun, Tao, Li, Ferguson, Meeker, Park, Batterman and Mukherjee2013), elastic net (Lenters et al., Reference Lenters, Portengen, Rignell-Hydbom, Jönsson, Lindh, Piersma, Toft, Bonde, Heederik and Rylander2016) and adaptive elastic net (Zou and Zhang, Reference Zou and Zhang2009). Other approaches include dimension reduction methods like principal component analysis (Yang et al., Reference Yang, Li, Li, Wang, Cao, Wu and Xu2013) and partial least squares (Sun et al., Reference Sun, Tao, Li, Ferguson, Meeker, Park, Batterman and Mukherjee2013), as well as Bayesian frameworks such as Bayesian model averaging (Bobb et al., Reference Bobb, Dominici and Peng2011). Notably, two methods specifically designed for mixture analyses in environmental epidemiology warrant attention: Weighted Quantile Sum Regression (WQSR, Carrico et al., Reference Carrico, Gennings, Wheeler and Factor-Litvak2015) and Bayesian Kernel Machine Regression (BKMR, Bobb et al., Reference Bobb, Henn, Valeri and Coull2018). WQSR and BKMR provide robust measures of the health effects of mixtures, with BKMR offering additional advantages, such as accounting for non-linearity and interaction in multivariate exposure–response relationships. Despite their strengths, all of these methods exhibit some limitations, including instability in model selection (shrinkage approaches), difficulties in interpreting latent variables (dimension reduction) and the computational intensity of Bayesian models. Furthermore, their application to large and heterogeneous exposome datasets – comprising diverse variable types such as -omics data and mixed categorical and continuous variables – remains limited.
We do not advocate for a rigid, one-size-fits-all approach; however, for complex research questions exploring the health effects of multiple exposures, a multi-staged statistical strategy is recommended. For example, an initial step could involve utilizing methods like LASSO as a pre-processing tool (Sun et al., Reference Sun, Tao, Li, Ferguson, Meeker, Park, Batterman and Mukherjee2013). This preliminary step helps refine the dataset and enhance model performance before implementing Bayesian approaches, such as: Bayesian Hierarchical Models (Gelman et al., Reference Gelman, Carlin, Stern and Rubin1995), which can handle data with nested or multilevel structures (e.g., repeated measures or multi-cohort studies); BKMR, which is particularly effective for modelling non-linear and interactive effects of correlated exposures; or Bayesian Additive Regression Trees (Chipman et al., Reference Chipman, George and McCulloch2010), a flexible ensemble method that captures complex, non-parametric relationships and has been used to identify high-order interactions in high-dimensional datasets. An alternative to multistage analysis is the use of machine learning (ML) techniques, which are increasingly applied in environmental health to manage high-dimensional exposure data and uncover patterns missed by regression models. ML is well-suited for modelling non-linear relationships and interactions in chemical mixtures. The choice of algorithm depends on the data structure and research aims. For example, neural networks are particularly suited for modelling complex, multilayered interactions in large datasets, while support vector regression (SVR) performs well in small- to medium-sized datasets with non-linear boundaries. Ensemble learning methods like gradient boosting (e.g., XGBoost, LightGBM, CatBoost) and adaptive boosting (AdaBoost) build highly predictive models and are especially effective for structured data with correlated features (see, for example, Argyri et al., Reference Argyri, Gallos, Amditis and Dionysiou2024; Guimbaud et al., Reference Guimbaud, Siskos, Sakhi, Heude, Sabidó, Borràs, Keun, Wright, Julvez and Urquiza2024, and references therein). However, a significant challenge with machine learning models lies in their lack of interpretability. For example, models like gradient boosting, which consist of numerous decision trees, make it difficult to determine how individual features contribute to a specific prediction. To address this limitation, Shapley Additive Explanations (SHAP) has emerged as a valuable tool, offering clear and detailed explanations of ML predictions (Lundberg, Reference Lundberg2017). Ultimately, the choice of method should reflect the dimensionality of the data, sample size, confounding and research objectives. As environmental health data grow in complexity, so does the relevance of machine learning for analysis and interpretation.
Uncovering mechanisms: -omics and molecular pathways in the shift towards causal inference
As research on plastic chemicals advances, there is an increasing emphasis on the need to identify robust causal evidence linking these exposures to human health outcomes. While correlations between exposure and health effects may represent genuine associations, they do not inherently demonstrate causation. A promising strategy to strengthen causal inference involves the integration of evidence from multiple approaches, each with distinct and unrelated sources of potential bias – a method referred to as triangulation of evidence. For instance, combining observational epidemiological data with experimental or quasi-experimental approaches, such as toxicological studies, in vitro systems and mechanistic models, facilitates the identification of biologically plausible pathways linking plastic exposure to health outcomes. Similarly, the integration of ‘multi-omics’ technologies, encompassing comprehensive biological domains such as genomics, transcriptomics and proteomics, offers a robust framework for uncovering novel causal mediators of disease and revealing biological insights that may remain undetected through single-omics analyses (Hasin et al., Reference Hasin, Seldin and Lusis2017; Karczewski and Snyder, Reference Karczewski and Snyder2018; Miller and Consortium, Reference Miller and Consortium2025). Since each method operates under unique assumptions and exhibits distinct strengths and limitations, consistent findings across diverse methodologies significantly reduce the likelihood of artefactual results, thereby strengthening the inference of causality (see Figure 1).

Figure 1. Unlike the traditional epidemiological approach (represented by the grey arrow), which examines the relationship between a single exposure and a disease outcome while many critical aspects of the biological pathway remain uncovered, hybrid epidemiology (depicted by the purple arrow) adopts a more comprehensive view. The hybrid epidemiology approach considers the broader exposome and examines how factors such as genetic predisposition, molecular mechanisms, environmental factors and lifestyle variations within individuals can modulate this effect. Hybrid epidemiology goes a step further by integrating multiple analytical approaches and techniques – including machine learning, multi-omics, advanced statistical methods and animal or cell models – to triangulate and identify the most parsimonious effect of exposure on disease aetiology.
The approach of uncovering the biological pathways linking exposures to outcomes has gained significant attention in Europe and is the focus of ongoing research through initiatives such as the Horizon 2020 ENDpoiNTs project (Lupu et al., Reference Lupu, Andersson, Bornehag, Demeneix, Fritsche, Gennings, Lichtensteiger, Leist, Leonards and Ponsonby2020) and the Psychiatric Disorders and Comorbidities Caused by Pollution in the Mediterranean Area (PsyCoMed) project. These projects combine scientific expertise in endocrine disruption and developmental neurotoxicity with advanced in silico and in vitro tools, innovative experimental approaches and sophisticated biostatistical analyses of human epidemiological and biomonitoring data. Their aim is to uncover both correlative and causal links between neurodevelopmental outcomes and endocrine pathways, addressing not only well-studied targets like estrogen, androgen and thyroid systems but also lesser-known pathways such as the retinoic acid system. Symeonides et al. (Reference Symeonides, Vacy, Thomson, Tanner, Chua, Dixit, Mansell, O’Hely, Novakovic and Herbstman2024) employed a multimodal approach, combining human observational studies and preclinical mouse models, to investigate the link between prenatal bisphenol A (BPA) exposure and autism spectrum disorder (ASD) in males. In the human cohort, BPA exposure was associated with increased ASD symptoms and diagnosis in males with low aromatase activity, driven by BPA-induced hypermethylation of the CYP19A1 brain promoter, with over three-fold increased risk of ASD symptoms at age 2 and six-fold increased risk of diagnosis at age 9 (Symeonides et al., Reference Symeonides, Vacy, Thomson, Tanner, Chua, Dixit, Mansell, O’Hely, Novakovic and Herbstman2024). In mouse models, BPA exposure and aromatase knockout led to ASD-like behaviours, amygdala hypoactivation and brain alterations. In vitro experiments confirmed BPA’s suppression of aromatase in neuronal cells. These findings provide insights into how prenatal BPA disrupts aromatase signalling, causing changes characteristic of ASD in males.
To inform policy effectively, broader research covering a wider range of health outcomes beyond neurodevelopment is needed (Symeonides et al., Reference Symeonides, Aromataris, Mulders, Dizon, Stern, Barker, Whitehorn, Pollock, Marin and Dunlop2024). Further, incorporating mixture analyses into triangulation methods is crucial, as interactions between individual EDCs can amplify their effects. For example, BPA has been shown to enhance oestrogen receptor expression, increasing cellular vulnerability to other EDCs (Hayes et al., Reference Hayes, Weening and Morey2016; Hamid et al., Reference Hamid, Junaid and Pei2021). This complicates the assessment of combined effects based on individual exposures (Kortenkamp, Reference Kortenkamp2008; J. M. Braun et al., Reference Braun, Gennings, Hauser and Webster2016). Misattributing health effects to a single exposure rather than to a correlated harmful exposure carries significant regulatory consequences (Caporale et al., Reference Caporale, Leemans, Birgersson, Germain, Cheroni, Borbély, Engdahl, Lindh, Bressan and Cavallo2022; Duh-Leong et al., Reference Duh-Leong, Maffini, Kassotis, Vandenberg and Trasande2023). Such poorly defined policies can increase global human exposure to plastic chemicals, resulting in considerable public health issues, disability and economic costs (the societal burden of chemical exposure is estimated at US$340 billion annually, see for example Duh-Leong et al., Reference Duh-Leong, Maffini, Kassotis, Vandenberg and Trasande2023). We therefore advocate for the integration of triangulation methods with mixture modelling to more accurately reflect real-world exposure scenarios. This combined approach enhances the identification of the most harmful chemicals within mixtures and provides insights into whether their effects are additive, antagonistic or synergistic – thus supporting more precise and actionable interventions. The applicability of this hybrid epidemiological framework is particularly relevant in areas where environmental exposures act through diverse, interacting biological pathways — including endocrine disruption, neurodevelopment and immune modulation. In this context, the eight hallmarks of environmental insults proposed by Peters et al. (Reference Peters, Nawrot and Baccarelli2021)) — namely oxidative stress and inflammation, genomic alterations and mutations, epigenetic alterations, mitochondrial dysfunction, endocrine disruption, altered intercellular communication, altered microbiome communities and impaired nervous system function — offer a valuable conceptual scaffold. Integrating multi-omics and mechanistic evidence with these hallmarks enables researchers to better identify plausible biological pathways and prioritize environmental exposures for regulatory action.
Additionally, the integration of advanced biomolecular techniques, such as genomics, proteomics and transcriptomics, plays a key role in enhancing the biological plausibility of exposure–disease associations and strengthening causal evidence. For instance, utilizing genetic data can effectively reduce confounding in epidemiological research through methods like pedigree-based analyses (e.g., sibling comparisons) and genetically informed approaches (e.g., polygenic scoring, genetic pathway score functions and Mendelian randomization). This capability to identify genotype-specific or individual responses to environmental exposures is crucial to unravelling gene–environment interactions (see, for example, Tanner et al., Reference Tanner, Thomson, Drummond, O’hely, Symeonides, Mansell, Saffery, Sly, Collier and Burgner2022; Elagali et al., Reference Elagali, Eisner, Tanner, Drummond, Symeonides, Love, Tang, Mansell, Burgner and Collier2025), as the aetiology of many diseases involves intricate interplays between genetic susceptibilities and environmental factors (Manrai et al., Reference Manrai, Cui, Bushel, Hall, Karakitsios, Mattingly, Ritchie, Schmitt, Sarigiannis and Thomas2017; Vermeulen et al., Reference Vermeulen, Schymanski, Barabási and Miller2020; Chang et al., Reference Chang, Ewald, Hui, Bayen and Xia2024). Advancing this understanding is important for the development of predictive gene–environment interaction (GxE) models, with transformative potential in robust chemical regulation and large-scale public health monitoring (Motsinger-Reif et al., Reference Motsinger-Reif, Reif, Akhtari, House, Campbell, Messier, Fargo, Bowen, Nadadur and Schmitt2024). Further, recent advances in single-cell omics and human organoid modelling provide powerful opportunities to validate and contextualize associations identified through epidemiological studies (Cuomo et al., Reference Cuomo, Nathan, Raychaudhuri, MacArthur and Powell2023; Farbehi et al., Reference Farbehi, Neavin, Cuomo, Studer, MacArthur and Powell2024; Caporale et al., Reference Caporale, Castaldi, Rigoli, Cheroni, Valenti, Stucchi, Lessi, Bulgheresi, Trattaro and Pezzali2025). For example, single-cell expression quantitative trait loci (eQTL) mapping reveals how genetic variation influences gene expression in specific cell types or states, helping identify where environmental exposures exert their effects. Likewise, organoid systems derived from human stem cells offer controlled platforms to model tissue-specific responses to plastic chemicals and test hypotheses about developmental disruption, endocrine signalling or neurotoxicity (Caporale et al., Reference Caporale, Castaldi, Rigoli, Cheroni, Valenti, Stucchi, Lessi, Bulgheresi, Trattaro and Pezzali2025). These tools strengthen causal inference by enabling experimental testing of epidemiologically derived hypotheses in human-relevant models and are a key complement to the hybrid epidemiology framework. By incorporating these technologies, researchers can bridge the gap between population-level associations and cell-level mechanisms, ultimately producing more translatable, policy-relevant insights (Caporale et al., Reference Caporale, Leemans, Birgersson, Germain, Cheroni, Borbély, Engdahl, Lindh, Bressan and Cavallo2022).
Limitations
While hybrid epidemiological approaches offer promising pathways for strengthening causal inference, several methodological and practical challenges remain. Data availability and accessibility are key limitations, particularly for high-resolution multi-omics and single-cell datasets, which are often derived from small cohorts and may be restricted due to privacy, consent or regulatory concerns. Even when accessible, harmonizing data across studies is difficult due to differences in sample processing, measurement platforms and annotation standards, hindering integrative and meta-analytic analyses.
Modelling the health effects of chemical mixtures also presents significant challenges. Although methods like WQSR, BKMR and elastic net are useful, they often assume additive effects, despite the likelihood of synergistic or antagonistic interactions. Accurately identifying and quantifying such interactions typically requires large sample sizes, which in turn lead to substantial financial costs – a major limiting factor. These financial constraints are compounded by the need for substantial biospecimen volumes (e.g., blood, urine or plasma), which can be especially difficult to obtain from vulnerable populations such as neonates or young children. As the field progresses, overcoming data and modelling limitations will be essential to bridge existing gaps in our understanding of how environmental exposures affect human health.
Conclusion
Although observational epidemiological studies have provided important information on the health impacts of exposure to plastic chemicals, it is time to move beyond simple ‘black-box’ associations. Hybrid epidemiology, powered by examining mixture analyses, including underpinning -omics and molecular mechanisms, as well as machine learning, offers the opportunity to examine causal evidence, the underlying biological pathways and understand the complex interactions between multiple chemical exposures. This shift is crucial for making informed decisions that protect human health in the face of increasing environmental challenges. By accounting for the totality of exposures and integrating mechanistic evidence, we can more effectively understand and prevent disease, ultimately improving population health, quality of life and economic outcomes such as Gross Domestic Product (GDP, Landrigan, Reference Landrigan2017).
Open peer review
To view the open peer review materials for this article, please visit http://doi.org/10.1017/plc.2025.10011.
Data availability statement
No data were used for the research described in this article.
Acknowledgements
All authors made a substantial contribution to the conception, preparation, writing and editing of the manuscript. All authors reviewed the manuscript before submission.
Competing interests
The authors declare no competing interests.
Financial statement
This research is funded by Minderoo Foundation (Australia), an independent not-for-profit philanthropic organization. Neither the foundation, nor its benefactors, had any influence in the planning and conduction of this work.
Comments
No accompanying comment.