The governance of federated learning: a decision framework for organisational archetypes

Tom Barbereau; Joaquin Delgado Fernandez; Sergio Potenciano Menci

doi:10.1017/dap.2025.10020

The governance of federated learning: a decision framework for organisational archetypes

Published online by Cambridge University Press: 28 July 2025

Tom Barbereau

Joaquin Delgado Fernandez

and

Sergio Potenciano Menci

Show author details

Tom Barbereau*: Affiliation:
Dutch Organization for Applied Scientific Research (TNO), The Hague, The Netherlands Institute for Information Law (IViR), University of Amsterdam, Amsterdam, The Netherlands
Joaquin Delgado Fernandez: Affiliation:
Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg , Esch-sur-Alzette, Luxembourg
Sergio Potenciano Menci: Affiliation:
Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg , Esch-sur-Alzette, Luxembourg
*: Corresponding author: Tom Barbereau; Email: tom.barbereau@tno.nl

Article contents

Abstract
Policy Significance Statement
Introduction
Background
Towards a framework of common layers
Organisational archetypes
Discussion
Outlook
Data availability statement
Author contribution
Funding statement
Competing interests
Ethical standard
Footnotes
References

Abstract

Federated learning (FL) is a machine learning technique that distributes model training to multiple clients while allowing clients to keep their data local. Although the technique allows one to break free from data silos keeping data local, to coordinate such distributed training, it requires an orchestrator, usually a central server. Consequently, organisational issues of governance might arise and hinder its adoption in both competitive and collaborative markets for data. In particular, the question of how to govern FL applications is recurring for practitioners. This research commentary addresses this important issue by inductively proposing a layered decision framework to derive organisational archetypes for FL’s governance. The inductive approach is based on an expert workshop and post-workshop interviews with specialists and practitioners, as well as the consideration of real-world applications. Our proposed framework assumes decision-making occurs within a black box that contains three formal layers: data market, infrastructure, and ownership. Our framework allows us to map organisational archetypes ex-ante. We identify two key archetypes: consortia for collaborative markets and in-house deployment for competitive settings. We conclude by providing managerial implications and proposing research directions that are especially relevant to interdisciplinary and cross-sectional disciplines, including organisational and administrative science, information systems research, and engineering.

Keywords

archetypes artificial intelligence federated learning Governance

Information

Type: Commentary
Information: Data & Policy , Volume 7 , 2025 , e53

DOI: https://doi.org/10.1017/dap.2025.10020 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2025. Published by Cambridge University Press

Policy Significance Statement

This commentary proposes a framework that identifies decision-making layers leading to an organisational archetype for governance in federated learning. What type of entity controls the server in such systems and orchestrates clients is a pending question that hinders adoption. The proposed framework allows for the ex-ante selection of the appropriate organisation for each setting. We apply the framework on the basis of real-world applications.

1. Introduction

The integration of artificial intelligence (AI) is set to bring about unprecedented changes to industries, public institutions, and civil society (Dwivedi et al., Reference Dwivedi, Hughes, Ismagilova, Aarts, Coombs, Crick, Duan, Dwivedi, Edwards, Eirug, Galanos, Ilavarasan, Janssen, Jones, Kar, Kizgin, Kronemann, Lal and Lucini2021). However, data of sufficient quality and quantity are not always available. It is costly to acquire, requires advanced capabilities to process, and is often restricted to organisational boundaries due to competition or regulation (Winter et al., Reference Winter, Berente, Howison and Butler2014; Jordan & Mitchell, Reference Jordan and Mitchell2015; Berente et al., Reference Berente, Gu, Recker and Santhanam2021). These questions become especially relevant when data could leave the boundaries of a singular organisation and collaboration occurs, because it is here, where the “adoption of federated learning (FL) […] is expected to have a catalytic impact towards precision” (Sheller et al., Reference Sheller, Edwards, Anthony Reina, Martin and Bakas2019).

Introduced in 2016, FL addresses concerns related to data sharing (Konečný et al., Reference Konečný, McMahan, Ramage and Richtárik2016). FL is a machine learning (ML) technique that allows for the training of a model across multiple decentralised devices or servers holding local data samples without exchanging them (so-called clients). In other words, data remain in its original location, within organisational boundaries or on a device. The model learns from the data locally and sends the updates back to (usually) a central server. That server aggregates these updates to improve the global model. This local learning and aggregation process is iterative until a performant model is trained and distributed. The processes can be improved by the addition of privacy-enhancing methods and technologies (Truong et al., Reference Truong, Sun, Wang, Guitton and Guo2021). In sum, FL allows organisations to collaborate in training a model without sharing data across their own boundaries. Given this feat, beyond the device-centric applications of Google (see Hard et al., Reference Hard, Rao, Mathews, Ramaswamy, Beaufays, Augenstein, Eichner, Kiddon and Ramage2018), FL finds promising application in (secondary) health data use in oncology (Dayan et al., Reference Dayan, Roth, Zhong, Harouni, Gentili, Abidin, Liu, Costa, Wood, Tsai, Wang, Hsu, Lee, Ruan, Xu, Wu, Huang, Kitamura and Lacey2021; Pati et al., Reference Pati, Baid, Edwards, Sheller, Wang, Reina, Foley, Gruzdev, Karkada, Davatzikos, Sako, Ghodasara, Bilello, Mohan, Vollmuth, Brugnara, Preetha, Sahm and Maier-Hein2022, in energy demand and short-term load forecasting (Fernández et al., Reference Fernández, Potenciano Menci, Lee, Rieger and Fridgen2022; Fernández et al., Reference Fernández, Potenciano Menci and Pavic2023), in governmental data sharing for predictions (Amard et al., Reference Amard, Fernandez, Barbereau and Fridgen2023; Sprenkamp et al., Reference Sprenkamp, Fernández, Eckhardt and Zavolokina2024), in financial credit risk assessment and fraud prevention (Lee et al., Reference Lee, Fernández, Potenciano Menci, Rieger and Fridgen2023; Fernández et al., Reference Fernández, Barbereau, Baim, Rieger, Fridgen, Guggenberger, Sedlmeir and Urbach2024), as well as numerous other domains.

However, whoever controls the central server in the training process wields considerable power. That is because the party acts as the orchestrator of the learning process and coordinates between the training clients (Pati et al., Reference Pati, Baid, Edwards, Sheller, Wang, Reina, Foley, Gruzdev, Karkada, Davatzikos, Sako, Ghodasara, Bilello, Mohan, Vollmuth, Brugnara, Preetha, Sahm and Maier-Hein2022; Bujotzek et al., Reference Bujotzek, Akünal, Denner, Neher, Zenk, Frodl, Jaiswal, Kim, Krekiehn, Nickel, Ruppel, Both, Döllinger, Opitz, Persigehl, Kleesiek, Penzkofer, Maier-Hein, Bucher and Braren2024). It is also responsible for maintaining regulatory requirements, upholding data privacy, and system security during the entire learning process. The orchestrator must ensure that the learning process is fair and that the global model is not biased towards any particular type of data or device. It may also have purview over questions of commercialisation and ownership in the form of intellectual property (IP) rights. Broadly, in the study of AI more generally, these responsibilities fall under what is understood as questions of “governance” (Berente et al., Reference Berente, Gu, Recker and Santhanam2021).

Good governance is essential for organisations dealing with technology (Weill & Ross, Reference Weill and Ross2004). AI is no exception to it: governance is a necessary prerequisite to reap sustained benefits (Berente et al., Reference Berente, Gu, Recker and Santhanam2021; Zhang, Reference Zhang2023). Governance is about aligning the affordances of a technology or data with organisational goals; and because this alignment is context-based on asset and organisation, governance is never a one-size-fits-all (Weber et al., Reference Weber, Otto and Österle2009; Khatri & Brown, Reference Khatri and Brown2010). In the case of FL, and to be expected given the relatively low level of adoption, the knowledge base on its governance is scarce. Critical questions towards its adoption arise: how to convince clients to participate? Who should control the aggregation process? How to arrange ownership of the trained model? What type of organisation orchestrates?

This commentary engages with these questions. In particular, it considers what type of organisation orchestrates FL, and what common decision-making layers lead to the choice for that organisation. Given the scarcity of knowledge, we approach this question ex-ante and rely on knowledge gathered by hosting a workshop on the subject (Storvang et al., Reference Storvang, Mortensen and Clarke2017) and discussing with experts (Mergel et al., Reference Mergel, Edelmann and Haug2019).Footnote ¹ In total, we convened with 15 experts from the health, financial, and energy sectors to discuss questions related to decision-making in the governance of FL. The result of this process, presented in Section 3, yields a layered decision framework for the organisational governance of FL. In Section 4, we apply this framework to real-world applications—notably, Federated Tumor Segmentation in healthcare (Pati et al., Reference Pati, Baid, Edwards, Sheller, Wang, Reina, Foley, Gruzdev, Karkada, Davatzikos, Sako, Ghodasara, Bilello, Mohan, Vollmuth, Brugnara, Preetha, Sahm and Maier-Hein2022), where we identify the consortia archetype, and data aggregation by Google (Hard et al., Reference Hard, Rao, Mathews, Ramaswamy, Beaufays, Augenstein, Eichner, Kiddon and Ramage2018), where we identify the in-house archetype—and delineate their limitations. Finally, in Section 5, we discuss and propose future research directions that are relevant to interdisciplinary and cross-sectional disciplines. Specifically, we foresee researchers engaging with technological developments in FL and considering their implications for governance.

2. Background

In FL, the peculiarity is that while training takes place at the edges, a central orchestrating entity is still present. The presence of that entity has technical and organisational implications.

In technical terms, the orchestrator is responsible for the central server, or rather the training process, which aggregates updates from decentralised clients that locally process data, and then distributes the combined model back to these clients. Thus, the orchestrator will receive the model trained by the clients (or the delta) from the previous iteration and aggregate it before transmitting it back to the clients (McMahan et al., Reference McMahan, Moore, Ramage, Hampson and y Arcas2017).Footnote ² Consequently, the orchestrator plays a crucial role, as it ensures the integration of diverse, local insights into a global model without directly sharing or aggregating data. Doing so requires data curation, standardisation of reference labels, and the formalisation of workflows, among othersFootnote ³. Furthermore, the orchestrator is also responsible for the aggregation mechanism, is in control of the update schedule, and handles security and privacy aspects (Yin et al., Reference Yin, Zhu and Hu2021). Thereby, it holds the capability to affect the accuracy, bias, and overall integrity of the system. The organisation governing the central server wields significant power (Pati et al., Reference Pati, Baid, Edwards, Sheller, Wang, Reina, Foley, Gruzdev, Karkada, Davatzikos, Sako, Ghodasara, Bilello, Mohan, Vollmuth, Brugnara, Preetha, Sahm and Maier-Hein2022; Bujotzek et al., Reference Bujotzek, Akünal, Denner, Neher, Zenk, Frodl, Jaiswal, Kim, Krekiehn, Nickel, Ruppel, Both, Döllinger, Opitz, Persigehl, Kleesiek, Penzkofer, Maier-Hein, Bucher and Braren2024).

From an organisational perspective, in the case that FL is collaborative and outside of the boundaries of a single organisation (as is the case in, e.g., applications in healthcare under a consortium; Pati et al., Reference Pati, Baid, Edwards, Sheller, Wang, Reina, Foley, Gruzdev, Karkada, Davatzikos, Sako, Ghodasara, Bilello, Mohan, Vollmuth, Brugnara, Preetha, Sahm and Maier-Hein2022), the “structure” acting as the orchestrator must be trustworthy. This organisation—beyond technical tasks described previously—is also tasked with aligning participating clients to a set of agreed-upon rules and contracts, settling disputes, and ensuring the financial sustainability of the collaboration. These points each consider the “locus of accountability”—that is, here, some type of organisation, the one “who makes the decision” (Khatri & Brown, Reference Khatri and Brown2010). All participants must trust this accountable organisation before any collaboration and sharing of assets occur (Berente et al., Reference Berente, Gu, Recker and Santhanam2021; Pati et al., Reference Pati, Baid, Edwards, Sheller, Wang, Reina, Foley, Gruzdev, Karkada, Davatzikos, Sako, Ghodasara, Bilello, Mohan, Vollmuth, Brugnara, Preetha, Sahm and Maier-Hein2022; Bujotzek et al., Reference Bujotzek, Akünal, Denner, Neher, Zenk, Frodl, Jaiswal, Kim, Krekiehn, Nickel, Ruppel, Both, Döllinger, Opitz, Persigehl, Kleesiek, Penzkofer, Maier-Hein, Bucher and Braren2024). This is essential in order to create an environment where all parties are willing to collaborate in the short term, and benefit from shared insights or financial gains in the long term. An alternative case is when FL is deployed inside the boundaries of an accountable organisation, in which case governance is done in-house (see, e.g., Hard et al., Reference Hard, Rao, Mathews, Ramaswamy, Beaufays, Augenstein, Eichner, Kiddon and Ramage2018).

At last, it is noteworthy that the organisational structure for governance must be designed according to the context in which FL is trained. Technology governance is never a one-size-fits-all and depends on the asset plus organisational structure at hand (Brown & Grant, Reference Brown and Grant2005; Weber et al., Reference Weber, Otto and Österle2009; Khatri & Brown, Reference Khatri and Brown2010). In the case of training FL across devices by a single organisation, governance typically happens in-house (see McMahan et al., Reference McMahan, Moore, Ramage, Hampson and y Arcas2017; Hard et al., Reference Hard, Rao, Mathews, Ramaswamy, Beaufays, Augenstein, Eichner, Kiddon and Ramage2018). When between organisations, in a non-competitive and research-centric market for data, a consortium might be the right fit (see Mateus et al., Reference Mateus, Moonen, Beran, Jaarsma, van der Landen, Heuvelink, Birhanu, Harms, Bron, Wolters, Cats, Mei, Oomens, Jansen, Schram, Dekker and Bermejo2024). Alternatively, a single party—such as a university or non-profit—could be trusted to act altruistically as a third party and reap shared rewards (see Pati et al., Reference Pati, Baid, Edwards, Sheller, Wang, Reina, Foley, Gruzdev, Karkada, Davatzikos, Sako, Ghodasara, Bilello, Mohan, Vollmuth, Brugnara, Preetha, Sahm and Maier-Hein2022).

Taking past understandings on designing data governance and decision domains (Khatri & Brown, Reference Khatri and Brown2010), we formulate a set of non-exhaustive questions related to and informing model governance (see Table 1).Footnote ⁴ They are categorised in terms of decision domains appropriated from governance literature. Formally, Weill and Ross (Reference Weill and Ross2004) propose that IT governance design includes five major decision domains—principles, architecture, infrastructure, application needs, and investment and prioritisation. Khatri and Brown (Reference Khatri and Brown2010) adapt these to data governance to consider principles, quality, metadata, access, and the life cycle. We adhere to and adapt the latter as we consider data to be the core asset.

Table 1. Decision domains and questions tailored to FL governance, based on Khatri and Brown (Reference Khatri and Brown2010)

^a The questions formulated by domain are non-exhaustive.

^b Privacy enhancing techniques (see Yin et al., Reference Yin, Zhu and Hu2021).

The practical orientation of this article compels us to consider the framing of governance by Pati et al. (Reference Pati, Baid, Edwards, Sheller, Wang, Reina, Foley, Gruzdev, Karkada, Davatzikos, Sako, Ghodasara, Bilello, Mohan, Vollmuth, Brugnara, Preetha, Sahm and Maier-Hein2022). Their study presents results from training an FL model for tumour detection involving data from 71 sites across 6 continents. Here, governance is referred to as (1) the definition of the problem statement and (2) the coordination with the collaborating sites. In consideration of this framing, the scope of this commentary is on the latter: the coordination (i.e., model governance) with collaborators. The goal of this commentary is to advance understanding of the common layers leading to an organisational archetype for the governance of FL. After delineating assumptions, describing our inductive approach, we propose the framework and its layers. At last, we apply the model to real-world cases.

3. Towards a framework of common layers

3.1. Assumptions

The development of information systems is a process that is constructivist (Weigl et al., Reference Weigl, Barbereau and Fridgen2023). Decision-making about technology in organisations involves actors that are involved in a “set of games” (Mintzberg, Reference Mintzberg1983) and “processes of getting commitment” (Keen, Reference Keen1981). It is to be expected that actors who contributed more data to a model, a crucial piece to the model, or who simply have a more dominant market share can (ab)-use their “positional advantage” (Hård, Reference Hård1993) in negotiations over technology. In particular, this is expected to be the case in competitive data markets (see Lee et al., Reference Lee, Fernández, Potenciano Menci, Rieger and Fridgen2023). Because the set of games involves actors whose motivations may be “hidden” (Grover et al., Reference Grover, Lederer and Sabherwal1988), the assumption is that decisions over governance occur in a black box.

Conceptually, one may open this black box from a constructivist perspective. Given constraints in the technical domain, technology’s affordances are interpreted flexibility by different actors. These actors make decisions within some institutional context, based on the organisational status quo (operating procedures, culture, etc.) and environmental pressures (regulation, financing, etc.) (Weigl et al., Reference Weigl, Barbereau and Fridgen2023). Although we find value in the described perspective, given the practical orientation of this commentary, an alternative, inductive lens is put forward.

We assume that it is never possible to fully deconstruct the black box of decision-making (Winner, Reference Winner1993). As each context is different, generalisation of any deconstruction would be limited (Weber et al., Reference Weber, Otto and Österle2009). Given this assumption and the consequent limits to generalisation, based on our process of information gathering, we instead provide a framework that considers three common layers—quasi, context factors—that lead to an organisational archetype for the governance of FL.

3.2. Inductive approach

The reflection and layered framework we propose is inductively developed by following principles of conceptual research (Jaakkola, Reference Jaakkola2020; Makowski, Reference Makowski2021). In September 2024, two of the three co-authors attended the 22nd International Conference on Intelligent Systems Applications to Power Systems in Budapest, Hungary. There, they discussed organisational archetypes for the governance of FL as part of a themed panel and brought back input for the creation of a framework. Workshops are commonly used as an inductive approach to research (Storvang et al., Reference Storvang, Mortensen and Clarke2017). The workshop included academics and practitioners active in the field of FL. Some were operationally involved in the deployment of solutions; others were active in more fundamental research.

Between September and August 2024, we gathered post-workshop reflections and feedback with FL specialists from various industries (n.b. energy, finance, and defence). They are considered agents with implicit and factual knowledge about processes and decisions (Mergel et al., Reference Mergel, Edelmann and Haug2019). Among authors and with some of the specialists, we repeatedly met to review and iterate (Klein & Myers, Reference Klein and Myers1999) until the final framework.

After the framework was developed, we performed an ex-ante application of it to a set of real-world applications identified using a modified version of an online open-access tool to do automated searches in different databases of academic literature (Gerloff, Reference Gerloff2022). We primarily consulted Association for Computing Machinery (ACM), Institute of Electrical and Electronics Engineers (IEEE), and Scopus for technical literature, as well as PubMed, given that the main empirical results stem from there. We also consulted non-peer-reviewed academic works from arXiv, Medrxiv, and Biorxiv, although we do not refer to these specifically.

3.3. Layered framework

The result of this process of information gathering and iteration is the formulation of a conceptual framework of common layers to derive organisational archetypes for the governance of FL (Figure 1). The three common layers at play when deciding upon the locus of accountability and corresponding organisational archetype are (1) the nature of the data market (spectrum of collaborative–competitive), (2) the FL model’s architecture (spectrum of centralised–decentralised), and (3) the model’s ownership (spectrum of proprietary–shared). The result is archetypes.

Figure 1. A layered framework to derive organisational archetypes for the governance of FL. The line represents the virtual decision-making process throughout each of the layers, resulting in a selected archetype.

While in no order of priority, there is a temporal element to these layers such that the initial consideration is done with regard to the market, followed by architectural design choices, and, at last, the definition of the ownership structure. We subsequently discuss each of the three common layers individually. Thereafter, we put forward a set of non-exhaustive questions to help decide on the archetype.

3.3.1. Data market

First, stakeholders must evaluate the data market in which they are operating. There are four types of data markets: many-to-many, one-to-many, many-to-one, and one-to-one (Driessen et al., Reference Driessen, Monsieur and Van Den2022). However, in stylised economic terms, we can simplify markets for data and place them on a spectrum between competitive and collaborative markets (Fernández, Reference Fernández2023). Competitive markets are characterised by organisations that guard their data closely to maintain a competitive edge. Legal restrictions and other market pressures also play a role in the limits of data sharing. Organisations will avoid sharing data that could potentially benefit their rivals, as their competitive advantage hinges on exclusive access to valuable data resources (Kearns & Lederer, Reference Kearns and Lederer2004).

However, organisations have begun to shift towards collaborative business models and technologies to maximise returns. At once, data markets can be “inherently” collaborative such that organisations actively work together to share data, research on projects, and reap mutual benefits (Spiekermann, Reference Spiekermann2019). Examples include hospitals and research centres collaborating to create comprehensive datasets for better patient care.Footnote ⁵ Other organisations, such as banks and insurance companies, will traditionally compete, although they will look to collaborate to address a shared problem and reap joint rewards (Lee et al., Reference Lee, Fernández, Potenciano Menci, Rieger and Fridgen2023).

3.3.2. Infrastructure

Second, stakeholders must decide upon the system architecture. FL presents two main architectures (Figure 2): centralised and decentralised (Yin et al., Reference Yin, Zhu and Hu2021). The former involves a set of clients and an orchestration agent (central server) for collaborative training of an ML model. The process consists of three phases as follows: preparation, training and learning, and use. Clients prepare their data, agree on the ML model to train, and select initialisation parameters. The orchestrator then selects a subset of clients to download, train, and share their model parameters. This process iterates until the model reaches the agreed objective, usually a certain performance level or learning threshold, although it could be a limit in rounds or training expenses. This architecture has trade-offs, including privacy concerns, a single point of failure (the server), communication overhead, scalability issues, security risks, and fairness challenges.

Figure 2. Conceptual architectures for FL.

Decentralised FL operates differently and overcomes some of these trade-offs. It requires a protocol for peer-to-peer information sharing and a logic architecture for learning orchestration. The same three phases persist: prepare, train and learn, and use. By removing the dependency on a central coordination agent and distributing responsibility among clients, decentralised FL fosters a more resilient and efficient collaboration (Martínez Beltrán et al., Reference Martínez Beltrán, Pérez, Sánchez, Bernal, Bovet, Pérez, Pérez and Celdrán2023). It also removes the sharing of potentially compromising data with a coordination agent, avoids a single point of failure, and aims to improve fairness by shifting importance from the client’s data to every client, preventing clients with large datasets from dominating the learning process (Witt et al., Reference Witt, Heyer, Toyoda, Samek and Li2023).

In certain scenarios (involving highly sensitive or private data), an alternative take on the centralised architecture emerges, such that the physical location of the training process changes. Under specific conditions, the federation may choose to aggregate the model at a trusted third party (such as a European body like eu-LISA or a market authority) that has no economic interest in the outcome but is responsible for ensuring the correct functioning of the system. For instance, in a fraud detection federation across banks (Yang et al., Reference Yang, Zhang, Ye, Li and Xu2019), a financial market regulator could oversee the training process to ensure it is conducted properly, without having any interest in the model or its outputs.

Besides physical solutions to trust and ownership, researchers have implemented technological solutions. To uphold privacy, mechanisms like differential privacy add noise to obscure individual data contributions and mitigate reconstruction risks (Fu et al., Reference Fu, Hong, Ling, Wang, Ran, Sun, Wang, Chen and Cao2024). Homomorphic encryption enables computations directly on encrypted data, preserving confidentiality during aggregation without sacrificing model utility (Jin et al., Reference Jin, Yao, Han, Joe-Wong, Ravi, Avestimehr and He2023). Similarly, secure multi-party computation protocols allow for collaborative model training while ensuring no individual party’s data are exposed (Kaminaga et al., Reference Kaminaga, Awaysheh, Alawadi and Kamm2023).

To protect against adversarial behaviour, especially model poisoning, anomaly detection techniques have been developed to identify irregular or malicious updates (Vucovich et al., Reference Vucovich, Tarcar, Rebelo, Gade, Porwal, Rahman, Redino, Choi, Nandakumar and Schiller2022). Trust and reputation-based systems can further reinforce robustness by weighting client updates according to their historical reliability and contribution quality (Rashid et al., Reference Rashid, Xiang, Uddin, Tang, Sood and Gao2025). Furthermore, research has proposed to integrate blockchain-based solutions, such as smart contracts (Cassano et al., Reference Cassano, D’Abramo, Munir and Ferretti2024), and non-technical approaches utilising legal instruments like contracts and independent audits to ensure ethical, legal, and technical robustness in systems, thereby holding providers accountable (OECD, 2016).

Other emerging methods seek to enhance fairness and operational efficiency. Fairness-aware aggregation frameworks attempt to balance the influence of clients with heterogeneous data volumes, ensuring equitable model contributions (Ezzeldin et al., Reference Ezzeldin, Yan, He, Ferrara and Avestimehr2023), while model compression techniques reduce the communication burden associated with transmitting large updates, thereby broadening participation to include clients with limited resources (Yang et al., Reference Yang, Xiao, Motta, Beaufays, Mathews and Chen2022). Collectively, these techniques enable the development of federated systems that are not only decentralised but also more secure, private, and equitable.

3.3.3. Ownership

Finally, the stakeholders need to address questions about ownership. In the context of AI, these questions are manifold. Post-GenAI, the body of works considering the ownership of outputs of an AI has grown (see Tzimas, Reference Tzimas2021). Conceptually, we pragmatically consider ownership as who owns the IP rights to the trained model and how the benefits derived are distributed (Berente et al., Reference Berente, Gu, Recker and Santhanam2021). The benefits can be both financial in the commercialisation of the model or practical in its application.

The IP can be shared or in whole; correspondingly, ownership can be shared equally among parties, be distributed among parties, or be owned by one single party. In the former, parties can—for example, in the European Union—draw up a so-called Research and Development Agreement before development (European Commission, 2023). The agreement covers (1) IP ownership and access rights, (2) registration protocols, (3) exploitation strategy (in the form of licences), and (4) IP management in case of termination of the agreement. An alternative form of shared ownership can be formalised by the creation of a joint venture or consortium, where two or more independent organisations undertake a specific project together and share the IP accordingly. Here too, for both types of organisations, agreements typically cover the same four points (see also WIPO, 2005). By and large, IP rights define the ownership settings and access to the model. For example, whereas in-house governance will allow to fully leverage the IP rights, sharing IP as part of a consortium may put limits on commercialisation.

3.3.4. Archetype

Due to the limited number of scaled applications, the knowledge on resulting archetypes is scarce. However, we expect one to use our framework and for one archetype to emerge as a result of decisions taken along each layer of the framework. To assist practitioners in identifying the matching archetype, we propose a set of guiding questions along the layers (see Table 2). Effectively, the way one answers these questions will determine internal development choices, ultimately leading to a specific archetype. In the future, the patterns for which archetypes are more popular in which context may become clearer. We anticipate that context, application domains, verticals, country-specific regulations, and other variables to play a significant role in shaping them.

Table 2. Guiding questions by layer of our proposed framework

^a The questions formulated by layer are non-exhaustive. We see these as starting point for discussion.

4. Organisational archetypes

4.1. Application of the framework

In Table 3, we consider a limited set of real-world applications through the lens of our framework. It is noteworthy that we only selected applications that are mature and have progressed beyond initial proofs of concept, prototypes, or small-scale pilots and are actively utilised in real-world environments, of which, according to the UK’s Department for Science, Innovation, and Technology (2024), there are relatively few of them. We expected this to be the case given the general, risk-averse sentiment towards FL adoption (Müller et al., Reference Müller, Zahn and Matthes2024).

Table 3. Real-world applications of FL through the lens of the layered framework

^a We estimate the application’s technology readiness levels (see NASA, 2023) on the basis of available documentation.

^b Coll, collaboration; Comp, competition.

^c Dec, decentralised; Cen, centralised.

We note that the majority of applications are in the healthcare sector. This was to be expected (Rieke et al., Reference Rieke, Hancox, Li, Milletarì, Roth, Albarqouni, Bakas, Galtier, Landman, Maier-Hein, Ourselin, Sheller, Summers, Trask, Xu, Baust and Cardoso2020). In terms of archetypes, we observe that in collaborative data markets with decentralised infrastructure and shared IP ownership, the consortium appears as an evident organisational archetype. Popularised in the information technology industry at the turn of the century, consortia are a form of alliance between multiple organisations that collaborate on a shared objective while maintaining their independence (see Hawkins, Reference Hawkins1999). Literature has shown that such collaborative structure mitigates the risks of monopolisation, enhances collective innovation, and promotes fair competition by aligning incentives toward mutual benefit rather than market dominance (Doz et al., Reference Doz, Olk and Ring2000). In the context of FL, the archetype helps to balance power dynamics by ensuring that decision-making, resource allocation, and IP rights are distributed more equitably among stakeholders (Bujotzek et al., Reference Bujotzek, Akünal, Denner, Neher, Zenk, Frodl, Jaiswal, Kim, Krekiehn, Nickel, Ruppel, Both, Döllinger, Opitz, Persigehl, Kleesiek, Penzkofer, Maier-Hein, Bucher and Braren2024; Mateus et al., Reference Mateus, Moonen, Beran, Jaarsma, van der Landen, Heuvelink, Birhanu, Harms, Bron, Wolters, Cats, Mei, Oomens, Jansen, Schram, Dekker and Bermejo2024).

The second archetype we see emerging is labelled as “in-house.” Here, the application is Gboard, Google’s virtual keyboard, which utilises FL to improve predictive text and typing suggestions while maintaining user privacy. Instead of sending raw typing data to centralised servers, Gboard trains directly on users’ devices, aggregating only the necessary updates to improve the overall model (Hard et al., Reference Hard, Rao, Mathews, Ramaswamy, Beaufays, Augenstein, Eichner, Kiddon and Ramage2018). In terms of governance, the model principles and access (refer to Table 1) largely define and point towards a centralised, proprietary setting.

Beyond the two archetypes observed “in the wild,” we foresee other archetypes to emerge. One of these is the joint venture whereby independent organisations form a new organisation under which model training occurs or is coordinated, by which ownership is equitably shared and typically commercialised. This archetype is opposed to consortia, where it is typically not commercialised.Footnote ⁶ The conditions for such a form of alliance are relationship and firm-specific (see Pateli & Lioukas, Reference Pateli and Lioukas2011). Another archetype we foresee to be of interest is to delegate orchestration to a third party. This could be, for example, a financial market authority or central bank (Lee et al., Reference Lee, Fernández, Potenciano Menci, Rieger and Fridgen2023). A more speculative, experimental archetype is to delegate orchestration to a decentralised autonomous organisation (Majeed et al., Reference Majeed, Hassan, Han and Hong2023).

4.2. Limitations

The proposed framework is subject to limitations. For instance, when derived inductively, here using a workshop and reflections, there are limits to the generalisability of the conceptual contribution due to the context-specific nature of the data collection (Storvang et al., Reference Storvang, Mortensen and Clarke2017; Jaakkola, Reference Jaakkola2020; Makowski, Reference Makowski2021). We addressed this limitation by application to some real-world cases, an established method for ex-ante validation, and by delineating boundary conditions in the form of assumptions (Busse et al., Reference Busse, Kach and Wagner2016).

Additionally, the subjective interpretations of the researchers and the potential for bias in data collection and analysis further constrain the applicability of the findings (Klein & Myers, Reference Klein and Myers1999). We addressed this limitation through additional conversations and iterations with experts. Still, while our frameworks can provide conceptually rich understandings, their validity and reliability across different settings and populations remains limited, necessitating further validation and adaptation to enhance generalisability. As FL models mature, we see this as a future research direction.

5. Discussion

While writing this commentary, and in the discussions with respective experts working on FL, we encountered three distinct points that are worthwhile to discuss in the context of governance. All three are of organisational relevance; for all three, supportive literature is scarce. Therefore, we view each as an avenue for future research.

First, because the value of contributions in FL decreases as the number of participants increases (what we describe as the learning threshold), collaboration is at odds with utility and individual return. One key challenge to FL is that the model training can reach a ceiling of accuracy (Ω) after a certain number of participants ( $ n $ ) contributed. In other words, as more participants join and contribute, the model becomes more accurate, and for each new user joining at the time ( $ n $ +1), the accuracy peaks and the utility of the contribution decrease (see Figure 3). This may indeed lead to unintended outcomes. It may be that when Ω is reached, the sustainability of the collaboration is in jeopardy; those who contributed first may argue that the value of those who contributed later is lesser (Lee et al., Reference Lee, Fernández, Potenciano Menci, Rieger and Fridgen2023). The orchestrator must define clear participation and ownership (reward) structures.

Figure 3. Stylised visualisation of model accuracy versus contributors. The intersection between $ \Omega $ and $ n $ represents the learning threshold.

The second point is that who initiates the collaboration is typically in control. Checks and balances within the system are not only a question of contracts and agreements; it is also about who initiates the use case in the first place. Similar to the power plays in the formation of consortia, the initiator of an FL collaboration largely defines what kind of governance structure to set up and what the relationship is between participants. It may be that, in a commercial example, the initiator opts to create a vendor lock-in and abuse their market position (“winner-takes-all”). To do so, it would reduce the amount of power end users have by, among others, reducing or eliminating the orchestrator as a whole. Valuable lessons may be learned from literature on consortia formation and the power dynamics at play (Doz et al., Reference Doz, Olk and Ring2000; Sakakibara, Reference Sakakibara2002).

At last, in our conversations and the workshop, experts flagged private companies’ inherent aversion towards the adoption of FL due to their awareness of associated risk (see also, Müller et al., Reference Müller, Zahn and Matthes2024). On the one hand, the adoption of FL demands certain organisational capabilities—it requires necessary (technical) skills, governance structures, and communication channels between participants (Bujotzek et al., Reference Bujotzek, Akünal, Denner, Neher, Zenk, Frodl, Jaiswal, Kim, Krekiehn, Nickel, Ruppel, Both, Döllinger, Opitz, Persigehl, Kleesiek, Penzkofer, Maier-Hein, Bucher and Braren2024). On the other hand, companies adopting FL are—at worst—exposing themselves to market risks and cybersecurity risks in the form of data leakages, predatory behaviour, and so forth. At best, they need to trust an orchestrator. Either way, experts pointed out that FL may be in the limbo of a chicken-and-egg problem. This is sensible given that distributed technologies commonly face this problem (Drasch et al., Reference Drasch, Fridgen, Manner-Romberg, Nolting and Radszuwill2020).

6. Outlook

As with any emerging technology, FL faces the dynamics of typical technology adoption life cycles—be it Gartner’s hype cycle or others. While deterministic by nature (Pollock & Williams, Reference Pollock and Williams2010), these frameworks can provide a useful lens to estimate the stage of development for FL.

The academic corpus surrounding FL appears to be entering a stage of consolidation, marked by an increasing number of early adopters seeking to understand, explore, and advance the technology. These efforts are projected into the first real-world implementations in the healthcare sector (Pati et al., Reference Pati, Baid, Edwards, Sheller, Wang, Reina, Foley, Gruzdev, Karkada, Davatzikos, Sako, Ghodasara, Bilello, Mohan, Vollmuth, Brugnara, Preetha, Sahm and Maier-Hein2022; Bujotzek et al., Reference Bujotzek, Akünal, Denner, Neher, Zenk, Frodl, Jaiswal, Kim, Krekiehn, Nickel, Ruppel, Both, Döllinger, Opitz, Persigehl, Kleesiek, Penzkofer, Maier-Hein, Bucher and Braren2024; Mateus et al., Reference Mateus, Moonen, Beran, Jaarsma, van der Landen, Heuvelink, Birhanu, Harms, Bron, Wolters, Cats, Mei, Oomens, Jansen, Schram, Dekker and Bermejo2024). These examples illustrate how real-world applications are progressing beyond purely technical aspects, in finding answers to organisational questions. Their success relies on interdisciplinary and cross-domain collaboration (van Drumpt et al., Reference van Drumpt, Timan, Talie, Veugen and van de Burgwal2024).

By contrast, applications outside of the healthcare sector remain underdeveloped, often lagging in the depth and breadth of available research. This disparity suggests that while research in other sectors may be maturing, real-world adoption is still in its infancy, requiring more innovators and organisations to bridge the gap. Consequently, as also seen in the healthcare sector, we recommend to decision-makers and policy-makers that interdisciplinary and cross-sectional collaboration is key for FL’s adoption. The uptake of use cases requires more than technical expertise.

Research opportunities will continue to emerge as both industry and academia explore the implementation of FL across sectors. As collaboration markets emerge and evolve, it is essential to examine the technical advancements in FL and the governance concepts that will accompany them. For instance, we envision that organisational frameworks must be developed to address long-term and short-term adoption strategies. Short-term strategies are particularly relevant in the context of FL, as the emergence of short-term federations may necessitate agile organisational structures that enhance governance’s dynamism and account for such contingencies (see Sambamurthy & Zmud, Reference Sambamurthy and Zmud1999). In line with these developments, a clear research opportunity building from our theoretical framework could lead to the creation of a decision tree to guide organisations in the selection process of an organisational archetype from a practical perspective. We foresee managers and C-level executives within organisations to be the main counterparties. Moreover, we also foresee the necessity of understanding the trade-offs; in other words, the benefits and drawbacks of certain decisions when in the context of an FL organisation.

Additional research opportunities in governance arise as the global “building boom” of high-performance computers and data centres is kept alive (Weise, Reference Weise2025). Specifically, beyond those owned and operated by Big Tech, we see these facilities as providing on-demand computational resources and being non-competitive, trustworthy orchestrators for the training of FL. Still, they also require governance frameworks to establish the rules based on the type of archetype they will interact with. However, given the preliminary nature of our approach and the lack of post-data analysis, a significant question remains uncertain: will there be a multitude of different archetypes, or will they converge?

Data availability statement

The authors confirm that all data generated or analysed during this study are included in this published article.

Author contribution

Conceptualisation: T.B., J.D.F., and S.P.M. Formal analysis: T.B., J.D.F., and S.P.M. Investigation: T.B., J.D.F., and S.P.M. Methodology: T.B., J.D.F., and S.P.M. Writing—original draft: T.B., J.D.F., and S.P.M. Writing – review and editing: T.B., J.D.F., and S.P.M. All authors approved the final submitted draft.

Funding statement

T.B. is supported by TNO under the Early Research Program “Next Generation Cryptography.” J.D.F. and S.P.M. are supported by the Luxembourg National Research Fund (FNR) and PayPal, PEARL grant reference 13,342,933/Gilbert Fridgen, and by FNR grant reference HPC BRIDGES/2022_Phase2/17886330/DELPHI. For the purpose of open access and in fulfilling the obligations arising from the grant agreement, the authors have applied a Creative Commons Attribution 4.0 International (CC BY 4.0) license to any Author Accepted Manuscript version arising from this submission.

Competing interests

The authors declare none.

Ethical standard

The research meets all ethical guidelines, including adherence to the legal requirements of the study country.

Footnotes

¹ Additionally, we ought to mention that all authors are actively working on applied FL projects.

² Usually, this aggregation is a weighted arithmetic mean or a simple arithmetic mean.

³ By this, we understand the standardisation of model outputs, labels the models are trained against, and standardisation of the weights across the clients.

⁴ The technical and organisational implications are accounted for as part of the formulated questions.

⁵ Admittedly, they are now more restricted in doing so, given privacy regulations (Price & Cohen, Reference Price and Cohen2019).

⁶ For a comparison between a consortium and an equity joint venture, see Costa et al. (Reference Costa, Silva and Oliveira2017).

References

Amard, A, Fernandez, JD, Barbereau, T and Fridgen, G (2023) Federated learning in migration forecasting. In Proceedings of the 44th International Conference on System Sciences. https://aisel.aisnet.org/treos_icis2023/23/Google Scholar

Berente, N, Gu, B, Recker, J and Santhanam, R (2021) Managing artifical intelligence. MIS Quarterly, 45(3), 1433–1450. https://aisel.aisnet.org/misq/vol45/iss3/16/Google Scholar

Brown, AE and Grant, GG (2005) Framing the frameworks: A review of IT governance research. Communications of the AIS 15(1), 38. https://doi.org/10.17705/1CAIS.01538.Google Scholar

Bujotzek, MR, Akünal, Ü, Denner, S, Neher, P, Zenk, M, Frodl, E, Jaiswal, A, Kim, M, Krekiehn, NR, Nickel, M, Ruppel, R, Both, M, Döllinger, F, Opitz, M, Persigehl, T, Kleesiek, J, Penzkofer, T, Maier-Hein, K, Bucher, A and Braren, R (2024) Real-world federated learning in radiology: Hurdles to overcome and benefits to gain. Journal of the American Medical Informatics Association 32(1), 193–205. https://doi.org/10.1093/jamia/ocae259.CrossRef Google Scholar

Busse, C, Kach, AP and Wagner, SM (2016) Boundary conditions: What they are, how to explore them, why we need them, and when to consider them. Organizational Research Methods 20(4), 574–609. https://doi.org/10.1177/1094428116641191.CrossRef Google Scholar

Cassano, L, D’Abramo, J, Munir, S and Ferretti, S (2024) Trust and Resilience in Federated Learning Through Smart Contracts Enabled Decentralized Systems. Available at https://arxiv.org/abs/2407.06862 10.1109/Blockchain62396.2024.00097CrossRef Google Scholar

Costa, e, Silva, S and Oliveira, SM (2017) Partner selection in international joint ventures: A framework for the analysis of factors relevant to the selection of partners. The Marketing Review 17(2), 199–215. https://doi.org/10.1362/146934717X14909733966182.Google Scholar

Dayan, I, Roth, HR, Zhong, A, Harouni, A, Gentili, A, Abidin, AZ, Liu, A, Costa, AB, Wood, BJ, Tsai, C-S, Wang, C-H, Hsu, C-N, Lee, CK, Ruan, P, Xu, D, Wu, D, Huang, E, Kitamura, FC, Lacey, G, et al. (2021) Federated learning for predicting clinical outcomes in patients with COVID-19. Nature Medicine 27, 1735–1743. https://doi.org/10.1038/s41591-021-01506-3.CrossRef Google Scholar

Department for Science, Innovation and Technology (2024) Repository of Privacy Enhancing Technologies (PETs) Use Cases. United Kingdom Government. Available at https://www.gov.uk/guidance/repository-of-privacy-enhancing-technologies-petsuse-cases Google Scholar

Doz, YL, Olk, PM and Ring, PS (2000) Formation processes of R&D consortia: Which path to take? Where does it lead? Strategic Management Journal 21(3), 239–266. https://doi.org/10.1002/(SICI)1097-0266(200003)21:3<239::AIDSMJ97>3.0.CO;2-K.3.0.CO;2-K>CrossRef Google Scholar

Drasch, BJ, Fridgen, G, Manner-Romberg, T, Nolting, FM and Radszuwill, S (2020) The token’s secret: The two-faced financial incentive of the token economy. Electronic Markets 30(3), 557–567. https://doi.org/10.1007/s12525-020-00412-9.CrossRef Google Scholar

Driessen, SW, Monsieur, G and Van Den, W-J (2022) Data market design: A systematic literature review. IEEE Access 10, 33123–33153. https://doi.org/10.1109/ACCESS.2022.3161478.CrossRef Google Scholar

Dwivedi, YK, Hughes, L, Ismagilova, E, Aarts, G, Coombs, C, Crick, T, Duan, Y, Dwivedi, R, Edwards, J, Eirug, A, Galanos, V, Ilavarasan, PV, Janssen, M, Jones, P, Kar, AK, Kizgin, H, Kronemann, B, Lal, B, Lucini, B, et al. (2021) Artificial intelligence (AI): Multidisciplinary perspectives on emerging challenges, opportunities, and agenda for research, practice and policy. International Journal of Information Management 57, 101994. https://doi.org/10.1016/j.ijinfomgt.2019.08.002.CrossRef Google Scholar

European Commission (2023) Commission Regulation (EU) 2023/1066 of 1 June 2023 on the Application of Article 101(3) of the Treaty on the Functioning of the European Union to Certain Categories of Research and Development Agreements. Official Journal of the European Union. Available at http://data.europa.eu/eli/reg/2023/1066/oj Google Scholar

Ezzeldin, YH, Yan, S, He, C, Ferrara, E and Avestimehr, S (2023) Fairfed: Enabling group fairness in federated learning. Proceedings of the AAAI Conference on Artificial Intelligence 37(6), 7494–7502.10.1609/aaai.v37i6.25911CrossRef Google Scholar

Fernández, JD (2023) Breaking Data Silos with Federated Learning [Doctoral dissertation, University of Luxembourg]. https://hdl.handle.net/10993/57042 Google Scholar

Fernández, JD, Barbereau, T, Baim, R and Rieger, A (2024) Opportunities and applications of federated learning in the financial services industry. In Fridgen, G, Guggenberger, T, Sedlmeir, J, Urbach, N (eds), Decentralization Technologies. Financial Innovation and Technology. Springer, Cham. https://doi.org/10.1007/978-3-031-66047-4_11Google Scholar

Fernández, JD, Potenciano Menci, S, Lee, CM, Rieger, A and Fridgen, G (2022) Privacy-preserving federated learning for residential short-term load forecasting. Applied Energy 326, 119915. https://doi.org/10.1016/j.apenergy.2022.119915.CrossRef Google Scholar

Fernández, JD, Potenciano Menci, S and Pavic, I (2023) Towards a peer-to-peer residential short-term load forecasting with federated learning. In 2023 IEEE Belgrade PowerTech. pp. 1–6. https://doi.org/10.1109/PowerTech55446.2023.10202782.CrossRef Google Scholar

Fu, J, Hong, Y, Ling, X, Wang, L, Ran, X, Sun, Z, Wang, WH, Chen, Z and Cao, Y (2024) Differentially private federated learning: A systematic review. arXiv preprint arXiv:2405.08299.Google Scholar

Gerloff, C (2022) Christiangerloff/set-you-free: First step (Version v0.1.0). Zenodo. https://doi.org/10.5281/zenodo.6907681CrossRef Google Scholar

Grover, V, Lederer, AL and Sabherwal, R (1988) Recognizing the politics of mis. Information & Management 14(3), 145–156. https://doi.org/10.1016/0378-7206(88)90005-5.CrossRef Google Scholar

Hård, M (1993) Beyond harmony and consensus: A social conflict approach to technology. Science, Technology, & Human Values 18 (3), 408–432. https://doi.org/10.1177/016224399301800402.CrossRef Google Scholar

Hard, A, Rao, K, Mathews, R, Ramaswamy, S, Beaufays, F, Augenstein, S, Eichner, H, Kiddon, C and Ramage, D (2018) Federated Learning for Mobile Keyboard Prediction. arXiv. https://doi.org/10.48550/arXiv.1811.03604CrossRef Google Scholar

Hawkins, R (1999) The rise of consortia in the information and communication technology industries: Emerging implications for policy. Telecommunications Policy 23(2), 159–173. https://doi.org/10.1016/S0308-5961(98)00085-8.CrossRef Google Scholar

Jaakkola, E (2020) Designing conceptual articles: Four approaches. AMS review 10(1), 18–26. https://doi.org/10.1007/s13162-020-00161-0.CrossRef Google Scholar

Jin, W, Yao, Y, Han, S, Joe-Wong, C, Ravi, S, Avestimehr, S and He, C (2023) Fedml-he: An efficient homomorphicencryption-based privacy-preserving federated learning system. arXiv preprint arXiv:2303.10837.Google Scholar

Jordan, MI and Mitchell, TM (2015) Machine learning: Trends, perspectives, and prospects. Science 349(6245), 255–260. https://doi.org/10.1126/science.aaa8415.CrossRef Google Scholar

Kaminaga, H, Awaysheh, FM, Alawadi, S and Kamm, L (2023) Mpcfl: Towards multi-party computation for secure federated learning aggregation. In Proceedings of the 16th IEEE/ACM International Conference on Utility and Cloud Computing (UCC ’23). Association for Computing Machinery, New York, NY, USA, 19, 1–10. https://doi.org/10.1145/3603166.3632144.Google Scholar

Kearns, GS and Lederer, AL (2004) The impact of industry contextual factors on IT focus and the use of IT for competitive advantage. Information & Management 41(7), 899–919. https://doi.org/10.1016/j.im.2003.08.018.CrossRef Google Scholar

Keen, PGW (1981) Information systems and organizational change. Communications of the ACM 24(1), 24–33. https://doi.org/10.1145/358527.358543.CrossRef Google Scholar

Khatri, V and Brown, CV (2010) Designing data governance. Communications of the ACM 53(1), 148–152. https://doi.org/10.1145/1629175.1629210.CrossRef Google Scholar

Klein, H and Myers, M (1999) A set of principles for conducting and evaluating interpretive field studies in information systems. MIS Quarterly 23(1), 67–93. https://doi.org/10.2307/249410.CrossRef Google Scholar

Konečný, J, McMahan, B, Ramage, D and Richtárik, P (2016) Federated optimization: Distributed machine learning for on-device intelligence. https://arxiv.org/abs/1610.05492.Google Scholar

Lee, CM, Fernández, JD, Potenciano Menci, S, Rieger, A and Fridgen, G (2023) Federated learning for credit risk assessment. In Proceedings of the 56th Hawaii International Conference on System Sciences. pp. 386–395. https://hdl.handle.net/10125/102676 10.24251/HICSS.2023.048CrossRef Google Scholar

Majeed, U, Hassan, SS, Han, Z and Hong, CS (2023) DAO-FL: Enabling Decentralized Input and Output Verification in Federated Learning with Decentralized Autonomous Organizations. Authorea Preprints. https://doi.org/10.36227/techrxiv.24546502.v1CrossRef Google Scholar

Makowski, PT (2021) Optimizing concepts: Conceptual engineering in the field of management—The case of routines research. Academy of Management Review 46 (4), 702–724. https://doi.org/10.5465/amr.2019.0252?journalCode=amr.CrossRef Google Scholar

Martínez Beltrán, ET, Pérez, MQ, Sánchez, PMS, Bernal, SL, Bovet, G, Pérez, MG, Pérez, GM and Celdrán, AH (2023) Decentralized federated learning: Fundamentals, state of the art, frameworks, trends, and challenges. IEEE Communications Surveys & Tutorials 25(4), 2983–3013. https://doi.org/10.1109/COMST.2023.3315746.CrossRef Google Scholar

Mateus, P, Moonen, J, Beran, M, Jaarsma, E, van der Landen, SM, Heuvelink, J, Birhanu, M, Harms, AGJ, Bron, E, Wolters, FJ, Cats, D, Mei, H, Oomens, J, Jansen, W, Schram, MT, Dekker, A and Bermejo, I (2024) Data harmonization and federated learning for multi-cohort dementia research using the OMOP common data model: A Netherlands consortium of dementia cohorts case study. Journal of Biomedical Informatics 155, 104661. https://doi.org/10.1016/j.jbi.2024.104661.CrossRef Google Scholar

McMahan, B, Moore, E, Ramage, D, Hampson, S and y Arcas, BA (2017) Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), 54, pp. 1273–1282. Fort Lauderdale, Florida, USA. JMLR: W&CP. https://proceedings.mlr.press/v54/mcmahan17a/mcmahan17a.pdf Google Scholar

Mergel, I, Edelmann, N and Haug, N (2019) Defining digital transformation: Results from expert interviews. Government Information Quarterly 36(4), 101385. https://doi.org/10.1016/j.giq.2019.06.002.CrossRef Google Scholar

Mintzberg, H (1983) Power in and around Organizations. Prentice Hall: Hoboken, New Jersey, U.S. https://doi.org/10.1177/017084068400500419Google Scholar

Müller, T, Zahn, M and Matthes, F (2024) Revealing the impacting factors for the adoption of federated machine learning in organizations. In Proceedings of the 57th Hawaii International Conference on System Sciences. https://doi.org/10.24251/HICSS.2023.881CrossRef Google Scholar

NASA. (2023) Technology Readiness Levels—NASA. Available at https://www.nasa.gov/directorates/somd/space-communications-navigation-program/technology-readiness-levels Google Scholar

OECD (2016) Recommendation of the council for development co-operation actors on managing the risk of corruption. https://legalinstruments.oecd.org/en/instruments/OECD-LEGAL-0431 Google Scholar

Pateli, A and Lioukas, S (2011) The choice of governance mode in ICT alliances: A property rights approach. Information & Management 48(1), 69–77. https://doi.org/10.1016/j.im.2011.01.002.CrossRef Google Scholar

Pati, S, Baid, U, Edwards, B, Sheller, M, Wang, S-H, Reina, GA, Foley, P, Gruzdev, A, Karkada, D, Davatzikos, C, Sako, C, Ghodasara, S, Bilello, M, Mohan, S, Vollmuth, P, Brugnara, G, Preetha, CJ, Sahm, F, Maier-Hein, K, et al. (2022) Federated learning enables big data for rare cancer boundary detection. Nature Communications 13(7346), 1–17. https://doi.org/10.1038/s41467-022-33407-5.CrossRef Google Scholar

Pollock, N and Williams, R (2010) The business of expectations: Howpromissory organizations shape technology and innovation. Social Studies of Science 40(4), 525–548. https://doi.org/10.1177/0306312710362275.CrossRef Google Scholar

Price, WN and Cohen, IG (2019) Privacy in the age of medical big data. Nature Medicine 25, 37–43. https://doi.org/10.1038/s41591-018-0272-7.CrossRef Google Scholar

Rashid, MM, Xiang, Y, Uddin, MP, Tang, J, Sood, K and Gao, L (2025) Trustworthy and fair federated learning via reputation-based consensus and adaptive incentives. IEEE Transactions on Information Forensics and Security 20, 2868–2882.10.1109/TIFS.2025.3546841CrossRef Google Scholar

Rieke, N, Hancox, J, Li, W, Milletarì, F, Roth, HR, Albarqouni, S, Bakas, S, Galtier, MN, Landman, BA, Maier-Hein, K, Ourselin, S, Sheller, M, Summers, RM, Trask, A, Xu, D, Baust, M and Cardoso, MJ (2020) The future of digital health with federated learning. npj Digital Medicine 3(119), 1–7. https://doi.org/10.1038/s41746-020-00323-1.CrossRef Google Scholar

Sakakibara, M (2002) Formation of R&D consortia: Industry and company effects. Strategic Management Journal 23(11), 1033–1050. https://doi.org/10.1002/smj.272.CrossRef Google Scholar

Sambamurthy, V and Zmud, RW (1999) Arrangements for information technology governance: A theory of multiple contingencies. MIS Quarterly 23(2), 261–290. https://doi.org/10.2307/249464.CrossRef Google Scholar

Sheller, M, Edwards, B, Anthony Reina, G, Martin, J and Bakas, S (2019) Federated learning in neuro-oncology for multi-institutional collaborations without sharing patient data. Neuro-Oncology, 21(Supplement 6), vi176–vi177. https://doi.org/10.1093/neuonc/noz175.737CrossRef Google Scholar

Spiekermann, M (2019) Data marketplaces: Trends and monetisation of data goods. Intereconomics 54(4), 208–216. https://doi.org/10.1007/s10272-019-0826-z.CrossRef Google Scholar

Sprenkamp, K, Fernández, JD, Eckhardt, S and Zavolokina, L (2024) Overcoming intergovernmental data sharing challenges with federated learning. Data & Policy 6, e27. https://doi.org/10.1017/dap.2024.19.CrossRef Google Scholar

Storvang, P, Mortensen, B and Clarke, AH (2017) Using workshops in business research: A framework to diagnose, plan, facilitate and analyze workshops. In Collaborative Research Design. Springer: Singapore. pp. 155–174. https://doi.org/10.1007/978-981-10-5008-4_7Google Scholar

Truong, N, Sun, K, Wang, S, Guitton, F and Guo, Y (2021) Privacy preservation in federated learning: An insightful survey from the GDPR perspective. Computers & Security 110, 102402. https://doi.org/10.1016/j.cose.2021.102402.CrossRef Google Scholar

Tzimas, T (2021) Legal and Ethical Challenges of Artificial Intelligence from an International Law Perspective. Springer International Publishing: Switzerland. https://doi.org/10.1007/978-3-030-78585-7CrossRef Google Scholar

van Drumpt, S, Timan, T, Talie, S, Veugen, T and van de Burgwal, L (2024) Digital transitions in healthcare: The need for transdisciplinary research to overcome barriers of privacy enhancing technologies uptake. Health and Technology 14(4), 709–723. https://doi.org/10.1007/s12553-024-00850-x.CrossRef Google Scholar

Vucovich, M, Tarcar, A, Rebelo, P, Gade, N, Porwal, R, Rahman, A, Redino, C, Choi, K, Nandakumar, D, Schiller, R, et al. (2022) Anomaly detection via federated learning. arXiv preprint arXiv:2210.06614.Google Scholar

Weber, K, Otto, B and Österle, H (2009) One size does not fit all–a contingency approach to data governance. Journal of Data and Information Quality 1(1), 1–27. https://doi.org/10.1145/1515693.1515696.CrossRef Google Scholar

Weigl, L, Barbereau, T and Fridgen, G (2023) The construction of self-sovereign identity: Extending the interpretive flexibility of technology towards institutions. Government Information Quarterly 40(4), 101873. https://doi.org/10.1016/j.giq.2023.101873.CrossRef Google Scholar

Weill, P and Ross, JW (2004) IT Governance: How Top Performers Manage IT Decision Rights for Superior Results. Harvard Business Press: New York City, New York, U.S.Google Scholar

Weise, K (2025) Big Tech keeps its A.I. Data center spending boom alive. New York Times. Available at https://www.nytimes.com/2025/02/08/technology/deepseek-data-centers-ai.html Google Scholar

Winner, L (1993) Upon opening the black box and finding it empty: Social constructivism and the philosophy of technology. Science, Technology, & Human Values 18(3), 362–378. https://doi.org/10.1177/016224399301800306.CrossRef Google Scholar

Winter, S, Berente, N, Howison, J and Butler, B (2014) Beyond the organizational ‘container’: Conceptualizing 21st century sociotechnicalwork. Information and Organization 24(4), 250–269. https://doi.org/10.1016/j.infoandorg.2014.10.003.CrossRef Google Scholar

WIPO (2005) Exchanging Value—Negotiating Technology Licensing Agreements: A Training Manual. World Intellectual Property Organisation. Available at https://www.wipo.int/edocs/pubdocs/en/licensing/906/wipo_pub_906.pdf Google Scholar

Witt, L, Heyer, M, Toyoda, K, Samek, W and Li, D (2023) Decentral and incentivized federated learning frameworks: A systematic literature review. IEEE Internet of Things Journal 10(4), 3642–3663. https://doi.org/10.1109/JIOT.2022.3231363.CrossRef Google Scholar

Yang, T-J, Xiao, Y, Motta, G, Beaufays, F, Mathews, R and Chen, M (2022) Online model compression for federated learning with large models. arXiv preprint arXiv:2205.03494.Google Scholar

Yang, W, Zhang, Y, Ye, K, Li, L and Xu, C-Z (2019) Ffd: A federated learning based method for credit card fraud detection. In BigData: 8th International Congress. pp. 18–32. https://doi.org/10.1007/978-3-030-23551-2_2CrossRef Google Scholar

Yin, X, Zhu, Y and Hu, J (2021) A comprehensive survey of privacy-preserving federated learning: A taxonomy, review, and future directions. ACM Computing Surveys 54(6), 1–36. https://doi.org/10.1145/3460427.CrossRef Google Scholar

Zhang, S (2023) Good governance essential for enterprises deploying AI. MIT Technology Review. Available at https://www.technologyreview.com/2023/07/18/1075972/good-governance-essential-for-enterprises-deploying-ai Google Scholar

Table 1. Decision domains and questions tailored to FL governance, based on Khatri and Brown (2010)

Figure 2. Conceptual architectures for FL.

Table 2. Guiding questions by layer of our proposed framework

Table 3. Real-world applications of FL through the lens of the layered framework

Figure 3. Stylised visualisation of model accuracy versus contributors. The intersection between $ \Omega $ and $ n $ represents the learning threshold.

Submit a response

Comments

No Comments have been published for this article.

Article contents

The governance of federated learning: a decision framework for organisational archetypes

Abstract

Keywords

Information

Policy Significance Statement

1. Introduction

2. Background

3. Towards a framework of common layers

3.1. Assumptions

3.2. Inductive approach

3.3. Layered framework

3.3.1. Data market

3.3.2. Infrastructure

3.3.3. Ownership

3.3.4. Archetype

4. Organisational archetypes

4.1. Application of the framework

4.2. Limitations

5. Discussion

6. Outlook

Data availability statement

Author contribution

Funding statement

Competing interests

Ethical standard

Footnotes

References

Comments

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests