Introduction
Intangible cultural heritage (ICH) is defined in the 2003 Convention for the Safeguarding of the Intangible Cultural Heritage (ICH Convention) as “the practices, representations, expressions, knowledge, skills … that communities, groups and, in some cases, individuals recognize as part of their cultural heritage.”Footnote 1 ICH is “transmitted from generation to generation [and] recreated by communities and groups in response to their environment, their interaction with nature and their history.”Footnote 2 The ICH Convention defines “safeguarding” as “measures aimed at ensuring the viability of the intangible cultural heritage, including the identification, documentation, research, preservation, protection, promotion, enhancement, transmission, particularly through formal and nonformal education, as well as the revitalization of the various aspects of such heritage.”Footnote 3 Despite its recognition and protection in international and domestic law worldwide,Footnote 4 ICH is increasingly at stake due to threats in connection with phenomena like globalization,Footnote 5 climate change,Footnote 6 and heritage recognition inequalities.Footnote 7 Indigenous ICH and traditional knowledge (TK)—including art techniques, music, oral history, and language—are particularly vulnerable to these threats.Footnote 8 Notably, Indigenous languages, which thanks to the historical reliance of Indigenous peoplesFootnote 9 on the intergenerational oral transmission of history, customs, and knowledge are closely interlinked with ICH and TK,Footnote 10 are all considered endangered,Footnote 11 with but few exceptions.Footnote 12 Thus, starting from the postulate that Indigenous cultures, heritage, and knowledge are “inseparable from Indigenous languages,”Footnote 13 this article analyzes the challenges that Indigenous communities face in safeguarding their ICH and TK,Footnote 14 particularly in relation to Indigenous languages, in the digital age.
Governments and international organizations are increasingly turning to digitization—that is, the “conversion, enhancement, or representation of analog or physical objects or real-world phenomena,”Footnote 15 including “text, music, images, and video,”Footnote 16 into digital format—to safeguard ICH. For instance, the Canadian Heritage Information Network (CHIN)Footnote 17 published a guide to assist museums, archives, and researchers in digitizing ICH.Footnote 18 The United Nations Educational, Scientific, and Cultural Organization (UNESCO) also promotes digitization projects, like the ICH digitization project in Bosnia and Herzegovina, which saw the use of digital technologies to preserve and document local communities’ cultural practices.Footnote 19 Another UNESCO-backed safeguarding project is the ongoing ICH digitization project in Kyrgyzstan to create an interactive multimedia digital platform to promote ICH awareness and protection in the country.Footnote 20 Therefore, it is generally possible to observe a positive reception of ICH digitization by the international community.
In line with the digitization trend, some Indigenous communities have been using digital tools, such as artificial intelligence (AI) and other digital technologies, to revitalize their languages, record TK, and preserve their ICH.Footnote 21 At the same time, ethical concerns have been raised, for instance, about the use of Indigenous data to train generative AI tools, especially in commercial activities.Footnote 22 Indigenous data, “born digital or not,” can be defined as “information, knowledge, specimens, and belongings about Indigenous Peoples or to that which they relate at both the individual and collective levels.”Footnote 23 When referring to Indigenous data, this article refers primarily to digital data related to Indigenous ICH and TK.Footnote 24
Corporations have shown interest in using Indigenous languages to create for-profit tools, even contacting Indigenous communities and proposing to pay them in exchange for speech recordings and other language-related data.Footnote 25 While ICH and TK digitization can raise several other concerns, such as freezingFootnote 26 and decontextualization,Footnote 27 this article examines the risks posed to Indigenous ICH and TK by the misappropriation and commercialization of Indigenous languages and data.
Thus, this article delves into a critical question: How can the control and agency of Indigenous communities over their cultures, languages, and knowledge be preserved in the context of Indigenous ICH and TK digitization projects?
The article is divided into three main sections. The findings of each of the first two sections will inform the proposal detailed in the third section. The first two sections open, after brief introductions, with two separate case studies. The first is set in Aotearoa (New Zealand),Footnote 28 and the second is set in Inuit Nunangat, more specifically Nunavut, Canada.Footnote 29
First, the article examines Indigenous communities’ and advocates’ concerns about the appropriation by corporate actors of their data in the context of commercial Indigenous language digitization projects. To support this analysis, the article refers to the Te Hiku Media case in New Zealand. This section highlights the shortcomings in the safeguards against the misappropriation and commercialization of Indigenous TK and ICH, with a focus on intellectual property rights (IPRs) and the dominant Western-centric intellectual property (IP) law system.Footnote 30 Data colonialism and Indigenous data sovereignty (ID-SOV) theories are then examined.
The second section examines public sector partnerships with private actors in Indigenous language digitization projects. This is illustrated by the case of the Government of Nunavut and Microsoft (GN–Microsoft) partnership to develop a free InuktitutFootnote 31 online translator. The article then analyzes the characters and dynamics of public–private partnerships (PPPs or P3) in cultural heritage and other fields to identify both benefits and drawbacks of resorting to PPPs for Indigenous ICH and knowledge digitization projects. Finally, the third section of the article, based on our observations resulting from the previous two sections, argues that ID-SOV principles—if implemented by design and by default—together with the beneficial aspects of the PPPs model can complement each other to address the appropriation threats to which Indigenous data, cultures, and knowledge are exposed in the digital arena.
Challenges to Indigenous data sovereignty
In light of the extinction risks faced by Indigenous languages, some Indigenous communities have been employing digital technologies in an attempt to preserve their cultural heritage and TK. All the while, these communities have sought to maintain or restore Indigenous control and stewardship over Indigenous cultural assets on the Internet.Footnote 32 For instance, while scholarship has found that resorting to data collection practicesFootnote 33 and Open Access contentFootnote 34 could contribute to ICH protection and dissemination, these practices may have adverse effects on Indigenous ICH and TK, “which have often, historically, been misappropriated and misused by powerful actors, including States and private companies.”Footnote 35 As such, these practices need nuanced restrictions to avoid abuse and misappropriation of Indigenous cultures.Footnote 36
With these considerations in mind, the following three subsections will examine the effects of data collection practices employed to train AI models,Footnote 37 such as large language models (LLMs),Footnote 38 on Indigenous ICH. First, we will examine the Te Hiku Media case, revealing the advantages and threats of private companies’ use of LLMs to preserve Indigenous languages. Then, we will argue that the legal protections that numerous scholars have suggested could provide safeguards for ICH and TK, specifically IPRs,Footnote 39 are incompatible with Indigenous ICH. At the same time, these same IPRs risk protecting corporate actors that misappropriate Indigenous ICH and TK. Finally, the third subsection will illustrate how the concept and principles of Indigenous data sovereignty may navigate the digitization of Indigenous culture through the menace of data colonialism, which eventually contributes to revitalizing it in the digital age.
The Te Hiku Media case
Kōrero MāoriFootnote 40 is a Māori trailblazer open-source app in the collection of oral recordings to train software to understand Indigenous languages through machine learning.Footnote 41 The app was developed by Te Hiku Media, a nonprofit organization in New Zealand. Te Hiku Media was founded in 1990, three years after the New Zealand government had finally declared te reo Māori—the language of the Māori people—an official language.Footnote 42 This organization has since been devoted to revitalizing tikanga Māori (Māori cultural values) and te reo Māori by creating and broadcasting content in this Indigenous language for over 30 years.Footnote 43
In recent years, artificial intelligence (AI) has enabled technological breakthroughs in various fields.Footnote 44 The preservation and revitalization of endangered languages are no exception.Footnote 45 Along with this promising opportunity came challenges for communities whose languages are endangered or vulnerable. For example, since Indigenous languages are often spoken by a limited number of people, the small size of the market for Indigenous language services and products did not appeal to mainstream AI companies.Footnote 46 Observing this market gap, Te Hiku Media created Kōrero Māori, collecting vocal recordings in several Indigenous languages—including Māori and Hawaiian—to train its own AI models to understand these languages.Footnote 47
Around the time Te Hiku Media’s data team published Kōrero Māori’s first version, in 2018, a US-based corporation specializing in translation services, Lionbridge,Footnote 48 started offering $45 (US) an hour to several Māori academics and radio groups in exchange for te reo Māori audio recordings.Footnote 49 Other than te reo Māori, Lionbridge has also targeted the languages of Indigenous Hawaiians, Samoans, and other Indigenous communities in Canada and the United States.Footnote 50 Te Hiku Media turned Lionbridge’s offer down due to concerns about Indigenous data appropriation. According to Te Hiku Media’s general manager, Peter-Lucas Jones, the pursuit of profit is what drives these companies to Indigenous languages.Footnote 51 Thus, in pursuit of economic profit, private corporations are likely to prioritize the development of their proprietary software over Indigenous communities’ goal of safeguarding and revitalizing their languages.Footnote 52 Ironically, the main target audience of these products would be Indigenous peoples; hence, not only would they be these products’ true originators but also their primary clients. In this dynamic, corporations could freely extract a wealth of knowledge and money from Indigenous peoples.
In contrast to profit-oriented corporations, Te Hiku Media aims to ensure that the Māori and other Indigenous peoples can benefit from the technology developed through their languages. To that end, Te Hiku Media has developed the Kaitiakitanga license to protect all partnerships with external agencies.Footnote 53 The license ensures that (1) any collaborative projects using Māori data in the Kaitiakitanga license repository directly benefit Māori people and (2) works derived from the use of this data are bound by the license.Footnote 54 An important aspect of the Kaitiakitanga license is that it focuses on the role of data caretakers (kaitiaki)—rather than data owners—undertaken by Māori organizations, like its developer Te Hiku Media. As such, caretakers ensure that all uses of Māori data covered by this license are respectful of the data itself and “of the people from whom it descends.”Footnote 55
A pioneer of Indigenous languages’ digitization with AI tools, Te Hiku Media’s progress has appealed to other Indigenous communities with similar goals of language revitalization.Footnote 56 Of particular importance are Te Hiku Media’s precautionary actions, such as the creation of the Kaitiakitanga license, providing a valuable example to other Indigenous communities aiming to preserve and promote their knowledge and culture.Footnote 57 Beyond language models, digitization has proven to be a promising approach to safeguarding ICH and TK, including Indigenous languages.Footnote 58 However, as the next subsection will address, existing legal frameworks, such as the current IP law regime, may hinder the beneficial aspects of digitization in safeguarding Indigenous ICH and TK.Footnote 59
Indigenous knowledge and intellectual property protection
Indigenous knowledge (or Indigenous TK) has no universal definition, with international instruments largely focusing on the term “‘traditional knowledge’ which goes beyond that which is referrable to Indigenous Peoples only.”Footnote 60 However, it is possible to identify recurring elements in national and international interpretations of Indigenous knowledge. For instance, according to the Government of Canada, Indigenous knowledge “describes complex knowledge systems embedded in the unique cultures, languages, values, and worldviews of Indigenous Peoples” and it “exists within Indigenous legal, political, and governance systems.”Footnote 61 The Canadian definition also recognizes the nuances of Indigenous knowledge and that “only Indigenous Knowledge holders, and Nations and communities are positioned to share their Indigenous Knowledge and provide guidance on its consideration.”Footnote 62 Other countries that exist on the ancestral lands of numerous Indigenous communities, such as the United StatesFootnote 63 and New Zealand,Footnote 64 have provided similar interpretations of Indigenous knowledge. Relevant elements to these definitions include the recognition of Indigenous knowledge’s connection to nature and land, as well as its intergenerational and ancestral development. In a similar light, according to UNESCO, “[l]ocal and indigenous knowledge refers to the understandings, skills and philosophies developed by societies with long histories of interaction with their natural surroundings. … This knowledge is integral to a cultural complex that also encompasses language, systems of classification, resource use practices, social interactions, ritual and spirituality.”Footnote 65 These definitions encompass all kinds of ICH created by and passed down from generation to generation within Indigenous communities.
The legal areas that address issues surrounding ICH include cultural heritage law and also IP law,Footnote 66 despite the latter not being originally envisioned to safeguard ICH. In particular, as Fiona MacMillan explains, while the “link” between cultural heritage law and IP law is “not recognized by the law,” it “exists in any case as a consequence of the overlapping application of these two regimes to certain artefacts.”Footnote 67 Similarly, scholars have found that IP law has an affinity to TK, recognizing how “[traditional] knowledge is remarkably similar to intellectual property rights.”Footnote 68 IPRs grant exclusive rights to owners of intangible properties, such as copyright, patents, and trademarks, entitling them to obtain commercial benefits from the use of their property. These legal protections allow IP law to promote creativity and innovation.
The link between IPRs, ICH, and TK is further illuminated by the work of the World Intellectual Property Organization (WIPO), which has been at the forefront of Indigenous TK and ICH safeguarding initiatives, promoting, for instance, grassroots-level support for Indigenous communities to develop “the practical tools and know-how necessary to use the existing IP system to best advantage.”Footnote 69 WIPO has also addressed the potentialityFootnote 70 of IP protection for TKFootnote 71 and traditional cultural expressions (TCEs),Footnote 72 categories which include Indigenous knowledge—that is, the TK of Indigenous peoples—and certain forms of ICH. However, IPRs do not protect most Indigenous ICH and TK.Footnote 73 As WIPO admits, “knowledge that has ancient roots and is often oral … is not protected by conventional intellectual property (IP) systems.”Footnote 74
Arguably, the incongruence between IP protection and Indigenous ICH and TK results in part from the tension between the individualistic and exclusive IPRs and the more collective and inclusive TK.Footnote 75 Western IPRs are limited to contemporary creations. Moreover, for specific IP protection systems to apply, certain requirements must be met. For instance, to be protected by copyright, a work must first be fixed in a tangible form (e.g., books, paintings, and works stored on digital devices).Footnote 76 Creators must be specific persons, either natural or legal, excluding community and intergenerational attribution. Therefore, only some forms of TK and TCEs and their modern innovations are qualified for IP protection. Selected examples are (1) copyrightable contemporary music and dance, (2) patentable pharmaceutical innovation, and (3) trademarks used for identifying authentic Indigenous arts.Footnote 77 As most Indigenous knowledge is developed through intergenerational and community contributions and is primarily transmitted orally, it often remains void of IP protection.
While Indigenous knowledge, as well as TK and TCEs in general, can rarely be granted IPRs, some jurisdictions have implemented defensive protectionFootnote 78 to prevent any intangible property appropriating and misusing Indigenous knowledgeFootnote 79 from being protected by IPRs. For example, in New Zealand, two Māori advisory committees, one for trademarks and the other for patents, should advise their respective commissioners whether a trademark or an invention is opposed to Māori values.Footnote 80 Although New Zealand remains the only country that clearly stipulates defensive protection for Indigenous communities over their knowledge and culture,Footnote 81 in Australia, Indigenous entities can register certification marksFootnote 82 to oppose registered trademarks misappropriating Indigenous TK and TCEs.Footnote 83 While this legal framework was not specifically designed to protect Indigenous TK and TCEs, it is an alternative tool for Indigenous communities to use in case of commercial damage resulting from misappropriation.
Apart from trademark and patent protection,Footnote 84 copyright law can also provide some legal safeguards for Indigenous ICH.Footnote 85 However, copyright protections may only apply to a fraction of Indigenous ICH and knowledge. When examining the treatment of oral traditions, Pınar Oruç identifies different challenges that Indigenous communities might encounter in relation to copyright law. For instance, a central obstacle is represented by the documentation of Indigenous oral traditions by outsiders. Oruç proposes that “[w]hen the documentation is made by outsiders to the community, they become the authors of that particular fixation of the Indigenous oral traditions” and, in this instance, “copyright could emerge but provide control to the outsiders instead of to the [Indigenous] community.”Footnote 86
Both existing literature and the present analysis suggest that IP law does not provide satisfying protection for Indigenous ICH and TK, raising vulnerability concerns.Footnote 87 Concerns heighten further in those situations in which Indigenous ICH and TK are free to use, while their derivatives developed by non-Indigenous individuals or groups have IP protection.Footnote 88 The recognition of IPRs to non-Indigenous actors, such as private corporations, over derivative works further facilitates the exploitation and misappropriation of Indigenous heritage and knowledge. This protection discrepancy also risks hindering Indigenous communities’ use of their own ICH, as non-Indigenous actors’ IPRs might be considered to supersede Indigenous peoples’ rights to their own heritage. In the case of language models, since the software is protected by copyright law in various countries,Footnote 89 software based on Indigenous languages falls into the exclusive domain of copyright, whether the creators are Indigenous peoples or not. Just as Te Hiku Media warned, if Indigenous communities are not alert about non-Indigenous actors appropriating their data—for instance, Indigenous language audio recordings—these non-Indigenous actors may use Indigenous knowledge to profit from their target consumers: Indigenous peoples themselves.Footnote 90 These data-extractivist activities reveal exploitative patterns that echo the colonization of land and natural resources.Footnote 91 The anticolonial fight is now taking place in the digital arena, including in the domains of voice recognition, translation software, and language models.
Data colonialism and Indigenous data sovereignty
In recent years, activities of non-Indigenous private corporations using and profiting from Indigenous data have raised serious concerns over the data sovereignty of Indigenous communities. For instance, Te Hiku Media argued that Whisper, OpenAI’s recent speech recognition model, could constitute a new form of colonization.Footnote 92 OpenAI claims Whisper was trained with 1,381 hours of recordings of te reo Māori and 338 hours of ʻōlelo Hawaiʻi, yet the company fails to reveal the sources of such data.Footnote 93 Suspecting that the data was scraped on the Internet—not uncommon for AI model trainingFootnote 94—Te Hiku Media highlights a fundamental divergence: While Indigenous communities respect their data and look after it, Western corporations (so-called data colonizers) aim to establish ownership over it and economically benefit from Indigenous peoples.Footnote 95
The extraction, dispossession, and commodification of data have become prevalent on online platforms, with tech companies extracting data on the Internet to train their AI models for commercial purposes. While Indigenous data is exploited in the process of data mining, Indigenous peoples can rarely claim infringement the same way copyright owners have recently done to fight against AI companies’ unauthorized uses of their digitized copyrighted works.Footnote 96 Consequently, scholars and activists have started recognizing the phenomenon of so-called data colonialism. Nick Couldry and Ulises Mejias were the first to systematically discuss and define data colonialism as “[t]he extension of a global process of extraction that started under colonialism and continued through industrial capitalism, culminating in today’s new form: instead of natural resources and labor, what is now being appropriated is human life through its conversion into data.”Footnote 97
Following the lead of Couldry and Mejias, more scholarsFootnote 98 have highlighted the connection between capitalism and colonialism in the digital era.Footnote 99 In this dynamic, capitalism’s insatiable desire for profits leads to the incessant exploitation of data, including Indigenous data. Many categories of Indigenous data—such as health care data—have been the object of unauthorized extraction for various misuses.Footnote 100 The long-term systemic discrimination toward Indigenous peoples has resulted in a lack of resources to prevent data appropriation and exploitation. This makes Indigenous peoples among the most vulnerable to data colonialism.Footnote 101
In the face of increasing threats to their data, Indigenous peoples seek mechanisms to “encourage data collectors and users to be more aligned with Indigenous worldviews.”Footnote 102 This is the case of Te Hiku Media’s enduring resistance against private corporations’ encroachment on Indigenous data related to their culture and knowledge. Indigenous peoples have been proactively addressing data colonialism concerns, particularly by emphasizing their sovereignty over Indigenous data. In doctrinal works, Indigenous data sovereignty (ID-SOV) is widely accepted as “the rights of Indigenous Peoples to control the collection, access, analysis, interpretation, management, dissemination and reuse of Indigenous data.”Footnote 103 ID-SOV has developed into an Indigenous-led global movementFootnote 104 that empowers Indigenous peoples to control and have agency over the use of Indigenous data.
Several principles can be grasped from various authors’ interpretations and definitions of ID-SOV. Matthew Snipp presents three features of the control that ID-SOV affords to Indigenous peoples: (1) the power to determine who should belong to an Indigenous community and who should instead be excluded for data collection purposes (identity); (2) the collected data must reflect the values and priorities of Indigenous peoples (interest); and (3) the power to determine who has access to the collected data (access).Footnote 105 The overall control over the determination of identity, interest, and access relative to practices for collecting Indigenous data contributes to the concept of data ownership, which is, as illustrated throughout our article, a critical component of ID-SOV. Furthermore, scholars of Indigenous studies have proposed an Indigenous governance approach to systematizing ID-SOV principles. For instance, Indigenous governance expert Diane Smith understands ID-SOV as a governance matter and has designed a conceptual framework for Indigenous data governance in line with Snipp’s proposed three features.Footnote 106 The framework encompasses six principles and practices regarding how Indigenous peoples access, disseminate, use, monitor, interpret, maintain, and manage their data.Footnote 107 The efforts that Indigenous communities have made to gain sovereignty over their data are observable in practices worldwide.Footnote 108
Reflecting Indigenous peoples’ inherent self-determination, ID-SOV is arguably a form of Indigenous sovereignty recognized in the United Nations Declaration on the Rights of Indigenous Peoples (UNDRIP).Footnote 109 Although data sovereignty is not explicitly stipulated in UNDRIP, it fits the scope of the right to self-determination,Footnote 110 whereby Indigenous peoples “have an inherent right to be in control of their destinies and to create their own political and legal organisations.”Footnote 111 As such, ID-SOV scholars have characterized “Indigenous sovereignty over Indigenous data” as “an extension of Indigenous peoples’ fundamental right to self-determination.”Footnote 112 Additionally, the UN Special Rapporteur on the Right to Privacy recognized the relevance of ID-SOV in a crucial 2019 report providing guidance on the processing of health-related data.Footnote 113
Moreover, various Indigenous communities and advocacy groups have customized data governance practices to their specific contexts as part of the global, Indigenous-led ID-SOV movement.Footnote 114 For instance, in 2019, the Global Indigenous Data Alliance released the CARE principles (collective benefit, authority to control, responsibilities, and ethics) to complement open data and open science standards that were not originally designed to address Indigenous rights and interests over the use of Indigenous data.Footnote 115 The Māori people initiated a mechanism, Te Mana Raraunga, “to articulate, and advocate for, a wider set of Māori rights and interests in Māori data—that is, any data that is about or from Māori people, Māori language, culture, resources, or environments.”Footnote 116 In Canada, the First Nations principles of OCAP®Footnote 117 (ownership, control, access, and possession)Footnote 118 by the First Nations Information Governance Centre (FNIGC) assert that “First Nations alone have control over data collection processes in their communities, and that they own and control how this information can be stored, interpreted, used, or shared.”Footnote 119 The key aspect of ID-SOV, exhibited by various Indigenous communities, is control over their data. It is worth noting that the FNIGC registered the trademark for the OCAP® acronym to prevent its misuse and appropriation. Under circumstances where Western IP laws still dominate the protection of intangible goods, Indigenous peoples should be encouraged to make use of existing tools, such as trademarks, to benefit their interests. They should also have access to more compatible legal approaches, such as ID-SOV, to better preserve their culture.Footnote 120
Having highlighted Indigenous peoples’ self-determination rights over their data, ID-SOV principles do not necessarily limit the uses of Indigenous data to Indigenous communities only. In fact, collaboration with non-Indigenous organizations, so long as they respect ID-SOV, may trigger positive results if doing so follows well-designed guidelines.
Public–private partnerships for Indigenous cultural heritage?
Drawing from the issues of data colonialism and lack of protection for Indigenous ICH threatened by the trends of corporate appropriation and commercialization, this section examines a potential alternative: public–private partnerships (PPPs). Thus, this section will investigate both positive and negative aspects of the project of the Government of Nunavut (Canada) and Microsoft to create translation tools for Indigenous languages. This analysis serves to emphasize both the potential benefits and drawbacks of PPPs focused on Indigenous ICH and knowledge digitization.
The Government of Nunavut and Microsoft partnership
In January 2021, Microsoft announced adding Inuktitut, one of the Inuit languages, to its Microsoft Translator apps, the Office package, and the Bing translator.Footnote 121 This project entails Microsoft’s collaboration with the Government of Nunavut (hereafter GN or “the Government”), one of Canada’s three territories.Footnote 122 Nunavut’s population is about 80% Inuit,Footnote 123 Indigenous peoples whose ancestral lands are located in the Arctic and subarctic regions.
The GN–Microsoft partnership was strengthened after the 2019 ransomware attack on the computer infrastructure of Nunavut. In that instance, GN requested the support of Microsoft and its incident response team to implement new security measures and rebuild Nunavut’s computer infrastructure.Footnote 124 Since the ransomware events, GN–Microsoft collaborations have expanded and diversified to include language preservation projects. The latest collaboration with Microsoft came as part of GN’s ongoing efforts to preserve and promote Inuit languages.Footnote 125 Particularly, the Inuktitut preservation project can address three fundamental preservation goals. First, these translation tools can facilitate the use of Inuktitut by the Government itself, allowing public services to be increasingly available in Inuktitut and, eventually, other Inuit languages.Footnote 126 This is fundamental, especially for those who speak Inuit languages as their first or exclusive languages,Footnote 127 who may encounter difficulties accessing public services and information. Second, the Microsoft Translator project aims to guarantee access to Inuit languages to all Nunavut citizens, including Inuit and non-Inuit communities and businesses.Footnote 128 Broad access not only encourages non-Inuit communities to develop greater awareness of Inuit culture and languages but also allows further revitalization of these languages. Furthermore, access to these translation services can facilitate the development of more inclusive products and services by business enterprises, in compliance with Nunavut law.Footnote 129 Third, the development of these translation tools fulfills the larger aspiration of promoting the accessibility of Inuit language software to Inuit users and communities outside of Nunavut and Canada.Footnote 130
The GN–Microsoft partnership is a promising development for the digital preservation of Inuit languages. So much so that the GN and Microsoft announced updated features to the Inuktitut translator and the expansion of its translation services to Inuinnaqtun,Footnote 131 one of the Inuit languages categorized by UNESCO as “definitely endangered.”Footnote 132 As also recognized by Microsoft Canada, the inclusion of Inuinnaqtun in their translator was only possible thanks to the work and contribution of the Kitikmeot Heritage Society,Footnote 133 which has the preservation of Inuit culture and languages as its mission.Footnote 134 The Inuktitut translator is a service built for and by Inuit communities, with the partnership of GN and Microsoft. As GN stated, moving forward, “[t]he more Inuktitut speakers use it and provide input, the more the translator ‘learns’ and increases its Inuktitut skills.”Footnote 135 The positive reception of this partnership has led to the development of new projects that expanded the GN–Microsoft partnership.Footnote 136 In November 2022, the minister of community and government services of Nunavut, David Joanasie, announced “the development of a sustainable process to update language models, including speech-to-text and text-to-speech capabilities.”Footnote 137 The announced text-to-speech services were released in December 2024.Footnote 138
Microsoft’s collaboration with local governments, communities, and interested organizations to preserve languages is not new. In 2004, the tech giant unveiled the Local Language Program (LLP) to develop its services and products in underrepresented languages globally.Footnote 139 The LLP also counted on the collaboration with Inuit language and dialect experts from the Pirurvik Centre for Inuit Culture, Language and Wellbeing (hereafter “Pirurvik Centre”). The project’s objective, completed in the spring of 2009, was to offer fully Inuktitut versions of the Office 2003 and Office 2007 packages and the Windows XP and Vista operating systems.Footnote 140
The Pirurvik Centre, led by Leena Evic and cofounder Gavin Nesbitt, was instrumental not only in the recording of existing terms in Inuktitut but also in the evolution of the Inuktitut language, with the creation of new tech-specific terms, such as the Inuktitut word for “Internet,” ikiaqqijjut. Footnote 141 However, the sheer volume of work that weighed on the Pirurvik Centre’s experts during the LLP project greatly distracted them from other projects and activities. This left Inuit language experts skeptical about such efforts. Gavin Nesbitt, recalling the intense workload that the LLP project entailed, said he would not “recommend it to a language that doesn’t have spare capacity,” as these programs are likely to take the top Indigenous language experts months and even years of their time.Footnote 142 Consequently, Nesbit argued, the intense focus required by these projects could bar experts from dedicating their time to the preservation of languages through activities such as storytelling, which are foundational to Indigenous traditions.Footnote 143
The GN–Microsoft partnership could be a chance for both the GN and Microsoft to address the concerns and skepticism brought up by Inuit language experts who worked on the 2004 LLP project. Microsoft could renew its role as a collaborator in Indigenous ICH safeguarding efforts while also fostering good relations with the GN and the Indigenous communities involved.Footnote 144 The GN could further support the revitalization and promotion of Inuit cultural heritage, implementing the developed tools to improve its services to both Inuit communities and the larger public.
PPPs in cultural heritage and other domains
While there is arguably no univocal and universal definition of PPPs,Footnote 145 for the purpose of this article, PPPs are intended as long-term collaborations between public and private actors for the delivery, development, and management of public services or assets.Footnote 146 PPPs have been used by numerous governments and public authorities worldwide, not only in cultural heritage managementFootnote 147 but also more broadly in the delivery of public services.Footnote 148 The public services object of PPPs can include both “infrastructure assets (such as bridges, roads) and social assets (such as hospitals, utilities, prisons).”Footnote 149
Governments often resort to PPPs to meet the requirements for funding, know-how, personnel, and other resources intrinsic to delivering public services.Footnote 150 Collaboration with private actors can provide financial and other resources while offering an alternative to the full privatization of services.Footnote 151 From this partnership, public actors can ideally offer more efficient services, while private actors—beyond a purely commercial gain—can also achieve their corporate social responsibility (CSR) goalsFootnote 152 and improve their public image and reputation. Different from projects that corporate actors undertake unilaterally, PPPs are characterized by a dynamic that rebalances a project’s stakes and decisional power away from the private sphere and into the public one. In this public–private dichotomy, the public interest seeks to subordinate the private interest, creating systems of accountability for the private actor.Footnote 153
PPPs have been employed in numerous domains, including energy, telecommunications, transportation,Footnote 154 and cultural heritage. In particular, the latter has unique characteristics that make PPPs particularly appealing to public actors. Notably, the management of tangible cultural heritage is characterized by high costs for renovations, protection, and maintenance.Footnote 155 Management of both tangible and intangible cultural heritage, moreover, requires the employment of expert staff, tools, and resources that, particularly in the case of digitization initiatives, might be unavailable to public actors.Footnote 156 This is true, for instance, with the use of advanced AI training tools to preserve languages, which is observable in the GN–Microsoft case. Specifically, AI tools and other digital technologies may be proprietary in nature, making them more costly and difficult to access. Moreover, these tools will likely require highly specialized staff in order to develop and implement the necessary digital services.
Another peculiar aspect of this collaboration model is that private actors’ incentives to partake in cultural heritage PPPs are primarily nonfinancial (or at least nondirectly so), centering instead on aspects of CSR, publicity, and public image. As argued by Settembre Blundo et al., “[f]rom the economic and financial point of view, the preservation of cultural heritage can be classified among the ‘weak’ projects, because the remuneration generated through revenue from [users] is normally not sufficient to adequately remunerate the private investor.”Footnote 157 However, cultural heritage PPPs, or in the GN–Microsoft case, Indigenous ICH PPPs, can offer private sector partners benefits besides direct financial gain.
On paper, as in the case of the current collaboration between GN and Microsoft, the private actor steps in to contribute with its resources and know-how. Meanwhile, the public actor guarantees that the managed ICH remains in the hands of its legitimate keepers, which, in this case, are Inuit communities. The services based on Indigenous knowledge and traditions promise to serve Indigenous communities themselves, while developing and keeping alive their languages. In return, the private actor develops its own products—such is the case of the development of Microsoft’s own AI tools and translation servicesFootnote 158—while also obtaining good publicity and developing stronger bonds with both local governments and nongovernmental groups and individuals. These preservation projects, however, require a continuous contribution of Indigenous peoples, who provide knowledge through data and labor, as in the case of the Pirurvik Centre. Indigenous peoples are an essential actor supplementing the limiting public-private dichotomy in Indigenous ICH governance. Hence, it is necessary to clearly examine and address the positive and negative aspects that these projects may have for Indigenous communities, their data, and their culture.
The private–public dichotomy in practice: Benefits and drawbacks
Two key aspects observable in the GN–Microsoft partnership could particularly contribute to more sustainable and successful Indigenous language preservation projects. On the one hand, public oversight over corporate activities using Indigenous knowledge, culture, and data, and on the other hand, the availability of costly and proprietary AI tools and other technologies.Footnote 159
A PPP model could address the governance and power imbalance between private actors and Indigenous communities, which may lead to Indigenous knowledge and traditions’ privatization and appropriation. In theory, the public actor could act as a guarantor of Indigenous interest and as an oversight authority, strengthening regulatory protection and granting access to justice and reparation mechanisms. Moreover, the use of ever-evolving technologies could mitigate the workload that weighs on Indigenous communities, as warned by Nesbitt, contributing to the development of language preservation tools, often done on a voluntary basis.Footnote 160 These technologies are costly, and meaningful access could be limited without direct collaboration on the part of the proprietary company. Furthermore, the active contribution of GN ensured broad implementation in government services of the developed language tools,Footnote 161 which were instead “never rolled … out on a large scale” after the completion of the 2004 LLP project in Nunavut.Footnote 162
A direct involvement of the public actor typical of PPPs could make these projects more viable for Indigenous communities, extending the implementation of translation software to fundamental services and information. Active deployment of language tools in public services would also strengthen Indigenous languages and promote their working use and development. Finally, the dynamic observable in the GN–Microsoft collaboration can ensure that the services and products developed through Indigenous knowledge primarily serve Indigenous communities themselves. These projects should not only aim to conserve Indigenous languages but also improve the everyday lives of Indigenous communities, making services more inclusive and mindful of Indigenous languages and cultures.
The valuable lessons that can be learned from the GN–Microsoft collaboration should not distract from its drawbacks. Despite this partnership showing a more positive example of corporate actors’ participation in projects on the safeguard of Indigenous ICH and TK, the essential issue of Indigenous data sovereignty remains unaddressed. In the GN–Microsoft case, Indigenous peoples maintain more agency and the services herein developed are accessible to both Inuit and non-Inuit communities globally. Furthermore, the Public–Private Partnership Policy developed by the GN itself integrates Inuit values, translated into the so-called principles of Tamapta.Footnote 163 However, concerns related to data sovereignty in the specific context of PPPs that focus on Indigenous ICH and TK digitization, such as in the case of the partnership with Microsoft, remain. In particular, it is unclear to what extent the Indigenous data collected to develop these services are owned by Inuit communities and to what degree they have decision-making power over how Microsoft uses their data.Footnote 164
In addition, the GN case arguably represents in itself a unicum that may struggle to be repeated elsewhere. Compared to governments and public authorities in other jurisdictions with a similar settler colonial background, GN’s representatives and public sector employees present a higher proportion of Indigenous peoples.Footnote 165 This may entail a greater interest in the GN to support Indigenous communities compared to regions of the world where Indigenous peoples not only are a minority but are also greatly underrepresented in public office, even lacking any form of self-governance. In those cases, public authorities may, therefore, fail to represent Indigenous peoples’ interests when their cultures and data are on the line.
Looking back, the previous controversial attempts by Microsoft to help Inuit language preservation efforts have left some concerned that current and future projects may prove unsuccessful in the long run, once again letting down Inuit communities.Footnote 166 Furthermore, from a technical perspective, these translation tools have been susceptible to making translation errors, needing time and an indefinitely large pool of data in order to be perfected.Footnote 167 This article argues that, in order to overcome the aforementioned issues and make otherwise-promising cases, such as the GN–Microsoft collaboration, more viable and sustainable for Indigenous communities, a complementary element is missing: the implementation of ID-SOV principles. The following section proposes a more refined dynamic to respond to the examined drawbacks while still considering and including the benefits of PPPs models for Indigenous ICH management projects.
Collaborative governance for Indigenous ICH digitization
According to Article 31 of UNDRIP, “Indigenous peoples have the right to maintain, control, protect and develop their cultural heritage”Footnote 168 and “States shall take effective measures to recognize and protect the exercise of these rights.”Footnote 169 Hence, international law recognizes the public actor, in the entity of the state, as a facilitator or guarantor of Indigenous cultural heritage protection. In light of this, PPPs could be used by public actors seeking to meet their facilitator role to support the safeguard of Indigenous ICH. At the same time, control and ownership over Indigenous ICH must remain in the hands of its legitimate keepers, requiring the restructuring of PPP models. In order to achieve sustainable and balanced projects, we argue that Indigenous cultural heritage PPPs should encompass aspects of collaborative governance that expand upon more traditional PPPs models.
The literature on cultural heritage PPPs already recognizes the relevance of collaborative governance in the form of so-called third sector participation,Footnote 170 P4,Footnote 171 or “citizen engagement.”Footnote 172 Similarly, this paper contends that overcoming a strict public–private dynamic—ensuring the participation of Indigenous stakeholders—could reduce the risks related to data colonialismFootnote 173 and Indigenous ICH and TK appropriation and commercialization.Footnote 174 To ensure the participation of Indigenous peoples and to protect Indigenous control over Indigenous data, PPP projects on Indigenous ICH digitization should integrate ID-SOV principles’ integration in PPPs “by design” and “by default.”
The terms “by design” and “by default” are borrowed from the scholarly literature on privacy and data protection,Footnote 175 also applied in groundbreaking regulations such as the General Data Protection Regulation (GDPR) of the European Union.Footnote 176 These concepts were theorized and systematized by Ann Cavoukian, privacy law scholar and former information and privacy commissioner of Ontario, Canada.Footnote 177 In her “7 Foundational Principles,” Cavoukian indicates how privacy principles should be approached by stakeholders, making it “integral to organizational priorities, project objectives, design processes, and planning operations.”Footnote 178
The framework of the “privacy by design” approach individuates a number of fundamental characteristics we seek to apply here to the integration of ID-SOV principles for cultural preservation projects undertaken by public and private actors in PPP-like collaborations. In particular, the inclusion of ID-SOV should be proactive and preventative, rather than being implemented solely as a remedial response to observed concerns.Footnote 179 ID-SOV should be a default when creating and operating software seeking to include, in any way and form, Indigenous data.Footnote 180 Furthermore, ID-SOV principles should be embedded not only in the content of agreements relative to cultural preservation projects but also in the software and services created as a result of these collaborations.Footnote 181 The principles of visibility and transparency should apply.Footnote 182 All stakeholders should be made aware of how Indigenous data are used in these projects, especially Indigenous communities and individuals themselves. Finally, Cavoukian recognizes a foundational principle, which is also paramount to ID-SOV, namely, respect for users’ interest.Footnote 183 As in the case of privacy by design, actors participating in the creation of services and systems that involve user data—in this case Indigenous data—must keep users’ interest at the center of their activities. ID-SOV principles prioritize Indigenous interest in the form of data governance and control.
In “conventional” PPPs—as opposed to P4 or other alternative partnership modelsFootnote 184—the partnership of the public and private sectors seeks to fulfill public interests and private self-interests, which are established and agreed upon by the two sectors.Footnote 185 While this dynamic provides a level of reciprocity—whereby both public and private interests are considered—the public–private dichotomy also entails that the public sector harnesses the self-interest of the private sector to serve the public interest.Footnote 186 However, in the case of PPPs seeking to support the safeguarding of Indigenous ICH and TK, a third interest should come into focus: Indigenous peoples’ interests. The dynamic of conventional PPPs risks remaining imbalanced, disfavoring Indigenous communities, without the clear retention of their control over Indigenous data and knowledge.
While Indigenous interests could be subsumed into the sphere of the public interest, they are formed independently of settler governmental authorities, within the context of Indigenous communities themselves. Specifically, Indigenous interests are to be determined by Indigenous peoples on a case-by-case basis. It is fundamental to recognize that Indigenous interests can vary from each Indigenous community. Hence, it is paramount to prioritize Indigenous peoples’ active participation—including through consultation and decision-making activities—in projects related to their cultural heritage, data, and even livelihood.
The public actor can support and integrate Indigenous interests within its activities and services. In turn, this will lead the private partner to follow through, modeling its own activities to meet Indigenous interests. To give primacy to the interest of Indigenous communities in Indigenous ICH digitization projects, it is essential to go back to principles of ID-SOV, as argued by scholars and Indigenous rights advocates, such as Te Hiku Media.Footnote 187 In this new dynamic inclusive of ID-SOV principles, the power of the public actor to exert the public interest over the private interest typical of PPPs can be harnessed to give primacy to Indigenous interests.
The need for enforceability of ID-SOV principles creates further functionality for PPPs and the public actor. On the one hand, these partnerships can contractually recognize the principles of ID-SOV, namely control and ownership over Indigenous data. On the other hand, governments should also integrate ID-SOV principles in their own regulatory frameworks, whether this be for Indigenous ICH projects or other endeavors that include Indigenous knowledge and, more specifically, Indigenous data. In so doing, the public actor would provide public oversight over the activities of the private actor, creating and enforcing accountability mechanisms, as well as implementing recourse venues for Indigenous communities seeking to protect their data and ICH.
All PPPs seeking to manage Indigenous cultural heritage—tangible or intangible—require, on the one hand, a collaborative approach and, on the other, the need to recognize Indigenous peoples’ control, ownership, and stewardship over their cultural property and data. For instance, as seen in the GN–Microsoft case, Indigenous language digitization needs extensive data collection and even the creation of neologismsFootnote 188 through contributions of entire Indigenous communities, including Indigenous researchers, scholars, and advocates. Hence, such projects should integrate ID-SOV principles, with the establishment of overseeing, decision-making, and participatory mechanisms for Indigenous communities. ID-SOV principles have the ultimate goal of restoring and maintaining control over Indigenous ICH and Indigenous data to its original caretakers. Moreover, the primary beneficiaries of services and products built through Indigenous knowledge should be Indigenous peoples themselves.Footnote 189 At the same time, the larger society can partake in revitalizing Indigenous ICH, as observed in the GN–Microsoft case, where the use of Inuktitut is promoted to Nunavut’s citizens and businesses through free public courses, with the intent to “preserve the Inuktut language through technology [and] promote the use of Inuktut every day, whether it be at home, in the office or around the world.”Footnote 190
Conclusion
While the creeping privatization and commercialization of ICH appears to be an increasingly concerning reality, solutions should be swiftly devised and applied to guarantee that cultural heritage remains within the control of its communities of origin. It is particularly the case for Indigenous peoples, as the ramifications of the commodification of their cultural heritage have roots in settler colonialism, characterized by the appropriation and destruction—beyond land—of their traditions and knowledge, including language.Footnote 191 These patterns threaten to repeat themselves, reinforcing the need to reclaim Indigenous sovereignty in the digital world.
The Te Hiku Media case shows how Indigenous communities’ endeavors to revitalize their languages constantly face threats from private enterprises. Corporations, often equipped with more financial and human resources, tend to take advantage of Indigenous ICH and TK, which risks replacing and hindering Indigenous peoples’ access to and control over their data. Moreover, Western law is too often unequipped to support Indigenous culture preservation. Current legal systems do not facilitate—arguably they even cripple—the protection of Indigenous ICH. In fact, although Indigenous knowledge provides abundant intangible sources for modern innovations, it often fails to obtain IP protection. The dominant Western IP system treasures individual and one-time innovation in contrast to the Indigenous values of collectivity and generational contribution. Inevitably, most Indigenous knowledge is left in the free-for-all public domain. Consequently, when Indigenous knowledge becomes the source of innovation for non-Indigenous actors, it is at risk of being misused and misappropriated, in turn harming Indigenous peoples’ interests and their culture.
Giving an answer to this protection lacuna, the concept of Indigenous data sovereignty seeks to reinforce Indigenous peoples’ inherent self-determination rights in issues related to their data, especially empowering Indigenous peoples while they collaborate with non-Indigenous corporations. The necessity to affirm ID-SOV grows ever stronger with regard to Indigenous ICH and the risk of its appropriation and commodification by corporate actors in the digital landscape. However, ID-SOV faces the obstacles of enforceability and power imbalances between Indigenous peoples and corporate actors.
As observed in the GN–Microsoft case, the cooperation between the public sector and the private sector can address concerns related to the power imbalances that might exist in purely private projects that seek to obtain data from Indigenous communities for the creation of proprietary products. Another interesting aspect of the GN–Microsoft partnership is the nature of the GN, which has a large number of Indigenous representatives and follows an Indigenous government system. On the one hand, this shows the importance of Indigenous peoples’ sovereignty more broadly and Indigenous participation in PPPs related to their cultural heritage. On the other hand, this can create a replicability challenge as most governments do not have such extensive Indigenous representation. Furthermore, while the GN’s partnerships with the private sector follow principles of Inuit sovereignty, the lack of any mention of data sovereignty and governance standards raises concerns and questions about the fate of Indigenous data shared and used with Microsoft. Thus, the transparent integration of ID-SOV principles and standards becomes necessary to address both the concerns of replicability and potential data sovereignty gaps.
Our critical analysis of PPPs shows how the sole public–private dichotomy that characterizes them cannot meet the needs of Indigenous peoples in retaining agency over the management of their ICH, even if it has proven particularly promising in other cases of cultural heritage management.Footnote 192 Instead, a revisited model of PPPs should be implemented to protect Indigenous ICH and address the issues related to ID-SOV enforceability and power imbalance. As we argue, the public–private dichotomy can greatly benefit from integrating collaborative governance aspects and ID-SOV. These, paired with technical and financial resources (private actor) and the establishment of a guarantor of cultural heritage control retention (public actor), can result in the development of Indigenous ICH management projects, which retain Indigenous peoples’ control. ICH digitization projects, in particular, should be supplemented with the principles of ID-SOV. Projects employing Indigenous knowledge and tradition through the collection and processing of data should be characterized by Indigenous data sovereignty by design and by default. This includes projects involving digitization of Indigenous languages and other Indigenous ICH.