Every history is really two histories,
the story of what happened
and the story of the perception of what happened.
W.J.T. Mitchell (Reference Mitchell2011, 161)
An AI-generated image depicting tent camps that formed the phrase ‘All Eyes on Rafah’ (Figure 1) has become one of the most widely circulated pieces of content related to the Israel-Hamas war (Asmelash Reference Asmelash2024). Intended and often interpreted as a symbolic call for global attention and international scrutiny, it has, however, faced criticisms for sanitising the depiction of death and destruction in Gaza (Mahroof and Mehmood Reference Mahroof and Mehmood2024). The image was seen as a representation of a cleansed view of the war’s brutality, despite the existence of numerous authentic photographs and reports from journalists risking their lives to document the harsh realities on the ground.

Figure 1. ‘All Eyes on Rafah’ Instagram story template. Screenshot of the original by @shahv4012.
Images of war not only represent events but also shape interpretations, often legitimising specific actors or narratives depending on the context in which the images are shared and viewed (Parry Reference Parry, Lilleker and Veneti2023). Images also play a crucial role in creating collective memory and forming beliefs (Mitchell Reference Mitchell, Manghani, Piper and Simons2006), ‘performing symbolic and political functions at the same time’ (Aiello and Parry Reference Aiello and Parry2020, 163). Thus, ‘All Eyes on Rafah’ does more than inform; it becomes part of a broader media ecosystem that influences which aspects of the war are remembered and which are forgotten.
Given the limited transparency of visual generative AI known as ‘the black boxing of vision’ (Gaboury Reference Gaboury2015), assessing their impact on collective memory might prove challenging. In an agenda-setting article on the role of generative AI in the context of war atrocity memorialisation, Makhortykh et al. (Reference Makhortykh, Zucker, Simon, Bultmann and Ulloa2023) raise concerns about how inauthentic content produced by synthetic media offers distorted representation of the atrocities due to the lack of morality and the inability of AI to understand the true meaning of data or to exercise moral judgment in areas such as memorialisation. Machine-made ‘historical’ accounts of human suffering and war scenes that never took place (see, eg, Bedingfield Reference Bedingfield2023) increasingly compete for public attention alongside authentic photographs captured by journalists and eyewitnesses, blurring the lines between reality and fabrication (Volpicelli Reference Volpicelli2023) and creating a parallel information ecosystem where distinguishing genuine documentation from algorithmically constructed narratives becomes a critical challenge.
The pseudo-witnessing performed by generative AI has significant implications for collective memory, simultaneously expanding and threatening human agency in shaping, preserving, and reinterpreting both personal memories and shared cultural histories (Hoskins Reference Hoskins2024). Through AI-generated images, memory is curated rather than reconstructed, and the past becomes ‘one point on a continuum that extends through the present and into the future’ (Wertsch and Roediger Reference Wertsch and Roediger2022, p 6). Thus, AI-generated images of war enact relationships between different temporalities by invoking the imagination of the possible past, and by doing so, render a fabricated vision of historical events which increasingly become part of collective memory. Generative AI’s visions of war and their implication for the practice of remembering is the primary concern of this study.
Taking a corpus study approach to identify dominant visual messages in 200 AI-generated images of war, we ask how different aspects of the Russia-Ukraine war are represented by proprietary and open-source generative AI platforms such as Midjourney (version 6.1), Adobe Firefly (Image 3), and Stable Diffusion (DPM++ 2 M Karras). To obtain the data, we designed 23 prompts across three categories: neutral (eg, Illustrate the current state of the Russia-Ukraine war), experimentally biased towards two identities – Ukrainian (eg, As a Ukrainian, depict a scene from Russia’s invasion of Ukraine) and Russian (eg, Show the resilience of the Russian people during the Russia-Ukraine war) – and hypothetical scenario-based (eg, Generate a movie poster about the Russia-Ukraine war).
The study is part of a research project that addresses the socio-technical and discursive dimensions of synthetic war content produced by AI chatbots and visual generative AI platforms. The results of the textual study revealed that US-based AI chatbots often align their portrayal of the Russia-Ukraine war with narratives from Western news outlets and government positions, occasionally propagating Russian misinformation without sufficient contextualisation (Roman et al. Reference Roman, Laba and Parmelee2025). Consequently, in their current state, textual generative AI is unreliable for providing accurate insights into wars and conflicts because it risks misrepresenting and distorting public perceptions of such events. In this study, we aim to uncover synthetic representations of war by visual generative AI and address how it becomes part of memory formation.
To do so, we draw on three complementary conceptual frameworks. Firstly, to examine machine visions of war, we conceptualise AI-generated images through Hoskins’ (Reference Hoskins and Hoskins2017) notion of the memory of the multitude, which emerges from ‘human-archival entanglements of communication through digital devices and networks’ (86). Secondly, visual social semiotics provides a framework for understanding how images communicate meaning systematically, through complex systems of signs and codes that guide viewer interpretation and response, which positions images as active sites of meaning-making and ideological construction rather than neutral representations (Kress and van Leeuwen Reference Kress and van Leeuwen2021). Thirdly, cultivation/desensitisation theory maintains that while media messages may appear varied in terms of themes and the individuals conveying them, systematic analyses reveal a striking consistency in the underlying value systems (Potter Reference Potter2014), and repeated exposure to consistent messages reinforces societal perceptions (Shehata et al. Reference Shehata, Thomas, Glogger and Andersen2024). Viewed from this theoretical angle, AI-generated images that foreground particular representations of war may appeal to viewers with certain belief systems, influencing their perception, interpretations, and affective responses to the war.
As visual communication and journalism scholars, we bring interdisciplinary perspectives and empirical insights into the affectual dimensions of visual generative AI in the memory of the multitude, particularly from the perspective of representation. We argue that visual generative AI introduces a new dimension to memory making in that it blends documentation with speculative fiction by synthesising the multitude embedded within the visual memory of war archives, historical biases, representational limitations, and commercial risk aversion.
AI-generated images in ‘the memory of the multitude’
Participatory media and new information infrastructures they engender have fundamentally reconfigured memory practices, creating hybrid systems where connective memory emerges through networked interactions between individuals, platforms, and algorithms (Hoskins Reference Hoskins2011, Reference Hoskins and Hoskins2017; Ekelund Reference Ekelund, Bietti and Pogacar2023). Horizontally integrated media conglomerates such as social media platforms, provide affordances for the instantaneous flow of images, ideas, and narratives, demanding active modes of spectatorship that blur the lines between cultural production and consumption (Jenkins Reference Jenkins2006). As Serafinelli (Reference Serafinelli2020) observes, digital media now serve as both repositories and active sites for networked memory work, enabling new participatory practices of visual remembering. Digital media transform images from static representations into what Giaccardi and Plate (Reference Giaccardi, Plate, Plate, Smelik and Muntean2017) call ‘memory objects’ (66). As memory objects for social interaction, images gain meaning through continuous recontextualisation – likes, shares, and algorithmic curation – turning individual contributions into nodes for interaction situated within broader visual culture. These interactions can be seen as ‘the memory of the multitude’ (Hoskins Reference Hoskins and Hoskins2017, 86), dissolving traditional delineations between public-private and individual-collective remembrance.
Within this ecosystem, the practice of remembering can be seen as a mediated action (Wertsch Reference Wertsch2004), distributed between social actors and cultural tools as an active process that ‘involves contention and contestation among people rather than a static body of knowledge that they possess’ (Wertsch and Roediger Reference Wertsch and Roediger2022, 318). Collective memory also functions as an ‘identity project’ (Wertsch Reference Wertsch2004, 33) that often prioritises group cohesion over objective representation but always draws on collective knowledge of communicative artifacts by an interpretative community that is also a social entity (Wertsch Reference Wertsch2004, 27).
AI-generated images introduce a new dimension to memory-making processes and collective knowledge. Unlike traditional war photography, which captures historical moments within specific cultural and political contexts (Parry Reference Parry, Lilleker and Veneti2023), AI-generated images are ‘networked images’ (Dewdney and Sluis Reference Dewdney and Sluis2022) implicated in training data, model infrastructure, platforms, and people. Such entanglements impact how machine vision ‘remembers’ – interpolating patterns from its training data at the submission of a prompt rather than reflecting actual events or the lived experiences of people affected by wars. In this way, generative AI technologies ‘untether the human past from the present’ (Hoskins Reference Hoskins2024, 1) introducing a new discursively fabricated past as aggregated representational objects derived from training data, which blends past, present, and imaginary futures (Dewdney and Sluis Reference Dewdney and Sluis2022; Laba Reference Laba, Papadopoulos, Matar and Mitraforthcoming). As Ritchin (Reference Ritchin2025) writes, what is being shown in a synthetic image ‘is not a photograph depicting an event to be seen as the past, but something that continues into the future, as do the possibilities for engagement with it’ (209). Similar to digital stock images, AI-generated images reconfigure actuality and potentiality as a new form of what Frosh (Reference Frosh2003) terms ‘temporal collapse:’ the compression of the reference time of what is represented in the image and the moment of viewing (165) where ‘[t]emporal projections of the future of the images are caught up in the constant citation, performance and transformation of images past, much as present narratives of the future self recall and transform memories’ (159).
Frosh’s (Reference Frosh2003) argumentation about temporal ambiguity bears relation to the contested notions of memory and imagination. In the context of philosophy of memory, Dos Santos et al. (Reference Dos Santos, McCarroll, Sant’Anna, Sant’Anna, CJ and Michaelian2022) distinguish between discontinuist (causalist) and continuist (simulationalist) positions. Discontinuists maintain that memory and imagination are fundamentally different kinds of mental states: remembering, they argue, requires an appropriate causal connection to a past experience, often conceived as a memory trace linking the representation to the actual event. In this view, imagination lacks such a causal connection and therefore constitutes a distinct kind of mental activity. By contrast, continuists (simulationists) argue that memory and imagination are mental states of the same kind. From this perspective, remembering is a form of constructive imagining produced by a reliably functioning episodic construction system that aims to simulate past events, without requiring a causal link to them. As Dos Santos and McCarroll (Reference Dos Santos, McCarroll, Sant’Anna, Sant’Anna, CJ and Michaelian2022) note, empirical evidence lends support to both positions, and both sides of the debate ‘agree that imagination and episodic memory typically involve mental imagery’ (48).
While this level of distinction between memory and imagination falls beyond the scope of our study, we acknowledge contested temporalities inherent in the memory of the multitude which arguably manifest in two key developments: first, AI produces a version of the past that never existed, and second, it introduces a ‘conversational past’, where memory is dialogically reconstructed in the present, remixing ‘individual human and collective memory’ and reconstituting ‘what it means and what is possible (and impossible) to forget’ (Hoskins Reference Hoskins2024, 10). These temporal frictions surface in synthetic narratives produced in AI-generated images, the representational possibilities of which are constrained by the socio-technical limits of AI models, training datasets, AI system interfaces, and moderation policies.
To understand these entanglements, post-digital memory scholarship suggests that remembering is no longer solely a human practice, but rather one shared and co-constructed by human and non-human actants (Jelewska Reference Jelewska2024). Post-digital memory intersects with post-humanist theories which reject anthropocentric models in favour of distributed agency across human- and machine-led processes through phygital (physical-digital) interfaces that decentralise control over historical narratives (Parui and Raj Reference Parui and Raj2024). These systems, according to Mandolessi (Reference Mandolessi2023), reflect ‘the reconfiguration of agency, in which a distributed memory is performed by human and non-human agents in a dynamic entanglement’ (1514).
Yet, in the context of AI-generated imagery, this seeming democratisation of memory curation is uneven: while generative AI enables broader participation in memory making, algorithmic bias and infrastructural inequalities persist, privileging those with access to advanced technologies and Western-centric perspectives that these technologies favour. Memory work, as Smit et al. (Reference Smit, Heinrich and Broersma2017) note, ‘is a discursive process – comprising practices, cultural forms, and technologies – wherein the past is shaped and constructed in the present and carried into the future’ (3120). Memory is also a representation imbued with meaning within existing cultural contexts (Smit Reference Smit, Merrill, Keightley and Daphi2020). In the case of AI-generated imagery, however, it is mediated by the representational constraints of training data – shaped not by lived experience or ethical deliberation, but by the synthetic aggregation of data patterns in response to textual or image prompts.
Media imagery, training datasets, and generated outputs
Image-text pairs used for training visual generative AI models play a critical role in shaping representation in AI-generated images of war. Vast datasets, often scraped from publicly available sources, to learn patterns and produce visuals, reflect the cultural, social, and historical prejudices embedded in their original context of use (Crawford and Paglen Reference Crawford and Paglen2021), including ideological positioning of news outlets as well as those stemming from civic reporting (Maier-Rothe et al. Reference Maier-Rothe, Kafafi, Feizabadi and Downey2014).
Images in news media – such as those depicting wars and terrorist attacks – often focus on tragic and shocking depictions of death or people in dire situations, and many of these images can be graphic (Konstantinidou Reference Konstantinidou2008; Parry Reference Parry2010; Parry Reference Parry, Lilleker and Veneti2023). Fahmy (Reference Fahmy2010) found that the US and Saudi-owned newspapers featured 30–40% of graphic images (of various intensity) of 9/11 and the Afghan war. Another study of the 2006 Israel-Lebanon war revealed that graphic images accounted for 23–30% of all UK newspapers’ photographs and included ‘depictions of mourning and coffins/covered bodies’ (Parry Reference Parry2010, 78). Similarly, Hussain and Fahmy (Reference Hussain and Fahmy2024) looked at Twitter images from a 2018 terrorist attack in Pakistan and found that 33% of pictures featured death and nearly 11% injuries.
Scholarship on the visual framing of war highlights how the decision to show graphic images of violence or civilian casualties is frequently tied to the national identity of news outlets and their relationship to the military actors involved. Griffin’s (Reference Griffin2004) analysis of US coverage during the Gulf and Iraq wars shows that American media outlets predominantly present images aligning with official narratives, often prioritizing motifs of US military prowess or sanitized, non-confrontational scenes. Photographs of casualties or destruction caused by American forces are comparatively rare, whereas images depicting injuries or death emerge only when bodies of enemy participants are shown but never those of US soldiers (Griffin Reference Griffin2004, 392). Similarly, Parry’s (Reference Parry2012) research on the British mainstream press during the 2003 Iraq invasion reveals that coverage of civilian casualties is influenced by the editorial distance from responsibility, noting that “the photographs’ inclusion, their graphic nature and verbal framing was determined not by technological or access issues but by editorial judgements’ (184). Furthermore, Taylor (Reference Taylor1998) identifies a tendency in Western media to display graphic images when victims are ‘foreigners.’ Images of dead or wounded people from other countries are more frequently shown, while those implicating Western soldiers or civilians are often censored or sanitised (129–154). Together, these studies illustrate how the visual emphasis in wartime media coverage is shaped not only by ethical considerations but also by national allegiances and editorial positions – making depictions of suffering both a political and a cultural act. These editorial choices in visual representation reflect broader patterns in how news media select and frame war imagery to influence perceptions of wars and conflicts.
Although limited, recent research of the Russia-Ukraine war reveals marked differences in how international and national news organisations visually frame war. In the study by Young and Omosun (Reference Young and Omosun2025), the photographs of injured military appeared in The New York Times (NYT) and The Guardian but were entirely absent in the Ukrainian and Russian news outlets analysed in the study. Images of injured Ukrainian civilians were rare across the sample – appearing in less than 3% of photographs in the NYT and the Ukrainian outlet – and are absent altogether from The Guardian and the Russian outlet. Similarly, Fernandez-Castrillo and Ramos (Reference Fernández-Castrillo and Ramos2025) found that depictions of wounded or dead individuals accounted for only 1–2% of images in Russian and Ukrainian news sites. Interpretations of these omissions vary: Fernandez-Castrillo and Ramos (Reference Fernández-Castrillo and Ramos2025) argue that the scarcity of death in images effectively ‘reorients the representation of the conflict towards a notion of war without civilian casualties’ (51). Young and Omosun (Reference Young and Omosun2025) suggest that the Russian outlet likely conformed to government propaganda directives, while ‘Ukraine might have refrained from showcasing images of affected civilians, possibly to avoid demoralizing the population and inadvertently acknowledging the successes of the Russian army’ (27). These findings indicate that both national context and ideological positioning – particularly whether an outlet belongs to a state directly involved in the conflict – shape visual framing choices. Relatedly, Damanhoury and Saleh’s (Reference Damanhoury and Saleh2025) study of the 2011 Gaza war found that Al Jazeera Arabic displayed more death and ‘about-to-die’ images than Fox News, a pattern that partly echoes Young and Omosun’s (Reference Young and Omosun2025) findings about Ukrainian and Russian outlets’ inclusion and exclusion of death in visuals. However, our literature review shows that most research on the visual framing of wars and conflicts focuses on Western media, with far less attention given to the national outlets of countries directly involved in these wars.
Previous research also examined the subjects and objects of photographs. Hussain and Fahmy (Reference Hussain and Fahmy2024) found that images from the terrorist attack in Pakistan predominately featured victims (40%) and elites (42%). Similarly, a study from the 2006 Israel-Lebanon war determined that Lebanese civilians appeared in less than a quarter and Israeli soldiers in 15–19% of UK news organisations’ photographs of this conflict, more often than any other subjects (Parry Reference Parry2010). This pattern of selective visual emphasis is also apparent in the coverage of the Russia-Ukraine war, and the analysis of Russian military and Ukrainian civilian casualties. For example, pictures of Ukrainian civilians and military dominated NYT and The Guardian photographs featuring the Russia-Ukraine war, and the photographs of Russian military were less prevalent, ‘making it look like the war solely involved Ukrainian civilians and military men’ (Young and Omosun Reference Young and Omosun2025, 26). From the same study, Ukrainian news outlet featured more photographs of political leaders, followed by the photographs of Ukrainian civilians and military, while Russian military and political leaders dominated the photographs of the Russian news outlet. Meanwhile, contextual backdrops of rubble were prominent in Western news media, accounting for 42% of The Guardian and 33% of NYT photographs, while appearing less in Ukrainian photographs (14%) and Russian photographs (2%) (Young and Omosun Reference Young and Omosun2025). Such editorial decisions ultimately contribute to the construction of distinct narratives about the war and its human and infrastructural costs.
Importantly, these and other existing yet unaddressed representations directly feed into the datasets used to train visual generative AI systems. When such image-text pairs – with their embedded cultural perspectives and editorial biases – become training data for machine learning models, they perpetuate and potentially amplify these representational disparities (Laba Reference Laba2024). AI systems ‘learn’ to reproduce selective framings, creating a cycle where AI-generated images inherit and further institutionalise visual biases present in training data. As an essential component of the AI pipeline, the lack of data transparency, cultural knowledge, and data diversity impact representation (Mihalcea et al. Reference Mihalcea, Ignat, Bai, Borah, Chiruzzo, Jin, Kwizera, Nwatu, Poria and Solorio2025). This transfer of representational patterns from human-curated news imagery alongside vast volumes of unspecified data to machine-generated outputs prompts a critical examination of not just the technical architecture of AI systems, but also the sociopolitical contexts of the visual data that shapes their outputs and, ultimately, our collective understanding of wars and conflicts.
Representation, cultivation, and desensitisation
Visual social semiotics and cultivation are the theoretical perspectives that guide our analysis of AI-generated images of war. First, visual social semiotics recognizes that meanings in images do not arise just from what is represented in the image itself, but from all sorts of subtle choices such as, for example, at which distance image subjects are positioned relative to the viewer, which kind of vertical angle they appear at, whether they make eye contact with the viewer or not, how much setting is made visible in the frame, whether they appear as individuals or in groups, and so on (see Kress and van Leeuwen Reference Kress and van Leeuwen2021, 41–44). It is through such choices that the viewer is positioned to relate to those represented in images and form opinions about them; yet, as Bateman (Reference Bateman and Jewitt2014) emphasises, ‘neither of these aspects can be simply ‘read off’ from any artefact or behaviour’ but ‘need to be derived from analysis’ of ‘the artefacts or activities under empirical investigation’ (240, italics in original).
Cultivation theory is another perspective relevant to our study as it is concerned with the potential effects of media messages (in our case – how the memory of the multitude manifests in how specific aspects of war are represented, subsequently positioning the viewer to relate to what is represented in images in a specific way). Cultivation research first focused on understanding television’s effects (Gerbner and Gross Reference Gerbner and Gross1976) but has since expanded to social media which, as a recent meta-analysis found, has greater cultivation effects than television (Hermann et al. Reference Hermann, Morgan and Shanahan2023). While early cultivation scholars did macro-level research on television across different genres, many studies now have a micro-level focus on specific genres and types of media (Potter Reference Potter2014; Hermann et al. Reference Hermann, Morgan and Shanahan2023).
Desensitisation research provides another perspective on the potential consequences of understanding AI-generated images of war and their potential impact on collective/connective memory. Those who are desensitised to violence have decreased sympathy for victims and a lack of willingness to help them (Waddell et al. Reference Waddell, Bailey, Weber, Ivory and Downs2019). Desensitisation can be a short-term phenomenon with viewers or a long-term state depending on the amount of repeated exposure to violent media (Scharrer Reference Scharrer2008). Within the context of AI-generated war scenes, if most pictures sanitise the conflict by excluding images of death and suffering, viewers may be desensitised to the extent that they are unwilling to provide support to those being victimised, and may remember a depicted war as, what Parry (Reference Parry2012) terms clean war. However, it should be noted that the media landscape is radically different today than when Gerbner and Gross (Reference Gerbner and Gross1976) discussed cultivation in terms of how the ‘location, action, and characterization’ of issues on television are ‘discharged into the mainstream of community consciousness’ (182). Today’s community consciousness is made up of far more than the stories broadcast on television by a handful of networks to viewers. As Hoskins (Reference Hoskins and Hoskins2017) highlights, ‘a new mass constantly snap, post, record, edit, like, link, forward and chat in a digital ecology of media’ (86). The result is that the ‘memory of the multitude is all over the place, scattered yet simultaneous and searchable: connected, networked, archived’ (Hoskins Reference Hoskins and Hoskins2017, 86).
Even in the current era of connective memory, cultivation and desensitisation are lenses through which researchers can see the possible impact of AI-generated war images. But, as Anderson (Reference Anderson2021) cautions, when addressing current media landscapes, ‘we need to once again become more confident on investigating the meaning of media texts and the interpretation of those texts in a way that is not reducible to effects, behaviourism, or stimulus and response’ (58). Similarly, Aiello (Reference Aiello2023) calls for a shift in attention away from seeking the links between images and outcomes like belief in misinformation or behavioural change, and towards the cultural and historical dimensions of visual political communication. Following this invitation, instead of asking how AI-generated images affect how people remember the war, we focus on how AI-generated representations of war are communicated in these images, acknowledging that particular kinds of representations can cultivate particular kinds of beliefs and assumptions.
Research design
Visual sample and analysis
To understand visual messages in AI-generated images of the Russia-Ukraine war by proprietary and open-source generative media platforms, this research mobilises a corpus studies approach to analyse 200 AI images generated to a set of control, identity-specific, and hypothetical prompts. Twenty-three prompts were developed to probe Midjourney (version 6.1), Adobe Firefly (Image 3), and Stable Diffusion (DPM++ 2 M Karras) to produce visual narratives intended as either neutral (CP1–7), potentially biased toward a particular representation (RU1–7 and UU1–7), or hypothetical (H1–2). These were generated on the same day, 29 August 2024 (Table 1). The motivation behind introducing both Russian- and Ukrainian-skewed prompts was to investigate whether national perspective modifiers impact the depiction of war scenes, abstract concepts (such as the resilience of Russians and Ukrainians), and political leaders from both countries in AI-generated images. For the RU and UU prompts, we opted for a single bias modifier per prompt to avoid compounding effects. For example, RU1 used the bias phrase ‘Russia’s special military operation in Ukraine,’ while RU2 incorporated the modifier ‘from a Russian perspective.’
Table 1. Prompts used to generate data across three visual generative AI systems

To analyse the data, we adapted generic annotation pipelines proposed by Hovy and Lavid (Reference Hovy and Lavid2010) and Bateman (Reference Bateman and Jewitt2014). After preparing a corpus of images in response to our prompts, we developed a preliminary annotation scheme based on 35 images from a test set. The development of the annotation scheme was informed by Kress and van Leeuwen’s (Reference Kress and van Leeuwen2021) visual grammar, in relation to the kinds of meanings communicated in images – representation (eg, inclusion/exclusion of people, whether people appear as individuals or in groups, which kinds of settings they are embedded in) and interaction (eg, social distance which ranges from personal to impersonal and perspective such as positioning of image subjects at certain points of horizontal and vertical axes of visual composition). As Laba (Reference Laba2024) notes, discourse analysis of AI-generated images of war ‘proves beneficial for considering nuanced sets of choices in AI images’ (1617) because these choices result in specific representations that position viewers of these images to relate to AI-generated content in a specific way.
The test set was analysed employing the preliminary annotation scheme with attributes to describe general themes as well as content-related aspects that describe image subjects. For example, settings were highly relevant for identifying general themes of each image, and so were presences/absences of cultural references such as flags and recognisable landmarks, while identifying genders and age groups of image subjects allowed for insights into predominant representational tropes of the people and ways in which they were shown as involved in or impacted by the war.
Two annotators individually coded the test set, and the results were checked for inter-annotator agreement (Cohen’s Kappa) using R. Kappa calculations pointed to almost perfect, substantial, or moderate agreement for all attribute-value pairs apart from attribute ‘salient objects/subjects’ which was subsequently removed due to only fair agreement (Kappa: 0.321, p-value = 0.0133), which resulted in a final annotation scheme with 11 attributes (Table 2) used for annotating the whole corpus of 200 images.
Table 2. Annotation scheme (attribute/value pairs) developed and evaluated in the study

From Figure 2, attributes such as ‘people present,’ ‘number of image subjects,’ and ‘social distance’ show almost perfect agreement (Kappa ≥ 0.81), while ‘dominant representational structure’, ‘setting’, ‘image subject(s)’, and ‘dominant age’ show substantial agreement (Kappa = 0.61–0.80). Attributes ‘cultural references’, ‘perceived emotional tone’, ‘dominant gender’, and ‘perspective’ display moderate agreement (Kappa = 0.41–0.60). All Kappa values are statistically significant, indicating reliable coding across all attributes.

Figure 2. Heatmap of Cohen’s Kappa values for inter-coder agreement across attributes.
The iteratively developed model (Table 2) was aimed at describing data at higher levels of abstraction, capturing combinations of visual elements and structural configurations, and subsequently comparing the patterns in visual data with the modifiers in the textual prompts. In sum, such an approach with clear criteria for segmentation of the visual data allows for ‘the research question to become straightforward to address with respect to the annotated data’ (Bateman Reference Bateman and Jewitt2014, 242).
Results
Varied platform responsiveness
When generating the data, we noted that responsiveness of different visual generative AI platforms was varied. Open-source Stable Diffusion was the most agreeable, successfully generating images for all prompts (88 images). Midjourney, on the other hand, accepted most control and identity-specific prompts (72 images) but refused to generate images depicting Zelenskyy and Putin’s involvement in the war from a neutral perspective (CP6–7), a Russian identity perspective (RU6–7), and Putin’s involvement from a Ukrainian identity perspective (UU7). Midjourney did generate images featuring Zelensky from a Ukrainian identity perspective (UU6).
Adobe Firefly, however, declined to respond to more than half of the prompts (40 images) – refusing all control prompts (except for CP5) and most Russian identity prompts (except for RU3). Nonetheless, it was more amenable when generating images for hypothetical scenarios (H1–2) and some Ukrainian identity prompts (UU1, 4, and 5–6). It appeared that Adobe Firefly was tuned to either reject prompts mentioning the Russia-Ukraine war or to obscure representations of war by placing subjects in peaceful settings devoid of any signs of conflict or destruction. Figure 3 summarises key patterns across platform responsiveness, predominant representational structures, common settings, presence of cultural symbols, and perceived emotional tone.

Figure 3. Platform responsiveness and general themes across the annotated corpus.
General themes
Generated images largely emphasised militaristic aspects of the Russia-Ukraine war, often depicting soldiers (66% or 103/156 images featuring human subjects) who are predominantly male, adult, and conventionally attractive. These subjects were usually portrayed from a detached or impersonal perspective, which may result in a limited understanding of the war, highlighting it as mainly a masculine, military event while neglecting the experiences of civilians, women, and the elderly. The majority of representations were static (145 instances), while dynamic representations were less common (55 instances). The frequent use of impersonal distance (that is long shots, 75 instances) and detachment (ie, oblique horizontal angles, 51 instances) in these images could also foster a perception of the war that feels emotionally distant, potentially leading viewers to approach it in a more analytical rather than empathetic manner.
The perceived emotional tone was predominantly sombre, accounting for 43.5% (87 instances). The tone was determined through facial expressions, colour palettes (eg, greys, blues, and browns contributing to a subdued atmosphere), and composition (eg, simple compositions with backgrounds showing destruction). Undefined tones were present in 39.5% (73 instances), indicating that a significant portion had mixed or plain backgrounds, and the represented people had neutral facial expressions, particularly in medium shots and close-ups of individuals. Tense tones appeared in 14% (28 instances) of the images, generally characterised by dynamic representational structures representing combat, while peaceful tones were rare, accounting for just 3% (12 instances). Soldiers and civilians were often portrayed as stoic while looking out at the destruction, with no intense emotions. Overall, the images focused on static, sombre representations, with a significant emphasis on mixed settings and symbols associated with Ukraine rather than Russia, even when Russian identity was introduced in our prompts.
In terms of setting, mixed scenarios were most common, appearing in 40.5% of the images. These included natural environments with partially discernible backgrounds, as main subjects occupied most of the frame, as well as props in ‘staged’ photographs (eg, plain backgrounds, maps, particularly in the sub-corpus of images generated with Stable Diffusion). Scenes of destruction closely followed at 35.5%, highlighting the aftermath and impact of the war on infrastructure, civilians, and, at times, heritage sights. Other settings included active combat (10.5%), peaceful nature (9.5%), and maps (4%). Peaceful nature was favoured by Adobe Firefly, which also accounted for 64% (18 out of 28) of images featuring the Ukrainian flag, while the Russian flag appeared far less frequently (4 instances). Recognisable landmarks were included in images created by Midjourney (5 instances), Stable Diffusion (5 instances), and once by Adobe Firefly. Notably, almost all destruction depicted in the images occurred in urban areas, despite the war unfolding in both urban and rural areas.
Out of 20 AI-generated images of political leaders, only one utilised a dynamic representational structure, which involves depicting individuals as engaged in activities. Images of Zelensky and Putin were predominantly static. Most representations focused on the frontal plane, with the leaders making eye contact with the viewer. Notably, Zelensky was often portrayed at a personal distance, while Putin was primarily shown at a socio-consultative distance. In AI-generated imagery, the size of the frame can be seen as an interactional resource to position the imagined audience to relate to the representation in a certain way (Laba Reference Laba2024). Personal distance, realised by a close-up has the potential to elicit a warmer connection to Zelensky, while socio-consultative distance, realised by a medium shot, positions Putin as an acquaintance, distanced from the viewer (Figure 4).

Figure 4. Typical representation of both political leaders.
Platform-specific ‘editorial’ biases.
A cross-platform comparison reveals distinct platform biases, identity-aligned framing, and inconsistencies in symbolic representation, pointing to the role of training data and moderation policies in shaping AI-generated conflict imagery (Table 3). Midjourney prioritised dramatic, conflict-centric visuals, particularly for control prompts (CP1–CP7) and identity-specific requests. For example, 75% of CP1 images (‘Depict a scene from the Russia-Ukraine war’) featured active combat, often portraying soldiers in impersonal, tense settings. In contrast, Adobe Firefly leaned toward neutral or peaceful themes, even in hypothetical scenarios (H1–H2). This is perhaps partly due to the refusal to generate images to control prompts, even as neutral as CP1. Stable Diffusion produced generic militarised imagery and struggled with symbolic specificity, often defaulting to unspecified flags or omitting them entirely.
Table 3. Comparative summary of distinctive features of visual narratives to CP, RU, and UU prompts across the three visual generative AI systems

Identity-driven narratives
The studied platforms minimised destruction and emphasised resilience or neutrality when prompted from a Russian viewpoint (RU1–RU7). For example, RU4 (resilience of Russian people) generated peaceful civilian scenes in Adobe Firefly (eg, left in Figure 5) but destruction-focused imagery in Midjourney. Notably, cultural references like Russian flags were rare (only 2 out of 28 RU images), while Ukrainian flags appeared unintentionally in 4/28 Russian-identity prompts, suggesting training data traces (eg, right in Figure 5) or platform confusion. The images amplified destruction and to a lesser extent, impact on civilians for Ukrainian identity prompts (UU1–UU7). In this set of images, symbolic emphasis is placed on national identity and emotional framing that aligns with Ukrainian narratives of resilience and victimhood. In contrast to the RU depictions of resilience, Ukrainian-identity prompts resulted in several depictions of (sombre) scenes including civilians. Most of these images depicted civilians through long, impersonal shots, with only 20 out of 70 using medium shots (eg, right in Figure 5) and nine close-ups.

Figure 5. Identity-driven narratives in Russian-identity prompts (left) and Ukrainian-identity prompts (right).
Hypothetical scenarios
National identifiers were inconsistently incorporated in hypothetical prompts (H1–H2). Adobe Firefly and Midjourney overrepresented Ukrainian flags, even when not specified in the prompt, while Stable Diffusion defaulted to militarised imagery without clear national symbols, highlighting model limitations in recognising explicit modifiers. All images generated in response to hypothetical scenarios featured human subjects, more often in small groups than as individuals, placed in the centre, which aligns with typical representation patterns seen on book covers and movie posters.
In contrast to the AI-generated images created in response to other prompts in the corpus, the depiction of gender was more balanced. Soldiers were the most frequently featured subjects, but their outfits, particularly in images generated by Midjourney, resembled military uniforms from the 19th century, especially during the later Tsarist era (Figure 6). It appears that visual generative AI interpreted some prompts as grounded in Ukrainian and Russian military history. This is evident in elements such as ornate epaulets, sashes, and high-collared coats, which were commonly worn by officers in European armies of that period, including those of the Russian Empire. Over 60% of hypothetical scenarios (H1–H2) included Ukrainian flags, even when unspecified in the prompt.

Figure 6. Representation of a hypothetical book cover (left) and a movie poster (right).
Discussion: Synthesising the multitude
By addressing representation in AI-generated images of war, we examined narratives resulting from the processes by which the Russia-Ukraine war is framed by non-human actants. AI-generated images of war illustrate how visual generative AI may selectively frame conflict and promote certain interpretations while obscuring more severe aspects of human suffering. Increasingly, decision making about which representations are made possible and which aspects are deemed too sensitive, violent, or inappropriate falls to technology companies serving as gatekeepers. This delegation of authority reconfigures collective memory by determining what can and what cannot be visualised.
First, we note that visual generative media platforms respond to prompts about the Russia-Ukraine war in varied ways. Stable Diffusion, the open-source model, was the most responsive to a wide range of prompts, whereas proprietary models like Midjourney and Adobe Firefly showed selective biases influenced by the composition of the prompt. Most images emphasised military aspects of the war, focusing on soldiers and political leaders, often with static, sombre tones. This type of representation results in narrowing of the visual narrative which overshadows civilian perspectives, human suffering, and loss (see Parry Reference Parry, Lilleker and Veneti2023).
Second, destroyed buildings and damaged infrastructure were dominant themes in most prompts; yet omission of graphic elements such as death, injury, or bloodshed provided a sanitised view of the war. These filtered portrayals may desensitise viewers (Scharrer Reference Scharrer2008; Waddel et al. Reference Waddell, Bailey, Weber, Ivory and Downs2019), decreasing sympathy for the victims and diminishing viewer willingness to help. Unlike human-produced war photography, AI-generated images frequently exclude critical elements such as the suffering of others and representations of refugees and children (see, eg, Wells Reference Wells2007; Konstantinidou Reference Konstantinidou2008; Martikainen and Sakki Reference Martikainen and Sakki2024). Furthermore, uniform depictions of soldiers as attractive and stoic contributes to desensitisation. These AI-generated images consistently fail to capture the intense emotions and suffering experienced in war zones, contrasting sharply with human-produced photographs in traditional media, which often contain graphic depictions of war victims as well as dead and injured individuals (Parry Reference Parry2010; Hussain and Fahmy Reference Hussain and Fahmy2024).
As visual communication and journalism researchers, we face significant limitations in uncovering the internal mechanisms by which certain representations and prohibitions are enforced. The proprietary nature of most AI systems means that there is limited or no access to model infrastructure, making it difficult to determine whether content is blocked at the prompt level, filtered during image generation, or removed after the fact. While terms of service and community guidelines provide some insight into platform policies, the lack of transparency and access to moderation algorithms leaves many questions unanswered about how and why certain representations – particularly those depicting human suffering – are excluded from AI-generated imagery.
Among the three platforms addressed in the study, Midjourney’s community guidelines state to ‘avoid making visually shocking or disturbing content’ (Midjourney 2024) on the platform. Yet, this representational responsibility is placed on users rather than the platform itself, with no clear indication how the guideline is enforced. It is also not made explicit what qualifies as ‘disturbing content’ and to whom, and whether such content cannot be specified in the prompt or cannot appear in images ever, even if the prompt is more abstract like in our approach (eg, depict a scene from the Russia-Ukraine war).
Similarly, Adobe (2024) recommends to not use Adobe’s generative AI features to create content that includes ‘promotion, glorification, or threats of violence’ and ‘graphic violence or gore,’ mentioning automated and manual methods for content filtering purposes. Contrastingly, Stable Diffusion (2024) makes no explicit mention of restrictions on content such as depictions of death, violence, or gore in the publicly available terms, and claims no responsibility ‘for any content created by users with the help of AI.’ However, many implementations of Stable Diffusion often include their own moderation layers or community guidelines, which may restrict certain types of content (Hugging Face 2025), but this is neither universal nor specified in the core terms provided.
In the context of commemorative artefacts, both production and distribution technology transform how the memory of the multitude is formed and distributed. The standards of ‘taste and decency’ – long central to editorial decision-making in Western media – have historically shaped the ways war and atrocity are visually represented (Parry Reference Parry2012, 182). Human journalists do not indiscriminately publish graphic material – their choices reflect societal conceptions of what is appropriate to show. This tradition of self-censorship has often resulted in sanitised portrayals of war, particularly when the violence was perpetrated by one’s own national forces, thereby ‘protecting’ audiences from the most confronting aspects of conflict, as found in reports of previous literature in earlier sections. With AI-generated images of war, these same sensibilities are reproduced – though in different forms – within the algorithms, training datasets, and moderation policies that govern generative AI systems and constitute the memory of the multitude. Decisions about acceptable visual content are no longer made solely by human editors in visible production processes but are embedded into opaque platform guidelines and risk-averse content filters. The result is ‘genericity’ and ‘timelessness’ (Westberg and Kvåle Reference Westberg and Kvåle2025, 576) – representations stripped of visceral detail – producing generic, decontextualised, and often dehumanised depictions of conflict. These automated processes carry forward inherited cultural biases and aversions, including the avoidance of imagery associated with disgust (Taylor Reference Taylor1998), shaping not only how audiences engage with the present but also how the visual record of war is collectively remembered.
Albeit rooted in the training data, the aesthetic form of genericity increasingly carries into the present and future to illustrate and inform on public events and issues. Paik et al. (Reference Paik, Bonna, Novozhilova, Gao, Kim, Wijaya and Betke2023) demonstrate that specific AI models lean towards specific visual genres (such as graphic figures and posters), raising questions about how textual prompts influence the type of visuals produced. These past traces are occasionally revealed not through prompting but through system choices to represent war scenes that never took place in a particular style. In our study, this was evident in several images produced by Stable Diffusion, which included depictions resembling maps in contrast to the majority of AI ‘photographs’ produced by Midjourney and cartoon-like visualisations produced by Adobe Firefly. Although not the primary focus of this study, visual AI style and how it contributes to interpretation of war images is another layer manifested through human-archival-mechanistic entanglements of collective memory. Emerging research has already demonstrated how specific colour patterns and stylistic choices in AI-generated images affect perceptions of authenticity and how such images may contribute to reshaping collective memory by introducing fabricated yet plausible representations of historical events (García-Huete et al. Reference García-Huete, Ignacio-Cerrato, Pacios, Vázquez-Poletti, Pérez-Serrano, Donofrio, Cesarano, Schetakis and di Iorio2025).
The implications of algorithms, training datasets, and moderation policies that govern generative AI systems extend beyond the representations uncovered in our study to the emotional impact of AI-generated imagery. AI-generated images tailored to specific news headlines can evoke emotional responses similar to those triggered by human-produced images Paik et al. (Reference Paik, Bonna, Novozhilova, Gao, Kim, Wijaya and Betke2023). Depending on the topic, AI-generated images elicit a broader range of emotional responses than human-selected ones, particularly when viewed without accompanying headlines (Paik et al. Reference Paik, Bonna, Novozhilova, Gao, Kim, Wijaya and Betke2023). More broadly, interpretation of generic images – which AI-generated images can be seen as a continuation of – is guided by diverse emotions, experiences, and identities mobilising ‘the personal in engagement with the social issues portrayed’ (Kennedy et al. Reference Kennedy, Aiello, Annabell and Anderson2025, 7), and images within the ‘human-interest frame’ (ie, those featuring people) are found to lead to higher values regarding the emotional evaluations of news articles (Bratner et al. Reference Brantner, Lobinger and Wetzstein2011). Thus, future research might explore how these emotional responses vary across different audiences and contexts, aiming to better understand how AI-generated visual content shapes public perception.
Conclusions and further research
This study has addressed AI-generated images of war as a new dimension of memory making. By drawing on the memory of the multitude, visual social semiotics, and cultivation/desensitisation theory, we have examined how visual generative AI constructs visual ‘histories’ that impact collective understanding of conflicts and wars. AI-generated representations emerge not from neutral technical processes, but from human-archival-mechanistic relationships that encode particular worldviews and representational limitations into the visual record of war. As Hoskins (Reference Hoskins2024, 3) notes, they blend the ‘mechanistic and the human’ in complex ways. The resulting imagery reflects the constraints and biases of both the historical war archives and AI platform policies that prioritise safety over a more nuanced and holistic picture of the war.
A systematic examination of AI-generated war imagery reveals how what is represented in the image positions the viewer to relate to the image content in specific ways. Images function not as passive reflections of reality but as active sites for interpretation and affectual response between viewers and those depicted (Kress and van Leeuwen Reference Kress and van Leeuwen2021). However, unlike photographs taken by human war photographers – which, as our literature review highlighted, include scenes of death and suffering – AI-generated images consistently exclude the most emotionally charged aspects of conflict in favour of sanitised, less risky representations. The implications of this sanitisation extend beyond immediate desensitisation effects. When AI-generated war scenes systematically exclude images of death and human suffering, viewers may become desensitised to the point where they are less willing to provide support to those being victimised (Waddell et al. Reference Waddell, Bailey, Weber, Ivory and Downs2019). This creates a troubling disconnect between the synthetic visual narratives and the harsh realities of armed conflict, where ‘the past casts a shadow over (im)possible futures’ (Ferreday and Kuntsman Reference Ferreday and Kuntsman2011, 1), and where the ghosts of sanitized, commercially-driven imagery cast shadows over future generations’ capacity to comprehend the true horror of armed conflict and respond with appropriate urgency to human suffering.
The study contributes to an interdisciplinary dialogue on collective memory at the intersections of visual communication studies, media studies, and memory studies by providing empirical insights into how generative AI mediates the visual representation of war through human-archival-mechanistic entanglements. The study invites future research on the broader cultural implications of AI-generated imagery of wars, prompting reflection on the ethical and perceptual consequences of machine-generated representations of wars. Our conclusions, however, should be read considering several limitations.
In this work, we primarily focused on AI images of war from a limited number of generative media platforms without accounting for temporal dynamics of generative models. Further research could expand this scope by including more platforms and a larger corpus, while longitudinal studies could provide insights into how AI representations might evolve. While surveys are traditionally used to find cultivation effects, our research incorporated content analysis of AI-generated images as a critical component of cultivation research methodology. By providing detailed documentation of what viewers might be exposed to, this ‘systematic analysis of content’ serves as an essential element of the ‘three-legged stool that supports a cultivation conclusion’ (Hermann et al. Reference Hermann, Morgan and Shanahan2023, 2506). The study was also constrained to default representations of AI-generated images, without considering the influence of stylistic variations. However, the way people interpret an image that resembles a naturalistic photograph, akin to those commonly seen in war reportage, can differ significantly from their perception of poster-like or cartoon-style depictions. Addressing these differences in future studies will be critical as varied modalities are likely to influence viewer responses to image narratives. Incorporating participant feedback could provide additional insights into affectual responses as, for example, Paik et al. (Reference Paik, Bonna, Novozhilova, Gao, Kim, Wijaya and Betke2023) did in relation to perceived newsworthiness in AI-generated images for visual journalism.
Ultimately, we hope that future studies could build on our findings and continue addressing new epistemological arrangements brought about by the memory of the multitude, with which come new ways of knowing the world – and remembering it – where, as Amoore et al. (Reference Amoore, Campolo, Jacobsen and Rella2024) note, generativity, latency, sequences, and pre-training become enduring forms of knowledge that redefine what can be known and acted upon.
Data availability statement
The data that support the findings of this study are available upon request.
Acknowledgements
We thank Andrew Hoskins, Anthony Downey, and Amanda Lagerkvist for their suggestions on developing the concept of collective memory. We are also grateful to the three anonymous reviewers for their detailed feedback, which greatly enhanced this work.
Competing interests
The authors declare none.
Nataliia Laba is an Assistant Professor in digital and multimodal communication / humane AI at the University of Groningen. Her research interests include representational issues in the context of visual and multimodal generative AI.
Nataliya Roman is an Associate Professor in the UNF School of Communication. Prior to her academic career, Dr. Roman worked as a reporter and documentary filmmaker for several prominent Ukrainian TV channels. Dr. Roman specializes in researching international and political communication. One of her main areas of expertise is Ukraine.
John H. Parmelee is a Professor and Director of the School of Communication at the University of North Florida. His research interests include how technology impacts political communication.