1. Introduction
Let w be a word on r letters, i.e. an element in the free group on the letters
$x_{1},\ldots,x_{r}$
. Let
$X_{1},\ldots,X_{r}$
be random
$d\times d$
unitary matrices, chosen independently at random according to the Haar probability measure, and consider the random matrix
$w(X_{1},\ldots,X_{r})$
, obtained by substituting
$X_{i}$
for
$x_{i}$
in w. For example, if
$w=x_{1}x_{2}x_{1}^{-1}x_{2}^{-1}$
, then
$w(X_{1},X_{2})=X_{1}X_{2}X_{1}^{-1}X_{2}^{-1}$
. In this paper, we study the distribution of the characteristic polynomial of
$w(X_{1},\ldots,X_{r})$
. To set notation, given a
$d\times d$
-matrix A and
$1\leq m\leq d$
, let
$c_{m}(A)$
be the coefficient of
$t^{d-m}$
in the characteristic polynomial
$\det(t\cdot\mathrm{Id}-A)$
of A. Note that
$c_{m}(A)=(-1)^{m}\,\mathrm{tr}\big(\bigwedge\nolimits^{\!m}A\big)$
, where
$\bigwedge^{\!m}A:\bigwedge^{\!m}\mathbb{C}^{d}\rightarrow\bigwedge^{\!m}\mathbb{C}^{d}$
is the mth exterior power of A. If A is unitary, all eigenvalues have absolute value 1, so we get the trivial bound
$|c_{m}(A)|\leq\binom{d}{m}$
.
Our main theorem is as follows.
Theorem 1.1. For every non-trivial word
$w\in F_{r}$
, there exists a constant
$\epsilon(w)>0$
such that

for every d and every
$1\leq m\leq d$
. In particular, we have

Remark 1.2. We make the following remarks.
-
(1) In the proof of Theorem 1.1, we show that, if the length of w is
$\ell$ and
$d\geq(25\ell)^{7\ell}$ , then one can take
$\epsilon(w)=\frac{1}{72}(25\ell)^{-2\ell}$ . We believe
$\epsilon(w)^{-1}$ can be taken to be a polynomial in
$\ell$ , for
$d\gg_{\ell}1$ .
-
(2) On the other hand, it follows from [Reference Elkasapy and ThomET15, Theorem 5.2] that, for a fixed d, one has to take
$\epsilon(w)\lesssim e^{-\sqrt{\ell}}$ , for some arbitrarily long words, even for
$m=1$ .
Theorem 1.1 relies on the following.
Theorem 1.3. For every
$m,\ell\in\mathbb{N}$
, every
$d\geq m\ell$
, and every word
$w\in F_{r}$
of length
$\ell$
, one has

In particular, if
$d\geq(22\ell)^{\ell}m$
, we have

In addition, we show that similar bounds hold for symmetric powers.
Theorem 1.4. For every
$\ell\in\mathbb{N}$
, every
$d\geq m\ell$
, and every word
$w\in F_{r}$
of length
$\ell$
, one has

In particular, if
$d\geq(16\ell)^{\ell}m$
, we have

and by the Cauchy–Schwarz inequality,

Remark 1.5. Theorem 1.4 is an analogue of Theorem 1.3. It is also an analogue of Theorem 1.1 for m at most linear in d. In contrast to exterior powers, the methods of this paper are insufficient for finding bounds similar to Theorem 1.1 for
$|\mathbb{E}(\mathrm{tr}(\mathrm{Sym}^{m}w(X_{1},\ldots,X_{r})))|$
, in the regime where m is superlinear in d.
1.1 Related work
Word maps on unitary groups and their eigenvalues have been studied extensively in the past few decades.
The case
$w=x$
, namely, the study of a Haar-random unitary matrix X, also known as the circular unitary ensemble (CUE), is an important object of study in random matrix theory (see, e.g., [Reference Anderson, Guionnet and ZeitouniAGZ10, Reference MeckesMec19] and the references therein). The joint density of the eigenvalues of X is given by the Weyl integration formula [Reference WeylWey39]. Schur’s orthogonality relations immediately imply that
$\mathbb{E}(|c_{m}(X)|^{2})=1$
for all
$1\leq m\leq d$
. Various other properties of the characteristic polynomial of a random unitary matrix X have been studied extensively (see, e.g., [Reference Keating and SnaithKS00, Reference Hughes, Keating and O’ConnellHKO01, Reference Conrey, Farmer, Keating, Rubinstein and SnaithCFK+03, Reference Bump and GamburdBG06, Reference Diaconis and GamburdDG06, Reference Bourgade, Hughes, Nikeghbali and YorBHNY08, Reference Arguin, Belius and BourgadeABB17, Reference Chhaibi, Madaule and NajnudelCMN18, Reference Paquette and ZeitouniPZ17]).
Diaconis and Shahshahani [Reference Diaconis and ShahshahaniDS94] have shown that, for a fixed
$m\in\mathbb{N}$
, the sequence of random variables
$\mathrm{tr}(X),\mathrm{tr}(X^{2}),\ldots,\mathrm{tr}(X^{m})$
converges in distribution, as
$d\rightarrow\infty$
, to a sequence of independent complex normal random variables. For the proof, which relies on the moment method, they computed the joint moments of those random variables and showed that

for
$d\geq\sum_{j=1}^{m}(a_{j}+b_{j})j$
. The rate of convergence was later shown to be super-exponential by Johansson [Reference JohanssonJoh97].
When
$w=x^{\ell}$
, (1.2) gives a formula for the moments of traces, and one can use Newton’s identities relating elementary symmetric polynomials and power sums, to deduce that

for
$d\geq2m\ell$
(see Appendix A). In [Reference RainsRai97, Reference RainsRai03], Rains partially extended (1.2) for small d and gave an explicit formula for the joint density of the eigenvalues of
$X^{\ell}$
(see [Reference RainsRai03, Theorem 1.3]).
We now move to general words
$w\in F_{r}$
. The case
$m=1$
, namely, the asymptotics as
$d\rightarrow\infty$
of the distribution of the random variable
$\mathrm{tr}(w(X_{1},\ldots,X_{r}))$
, was studied in the context of Voiculescu’s free probability (see, e.g., [Reference Voiculescu, Dykema and NicaVDN92, Reference Mingo and SpeicherMS17]). In particular, in [Reference VoiculescuVoi91, Reference RădulescuRăd06, Reference Mingo, ś niady and SpeicherMSS07] it was shown that, for a fixed
$w\in F_{r}$
, the sequence of random variables
$\mathrm{tr}(w(X_{1},\ldots,X_{r}))$
, for
$d=1,2,\ldots$
, converges in distribution, as
$d\rightarrow\infty$
, to a complex normal random variable (with suitable normalization). As a direct consequence, for a fixed
$m\in\mathbb{N}$
, the random variables
$c_{m}(w(X_{1},\ldots,X_{r}))$
converge, as
$d\rightarrow\infty$
, to a certain explicit polynomial of Gaussian random variables. This is done in Appendix A, Corollary A.4, following [Reference Diaconis and GamburdDG06].
In [Reference Magee and PuderMP19], Magee and Puder have shown that
$\mathbb{E}(\mathrm{tr}(w(X_{1},\ldots,X_{r})))$
coincides with a rational function of d, if d is sufficiently large, and bounded its degree in terms of the commutator length of w. They also found a geometric interpretation for the coefficients of the expansion of that rational function as a power series in
$d^{-1}$
, see [Reference Magee and PuderMP19, Corollaries 1.8 and 1.11]. See [Reference BrodskyBro24] for additional work in this direction.
1.2 Ideas of proofs
With a few exceptions, the results stated in § 1.1 are asymptotic in d, but not uniform in both m and d. We try to explain some of the challenges in proving results that are uniform in m, while explaining the idea of the proof of Theorem 1.1.
Our main tool (which is also used in the papers [Reference RădulescuRăd06, Reference Mingo, ś niady and SpeicherMSS07, Reference Magee and PuderMP19] cited previously) to study integrals on unitary groups is the Weingarten calculus [Reference WeingartenWei78, Reference CollinsCol03, Reference Collins and ŚniadyCS06]. Roughly speaking, the Weingarten calculus utilizes the Schur–Weyl duality to express integrals on unitary groups as sums of so-called Weingarten functions over symmetric groups. In our case, in order to prove Theorem 1.1, we need to estimate the integral

Using Weingarten calculus (Theorem 2.12), we express (1.3) as a finite sum

where
$\ell_{1},\ldots,\ell_{2r}\in\mathbb{N}$
and
$F:\prod_{i=1}^{2r}S_{m\ell_{i}}\rightarrow\mathbb{Z}$
are related to combinatorial properties of w, and each
$\mathrm{Wg}_{d}^{(i)}:S_{m\ell_{i}}\rightarrow\mathbb{R}$
is a Weingarten function (see Definition 2.10). There are two main difficulties when dealing with sums such as (1.4) in the region when m is unbounded.
-
(1) While the asymptotics of Weingarten functions
$\mathrm{Wg}_{d}:S_{m}\rightarrow\mathbb{R}$ are well understood when
$d\gg m$ (see [Reference CollinsCol03, Section 2.2] and [Reference Collins and MatsumotoCM17, Theorem 1.1]), much less is known in the regime where m is comparable with d.
-
(2) Even if we have a good understanding of a single Weingarten function, the number of summands in (1.4) is large and it is not enough to bound each individual Weingarten function.
Luckily, there are plenty of cancellations in the sum (1.4). To understand these cancellations, we identify a symmetry of (1.4). More precisely, we find a group H acting on
$\prod_{i=1}^{2r}S_{m\ell_{i}}$
such that F is equivariant with respect to H, and such that the contribution of any H-orbit to the sum (1.4) is a product of terms, each of which has the form

where
$\mathrm{sgn}(x)$
is the sign of x and the sum is over the Young subgroup
$S_{m}^{\ell_{i}}\subseteq S_{m\ell_{i}}$
, see Corollary 5.3.
Weingarten functions are class functions, so they are linear combinations of irreducible characters of
$S_{m\ell_{i}}$
. Explicitly, we have (see [Reference Collins and ŚniadyCS06, (13)])

where each
$\lambda$
is a partition of
$m\ell_{i}$
with at most d parts and
$\chi_{\lambda}$
and
$\rho_{\lambda}$
are the corresponding irreducible characters of
$S_{m\ell_{i}}$
and
$\mathrm{U}_{d}$
, respectively. The cancellations that we get in the sum (1.5) come from averaging irreducible characters of
$S_{m\ell_{i}}$
over
$S_{m}^{\ell_{i}}$
-cosets. Here
$S_{m}^{\ell_{i}}$
is a large subgroup of
$S_{m\ell_{i}}$
, so these cancellations will be significant as well. For example, all terms in (1.6) for which
$\lambda$
has more than
$\ell_{i}$
columns vanish. See Lemmas 2.7 and 2.8 for the precise bounds.
After we bound the average contribution of each H-orbit in the sum (1.4) by a function C(m,d,w), we bound (1.4) by
$|Z|\cdot C(m,d,w)$
for some finite set Z. This becomes a counting problem, which we solve in § 6, see Proposition 6.1.
The proof of Theorem 1.1 occupies §§ 4, 5, 6 and 7. Since the combinatorics of general words is a bit complicated, we prove a simplified version of Theorem 1.3 for the special case of the Engel word [[x,y],y] in § 3. The proof for this special case contains the main ideas of the paper, while being easier to understand.
1.3 Further discussion and some open questions
The results of this paper fit in the larger framework of the study of word measures and their Fourier coefficients.
Let G be a compact group, and let
$\mu_{G}$
be the Haar probability measure on G. To each word
$w(x_{1},\ldots,x_{r})\in F_{r}$
we associate the corresponding word map
$w_{G}:G^{r}\rightarrow G$
, defined by
$(g_{1},\ldots,g_{r})\mapsto w(g_{1},\ldots,g_{r})$
. The pushforward measure
$(w_{G})_{*}(\mu_{G}^{r})$
is called the word measure
$\tau_{w,G}$
associated with w and G. Let
$\mathrm{Irr}(G)$
be the set of irreducible characters of G. The Fourier coefficient of
$\tau_{w,G}$
at
$\rho\in\mathrm{Irr}(G)$
is

If
$w\neq1$
and G is a compact semisimple Lie group, then by Borel’s theorem [Reference BorelBor83], the map
$w_{G}:G^{r}\rightarrow G$
is a submersion outside a proper subvariety in
$G^{r}$
. It follows that
$\tau_{w,G}$
is absolutely continuous with respect to
$\mu_{G}$
and, therefore,
$\tau_{w,G}=f_{w,G}\cdot \mu_{G}$
, where
$f_{w,G}\in L^{1}(G)$
is the Radon–Nikodym density. In this case,
$f_{w,G}=\sum_{\rho\in\mathrm{Irr}(G)}\overline{a_{w,G,\rho}}\cdot\rho$
.
In [Reference Larsen, Shalev and TiepLST19, Theorem 4], Larsen, Shalev, and Tiep proved uniform
$L^{\infty}$
-mixing time for convolutions of word measures on sufficiently large finite simple groups. From this, the following can be deduced.
Theorem 1.6. For every
$w\in F_{r}$
, there exists
$N(w)\in\mathbb{N}$
such that if G is a finite simple group with at least N(w) elements, then

for
$\epsilon(w)=C\cdot\ell(w)^{-4}$
and some absolute constant C.
The proof of Theorem 1.6 is given at the end of § 7.
We believe that a similar statement should be true for compact semisimple Lie groups.
Conjecture 1.7. For every
$1\neq w\in F_{r}$
, there exists
$\epsilon(w)>0$
such that, for every compact connected semisimple Lie group G and every
$\rho\in\mathrm{Irr}(G)$
,

It is natural to estimate
$\epsilon(w)$
in terms of the length
$\ell(w)$
of the word w. For simple groups of bounded rank, item (2) of Remark 1.2 (i.e. [Reference Elkasapy and ThomET15, Theorem 5.2]) shows that there are arbitrarily long words w for which
$\epsilon(w)$
cannot be larger than
$e^{-\sqrt{\ell(w)}}$
. However, we believe that better Fourier decay can be achieved for the high-rank case.
Question 1.8. Can one take
$\epsilon(w)$
to be a polynomial in
$\ell(w)$
, if
$\mathrm{rk}(G)\gg_{\ell(w)}1$
?
Theorem 1.1 gives evidence to Conjecture 1.7 for
$G=\mathrm{SU}_{d}$
and the collection of fundamental representations
$\big\{ \bigwedge^{\!m}\mathbb{C}^{d}\big\}_{m=1}^{d}$
. Indeed, for every
$\rho\in\mathrm{Irr}(\mathrm{U}_{d})$
, since
$|\rho(\lambda A)|=|\rho(A)|$
for
$A\in\mathrm{SU}_{d}$
and
$\lambda\in\mathrm{U}_{1}$
, and since
$\mu_{\mathrm{U}_{d}}$
is the pushforward of
$\mu_{\mathrm{U}_{1}}\times\mu_{\mathrm{SU}_{d}}$
by the multiplication map
$(\lambda,A)\mapsto\lambda A$
, we have

Theorem 1.4 deals with another family of irreducible representations
$\{ \mathrm{Sym}^{m}\mathbb{C}^{d}\}_{m=1}^{\lfloor d/(16\ell)^{\ell}\rfloor }$
, giving further evidence for Conjecture 1.7.
Verifying Conjecture 1.7 will imply that, for every word w, the random walks induced by the collection of measures
$\{\tau_{w,G}\}_{G}$
, where G runs over all compact connected simple Lie groups, admit a uniform
$L^{\infty}$
-mixing time. Namely, using [Reference Guralnick, Larsen and ManackGLM12, Theorem 1], it will show the existence of
$t(w)\in\mathbb{N}$
such that

for every compact connected simple Lie group G. By the above discussion, t(w) grows at least exponentially with
$\sqrt{\ell(w)}$
under no restriction on the rank. If the condition (1.10) is replaced by the condition that
$\tau_{w,G}^{*t(w)}$
has bounded density, one might hope for polynomial bounds.
Question 1.9. Let
$1\neq w\in F_{r}$
. Can one find
$t(w)\in\mathbb{N}$
such that for every compact connected semisimple Lie group G,
$\tau_{w,G}^{*t(w)}$
has bounded density with respect to
$\mu_{G}$
? Can t(w) be chosen to have polynomial dependence on
$\ell(w)$
?
Question 1.9 can be seen as an analytic specialization of a geometric phenomenon. Let
$\varphi:X\rightarrow Y$
be a polynomial map between smooth
$\mathbb{Q}$
-varieties. We say that
$\varphi$
is (FRS) if it is flat and its fibers all have rational singularities. In [Reference Aizenbud and AvniAA16, Theorem 3.4], Aizenbud and the first author showed that if
$\varphi$
is (FRS), then for every non-Archimedean local field F and every smooth, compactly supported measure
$\mu$
on X(F), the pushforward
$\varphi_{*}\mu$
has bounded density. This result was extended in [Reference ReiserRei18] to the Archimedean case,
$F=\mathbb{R}$
or
$\mathbb{C}$
, and, moreover, if one runs over a large enough family of local fields, the condition of (FRS) is, in fact, necessary as well for the densities of pushforwards to be bounded (see [Reference Aizenbud and AvniAA16, Theorem 3.4] and [Reference Glazer, Hendel and SodinGHS24, Corollary 6.2]).
To rephrase Question 1.9 in geometric term, we further need the following notion from [Reference Glazer and HendelGH19, Reference Glazer and HendelGH21].
Definition 1.10. [Reference Glazer and HendelGH19, Definition 1.1]. Let
$\varphi:X\rightarrow G$
and
$\psi:Y\rightarrow G$
be morphisms from algebraic varieties X,Y to an algebraic group G. We define their convolution by

We denote by
$\varphi^{*k}:X^{k}\rightarrow G$
the k-fold convolution of
$\varphi$
with itself.
Based on the above discussion, a positive answer to the following question will answer Question 1.9 positively.
Question 1.11. [Reference Glazer and HendelGH24, Question 1.15]. Can we find
$\alpha,C>0$
such that, for every
$w\in F_{r}$
of length
$\ell$
and every simple algebraic group G, the word map
$w_{G}^{*C\ell^{\alpha}}$
is (FRS)?
In [Reference Glazer and HendelGH19, Reference Glazer and HendelGH21], the second author and Hendel have shown that any dominant map
$\varphi:X\rightarrow G$
from a smooth variety to a connected algebraic group becomes (FRS) after sufficiently many self-convolutions. Concrete bounds were given in [Reference Glazer, Hendel and SodinGHS24, Corollary 1.9]. Based on these results, we prove Conjecture 1.7 and answer Question 1.9 for the bounded rank case (see Proposition 7.2).
To conclude the discussion, we remark that a positive answer for Question 1.11 will answer Question 1.9 for compact semisimple p-adic groups as well. Significant progress was made in this direction in the work [Reference Glazer and HendelGH24], by the second author and Hendel, where singularities of word maps on semisimple Lie algebras and algebraic groups were studied.
1.4 Conventions and notation
-
(1) We denote the set
$\{1,\ldots,N\}$ by [N].
-
(2) For a finite set X, we denote the symmetric group on X by
$\mathrm{Sym}(X)$ and the space of functions
$f:X\rightarrow\mathbb{C}$ by
$\mathbb{C}[X]$ .
-
(3) We write
$(-1)^{\sigma}$ for the sign of a permutation
$\sigma$ .
-
(4) For a group G, a representation is a pair
$(\pi,V)$ , with
$\pi:G\rightarrow\mathrm{GL}(V)$ a homomorphism. We denote the character of
$(\pi,V)$ by
$\chi_{\pi}$ and denote its dual by
$(\pi^{\vee},V^{\vee})$ .
2. Preliminaries
2.1 Some facts in representation theory
For a compact group G, we denote the set of irreducible complex characters of G by
$\mathrm{Irr}(G)$
. Given a subgroup
$H\leq G$
and a character
$\chi\in\mathrm{Irr}(H)$
, we denote the induction of
$\chi$
to G by
$\mathrm{Ind}_{H}^{G}\chi$
. We normalize the Haar measure to be a probability measure and denote the expectation with respect to the Haar measure by
$\mathbb{E}$
. The standard inner product on functions on G is
$\langle f_{1},f_{2}\rangle_{G}=\mathbb{E}f_{1}\overline{f_{2}}$
.
2.1.1 Representation theory of the symmetric group
Given
$m\in\mathbb{N}$
, a partition of m is a non-increasing sequence
$\lambda=(\lambda_{1},\ldots,\lambda_{k})$
of non-negative integers that sum to m. In this case, we write
$\lambda\vdash m$
. Two partitions are equivalent if they differ only by a string of zeros at the end. A partition
$\lambda=(\lambda_{1},\ldots,\lambda_{k})$
, with
$\lambda_{k}>0$
, is graphically encoded by a Young diagram, which is a finite collection of boxes (or cells) arranged in k left-justified rows, where the jth row has
$\lambda_{j}$
boxes. The length
$\ell(\lambda)$
of a partition
$\lambda\vdash m$
is the number of non-zero parts
$\lambda_{i}$
or, equivalently, the number of rows in the corresponding Young diagram.
The irreducible representations of
$S_{m}$
are in bijection with partitions
$\lambda\vdash m$
. We write
$\chi_{\lambda}\in\mathrm{Irr}(S_{m})$
for the corresponding character. For each cell (i,j) in the Young diagram of
$\lambda$
, the hook length
$h_{\lambda}(i,j)$
is the number of cells (a,b) in the Young diagram of
$\lambda$
such that either
$a=i$
and
$b\geq j$
, or
$a\geq i$
and
$b=j$
. The hook-length formula states that

Definition 2.1.
-
(1) Fix a Young diagram
$\lambda$ and let
$n\in\mathbb{N}$ . An n-expansion of
$\lambda$ is any Young diagram obtained by adding n boxes to
$\lambda$ in such a way that no two boxes are added in the same column.
-
(2) Given a partition
$\lambda=(\lambda_{1},\ldots,\lambda_{l_{1}})\vdash k$ and a partition
$\mu=(\mu_{1},\ldots,\mu_{l_{2}})\vdash l$ , a
$\mu$ -expansion of
$\,\lambda$ is defined to be a
$\mu_{l_{2}}$ -expansion of a
$\mu_{l_{2}-1}$ -expansion of a
$\cdots$ of a
$\mu_{1}$ -expansion of the Young diagram of
$\lambda$ . For a
$\mu$ -expansion of
$\lambda$ , we label the boxes added in the
$\mu_{l_{j}}$ -expansion by the number j and order the boxes lexicographically by their position, first from top to bottom and then from right to left. We say that a
$\mu$ -expansion of
$\lambda$ is strict if, for every
$p\in\{1,\ldots,l_{2}-1\}$ and every box t, the number of boxes coming before t that are labeled p is greater than or equal to the number of boxes coming before t that are labeled
$(p+1)$ .
Theorem 2.2 (Littlewood–Richardson rule [Reference MacdonaldMac95, I.9]). Let
$\lambda\vdash k$
and
$\mu\vdash m$
. Then,

where
$N_{\lambda\mu\nu}$
is the number of strict
$\mu$
-expansions of
$\lambda$
that are a Young diagram of the partition
$\nu$
.
We need the following consequence of Theorem 2.2.
Lemma 2.3. Let
$l\in\mathbb{Z}_{\geq2}$
and identify
$S_{m}^{l}$
with its image in the standard embedding
$S_{m}^{l}\hookrightarrow S_{ml}$
. Then, each
$\chi_{\nu}\in\mathrm{Irr}(S_{ml})$
appearing in
$\mathrm{Ind}_{\mathrm{S}_{m}^{l}}^{S_{ml}}(1)$
(respectively,
$\mathrm{Ind}_{\mathrm{S}_{m}^{l}}^{S_{ml}}(\mathrm{sgn})$
) corresponds to a partition
$\nu\vdash ml$
with at most l rows (respectively, l columns).
Proof. We prove the statement for the trivial representation 1 by induction on l. The proof for sgn is similar. The character 1 of
$S_{m}$
corresponds to the partition
$\lambda$
consisting of one row of length m. By the induction hypothesis, we may assume that
$\mathrm{Ind}_{\mathrm{S}_{m}^{j}}^{S_{mj}}(1)=\bigoplus_{\mu\vdash mj}m_{\mu}\chi_{\mu}$
, with
$m_{\mu}>0$
only if
$\mu$
has at most j rows, for all
$j<l$
. Hence, we can write

By Theorem 2.2 and since a strict
$\lambda$
-expansion of
$\mu$
increases the number of rows by at most one, the lemma follows.
2.1.2 Representation theory of the unitary group
The irreducible representations of
$\mathrm{U}_{d}$
can be identified with the irreducible rational representations of
$\mathrm{GL}_{d}(\mathbb{C})$
. More precisely, the restriction map
$\rho\mapsto\rho|_{\mathrm{U}_{d}}$
induces a bijection
$\mathrm{Irr}(\mathrm{GL}_{d}(\mathbb{C}))\rightarrow\mathrm{Irr}(\mathrm{U}_{d})$
. Moreover, the set
$\mathrm{Irr}(\mathrm{U}_{d})$
is in bijection with the set
$\Lambda$
of dominant weights,

We denote the representation corresponding to
$\lambda\in\Lambda$
by
$(\rho_{\lambda},V_{\lambda})$
. The irreducible representations

are called the fundamental representations, and we have
$\bigwedge\nolimits^{\!m}\mathbb{C}^{d}\simeq V_{(1,\ldots,1,0,\ldots,0)}$
, with 1 appearing m times. In particular, the standard representation
$\mathbb{C}^{d}$
is
$V_{(1,0,\ldots,0)}$
. Note that
$\bigwedge^{\!d}\mathbb{C}^{d}$
is the determinant representation
$\chi_{\det}$
. We identify a weight
$\lambda\in\Lambda$
such that
$\lambda_{d}\geq0$
with a partition
$(\lambda_{1},\ldots,\lambda_{d})$
.
Remark 2.4 [Reference Fulton and HarrisFH91, I.6, Exc. 6.4]. For each
$\lambda=(\lambda_{1},\ldots,\lambda_{d})\vdash m$
,

where (i,j) are the coordinates of the cells in the Young diagram with shape
$\lambda$
.
Given
$\lambda,\mu\in\Lambda$
, the irreducible subrepresentations of
$\rho_{\lambda}\otimes\rho_{\mu}$
are determined by the Littlewood–Richardson rule as follows.
Theorem 2.5 (Littlewood–Richardson rule; see, e.g., [Reference Fulton and HarrisFH91, I.6, Equation (6.7)]). Let
$\lambda,\mu\in\Lambda$
and suppose that
$\lambda_{d},\mu_{d}\geq0$
. Let
$N_{\lambda\mu\nu}$
be the coefficients from Theorem 2.2. Then,

Remark 2.6. For
$\lambda,\mu\in\Lambda$
, set
$\widetilde{\lambda}:=\lambda-(\lambda_{d},\ldots,\lambda_{d})$
and
$\widetilde{\mu}:=\mu-(\mu_{d},\ldots,\mu_{d})$
. Then
$\rho_{\lambda}=\chi_{\det}^{\lambda_{d}}\cdot\rho_{\widetilde{\lambda}}$
and
$\rho_{\mu}=\chi_{\det}^{\mu_{d}}\cdot\rho_{\widetilde{\mu}}$
, and hence by Theorem 2.5, one has

2.1.3 Averaging characters over cosets
Lemma 2.7. Let G be a finite group, let
$(\pi,V)$
be an irreducible representation of G, let
$H\leq G$
be a subgroup, and let
$\lambda$
be any one-dimensional character of H. Then, for every
$g\in G$
,

In particular, if
$\langle \chi_{\pi}|_{H},\lambda\rangle_{H}=0$
, then
$\sum_{h\in H}\lambda^{-1}(h)\chi_{\pi}(gh)=0$
.
Proof. Write
$\pi|_{H}=\bigoplus_{i=1}^{\widetilde{N}}\pi_{i}$
with each
$(\pi_{i},V_{i})$
an irreducible representation of H. For each i and
$h'\in H$
,

By Schur’s lemma,
$\sum_{h\in H}\lambda^{-1}(h)\pi_{i}(h)$
is a scalar matrix
$\alpha\cdot I_{V_{i}}$
, for some
$\alpha\in\mathbb{C}$
. Hence,

Let
$L:=\{ v\in V:\pi(h)v=\lambda(h)\cdot v,\ \forall h\in H\} $
be the subspace of
$(H,\lambda)$
-equivariant vectors in V and let
$L^{\bot}$
be an H-invariant subspace of V with
$V=L\oplus L^{\bot}$
. By (2.5), the map
$A:=\sum_{h\in H}\lambda^{-1}(h)\pi(h)\in\mathrm{End}(V)$
satisfies
$A|_{L^{\bot}}=0$
and
$A|_{L}=|H|\cdot I_{L}$
. Take an orthonormal basis
$v_{1},\ldots,v_{N}$
for V with
$L=\langle v_{1},\ldots,v_{M}\rangle$
,
$L^{\bot}=\langle v_{M+1},\ldots,v_{N}\rangle$
. Then,

and the lemma follows.
The following lemma gives a different estimate on the average of a character over a coset, and this estimate is sharper when the double coset HgH is large. We will not need these alternative estimates, but we thought it could be useful to state them.
Lemma 2.8. Let G be a finite group, and let
$H\leq G$
be a subgroup. Then, for each
$\chi\in\mathrm{Irr}(G)$
and each
$g\in G$
,

Proof. Let G be a finite group. For each
$\chi\in\mathrm{Irr}(G)$
, we denote by
$(\pi_{\chi},V_{\chi})$
the representation corresponding to
$\chi$
. The non-commutative Fourier transform (see, e.g., [Reference ApplebaumApp14, Section 2.3]) is the map
${\mathcal F}:\mathbb{C}[G]\rightarrow\bigoplus_{\chi\in\mathrm{Irr}(G)}\mathrm{End}(V_{\chi})$
defined by
$f\mapsto\widehat{f}:=(\widehat{f}(\chi))_{\chi\in\mathrm{Irr}(G)}$
, where
$\widehat{f}(\chi)=(\frac{1}{|G|})\sum_{g'\in G}f(g')\pi_{\chi}(g'^{-1})$
. We denote by
$\Vert f\Vert_{2}:=\big((\frac{1}{|G|})\sum_{g'\in G}|f(g')|^{2}\big)^{{1}/{2}}$
. Similarly, for a collection of endomorphisms
$(A_{\chi})_{\chi\in\mathrm{Irr}(G)}\in\bigoplus_{\chi\in\mathrm{Irr}(G)}\,\mathrm{End}(V_{\chi})$
, with
$A_{\chi}\in\mathrm{End}(V_{\chi})$
, we define

where
$\Vert A_{\chi}\Vert_{\mathrm{HS}}:=\mathrm{tr}(A_{\chi}\cdot A_{\chi}^{*})^{{1}/{2}}$
is the Hilbert–Schmidt norm on
$\mathrm{End}(V_{\chi})$
. The Plancherel theorem (see, e.g., [Reference ApplebaumApp14, Theorem 2.3.1(2)]), states that

Let
$\psi_{HgH}:=(\frac{1}{|HgH|})1_{HgH}$
. For each
$\chi\in\mathrm{Irr}(G)$
, one has

The square of the
$L^{2}$
-norm of
$\psi_{HgH}$
is given by

Let
$v_{1},\ldots,v_{M}$
be an orthonormal basis of
$V_{\chi}^{H}:=\{ v\in V_{\chi}:\pi_{\chi}(h)\cdot v=v,\ \forall h\in H\}$
with respect to some G-invariant inner product
$\langle\,,\,\rangle$
on
$V_{\chi}$
, with
$M=\langle\chi,1\rangle_{H}$
. Let
$\big(V_{\chi}^{H}\big)^{\perp}$
be the orthogonal complement to
$V_{\chi}^{H}$
in
$V_{\chi}$
. In the proof of Lemma 2.7, in the case that
$\lambda=1$
, we have seen that

In particular, we have

Hence,

By (2.6), (2.7) is equal to (2.9), hence,

where the first equality follows from (2.8), and the first inequality follows from Cauchy–Schwarz inequality.
2.2 Weingarten calculus
In §§ 2.1.1 and 2.1.2 we stated that each partition
$\lambda\vdash m$
with
$\ell(\lambda)\leq d$
induces two different representations,
$\rho_{\lambda}\in\mathrm{Irr}(\mathrm{U}_{d})$
and
$\chi_{\lambda}\in\mathrm{Irr}(S_{m})$
. There is a deeper connection between
$\rho_{\lambda}$
and
$\chi_{\lambda}$
coming from the Schur–Weyl duality: the space
$(\mathbb{C}^{d})^{\otimes m}$
carries a natural action of
$\mathrm{U}_{d}\times S_{m}$
, where
$A\in\mathrm{U}_{d}$
acts diagonally
$A\cdot(v_{1}\otimes\cdots\otimes v_{m})=Av_{1}\otimes\cdots\otimes Av_{m}$
, and
$\sigma\in S_{m}$
acts by
$\sigma\cdot (v_{1}\otimes\cdots\otimes v_{m})=v_{\sigma(1)}\otimes\cdots\otimes v_{\sigma(m)}$
. The Schur–Weyl duality can be phrased as follows.
Theorem 2.9 (Schur–Weyl duality [Reference WeylWey39]). The space
$(\mathbb{C}^{d})^{\otimes m}$
is a multiplicity-free representation of
$\mathrm{U}_{d}\times S_{m}$
. The decomposition of
$(\mathbb{C}^{d})^{\otimes m}$
into irreducible components is given by

There are two special functions on
$S_{m}$
which come from (2.10). First, writing
$\ell(\sigma)$
for the number of disjoint cycles in
$\sigma\in S_{m}$
, the character of
$(\mathbb{C}^{d})^{\otimes m}$
as a representation of
$S_{m}$
is the function
$\sigma\mapsto d^{\ell(\sigma)}$
.
Recall we have an isomorphism of algebras
$\mathbb{C}[S_{m}]\simeq\bigoplus_{\lambda\vdash m}\mathrm{End}(V_{\chi_{\lambda}})$
, where the multiplication in
$\mathbb{C}[S_{m}]$
is the convolution operation
$f_{1}*f_{2}(y):=\sum_{x\in S_{m}}f(x)g(x^{-1}y)$
. We denote by
$\mathbb{C}_{d}[S_{m}]$
the subalgebra corresponding to
$\bigoplus_{\lambda\vdash m,\ell(\lambda)\leq d}\mathrm{End}(V_{\chi_{\lambda}})$
.
Definition 2.10 [Reference Collins and ŚniadyCS06, Proposition 2.3]. Let
$d\in\mathbb{N}$
. The Weingarten function
$\mathrm{Wg}_{d}:S_{m}\rightarrow\mathbb{C}$
is the inverse of the function
$d^{\ell(\sigma)}$
in the ring
$\mathbb{C}_{d}[S_{m}]$
. It has the following Fourier expansion:

Remark 2.11. Since in this paper we only consider
$\mathrm{Wg}_{d'}(\sigma)$
for
$d'=d$
, we write Wg instead of
$\mathrm{Wg}_{d}$
.
The Weingarten calculus, developed in [Reference WeingartenWei78, Reference CollinsCol03, Reference Collins and ŚniadyCS06], utilizes the Schur–Weyl duality to express integrals on unitary groups as finite sums of Weingarten functions on symmetric groups. One formulation is the following theorem by Collins and Śniady.
Theorem 2.12 [Reference Collins and ŚniadyCS06, Corollary 2.4]. Let
$(i_{1},\ldots,i_{m})$
,
$(j_{1},\ldots,j_{m})$
,
$(i'_{1},\ldots,i'_{m})$
, and
$(j'_{1},\ldots,j'_{m})$
be tuples of integers in [d]. Then,

We will use a coordinate-free version of Theorem 2.12 which we proceed to state.
Definition 2.13. Let
$\Omega$
be a set.
-
(1) A symmetric partition
$\Phi$ of
$\Omega$ is a partition
$\Omega=\bigsqcup_{i=1}^{r}A_{i}\sqcup\bigsqcup_{i=1}^{r}B_{i}$ , where
$|A_{i}|=|B_{i}|$ .
-
(2) Given a symmetric partition
$\Phi=(A_{1},\ldots,A_{r},B_{1},\ldots,B_{r})$ , let
\[S_{\Phi}=\{ \Sigma\in\mathrm{Sym}(\Omega):\Sigma(A_{i})=B_{i},\Sigma(B_{i})=A_{i}\} .\]
-
(3) If
$\Sigma\in S_{\Phi}$ , then
$\Sigma^{2}(A_{i})=A_{i}$ and we define
$\widetilde{\mathrm{Wg}}(\Sigma^{2})=\prod_{i=1}^{r}\mathrm{Wg}(\Sigma^{2}|_{A_{i}})$ .
Proposition 2.14. Let
$\Phi=(A,B)$
be a symmetric partition of
$\Omega$
and let
$F,H:\Omega\rightarrow[d]$
. Then

Proof. Identify
$A\cong\{ 1,\ldots,m\} $
and
$B\cong\{ -1,\ldots,-m\}$
and let
$\overrightarrow{\!i},\overrightarrow{\!j},\overrightarrow{\!i}',\overrightarrow{\!j}'\in[d]^{m}$
be

Then, by Theorem 2.12,

For
$\sigma,\tau\in S_{m}$
, let
$\Sigma_{(\sigma,\tau)}\in\mathrm{Sym}(A\sqcup B)\cong\mathrm{Sym}(\{-m,\ldots,-1,1,\ldots,m\} )$
be the permutation

The map
$(\sigma,\tau)\mapsto\Sigma_{(\sigma,\tau)}$
is a bijection
$S_{m}^{2}\cong S_{\Phi}$
and the condition
$\delta_{i_{1},i'_{\sigma(1)}}\cdots\delta_{i_{m},i'_{\sigma(m)}}\cdot\delta_{j_{1},j'_{\tau(1)}}\cdots\delta_{j_{m},j'_{\tau(m)}}=1$
is equivalent to
$H=F\circ\Sigma_{(\sigma,\tau)}$
. Finally, the permutation
$(\Sigma_{(\sigma,\tau)})^{2}$
acts on A as
$\sigma^{-1}\tau$
, and the result follows.
Corollary 2.15. Let
$\Phi=(A_{1},\ldots,A_{r},B_{1},\ldots,B_{r})$
be a symmetric partition of
$\Omega$
and let
$F,H:\Omega\rightarrow[d]$
. Then

3. The Engel word as a model case
‘Those who run to long words are mainly the unskillful and tasteless; they confuse pomposity with dignity, flaccidity with ease, and bulk with force.’ [Reference FowlerFow65, p. 342]
In this section we prove the following simplified version of Theorem 1.3 for the Engel word. We chose the Engel word since it is short enough to make the proof easier to digest, while at same time complicated enough so that the proof contains most of the key ideas in the paper.
Theorem 3.1. Let X,Y be independent random variables with respect to the normalized Haar measure on
$\mathrm{U}_{d}$
. For every
$d\geq2m$
, one has

Let
$w=[[x,y],y]=xyx^{-1}yxy^{-1}x^{-1}y^{-1}$
be the Engel word. We would like to compute
$\mathbb{E}\big(\mathrm{tr}\bigwedge\nolimits^{\!m}w(X,Y)\big)$
. Denote
${\mathcal I}_{m,d}:=\{a_{1}<\cdots<a_{m}:a_{i}\in[d]\}$
, and note that

We have

The group
$S_{m}$
acts on
$[d]^{m}$
by
$\sigma(\overrightarrow{\!v})_{i}=\overrightarrow{\!v}_{\sigma^{-1}(i)}$
for any
$\sigma\in S_{m}$
and
$\overrightarrow{\!v}\in[d]^{m}$
. Similarly, given
$\overrightarrow{\!v},\overrightarrow{w}\in[d]^{m}$
and
$\tau\in S_{2m}$
, we denote by
$(\overrightarrow{\!v},\overrightarrow{w})$
the element in
$[d]^{2m}$
given by
$(\overrightarrow{\!v},\overrightarrow{w})_{i}=\small{\begin{cases}\overrightarrow{\!v}_{i} & \text{if }i\leq m\\\overrightarrow{w}_{i-m} & \text{if }m<i\leq2m\end{cases}}$
, and denote by
$\tau(\overrightarrow{\!v},\overrightarrow{w})_{i}=(\overrightarrow{\!v},\overrightarrow{w})_{\tau^{-1}(i)}$
. In particular, writing
$X_{\overrightarrow{\!v},\overrightarrow{\!u}}:=\prod_{i=1}^{m}X_{v_{i},u_{i}}$
for
$\overrightarrow{\!v},\overrightarrow{\!u}\in[d]^{m}$
, we have

We now rewrite the expected value of (3.3) using Weingarten calculus. For this, define

and

Lemma 3.2 We have

Proof. Using Weingarten calculus, i.e. Theorem 2.12, and (3.3),

Applying the change of coordinate
$\sigma_{2}:=(\mathrm{\pi}\times\mathrm{Id})\circ\widetilde{\sigma}_{2}$
, and observing that
$\widetilde{\sigma}_{2}^{-1}(\pi^{-1}(\overrightarrow{\!a}),\overrightarrow{\!{\kern-.5pt}d})=\sigma_{2}^{-1}(\overrightarrow{\!a},\overrightarrow{\!{\kern-.5pt}d})$
, (3.6) becomes

In order to bound (3.5), we consider a natural action of
$S_{m}^{7}$
on Z, and find a suitable change of coordinates such that the average of the product of the Weingarten functions in (3.5) over any
$S_{m}^{7}$
-orbit is equal to a product of averages of individual Weingarten functions over cosets (see (3.8)). We then use Lemma 2.7 to estimate the contribution in (3.5) of each
$S_{m}^{7}$
-orbit. To conclude the estimates of (3.5), we will further provide estimates for
$|Z|$
.
We first describe the action of
$S_{m}^{7}$
. The element
$(\pi_{b},\pi_{c},\ldots,\pi_{D})\in S_{m}^{7}$
acts on
$(\overrightarrow{\!a},\ldots,\overrightarrow{\!{\kern-.5pt}D})$
by
$(\overrightarrow{\!a},\pi_{b}(\overrightarrow{\!b}),\pi_{c}(\overrightarrow{\!c}),\ldots,\pi_{D}(\overrightarrow{\!{\kern-.5pt}D}))$
and it acts on
$(\sigma_{1},\sigma_{2},\tau_{1},\tau_{2})$
by




This gives rise to an action of
$S_{m}^{7}$
on Z. The action on the input of the Weingarten functions becomes

where we used the conjugacy invariance of Wg to move permutations from right to left. Consider the bijection
$\psi:S_{m}^{8}\rightarrow S_{m}^{8}$
, defined by
$(x_{1},\ldots,x_{8})\mapsto(x_{1},x_{1}x_{2},\ldots,x_{1}x_{2},\ldots, x_{8})$
. Under the change of coordinates
$(\theta_{D},\theta_{c},\theta_{A},\theta_{b},\theta_{C},\theta_{d},\theta_{B},\theta):=\psi^{-1}(\pi_{D},\pi_{c},\pi_{A},\pi_{b},\pi_{C},\pi_{d},\pi_{B},\pi^{-1})$
, the summation of (3.5) over an
$S_{m}^{7}$
-orbit splits into a product of two separate sums:

We can now use the Fourier expansion of Wg (2.11) and the estimates in § 2.1.3 to bound the contribution of an
$S_{m}^{7}$
-orbit in Z to (3.5).
Proposition 3.3. Let
$\widetilde{v}:=(\widetilde{\overrightarrow{\!a}},\ldots,\,\widetilde{\overrightarrow{\!{\kern-.5pt}D}},\widetilde{\sigma}_{1},\widetilde{\sigma}_{2},\widetilde{\tau}_{1},\widetilde{\tau}_{2})\in Z$
and let
${\mathcal O}_{\widetilde{v}}:=S_{m}^{7}\widetilde{v}$
be its
$S_{m}^{7}$
-orbit. Then,

Proof. By the orbit-stabilizer theorem, the left-hand side of (3.9) is the same as summing over all
$(\pi_{D},\ldots,\pi_{B})\in S_{m}^{7}$
and dividing by
$m!^{7}$
. By (3.8), the left-hand side of (3.9) is equal to

Note that
$(S_{2m},S_{m}\times S_{m})$
is a sgn-twisted Gelfand pair, that is, the representation
$\mathrm{Ind}_{S_{m}^{2}}^{S_{2m}}\,\mathrm{sgn}$
is multiplicity-free. By Frobenius reciprocity, each irreducible subrepresentation
$(V_{\lambda},\pi_{\lambda})$
of
$\mathrm{Ind}_{S_{m}^{2}}^{S_{2m}}\mathrm{sgn}$
has a unique
$(S_{m}^{2},\mathrm{sgn})$
-invariant unit vector, so
$\langle \chi_{\lambda}|_{S_{m}^{2}},\mathrm{sgn}\rangle_{S_{m}^{2}}=1$
. By Lemma 2.7, for each
$\sigma\in S_{2m}$
, we have

By Lemma 2.3 it follows that
$\pi_{\lambda}\hookrightarrow\mathrm{Ind}_{S_{m}^{2}}^{S_{2m}}\mathrm{sgn}$
if and only if the Young diagram of
$\lambda\vdash2m$
has at most two columns. Combining with (3.10), we have

By (2.11), (3.11), (2.3), and by our assumption that
$d\geq2m$
, we have

This concludes the proposition.
We now turn to the last ingredient in the proof of Theorem 3.1.
Definition 3.4. Let
$f:S\rightarrow[d]$
be a function on a set S. We define the shape
$\nu_{f}:[d]\rightarrow\mathbb{N}$
of f as

and denote
$\nu_{f}!:=\prod_{u=1}^{d}\nu_{f,u}$
.
Proposition 3.5. Let Z be as in (3.4). Then,

Proof. We need to count all the possible tuples
$(\overrightarrow{\!a},\ldots,\overrightarrow{\!{\kern-.5pt}D},\sigma_{1},\sigma_{2},\tau_{1},\tau_{2})$
in Z. Suppose we have already fixed
$\overrightarrow{\!a}$
and the shapes
of
$\overrightarrow{\!b},\overrightarrow{\!c}$
and
$\overrightarrow{\!{\kern-.5pt}d}$
, where
$\overrightarrow{\!b},\overrightarrow{\!c},\overrightarrow{\!{\kern-.5pt}d}$
are considered as a functions
$[m]\rightarrow[d]$
. Given these data, we have the following.
-
(1) There are
$\frac{m!^{3}}{\nu_{\overrightarrow{\!b}}!\nu_{\overrightarrow{\!c}}!\nu_{\overrightarrow{\!{\kern-.5pt}d}}!}$ options for
$\overrightarrow{\!b},\overrightarrow{\!c},\overrightarrow{\!{\kern-.5pt}d}$ with the above shapes.
-
(2) There are
$(2m)!^{2}$ options for
$\sigma_{2}$ and
$\tau_{2}$ .
-
(3) There are at most
$\binom{2m}{m}^{2}$ options for choosing
$\tau_{1}^{-1}([m])$ and
$\sigma_{1}([m])$ , as subsets of [2m]. Note that we count both valid and invalid options.
-
(4) After fixing the subsets
$\tau_{1}^{-1}([m])$ and
$\sigma_{1}([m])$ , there are at most
$\nu_{\overrightarrow{\!c}}!\nu_{\overrightarrow{\!{\kern-.5pt}d}}!$ options for
$\tau_{1}$ and
$\nu_{\overrightarrow{\!b}}!$ options for
$\sigma_{1}$ .
Summarizing the above items, we get there are at most
$(\frac{m!^{3}(2m)!^{2}\nu_{\overrightarrow{\!b}}!\nu_{\overrightarrow{\!c}}!\nu_{\overrightarrow{\!{\kern-.5pt}d}}!}{\nu_{\overrightarrow{\!b}}!\nu_{\overrightarrow{\!c}}!\nu_{\overrightarrow{\!{\kern-.5pt}d}}!})\binom{2m}{m}^{\!2}=m!^{7}\binom{2m}{m}^{\!4}$
options for
$(\overrightarrow{\!a},\ldots,\overrightarrow{\!{\kern-.5pt}D},\sigma_{1},\sigma_{2},\tau_{1},\tau_{2})\in Z$
with the initial data
. Note that there are
$\binom{d}{m}$
possible options for
$\overrightarrow{\!a}$
, and
$\binom{d+m-1}{m}^{\!3}$
options for
. This gives the desired upper bound.
We can now finish the proof of Theorem 3.1.
Proof of Theorem 3.1. Note that for every
$k\geq1$
and
$n\geq k$
we have

where the rightmost inequality follows from Stirling’s approximation. By Lemma 3.2 and by Propositions 3.3 and 3.5,

By (3.13) and (3.12), by the inequality
$\binom{2m}{m}\leq2^{2m}$
, and by our assumption that
$d\geq2m$
,

Remark 3.6. The current proof of Proposition 3.5 depends on the special structure of the Engel word. One can give a slightly more complicated argument, which can be easily generalized for every word w (this is done in § 6). Here are the main ideas of this alternative argument.
We encode the expression

from (3.2), graphically, by the
$4\times4$
matrix

which is constructed as follows. The rows and columns are indexed by
$x,y,x^{-1},y^{-1}$
. We order the rows by
$x<y<y^{-1}<x^{-1}$
and order the columns by
$x^{-1}<y^{-1}<y<x$
. To find the
$(x,y^{-1})$
-entry of this matrix (i.e. the (1,2)-entry), we look for the subword
$XY^{-1}$
in (3.14) and record the letter of the common index, which is C. All other entries are determined in similar fashion. Note that we do not have elements in the main diagonal since w is cyclically reduced.
We denote
$\eta_{1}=\tau_{1}$
,
$\eta_{2}=\tau_{2}$
,
$\eta_{3}=\sigma_{2}^{-1}$
and
$\eta_{4}=\sigma_{1}^{-1}$
. Note that
$\eta_{i}$
sends the ith row of (3.15) into a permuted copy of its ith column. The alternative counting argument in Proposition 3.5 goes as follows. We fix the upper triangular part, i.e.
$\overrightarrow{\!{\kern.5pt}C},\overrightarrow{\!{\kern-.5pt}D},\overrightarrow{\!a},\overrightarrow{\!b}$
(instead of
$\overrightarrow{\!a},\overrightarrow{\!b},\overrightarrow{\!c},\overrightarrow{\!{\kern-.5pt}d}$
in the proof above). We then choose
$\eta_{1}$
(with
$2m!$
options), which gives us
$\overrightarrow{\!c},\overrightarrow{\!{\kern-.5pt}d}$
and, in particular, reveals the second row. Next, we choose all possible
$\eta_{2}:(\overrightarrow{\!b},\overrightarrow{\!c})\rightarrow(\overrightarrow{\!{\kern-1pt}B},\overrightarrow{\!{\kern.5pt}C})$
, taking into consideration the fact that
$\overrightarrow{\!{\kern.5pt}C}$
is already known. We then proceed to the next row and guess
$\eta_{3}$
, taking into consideration that we already know
$\overrightarrow{\!{\kern-.5pt}D}$
. At this point, the vectors
$\overrightarrow{\!a},\overrightarrow{\!b},\ldots,\overrightarrow{\!{\kern.5pt}C},\overrightarrow{\!{\kern-.5pt}D}$
and the permutations
$\eta_{1},\eta_{2},\eta_{3}$
are known, and the number of options for
$\eta_{4}$
is determined by the shapes of
$\overrightarrow{\!a},\overrightarrow{\!b}$
. This argument will be generalized in § 6 for arbitrary words, where, instead of a
$4\times4$
matrix, we will have a
$2r\times2r$
matrix and, each time we choose
$\eta_{1},\ldots,\eta_{k}$
, the
$k+1$
th row is revealed, allowing us to proceed by induction.
4 Rewriting Theorem 1.3 using Weingarten calculus
In this section, we rewrite the expression
$\mathbb{E}\big(|\mathrm{tr}\bigwedge\nolimits^{\!m}w(X_{1},\ldots,X_{r})|^{2}\big)$
of Theorem 1.3 as a finite sum of Weingarten functions.
Let
$\ell,m,d,w$
be as in Theorem 1.3. We may assume that w is cyclically reduced, i.e. it does not contain a subword of the form
$x_{j}x_{j}^{-1}$
and the first and last letters of w are not inverse of each other. For
$u\in[\ell]$
, let

If we denote
$x_{-a}=x_{a}^{-1}$
, then
$w=\prod_{u}x_{w(u)}$
. We write
$w^{-1}$
for the inverse word,

We start by noting that

Define
$\widetilde{T}\in\mathrm{Sym}([\ell]\times[m])$
by

Recall that
${\mathcal I}_{m,d}=\{a_{1}<\cdots<a_{m}:a_{i}\in[d]\}$
. We have

where in the last equality we use the natural embedding
$\mathrm{Sym}(\{ \ell\}\times[m])\hookrightarrow\mathrm{Sym}([\ell]\times[m])$
obtained by acting trivially on
$[\ell-1]\times[m]$
. Applying this to
$w^{-1}$
, we get

Set
$\Omega=[2]\times[\ell]\times[m]$
,
$\Omega_{s,u}=\{ s\} \times\{ u\} \times[m]$
, and for
$\gamma\in\Omega$
, define

Define
$T\in\mathrm{Sym}(\Omega)$
by

By combining (4.4) and (4.5), we get

The map
$\pi\mapsto T\pi T^{-1}$
is an isomorphism
$\mathrm{Sym}(\Omega_{s,\ell})\overset{\simeq}{\rightarrow}\mathrm{Sym}(\Omega_{s,1})$
, for
$s\in[2]$
. Hence,

where, in the last equality, we replaced F by
$F\circ\big(\pi'\pi\big)^{-1}$
.
Let
$\Phi=(A_{1},\ldots,A_{r},B_{1},\ldots,B_{r})$
be the partition given by

For each
$(\pi,\pi')\in\mathrm{Sym}(\Omega_{1,1})\times\mathrm{Sym}(\Omega_{2,1})$
, set

The sets
$Z_{\pi,\pi'}$
are disjoint. We use the notation

Remark 4.1. Note that we have a map
$Z\rightarrow\mathrm{Sym}(\Omega_{1,1})\times\mathrm{Sym}(\Omega_{2,1})$
sending
$(F,\Sigma)$
to the unique pair
$(\pi_{F},\pi'_{F})$
such that
$(F,\Sigma)\in Z_{\pi_{F},\pi'_{F}}$
.
Rewriting (4.8) using Weingarten calculus (Corollary 2.15), we have the following result.
Proposition 4.2. Let
$w\in F_{r}$
be a cyclically reduced word. Then,

5. Estimating the contribution of a single orbit in Z
In this section we introduce an action of
$H:=\prod_{(s,u)\in[2]\times[\ell]}\mathrm{Sym}(\Omega_{s,u})$
on Z, and estimate (4.12) restricted to each H-orbit. The action can be described as follows.
For every
$(s,u)\in[2]\times[\ell]$
, the group
$\mathrm{Sym}(\Omega_{s,u})$
acts on Z in the following way: if
$u\neq1$
, the action of
$\pi_{s,u}\in\mathrm{Sym}(\Omega_{s,u})$
is

If
$s\in[2]$
and
$\pi_{s,1}\in\mathrm{Sym}(\Omega_{s,1})$
, then

The above group actions commute, which gives rise to an action of H. Note that
$(\pi_{1,1},\pi_{2,1})\cdot Z_{\pi,\pi'}=Z_{\pi_{1,1}\pi,\pi_{2,1}\pi'}$
. If
$u\neq1$
, then
$\pi_{s,u}\cdot(Z_{\pi,\pi'})=Z_{\pi,\pi'}$
.
Definition 5.1. For each
$u,v\in[\ell]$
, we define
$*:\mathrm{Sym}(\Omega_{s,u})\times\mathrm{Sym}(\Omega_{s,v})\rightarrow\mathrm{Sym}(\Omega_{s,v})$
by

Note that
$*$
is associative.
Let
$h:=\prod_{(s,u)}\pi_{s,u}\in H$
and denote
$\overline{h}:=\prod_{(s,u)\neq(1,1),(2,1)}\pi_{s,u}$
. Then
$h\cdot\Sigma=\overline{h}\circ\Sigma\circ T^{-1}h^{-1}T$
. Since
$\widetilde{\mathrm{Wg}}$
is invariant under conjugation in H,

where
$\Psi_{h}=T^{-1}h^{-1}T\overline{h}\in H$
. On each
$\Omega_{s,u}$
,
$\Psi_{h}$
has the following form.
Lemma 5.2. We have

Corollary 5.3. Let
$(\widehat{F},\widehat{\Sigma})$
be a representative of an H-orbit
${\mathcal O}_{(\widehat{F},\widehat{\Sigma})}$
, with
$(\pi_{\widehat{F}},\pi'_{\widehat{F}})=(\mathrm{Id},\mathrm{Id})$
. Then,

Proof. For each
$h=\prod_{(s,u)}\pi_{s,u}\in H$
, write
$\nu(h):=(-1)^{\pi_{1,1}\pi_{2,1}}$
. Consider the bijection
$\psi:H\rightarrow H$
,
$\psi\big(\prod_{(s,u)}\pi_{s,u}\big)=\prod_{(s,u)}\theta_{s,u}$
, where, for
$s=1,2$
,

and observe that
$\nu(\psi(h))=(-1)^{h}=(-1)^{T^{-1}h^{-1}T}$
. Further note that

and hence
$\Psi_{\psi(h)}=\prod_{(s,u)}T^{-1}\pi_{s,u}^{-1}T$
. Changing variables using
$\psi$
, the left-hand side of (5.4) is

where, in each line above,
$h=\prod_{(s,u)}\pi_{s,u}$
.
Corollary 5.4. Set
$\ell_{i}:=\frac{|A_{i}|}{m}$
for each
$i\in[r]$
. Then the following holds:

Proof. By Proposition 4.2, Corollary 5.3, (2.11), Lemma 2.7, and by (2.3),

Note that the irreducible characters
$\chi_{\lambda}$
in
$\mathrm{Ind}_{\mathrm{S}_{m}^{\ell_{i}}}^{S_{m\ell_{i}}}(\mathrm{sgn})$
correspond to Young diagrams
$\lambda\vdash m\ell_{i}$
with at most
$\ell_{i}$
columns. If the columns of
$\lambda$
are of lengths
$j_{1}\geq\cdots\geq j_{\ell_{i}}$
, then

6. Estimates on
$|Z|$
In this section we give upper bounds on
$|Z|$
, defined in (4.11). We first set some notation. For each
$0\neq i,j\in[-r,r]$
, set



Following Remark 3.6, it is helpful to picture a
$2r\times2r$
matrix, whose (i,j)th entry is the set
$V_{ij}$
, with
$R_{-r},\ldots,R_{r}$
correspond to rows, and
$C_{-r},\ldots,C_{r}$
correspond to columns. Denote

Observe that
$\ell_{i}=\ell_{-i}$
,
$\ell_{i,j}=\ell_{j,i}$
and note that
$\ell_{i}=\frac{|A_{i}|}{m}$
if
$i>0$
, so that (6.1) extends the definition of
$\ell_{i}$
in Corollary 5.4. For each
$0\neq j\in[-r,r]$
set

For each
$i\in[r]$
and each
$\Sigma\in S_{\Phi}$
, denote
$\eta_{i}:=T\circ(\Sigma^{-1})|_{B_{i}}$
and
$\eta_{-i}:=T\circ(\Sigma^{-1})|_{A_{i}}$
. Note that
$\eta_{i}(C_{i})=R_{i}$
for all i. Define the following sets:

and

Proposition 6.1. We have

Proof. The map
$(F,\Sigma)\mapsto(F\circ\pi_{F}\pi'_{F},\Sigma\circ T{}^{-1}\pi_{F}\pi'_{F}T)$
is a bijection between Z and W, giving the first equality. Clearly,
$|W|\leq|W'|$
.
In order to prove the last inequality, we use the map
$\Phi_{+}:W'\rightarrow\{f:C_{+}\rightarrow[d]\}$
, sending
$(F,\Sigma)\in W'$
to
$F|_{C_{+}}$
. We estimate
$|W'|$
by analyzing the fibers of
$\Phi_{+}$
. Let
$f\in\Phi_{+}(W')$
and suppose it has a shape
$\nu_{+}$
(see Definition 3.4). We write
$\nu_{j,+}$
for the shapes of
$f|_{C_{j}^{+}}$
. We reveal
$(F,\Sigma)\in\Phi_{+}^{-1}(f)$
row by row, starting with the
$-r$
th row
$R_{-r}$
and making sure that, in each step,
$F\circ T|_{T^{-1}(R_{k})}=F\circ\Sigma|_{T^{-1}(R_{k})}$
, or, equivalently,
$F|_{R_{k}}=F\circ\eta_{k}^{-1}|_{R_{k}}$
.
-
1. There are at most
$(m\ell_{-r})!$ options for
$\eta_{-r}$ . Note that
$R_{-r\subseteq C_+}$ and, hence, by (6.2), the choice of
$\eta_{-r}$ determines
$F|_{C_{-r}}$ . At this point,
$F|_{R_{-r+1}}$ is determined as well.
-
2. Note that
$C_{-r+1}^{+}=V_{-r,-r+1}$ . There are at most:
-
(a)
$\binom{m\ell_{-r+1}}{m\ell_{-r,-r+1}}$ options for the sets
$\eta_{-r+1}(C_{-r+1}^{+})$ and
$\eta_{-r+1}(C_{-r+1}\backslash C_{-r+1}^{+})$ ;
-
(b)
$(m(\ell_{-r+1}-\ell_{-r,-r+1}))!$ options for
$\eta_{-r+1}|_{C_{-r+1}\backslash C_{-r+1}^{+}}:C_{-r+1}\backslash C_{-r+1}^{+}\rightarrow\eta_{-r+1}(C_{-r+1}\backslash C_{-r+1}^{+})$ ;
-
(c)
$(\nu_{-r+1,+})!$ options for
$\eta_{-r+1}:C_{-r+1}^{+}\rightarrow\eta_{-r+1}(C_{-r+1}^{+})$ .
-
-
3. More generally, assume, by induction, that we have fixed
$(\eta_{i})_{i<k}$ , and, thus, we have already determined
$F|_{R_{i}}$ for
$i\leq k$ ,
$F|_{C_{i}}$ for
$i<k$ , and
$F|_{C_{+}}$ . Then there are at most:
-
(a)
$\binom{m\ell_{k}}{\sum_{i<k}m\ell_{i,k}}$ options for the sets
$\eta_{k}(C_{k}^{+})$ and
$\eta_{k}(C_{k}\backslash C_{k}^{+})$ ;
-
(b)
$\big(m(\ell_{k}-\sum_{i<k}\ell_{i,k})\big)!$ options for
$\eta_{k}|_{C_{k}\backslash C_{k}^{+}}:C_{k}\backslash C_{k}^{+}\rightarrow\eta_{k}(C_{k}\backslash C_{k}^{+})$ ;
-
(c)
$(\nu_{k,+})!$ options for
$\eta_{k}|_{C_{k}^{+}}:C_{k}^{+}\rightarrow\eta_{k}(C_{k}^{+})$ .
After choosing
$\eta_{-r},\ldots,\eta_{r}$
, we have determined F. Furthermore, since
$\sum_{0\neq k=-r}^{r}\nu_{k,+}=\nu_{+}$
, we have
$\prod_{0\neq k=-r}^{r}(\nu_{k,+})!\leq\nu_{+}!$
. Hence,

Since
$|C_{+}|=m\ell$
, we have

and there are at most
$\binom{d+m\ell}{m\ell}$
possible shapes
$\nu_{+}$
. Combining (6.3) and (6.4) we conclude

7. Proof of Theorems 1.1 and 1.3
In this section we use the results of §§ 4, 5 and 6 to prove Theorems 1.1 and 1.3. We end the section with the proof of Theorem 1.6.
Proof of Theorem 1.3. Assume that
$d=am$
for
$a\geq\ell\geq2$
. By (3.12), we have


We remind the reader the definition of
$\ell_{i}$
and
$\ell_{i,j}$
in (6.1). Concretely, for each
$0\neq i,j\in[-r,r]$
,
$\ell_{i}$
is the combined number of appearances of the letter
$x_{i}$
(with the convention that
$x_{-i}=x_{i}^{-1}$
) in w and
$w^{-1}$
, and
$\ell_{i,j}$
is the combined number of appearances of the string ‘
$x_{i}x_{j}^{-1}$
’ in w and in
$w^{-1}$
. In particular, we have
$\sum_{i=1}^{r}\ell_{i}=\ell$
,
$\ell_{i,i}=0$
and
$\sum_{0\neq i\in[-r,r]}\ell_{i,k}=\ell_{k}$
and, therefore,

By Corollary 5.4, Proposition 6.1 and by (7.3), (7.1) and (7.2), we obtain

Finally, note that if
$d\geq(22\ell)^{\ell}m$
, then
$(22\ell)^{m\ell}\leq(\frac{d}{m})^{m}\leq\binom{d}{m}$
.
We now turn to the proof of Theorem 1.1. We first deal with the case when the rank is bounded (and prove Conjecture 1.7 in this case) and then prove Theorem 1.1 in the unbounded case.
Definition 7.1. Given
$w_{1}\in F_{r_{1}}$
and
$w_{2}\in F_{r_{2}}$
, we denote by
$w_{1}*w_{2}\in F_{r_{1}+r_{2}}$
their concatenation. For example, if
$w=[x,y]$
, then
$w*w=[x,y]\cdot[z,w]$
.
We remind the reader that for a compact group G, and a word
$w\in F_{r}$
, we denote by
$\tau_{w,G}:=(w_{G})_{*}(\mu_{G}^{r})$
the word measure associated to w and G, and the Fourier coefficient of
$\tau_{w,G}$
at
$\rho\in\mathrm{Irr}(G)$
is
$a_{w,G,\rho}:=\int_{G^{r}}\rho(w(x_{1},\ldots,x_{r}))\mu_{G}^{r}=\int_{G}\rho(y)\tau_{w,G}$
. If G is a compact connected semisimple Lie group, by [Reference BorelBor83], the map
$w_{G}:G^{r}\rightarrow G$
is a submersion outside a proper subvariety in
$G^{r}$
. It follows that in this case, or e.g. when G is a finite group,
$\tau_{w,G}$
is absolutely continuous with respect to
$\mu_{G}$
, and we can write
$\tau_{w,G}=f_{w,G}\cdot \mu_{G}$
, with
$f_{w,G}\in L^{1}(G)$
. Since
$\tau_{w,G}$
is conjugate invariant,
$f_{w,G}$
is a class function, and it can be written as a linear combination of characters
$f_{w,G}=\sum_{\rho\in\mathrm{Irr}(G)}\overline{a_{w,G,\rho}}\cdot\rho$
.
By Definition 7.1, we see that
$\tau_{w_{1}*w_{2},G}=\tau_{w_{1},G}*\tau_{w_{2},G}$
for every
$w_{1}\in F_{r_{1}}$
and
$w_{2}\in F_{r_{2}}$
. Since
$\rho_{1}*\rho_{2}=\frac{\delta_{\rho_{1},\rho_{2}}}{\rho_{1}(1)}\cdot\rho_{1}$
for every
$\rho_{1},\rho_{2}\in\mathrm{Irr}(G)$
, we have

Proposition 7.2. For every
$1\neq w\in F_{r}$
and
$d\in\mathbb{N}$
, there exists
$\epsilon(d,w)>0$
such that:
-
(1) for every compact connected semisimple Lie group G of rank d and every
$\rho\in\mathrm{Irr}(G)$ , we have
$|a_{w,G,\rho}|\leq\rho(1)^{1-\epsilon(d,w)}$ ;
-
(2) in particular, for every
$1\leq m\leq d$ ,
\[\mathbb{E}_{\mathrm{U}_{d}}\Big(\Big|\mathrm{tr}\Big(\!\bigwedge\nolimits^{\!\!m}w(X_{1},\ldots,X_{r})\Big)\!\Big|^{2}\Big)=\mathbb{E}_{\mathrm{SU}_{d}}\Big(\Big|\mathrm{tr}\Big(\!\bigwedge\nolimits^{\!\!m}w(X_{1},\ldots,X_{r})\Big)\!\Big|^{2}\Big)\leq\binom{d}{m}^{\!\!2(1-\epsilon(d,w))}.\]
Proof. We first prove item (1). Fix
$w\in F_{r}$
and a compact connected semisimple Lie group G. Let
$\tau_{w,G}=f_{w,G}\mu_{G}$
be the word measure. By (7.4), and since
$a_{w^{-1},G,\rho}=\overline{a_{w,G,\rho}}$
for each
$\rho\in\mathrm{Irr}(G)$
, we have

Replacing w by
$w*w^{-1}$
, we may assume that all Fourier coefficients
$a_{w,G,\rho}$
are in
$\mathbb{R}_{\geq0}$
.
It follows from [Reference Glazer, Hendel and SodinGHS24, Theorem 1.1] that
$f_{w,G}\in L^{1+\epsilon'}(G)$
for some
$\epsilon'=\epsilon'(G,w)>0$
. By Young’s convolution inequality, it follows that
$f_{w,G}^{*t}\in L^{\infty}(G)$
for all
$t\geq t_{0}(G,w):=\lceil\frac{1+\epsilon'(G,w)}{\epsilon'(G,w)}\rceil $
(see, e.g., [Reference Glazer, Hendel and SodinGHS24, Section 1.1, end of p.3]). In particular, by (7.4), we deduce that

Since
$a_{w,G,\rho}\geq0$
, we deduce that
$a_{w,G,\rho}<\rho(1)^{1-\frac{2}{t_{0}(G,w)}}$
for all but finitely many
$\rho\in\mathrm{Irr}(G)$
. To deal with the remaining finitely many (non-trivial) representations of G, we simply use the bound
$a_{w,G,\rho}<\rho(1)$
, which follows e.g. by the Itô–Kawada equidistribution theorem [Reference Kawada and ItôKI40] (see also [Reference ApplebaumApp14, Theorem 4.6.3]), since
$\mathrm{Supp}(\tau_{w,G})$
generates G. Since there are only finitely many compact semisimple connected Lie groups of rank d, this implies item (1).
Note that the character
$\rho_{(1^{m})}\otimes\rho_{(1^{m})}^{\vee}$
of the representation
$\bigwedge\nolimits^{\!m}\mathbb{C}^{d}\otimes\big(\bigwedge\nolimits^{\!m}\mathbb{C}^{d}\big)^{\vee}$
of
$\mathrm{SU}_{d}$
is given by
$\big|\mathrm{tr}\big(\bigwedge\nolimits^{\!m}(A)\big)\big|^{2}$
. Since
$\rho_{(1^{m})}\otimes\rho_{(1^{m})}^{\vee}$
is a sum of irreducible characters, by applying the Itô–Kawada equidistribution theorem to each irreducible character, for each
$1\leq m\leq d$
, we have

Since there are only finitely many such m’s, this implies item (2).
Theorem 1.1 now follows from Proposition 7.2 and the following theorem.
Theorem 7.3. For every
$\ell\in\mathbb{N}$
, there exist
$\epsilon(\ell),C(\ell)>0$
such that, for every
$d\geq C(\ell)$
, every
$1\leq m\leq d$
, and every word
$w\in F_{r}$
of length
$\ell$
, one has

In order to prove Theorem 7.3, we need the following technical lemma.
Lemma 7.4. Let
$H(x)=-x\log(x)-(1-x)\log(1-x)$
be the binary entropy function. Then:
-
(1) for every
$d\in\mathbb{N}$ and every
$0<x<1$ such that
$dx\in\mathbb{N}$ , we have
$({2^{dH(x)}}/{\sqrt{8dx(1-x)}})\leq\binom{d}{xd}\leq({2^{dH(x)}}/{\sqrt{\pi dx(1-x)}})\leq2^{dH(x)}$ ;
-
(2) let
$0<\delta\leq\frac{1}{2}$ , then for every
$b\in[\delta,\frac{1}{2}]$ ,
$a\in[\delta,b]$ , and
$d>({1}/{\delta^{4}})$ such that bd,ad,d are integers, one has
\[\binom{d}{(b-a)d}\leq\binom{d}{bd}^{\!\!1-\delta^{2}}.\]
Proof. Item (1) follows, e.g., from [Reference Cover and ThomasCT06, Lemma 17.5.1]. The Taylor series of H(x) around
$1/2$
is

Since
$H'(x)=\log(\frac{1-x}{x})$
, H(x) is monotone increasing in
$(0,1/2)$
and, therefore,

Since
$d>\frac{1}{\delta^{4}}\geq16$
, we have
$\frac{\log(d)}{d}\leq\frac{1}{\sqrt{d}}\leq\delta^{2}$
. Combining with item (1), we have

Proof of Theorem 7.3. Since
$\bigwedge\nolimits^{\!m}V\simeq\big(\bigwedge\nolimits^{\!d-m}V\big)^{\vee}\otimes\chi_{\mathrm{det}}$
, we may assume that
$2m\leq d$
. Let
$\delta(\ell):=(25\ell)^{-\ell}$
, let
$C(\ell)=\delta(\ell)^{-7}$
, and suppose that
$d\geq C(\ell)$
. By Theorem 1.3, we may assume that
$d\leq\delta(\ell)^{-1}m$
, and, in particular,
$m\geq\delta(\ell)^{-6}$
. As in the proof of Proposition 7.2, by replacing w by
$w*w^{-1}$
, we may assume that
$a_{w,\mathrm{U}_{d},\rho}\in\mathbb{R}_{\geq0}$
for all
$\rho\in\mathrm{Irr}(\mathrm{U}_{d})$
. By Theorem 2.5, we have for all
$c\leq \frac{d}{2}$
:

where
$\lambda_{(j)}=(1,\ldots,1,0,\ldots,0,-1,\ldots,-1)$
, with
$-1$
and 1 appearing j times. Moreover,
$V_{\lambda_{(c)}}$
is the largest irreducible subrepresentation of
$\bigwedge\nolimits^{\!c}V\otimes\bigwedge\nolimits^{\!c}V^{\vee}$
, and we have
$\rho_{\lambda_{(c)}}(1)\geq (\frac{1}{c+1})\binom{d}{c}^{\!2}\geq\binom{d}{c}^{\!3/2}$
. By Theorem 1.3, and since all
$a_{w,\mathrm{U}_{d},\rho}$
are non-negative, if
$c\leq\lceil\delta(\ell)d\rceil\leq(22\ell)^{-\ell}d$
, then

Applying the last inequality for
$w^{*9}$
, recalling that
$a_{w^{*t},\mathrm{U}_{d},\rho}=\frac{a_{w,\mathrm{U}_{d},\rho}^{t}}{\rho(1)^{t-1}}$
for all
$\rho\in\mathrm{Irr}(\mathrm{U}_{d})$
, we get

Note that, for each
$\delta(\ell)d\leq m\leq\frac{d}{2}$
,
$\bigwedge\nolimits^{\!m}V$
is a subrepresentation of
$\bigwedge\nolimits^{\lceil\delta(\ell)d\rceil}V\otimes\bigwedge\nolimits^{\!m-\lceil\delta(\ell)d\rceil}V$
, so

Finally, by the positivity of the Fourier coefficients of w, by (7.7), by Lemma 7.4 (note that
$m\geq\lceil\delta(\ell)d\rceil$
) and by (3.12) (note that
$\delta(\ell)^{2}m\geq1$
),

By (3.12),
$m+1\leq2^{2\sqrt{m}}\leq\binom{d}{m}^{\!2/\sqrt{m}}$
for each
$m\leq\frac{d}{2}$
. Hence,

Consequently, we get

and, thus,
$\mathbb{E}\big(\rho_{\lambda_{(m)}}\circ w\big)\leq\rho_{\lambda_{(m)}}(1)^{1-\frac{\delta(\ell)^{2}}{36}}$
. Taking
$\epsilon(\ell):=\frac{\delta(\ell)^{2}}{72}$
, and using
$m+1\leq\binom{d}{m}^{\!2\delta(\ell)^{3}}$
, we get

We end the section with a proof of Theorem 1.6.
Proof of Theorem 1.6. Let
$w\in F_{r}$
. Denote
$\widetilde{w}:=w*w^{-1}$
. Recall that
$\tau_{\widetilde{w},G}=f_{\widetilde{w},G}\mu_{G}$
and note that for every
$t\in\mathbb{N}$
,

Applying [Reference Larsen, Shalev and TiepLST19, Theorem 4], there are
$C',M(w)\in\mathbb{N}$
such that, for
$N(w):=C'\ell(w)^{4}$
and for every finite simple group G of size
$>M(w)$
, one has

where the first equality follows from (7.4). Since
$a_{\widetilde{w},G,\rho}=\frac{|a_{w,G,\rho}|^{2}}{\rho(1)}\geq0$
, we deduce that for each
$1\neq\rho\in\mathrm{Irr}(G)$

from which the theorem follows for
$\epsilon=\frac{1}{N(w)}=\frac{1}{C'\ell(w)^{4}}$
.
8. Fourier coefficients of symmetric powers
In this section, we prove Theorem 1.4. Denote
${\mathcal J}_{m,d}=\{c_{1}\leq\cdots\leq c_{m}:c_{i}\in[d]\}$
. We first claim that, for each
$A\in\mathrm{End}(\mathbb{C}^{d})$
and
$m\geq1$
,

Indeed, for each
$\overrightarrow{\!c}\in{\mathcal J}_{m,d}$
, let
$\nu_{\overrightarrow{\!c}}$
be the shape of
$\overrightarrow{\!c}$
(see Definition 3.4) and set

Then
$\{ v_{\overrightarrow{\!c}}\}_{\overrightarrow{\!c}\in{\mathcal J}_{m,d}}$
is an orthonormal basis for
$\mathrm{Sym}^{m}(\mathbb{C}^{d})$
. Given
$A\in\mathrm{End}(\mathbb{C}^{d})$
, we have

where the last equality follows since
$\sum_{\pi\in S_{m}}A_{c_{1}c_{\pi(1)}}\cdots A_{c_{m}c_{\pi(m)}}$
is invariant under permuting
$c_{1},\ldots,c_{m}$
, and since there are
$\frac{m!}{\nu_{\overrightarrow{\!c}}!}$
vectors
$\overrightarrow{\!a}\in[d]^{m}$
of a shape
$\nu_{\overrightarrow{\!c}}$
. In particular, for any word w,

Proposition 8.1. Let
$w\in F_{r}$
be a cyclically reduced word. With
$\Phi,T,\Omega,\Omega_{s,u}$
as in § 4, we have

where

Proof. Similarly to (4.4), we have

Consequently, as in (4.8), we have

The proposition now follows from Corollary 2.15.
We next define an action of
$H:=\prod_{(s,u)\in[2]\times[\ell]}\mathrm{Sym}(\Omega_{s,u})$
on
$\widetilde{Z}$
in the same way as in § 5. For
$(s,u)\in[2]\times([\ell]\backslash\{1\})$
and
$\pi_{s,u}\in\mathrm{Sym}(\Omega_{s,u})$
,

and if
$(\pi_{1,1},\pi_{2,1})\in\mathrm{Sym}(\Omega_{1,1})\times\mathrm{Sym}(\Omega_{2,1})$
,

Proof of Theorem 1.4. The proof is similar to the proof of Theorem 1.3. The only difference is that now, summing over the H-orbit kills all representations that do not appear in
$\mathrm{Ind}_{S_{m}^{\ell_{i}}}^{S_{m\ell_{i}}}(1)$
, rather than the representations not in
$\mathrm{Ind}_{S_{m}^{\ell_{i}}}^{S_{m\ell_{i}}}(\mathrm{sgn})$
. By Lemma 2.3, the irreducible subrepresentations
$\chi_{\lambda}$
of
$\mathrm{Ind}_{S_{m}^{\ell_{i}}}^{S_{m\ell_{i}}}(1)$
correspond to partitions
$\lambda=(\lambda_{1},\ldots,\lambda_{\ell_{i}})$
with at most
$\ell_{i}$
rows, and, therefore,
$\prod_{(a,b)\in\lambda}(d+b-a)\geq(d-\ell)^{m\ell_{i}}$
. As in Corollary 5.3 and (5.6), the average of
$\widetilde{\mathrm{Wg}}(\Sigma^{2})$
over an H-orbit
$H\cdot(\widehat{\pi},\widehat{\pi'},\widehat{F},\widehat{\Sigma})$
is bounded by

Denote
$\widetilde{Z}_{\pi,\pi'}:=\{ (F,\Sigma):(\pi,\pi',F,\Sigma)\in\widetilde{Z}\}$
. Since
$\widetilde{Z}_{\mathrm{Id},\mathrm{Id}}=W'$
, Proposition 6.1 implies that

As in the proof of Theorem 1.3, if
$d\geq m\ell$
, then

Appendix A. Fourier coefficients of the power word and a Diaconis–Shahshahani-type result
In this appendix, we formulate two results. The first is a computation of the Fourier coefficients of the power word
$w=x^{l}$
for representations
$\rho_{\lambda}\in\mathrm{Irr}\big(\mathrm{U}_{d}\big)$
, where
$\widetilde{\lambda}$
(see Remark 2.6) has at most
$\frac{d}{2l}$
boxes. The second is a Diaconis–Shahshahani-type result for the mth coefficient of the characteristic polynomial of a word w in random unitary matrices. Both statements are consequences of known results.
Proposition A.1. Let
$w=x^{l}$
be the lth power word. Then, for every
$m\in\mathbb{N}$
and every
$d\geq2ml$
:
-
(1) we have
\[\mathbb{E}(|\rho_{\lambda}\circ w|^{2})=\frac{1}{m!}\sum_{\sigma\in S_{m}}l^{\ell(\sigma)}|\chi_{\lambda}(\sigma)|^{2},\]
$\lambda\vdash m$ ; in particular,
$\mathbb{E}(|\rho_{\lambda}\circ w|^{2})\leq l^{m}$ ;
-
(2) we have
\[\mathbb{E}\Big(\Big|\mathrm{tr}\Big(\!\bigwedge\nolimits^{\!\!m}w\Big)\!\Big|^{2}\Big)=\mathbb{E}(|\mathrm{tr}(\mathrm{Sym}^{m}w)|^{2})=\binom{l+m-1}{m}.\]
Proof. For every matrix
$A\in\mathrm{U}_{d}$
and every
$\mu\vdash m$
, set

where
$\mu=(1^{a_{1}}\cdots m^{a_{m}})$
is the partition
$m=\underset{a_{1}\text{times}}{\underbrace{(1+\cdots+1)}}+\cdots+\underset{a_{m}\text{times}}{\underbrace{(m+\cdots+m)}}$
. The functions
$\mathrm{tr}_{\mu}$
correspond to the power-sum symmetric functions
$p_{\mu}$
. Given
$\lambda\vdash m$
, the character
$\rho_{\lambda}(A)$
is a Schur polynomial in the eigenvalues of A, and, hence, it can be expressed in terms of
$\mathrm{tr}_{\mu}(A)$
via the following formula (see, e.g., [Reference MacdonaldMac95, I.7, p. 114]),

where
$\chi_{\lambda}(\mu)$
is the value of the character
$\chi_{\lambda}\in\mathrm{Irr}(S_{m})$
on the elements with cycle type
$\mu$
. In addition, by (1.2), for every pair of partitions
$\mu=(1^{a_{1}}\cdots m^{a_{m}})$
and
$\mu'=(1^{b_{1}}\cdots m^{b_{m}})$
of m, we have

Combining (A.2) and (A.3), and using the fact that the number of permutations
$\sigma\in S_{m}$
of cycle type
$\mu=(1^{a_{1}}\cdots m^{a_{m}})$
is
$\frac{m!}{\prod_{j=1}^{m}a_{j}!j^{a_{j}}}$
, we obtain

The second claim of item (1) follows from Schur orthogonality and the inequality
$l^{\ell(\sigma)}\leq l^{m}$
.
For item (2), note that
$\mathrm{tr}\big(\!\bigwedge\nolimits^{\!m}w\big)=\rho_{(1^{m})}\circ w$
and
$\mathrm{tr}(\mathrm{Sym}^{m}w)=\rho_{(m^{1})}\circ w$
. The corresponding characters of
$S_{m}$
are the sign and the trivial characters. Thus, (A.4) becomes

where
$\left[\begin{smallmatrix} m\\ k\end{smallmatrix}\right]$
is the number of permutations of m elements with exactly k disjoint cycles, also known as the unsigned Stirling number of the first kind. The last equality follows, for example, from [Reference Graham, Knuth and PatashnikGKP94, Equation (6.11)]. This concludes item (2).
We next prove a Diaconis–Shahshahani-type result. We first recall the following proposition, which is a consequence of [Reference Mingo, ś niady and SpeicherMSS07, Theorem 2] and [R, Theorem 4.1] (see also [Reference Magee and PuderMP19, Corollary 1.13]).
Proposition A.2. Let
$w\in F_{r}$
, and let
$\mu=(1^{a_{1}}\cdots m^{a_{m}})$
,
$\mu'=(1^{b_{1}}\cdots m^{b_{m}})$
be partitions of m. Let
$p(w)\in\mathbb{N}$
be such that
$w=u^{p(w)}$
with
$u\in F_{r}$
a non-power. Then,

Since the joint moments of
$\mathrm{tr}(w^{1}),\ldots,\mathrm{tr}(w^{m})$
converge, as
$d\rightarrow\infty$
, to the joint moments of independent complex normal random variables, an application of the moment method (as was done in [Reference Diaconis and ShahshahaniDS94] for
$w=x$
, and later in [Reference RădulescuRăd06, Reference Mingo, ś niady and SpeicherMSS07] for a general word) implies the following.
Corollary A.3 (see [Reference RădulescuRăd06, Theorem 4.1] and [Reference Mingo, ś niady and SpeicherMSS07, Theorem 2]). The random variables
$\mathrm{tr}(w^{1}),\ldots,\mathrm{tr}(w^{m})$
converge in distribution to
$\sqrt{p(w)}Z_{1},\ldots,\sqrt{mp(w)}Z_{m}$
, as
$d\rightarrow\infty$
, where
$Z_{1},\ldots,Z_{m}$
are independent complex normal variables.
In [Reference Diaconis and GamburdDG06], Diaconis and Gamburd combined Corollary A.3 for
$w=x$
(namely [Reference Diaconis and ShahshahaniDS94]), together with Newton’s identities relating elementary and power-sum symmetric functions to give a formula for the limit behavior of the random variables
$\mathrm{tr}\bigwedge\nolimits^{\!m}X$
with X is a random unitary matrix in
$\mathrm{U}_{d}$
. Repeating the argument for a general word w yields the following description of
$\underset{d\rightarrow\infty}{\lim}\mathrm{tr}_{\mathrm{U}_{d}}\bigwedge\nolimits^{\!m}w$
.
Corollary A.4 (cf. [Reference Diaconis and GamburdDG06, Proposition 4]). Let
$w\in F_{r}$
be a word and let
$m\in\mathbb{N}$
. Then the sequence of random variables
$\mathrm{tr}_{\mathrm{U}_{d}}\bigwedge\nolimits^{\!m}w$
converges in distribution, as
$d\rightarrow\infty$
, to the polynomial in the normal variables
$Z_{1},\ldots,Z_{m}$
given by

Example A.5. Let
$m=3$
. Then for every Borel set
$A\subseteq\mathbb{C}$
,

where
$Z_{1},Z_{2},Z_{3}$
are independent and identically distributed normal variables, and

Acknowledgements
We thank Rami Aizenbud, Yotam Hendel, Michael Larsen, Michael Magee, Doron Puder, Yotam Shomroni, Ofer Zeitouni and Steve Zelditch for useful conversations. We thank the referees for their useful comments and for improving the readability of the paper.
Conflicts of interest
None.
Financial support
NA was supported by NSF grant DMS–1902041, IG was supported by AMS–Simons travel grant, and both of us were supported by BSF grant 2018201.
Journal information
Compositio Mathematica is owned by the Foundation Compositio Mathematica and published by the London Mathematical Society in partnership with Cambridge University Press. All surplus income from the publication of Compositio Mathematica is returned to mathematics and higher education through the charitable activities of the Foundation, the London Mathematical Society and Cambridge University Press.