The Steinberg Representation

In this post I want to describe a remarkable representation associated to finite groups of Lie type. For this, let \(G\) be a connected reductive group over a finite field \(k\) with \(q = p^r\) elements, and let \(U\) be the unipotent radical of some Borel \(k\)-subgroup \(B\) of \(G\). Steinberg constructed an irreducible representation \(\mathrm{St}\) of \(G(k)\) of dimension \(q^{\dim U}\). For convenience, we will assume that \(G\) is split, although the experts can rest assured that everything goes through if one replaces absolute root systems by relative root systems. If \(T\) is a (split) maximal \(k\)-torus of \(B\) and \(\Delta\) is the system of simple roots corresponding to the pair \((B, T)\), then we define

\[ \mathrm{St} = \sum_{I \subset \Delta} (-1)^{|I|} \mathrm{ind}_{P_I(k)}^{G(k)} 1_{P_I(k)}, \]

where \(P_I\) is the parabolic subgroup of \(G\) containing \(B\) corresponding to \(I\). From this definition, it is unclear that this is a character of \(G(k)\) (rather than a generalized character), let alone irreducible, and it is not clear what its dimension should be.

Example. Let \(G = \mathrm{SL}_2\), so that \(\Delta\) consists of a single element and \(\mathrm{St} = \mathrm{ind}_{B(k)}^{G(k)} 1_{B(k)} - 1_{G(k)}\). By definition, \(\mathrm{ind}_{B(k)}^{G(k)} 1_{B(k)}\) is the space of \(\mathbf{C}\)-valued functions on \(G(k)/B(k) = \mathbf{P}^1(k)\), while \(1_{G(k)}\) may be understood as the space of constant functions on the same space. Thus one can regard \(\mathrm{St}\) as the (\(q\)-dimensional) space of complex-valued functions on \(\mathbf{P}^1(k)\) of average value \(0\). The fact that \(\mathrm{St}\) is irreducible is still not obvious, but it follows in this case from arguments with Frobenius reciprocity and Mackey theory. (If anyone can think of a simple argument, I would be happy to hear it!)

Example. The Steinberg representation is not intrinsic to the abstract group \(G(k)\), as it can happen that a finite group can be given the structure of a group of Lie type in more than one way. For example, \(\mathrm{PSL}_2(\mathbf{F}_7) \cong \mathrm{SL}_3(\mathbf{F}_2)\), and this is the unique simple group of order 168. (Here \(\mathrm{PSL}_2(\mathbf{F}_7)\) denotes the quotient of \(\mathrm{SL}_2(\mathbf{F}_7)\) by its center; it does not denote \(\mathrm{PGL}_2(\mathbf{F}_7)\), which has order \(336\) and is not simple.) Thus there can be more than one Steinberg representation associated to a given group. In this case, in this case there is one Steinberg representation of dimension 7 and another of dimension 8. (The fact that the Steinberg representation of \(\mathrm{SL}_2(\mathbf{F}_7)\) descends to \(\mathrm{PSL}_2(\mathbf{F}_7)\) can be deduced from the definition using the Bruhat decomposition and an inclusion-exclusion argument.)

The proof that \(\mathrm{St}\) has properties as described works essentially as follows:

  1. Show that if \(\epsilon\) is the alternating character of the Weyl group \(W\) of \(G\), and if \(W_I\) is the subgroup of \(W\) generated by the simple reflections along roots in \(I \subset \Delta\), then \(\epsilon = \sum_{I \subset \Delta} (-1)^{|I|} \mathrm{ind}_{W_I}^{W} 1_{W_I}\). This is proved by introducing a simplicial complex \(\Sigma\) attached canonically to \((W, S)\), showing that (the geometric realization of) \(\Sigma\) is a sphere, recognizing \(\epsilon\) as the representation of \(W\) on the top homology of \(\Sigma\), and then using the Hopf trace formula.
  2. Use point 1 and the Bruhat decomposition (+ adjacent theory) to deduce that \(\mathrm{St}\) or its negative is irreducible.
  3. Compute \(\mathrm{St}(1) = q^{\dim U}\) and conclude in particular from point 2 that \(\mathrm{St}\) is irreducible. This is proved by introducing a simplicial complex \(\mathcal{B}\) attached canonically to \(G(k)\), showing that (the geometric realization of) \(\mathcal{B}\) is homotopic to the wedge of \(q^{\dim U}\)-many spheres and the using the Hopf trace formula to recognize \(\mathrm{St}\) as the representation of \(G(k)\) on the top homology of \(\mathcal{B}\).

In this post I will only prove points 1 and 2, leaving point 3 for a later post in which I will discuss buildings more fully. (One can show point 3 using elementary methods avoiding buildings, but the combinatorics are somewhat intricate and it seems to me that there is more intuition to be gained from the topological argument. For details, see the proof of Theorem 2(b) in Curtis, The Steinberg Character of a Finite Group with a \((B, N)\)-pair.) A motivating point to keep in mind is that the Weyl group \(W\) is often thought of as an analogue of \(G(k)\) where \(k\) is replaced by “the field with one element”. For us this only has the heuristic value that many true statements about \(G(k)\) admit analogous statements for \(W\), and the latter are often viewed as “degenerate” forms of the former. For example, there are universal polynomials in \(q\) which compute the order of \(G(k)\), and specializing \(q = 1\) leads to formulas for the order of \(W\). See the introduction of this ArXiV document, for example. In the situation above, we view the Coxeter complex of the Weyl group \(W\) as being a degenerate form of the spherical building of \(G(k)\) and the alternating character of \(W\) as being a degenerate form of the Steinberg character of \(G(k)\). As we will see in a later post, the spherical building \(\mathcal{B}\) of \(G(k)\) is built up from subcomplexes called apartments, each of which is canonically isomorphic to the Coxeter complex of \(W\). (Note also that \(\mathrm{St}\) has dimension \(q^{\dim U}\), and specializing \(q = 1\) gives the dimension of the alternating character \(\epsilon\).) In this way, the basic facts about the Steinberg representation are deduced from the degenerate and nondegenerate situations.

Before beginning to prove points 1 and 2 from above, I want to remark that the Steinberg representation shows up in many unexpected places in algebraic geometry and the theory of algebraic groups. I intend to write a post explaining this at a later date, but for the moment let it suffice to say that it can be shown that the reduction of \(\mathrm{St}\) modulo \(p\) comes from a representation of the algebraic group \(G\), and it holds a very special place among these representations. In particular, the fact that it can be defined concretely as a representation of a finite group means that its dimension can be determined, while in general the dimensions of the simple representations of \(G\) are very mysterious. As an example application, the Steinberg representation is the crucial ingredient in the proof that quotients of normal affine varieties by reductive groups (exist as schemes and) remain affine in characteristic \(p\); the same kinds of arguments (plus a lot of technique in reductive group schemes over general bases) lead to a very interesting characterization of reductivity over general bases, see Alper, Adequate moduli spaces and geometrically reductive group schemes, Cor. 9.7.7. The Steinberg representation is also an ingredient in the proof of Kempf’s vanishing theorem, a statement about the vanishing of the higher cohomology of certain line bundles on flag varieties.

Coxeter complexes

As mentioned above, we will need to introduce the Coxeter complex of a Coxeter system \((W, S)\), a certain simplicial complex attached to \((W, S)\). Before doing so, let us recall several facts about Coxeter systems. There will be very few proofs; we refer (vaguely) to [B, Chap. V, Sec. 1] for all results below. First, recall that by definition, \((W, S)\) is a Coxeter system when \(W\) is a group, \(S\) is a subset of \(W\), and \(W\) has a presentation of the form

\[ W = \langle s \in S: (st)^{m_{s, t}} = 1 \rangle \]

where \(m_{s, t} \in \mathbf{Z}_{> 0} \cup \{\infty\}\), \(m_{s, s} = 1\), and \(m_{s, t} \geq 2\) whenever \(s \neq t\). Note that distinct elements \(s\) and \(t\) commute if and only if \(m_{s, t} = 2\). We will deal only with the case that \(W\) is finite, though there are important cases (appearing in Bruhat-Tits theory, for example) in which infinite \(W\) are important. Coxeter groups are to be thought of as groups generated by reflections, and indeed as long as \(S\) is finite there is a canonical faithful \(|S|\)-dimensional real representation \(V\) of \(W\) with the properties that \(W\) is a discrete subgroup of \(\mathrm{GL}(V)\) and each element of \(S\) is a reflection on \(V\). This \(V\) is called the geometric representation of \((W, S)\). (Using this, it is easy to see that \(W\) is finite if and only if this representation can be equipped with an inner product so that \(W\) is a subgroup of the associated orthogonal group.)

Example. Consider the dihedral group \(D_n = \langle s, t: s^2 = t^2 = (st)^n = 1 \rangle\) for \(2 \leq n < \infty\). Clearly the pair \((D_n, \{s, t\})\) is a Coxeter system, and it can be realized concretely as the subgroup of \(\mathrm{GL}_2(\mathbf{R})\) generated by the orthogonal reflections along the \(x\)-axis and along the line \(\pi/n\) radians counterclockwise from the \(x\)-axis. If \(n = \infty\) then there is also a dihedral group \(D_\infty = \langle s, t: s^2 = t^2 = 1 \rangle\), but it is slightly more complicated to describe the representation for \(D_\infty\). (In this case \(s\) and \(t\) do not act by orthogonal reflections.)

Let \(W\) be the Weyl group of \(G\), i.e., the quotient \(N/T\), where \(N = N_G(T)\) is the normalizer of \(T\) in \(G\). This is a constant \(k\)-group, which we will often identify with its underlying group of \(k\)-points. A fundamental fact (and the reason this section is here) is that if \(W\) is the Weyl group of the reductive group \(G\) and \(S\) is the set of orthogonal reflections (with respect to a \(W\)-invariant inner product on \(X(T) \otimes \mathbf{R}\)) along roots in \(\Delta\), then the pair \((W, S)\) is a Coxeter system. This follows from the general formalism of Tits systems, see [B, Chap. V, Sec. 2].

Example. Taking \(G = \mathrm{SL}_n\) (or \(\mathrm{GL}_n\)), we can deduce that \((S_n, S)\) is a Coxeter system, where \(S_n\) is the symmetric group on \(n\) letters and \(S\) is the set of adjacent transpositions \(s_i = (i \,\, i+1)\) for \(1 \leq i \leq n-1\). In fact, \(S_n\) is generated by the \(s_i\) with the following relations:

  • \(s_i^2 = 1\) for all \(i\),
  • \((s_i s_{i+1})^3 = 1\) for all \(i\), and
  • \((s_i s_j)^2 = 1\) whenever \(j \not \in \{i-1, i, i+1\}\).

If \(w \in W\) then we may write \(w = s_1 \cdots s_n\) for some \(s_1 \dots s_n \in S\). If \(n\) is minimal among all such decompositions, then we will call \((s_1, \dots, s_n)\) a reduced decomposition of \(w\) and we will define the length \(\ell(w)\) of \(w\) to be equal to \(n\). For any decomposition \((s_1, \dots, s_N)\) of \(w\), there is some subsequence \(1 \leq i_1 < \cdots < i_m \leq N\) such that \((s_{i_1}, \dots, s_{i_m})\) is a reduced decomposition of \(w\). Thus in particular every decomposition for which there exists no proper such subsequence is of length \(n\). While there may be several different minimal decompositions of a given element, the set \(\{s_1, \dots, s_n\}\) does not depend on this choice.

Example. In \(D_4\) we have \(stst = tsts\), and these are both reduced decompositions. (This generalizes entirely to \(D_n\) when \(n < \infty\).)

For every subset \(I \subset S\), there is a subgroup \(W_I\) of \(W\) generated by all of the elements of \(I\). This is itself a Coxeter group, and we have \(W_I \cap S = I\), i.e., any element of \(S\) which can be written as a product of elements of \(I\) is itself an element of \(I\). In particular, the subgroup \(W_I\) determines the subset \(I\). The discussion above on minimal decompositions shows also that \(W_I \cap W_J = W_{I \cap J}\).

Example. For any Coxeter system \((W, S)\) and any \(s \in S\) we have \(W_{\{s\}} \cong \mathbf{Z}/2\). If \(s, t \in S\) are distinct then \(W_{\{s, t\}} \cong \langle s, t: s^2 = t^2 = (st)^{m_{s, t}} = 1 \rangle\) is a dihedral group (of finite or infinite order according to whether \(m_{s, t} < \infty\) or not).

Example. If \(W = S_n\) as in a previous example and \(I = \{s_i: i \neq m\}\) for some \(1 \leq m \leq n-1\), then \(W_I \cong S_m \times S_{n-m}\).

We are now ready to define the Coxeter complex \(\Sigma = \Sigma(W, S)\). It is defined as follows: the vertices of \(\Sigma\) are precisely the cosets \(wW_{S - \{s\}}\), where as above \(W_{S - \{s\}}\) is the subgroup of \(W\) generated by \(S - \{s\}\). The facets of \(\Sigma\) are then defined to be those sets \(\{v_0, \dots, v_m\}\) of vertices such that \(v_0 \cap \cdots \cap v_m \neq \emptyset\). We will denote such a facet by \([v_0, \dots, v_m]\). If we have chosen an ordering \(s_0, \dots, s_n\) of \(S\), then we will let \(e_i = W_{S - \{s_i\}}\). If \(\alpha = \{i_1, \dots, i_m\}\) is a subset of \(\{0, \dots, n\}\), then we will say that the facet \([we_{i_1}, \dots, we_{i_m}]\) has type \(\alpha\). Note that every facet is of this form for some \(w \in W\) and some subset \(\{i_1, \dots, i_m\}\) of \(\{0, \dots, n\}\).

Example. If \(W = D_2 \cong (\mathbf{Z}/2)^2\) as above, then \(\Sigma\) is a square: order \(S\) by \(s < t\). We have \(W_{\{s\}} = \{1, s\}\) and \(W_{\{t\}} = \{1, t\}\), and the facets are precisely the vertices \(e_0, te_0, e_1, se_1\) and the chambers \([e_0, e_1], [e_0, se_1], [te_0, e_1],\) and \([te_0, se_1]\). It is easily checked that this is a square. (Draw a picture!) In general, the Coxeter complex of \(D_n\) is a \(2n\)-gon.

Notice that \(W\) acts on \(\Sigma\) by simplicial automorphisms, via \(w' \cdot wW_{S - \{s\}} = (w'w)W_{S - \{s\}}\). We will often not distinguish between \(\Sigma\) and its geometric realization. One of the main results of [B, Chap. V] is that \(\Sigma\) is a triangulation of the \((|S| - 1)\)-dimensional Euclidean sphere. (See the footnote at the end of this post.) In particular, if \(|S| = 1\) (so \(W = \mathbf{Z}/2\)) then its (integral) homology in degree \(0\) is \(\mathbf{Z}^2\) and \(0\) in all other degrees; if \(|S| > 1\) then its homology is \(\mathbf{Z}\) in degrees \(0\) and \(|S| - 1\), and it is \(0\) in all other degrees. The main theorem in this section relies on a computation of the character of the representation of \(W\) on \(H_{|S| - 1}(\Sigma, \mathbf{Z})\).

Theorem: If \(\epsilon\) is the alternating character of \((W, S)\), the homomorphism \(W \to \mathbf{C}^*\) determined by the condition \(\epsilon(s) = -1\) for all \(s \in S\), then

\[ \epsilon = \sum_{I \subset S} (-1)^{|I|} \mathrm{ind}_{W_I}^{W} 1_{W_I}. \]

Proof. Let \(n = |S| - 1\). If \(n = 0\) then the result is obvious, so we will assume \(n > 0\). Choose an ordering \(S = \{s_0, \dots, s_n\}\). For each \(m\), \(0 \leq m \leq n\), let \(C_m\) denote the free abelian group generated by the \(m\)-facets of \(\Sigma\). For each subset \(\alpha = \{s_{i_0}, \dots, s_{i_m}\}\) of \(S\), let \(L_\alpha\) denote the set of \(m\)-facets of \(\Sigma\) of type \(\alpha\). Let \(\kappa_m\) be the character of the permutation representation of \(W\) on \(C_m\); for each \(\alpha\) as above, let \(\lambda_\alpha\) denote the character of the permutation representation of \(W\) on \(L_\alpha\); and let \(\theta_m\) denote the character of the representation of \(W\) on \(H_m(\Sigma)\). Trivially, \(\kappa_m = \sum_{|\alpha| = m+1} \lambda_\alpha\). By the Hopf trace formula, we have

\[ \sum_{m=0}^{n} (-1)^m \kappa_m = \theta_0 + (-1)^n \theta_n. \]

Evidently \(\theta_0\) is the trivial representation since \(\Sigma\) is connected. Moreover, \(\theta_n = \epsilon\), as we can see by considering the fundamental \(n\)-cycle \(\sum_{w \in W} (-1)^{\ell(w)} [we_0, \dots, we_n]\). Then we have

\[ -\sum_{\emptyset \neq \alpha \subset \{0, \dots, n\}} (-1)^{|\alpha|} \lambda_\alpha = 1 + (-1)^n \epsilon. \]

Now if \(\alpha\), \(\pi\) are complementary subsets of \(\{0, \dots, n\}\) then \(W_\pi\) is the subgroup of \(W\) fixing each \(e_i\), \(i \in \alpha\). So \(\lambda_\alpha = \phi_\pi := \mathrm{ind}_{W_\pi}^{W} 1_{W_\pi}\). As \((-1)^\alpha = (-1)^n(-1)^{|\pi|}\) we have

\[ -\sum_{\pi \neq \{0, \dots, n\}} (-1)^n(-1)^{|\pi|} \phi_\pi = 1 + (-1)^n \epsilon \]

and a simple rearrangement gives

\[ \sum_\pi (-1)^{|\pi|} = \epsilon \]

as desired.

Comparing characters of \(G(k)\) and \(W\)

In this section we will prove point 2 using fairly formal methods in finite group theory along with some comparisons between the subgroup structures of \(G(k)\) and \(W\). First, if we identify the sets \(\Delta\) of simple roots and \(S\) of orthogonal reflections along simple roots, then we have \(P_I(k) = B(k)W_IB(k)\) for all subsets \(I \subset S\). (Although it is not sensible to multiply elements of \(W\) with elements of \(G(k)\), these double cosets are still sensible objects because \(W\) normalizes \(T(k)\) and \(T\) is contained in \(B\).) In particular, the Bruhat decomposition states that \(G(k) = B(k)WB(k)\). It follows from the same formalism that if \(J, K \subset S\) are two subsets then the number of \((W_J, W_K)\)-double cosets is equal to the number of \((P_J(k), P_K(k))\)-double cosets. To give a flavor of the methods involved (and because this will be needed), I give a proof of this statement below.

Lemma: The map \(W_J\backslash W/W_K \to P_J(k)\backslash G(k)/P_K(k)\), \(W_JwW_K \mapsto P_J(k)wP_K(k)\), is a bijection.

Proof. First, the map is well-defined: for example, if \(s \in J\) then we have

\begin{align*} P_J(k)sw = B(k)W_JB(k)sw &\subset B(k)W_JB(k)w \cup B(k)W_JsB(k)w \\ &= P_J(k)w \end{align*}

by one of the axioms of a Tits system. The Bruhat decomposition shows that this map is surjective, so we need only show that for every \(w \in W\), \(P_J(k)wP_K(k) = B(k)W_JwW_KB(k)\). For this note that

\[ P_J(k)wP_K(k) = B(k)W_JB(k)wB(k)W_KB(k). \]

We claim \(B(k)W_JB(k)wB(k) = B(k)W_JwB(k)\). Since one inclusion is obvious, it suffices to show the inclusion \(\subset\). Let \(w' \in W_J\), and write \(w' = s_1 \cdots s_n\) for some \(s_i \in J\) (as we can do by definition of \(W_J\)). By [B, Chap. V, Sec. 2, Lem. 1], we have

\[ B(k)w'B(k)wB(k) \subset \bigcup_{1 \leq i_1 < \cdots < i_m \leq n} B(k)s_{i_1} \cdots s_{i_m} wB(k). \]

The right hand side is clearly contained in \(B(k)W_JwB(k)\), so we are done. It suffices now to show \(W_JwB(k)W_K \subset B(k)W_JwW_KB(k)\), but indeed this follows from precisely the same argument as above. So the first displayed equation is true and the Lemma has been proved.

We are now ready to compare \(\epsilon\) and \(\mathrm{St}\). For each subset \(J\) of \(S\) (\(= \Delta\)), we let \(\psi_J = \mathrm{ind}_{W_J}^{W} 1_{W_J}\) and \(\chi_J = \mathrm{ind}_{P_J(k)}^{G(k)} 1_{P_J(k)}\).

Theorem: The mapping \(\theta: \sum_J a_J \psi_J \mapsto \sum_J a_J \chi_J\) is an isometry (with respect to the usual inner product) from the complex vector space generated by the characters \(\psi_J\) of \(W\) to the complex vector space generated by the characters \(\chi_J\) of \(G(k)\). If \(\psi = \sum_J n_J \psi_J\) is an irreducible character of \(W\) for integers \(n_J\), then \(\chi = \sum_J n_J \chi_J\) or its negative is an irreducible character of \(G(k)\). In particular, \(\mathrm{St}\) or its negative is an irreducible character of \(G(k)\).

Proof. The second statement follows from the first: namely, write \(\chi\) as an (integral) linear combination of irreducible characters of \(G(k)\) and note that the isometry statement implies that exactly one of the coefficients in this linear combination is nonzero, and this nonzero coefficient is either equal to 1 or -1. The final statement follows from the theorem in the previous section.

It is now enough to show that \((\psi_J, \psi_K)_W = (\chi_J, \chi_K)_{G(k)}\) for all subsets \(J, K\) of \(S\). In fact, we will show that \((\psi_J, \psi_K)_W\) is equal to the number of \((W_J, W_K)\)-double cosets in \(W\). The same method will show an analogous result for \(G(k)\). As the number of \((W_J, W_K)\)-double cosets is equal to the number of \((G_J, G_K)\)-double cosets (as noted in the Lemma above), the result follows.

First, note that if \(H\) is a finite group acting transitively on two sets \(X\) and \(Y\), and if \(x_0 \in X\), \(y_0 \in Y\), then the number of orbits of \(H\) acting on \(X \times Y\) is equal to the number of \((H_{x_0}, H_{y_0})\)-double cosets in \(H\), where \(H_{x_0}\) and \(H_{y_0}\) are the stabilizers of \(x_0\) and \(y_0\) in \(H\), respectively. If \(\psi_X\) and \(\psi_Y\) are the characters of the permutation representations of \(H\) on \(X\) and \(Y\), then we note that \(\psi_X(h)\psi_Y(h)\) is the number of fixed points of \(h\) acting on \(X \times Y\). By Burnside’s lemma, it follows that the number of orbits of \(H\) acting on \(X \times Y\) is equal to

\[ |H|^{-1}\sum_{h \in H} \psi_X(h)\psi_Y(h) = (\psi_X, \psi_Y)_H. \]

Apply these observations to \(H = W\), \(X = \{wW_J\}_{w \in W}\), \(Y = \{wW_K\}_{w \in W}\), \(x_0 = W_J\), \(y_0 = W_K\), and similarly for \(G(k)\), to conclude.

Footnote. It is not entirely trivial to extract from [B, Chap. V] the fact that \(\Sigma(W, S)\) is a triangulation of an \((|S| - 1)\)-dimensional sphere. The main point to show is that (the geometric realization of) \(\Sigma(W, S)\) can alternatively be described in the following way: equip the geometric representation \(V\) of \((W, S)\) with a \(W\)-invariant inner product, so that \(W\) is generated as a subgroup of \(\mathrm{GL}(V)\) by orthogonal reflections. There is a system of hyperplanes in \(V\) consisting of those hyperplanes fixed by some reflection in \(W\), and this system satisfies the axioms outlined in the beginning of [B, Chap. V, Sec. 3]. Equip the unit sphere \(K \subset V\) (with respect to the given inner product) with the triangulation coming from this system of hyperplanes. Using the results of [B, Chap. V, nos. 3.2, 3.3], one can show that in fact \(\Sigma(W, S)\) is isomorphic to this triangulation of \(K\).

References. [B] = Bourbaki, Groupes et algèbres de Lie, Chaps. IV, V, and VI

Notify of
Inline Feedbacks
View all comments