1. algebraic geometry
  2. algebraic topology
  3. analysis of PDEs
  4. category theory
  5. classical analysis and ODEs
  6. combinatorics
  7. commutative algebra
  8. complex variables
  9. differential geometry
  10. dynamical systems
  11. functional analysis
  12. general mathematics
  13. general topology
  14. geometric topology
  15. group theory
  16. information theory
  17. K-theory and homology
  18. logic
  19. mathematical physics
  20. metric geometry
  21. number theory
  22. numerical analysis
  23. operator algebras
  24. optimization and control
  25. probability
  26. quantum algebra
  27. representation theory
  28. rings and algebras
  29. spectral theory
  30. statistics theory
  31. symplectic geometry

The Steinberg Representation

In this post I want to describe a remarkable representation associated to finite groups of Lie type. For this, let G be a connected reductive group over a finite field k with q = p^r elements, and let U be the unipotent radical of some Borel k-subgroup B of G. Steinberg constructed an irreducible representation \mathrm{St} of G(k) of dimension q^{\dim U}. For convenience, we will assume that G is split, although the experts can rest assured that everything goes through if one replaces absolute root systems by relative root systems. If T is a (split) maximal k-torus of B and \Delta is the system of simple roots corresponding to the pair (B, T), then we define

    \[ \mathrm{St} = \sum_{I \subset \Delta} (-1)^{|I|} \mathrm{ind}_{P_I(k)}^{G(k)} 1_{P_I(k)}, \]

where P_I is the parabolic subgroup of G containing B corresponding to I. From this definition, it is unclear that this is a character of G(k) (rather than a generalized character), let alone irreducible, and it is not clear what its dimension should be.

Example. Let G = \mathrm{SL}_2, so that \Delta consists of a single element and \mathrm{St} = \mathrm{ind}_{B(k)}^{G(k)} 1_{B(k)} - 1_{G(k)}. By definition, \mathrm{ind}_{B(k)}^{G(k)} 1_{B(k)} is the space of \mathbf{C}-valued functions on G(k)/B(k) = \mathbf{P}^1(k), while 1_{G(k)} may be understood as the space of constant functions on the same space. Thus one can regard \mathrm{St} as the (q-dimensional) space of complex-valued functions on \mathbf{P}^1(k) of average value 0. The fact that \mathrm{St} is irreducible is still not obvious, but it follows in this case from arguments with Frobenius reciprocity and Mackey theory. (If anyone can think of a simple argument, I would be happy to hear it!)

Example. The Steinberg representation is not intrinsic to the abstract group G(k), as it can happen that a finite group can be given the structure of a group of Lie type in more than one way. For example, \mathrm{PSL}_2(\mathbf{F}_7) \cong \mathrm{SL}_3(\mathbf{F}_2), and this is the unique simple group of order 168. (Here \mathrm{PSL}_2(\mathbf{F}_7) denotes the quotient of \mathrm{SL}_2(\mathbf{F}_7) by its center; it does not denote \mathrm{PGL}_2(\mathbf{F}_7), which has order 336 and is not simple.) Thus there can be more than one Steinberg representation associated to a given group. In this case, there is one Steinberg representation of dimension 7 and another of dimension 8. (The fact that the Steinberg representation of \mathrm{SL}_2(\mathbf{F}_7) descends to \mathrm{PSL}_2(\mathbf{F}_7) can be deduced from the definition using the Bruhat decomposition and an inclusion-exclusion argument.)

The proof that \mathrm{St} has properties as described works essentially as follows:

  1. Show that if \epsilon is the alternating character of the Weyl group W of G, and if W_I is the subgroup of W generated by the simple reflections along roots in I \subset \Delta, then \epsilon = \sum_{I \subset \Delta} (-1)^{|I|} \mathrm{ind}_{W_I}^{W} 1_{W_I}. This is proved by introducing a simplicial complex \Sigma attached canonically to (W, S), showing that (the geometric realization of) \Sigma is a sphere, recognizing \epsilon as the representation of W on the top homology of \Sigma, and then using the Hopf trace formula.
  2. Use point 1 and the Bruhat decomposition (+ adjacent theory) to deduce that \mathrm{St} or its negative is irreducible.
  3. Compute \mathrm{St}(1) = q^{\dim U} and conclude in particular from point 2 that \mathrm{St} is irreducible. This is proved by introducing a simplicial complex \mathcal{B} attached canonically to G(k), showing that (the geometric realization of) \mathcal{B} is homotopic to the wedge of q^{\dim U}-many spheres and the using the Hopf trace formula to recognize \mathrm{St} as the representation of G(k) on the top homology of \mathcal{B}.

In this post I will only prove points 1 and 2, leaving point 3 for a later post in which I will discuss buildings more fully. (One can show point 3 using elementary methods avoiding buildings, but the combinatorics are somewhat intricate and it seems to me that there is more intuition to be gained from the topological argument. For details, see the proof of Theorem 2(b) in Curtis, The Steinberg Character of a Finite Group with a (B, N)-pair.) A motivating point to keep in mind is that the Weyl group W is often thought of as an analogue of G(k) where k is replaced by “the field with one element”. For us this only has the heuristic value that many true statements about G(k) admit analogous statements for W, and the latter are often viewed as “degenerate” forms of the former. For example, there are universal polynomials in q which compute the order of G(k), and specializing q = 1 leads to formulas for the order of W. See the introduction of this ArXiV document, for example. In the situation above, we view the Coxeter complex of the Weyl group W as being a degenerate form of the spherical building of G(k) and the alternating character of W as being a degenerate form of the Steinberg character of G(k). As we will see in a later post, the spherical building \mathcal{B} of G(k) is built up from subcomplexes called apartments, each of which is canonically isomorphic to the Coxeter complex of W. (Note also that \mathrm{St} has dimension q^{\dim U}, and specializing q = 1 gives the dimension of the alternating character \epsilon.) In this way, the basic facts about the Steinberg representation are deduced from the degenerate and nondegenerate situations.

Before beginning to prove points 1 and 2 from above, I want to remark that the Steinberg representation shows up in many unexpected places in algebraic geometry and the theory of algebraic groups. I intend to write a post explaining this at a later date, but for the moment let it suffice to say that it can be shown that the reduction of \mathrm{St} modulo p comes from a representation of the algebraic group G, and it holds a very special place among these representations. In particular, the fact that it can be defined concretely as a representation of a finite group means that its dimension can be determined, while in general the dimensions of the simple representations of G are very mysterious. As an example application, the Steinberg representation is the crucial ingredient in the proof that quotients of normal affine varieties by reductive groups (exist as schemes and) remain affine in characteristic p; the same kinds of arguments (plus a lot of technique in reductive group schemes over general bases) lead to a very interesting characterization of reductivity over general bases, see Alper, Adequate moduli spaces and geometrically reductive group schemes, Cor. 9.7.7. The Steinberg representation is also an ingredient in the proof of Kempf’s vanishing theorem, a statement about the vanishing of the higher cohomology of certain line bundles on flag varieties.

Coxeter complexes

As mentioned above, we will need to introduce the Coxeter complex of a Coxeter system (W, S), a certain simplicial complex attached to (W, S). Before doing so, let us recall several facts about Coxeter systems. There will be very few proofs; we refer (vaguely) to [B, Chap. V, Sec. 1] for all results below. First, recall that by definition, (W, S) is a Coxeter system when W is a group, S is a subset of W, and W has a presentation of the form

    \[ W = \langle s \in S: (st)^{m_{s, t}} = 1 \rangle \]

where m_{s, t} \in \mathbf{Z}_{> 0} \cup \{\infty\}, m_{s, s} = 1, and m_{s, t} \geq 2 whenever s \neq t. Note that distinct elements s and t commute if and only if m_{s, t} = 2. We will deal only with the case that W is finite, though there are important cases (appearing in Bruhat-Tits theory, for example) in which infinite W are important. Coxeter groups are to be thought of as groups generated by reflections, and indeed as long as S is finite there is a canonical faithful (finite-dimensional) real representation V of W with the properties that W is a discrete subgroup of \mathrm{GL}(V) and each element of S is a reflection on V. (Using this, it is easy to see that W is finite if and only if this representation can be equipped with an inner product so that W is a subgroup of the associated orthogonal group.)

Example. Consider the dihedral group D_n = \langle s, t: s^2 = t^2 = (st)^n = 1 \rangle for 2 \geq n < \infty. Clearly the pair (D_n, \{s, t\}) is a Coxeter system, and it can be realized concretely as the subgroup of \mathrm{GL}_2(\mathbf{R}) generated by the orthogonal reflections along the x-axis and along the line \pi/n radians counterclockwise from the x-axis. If n = \infty then there is also a dihedral group D_\infty = \langle s, t: s^2 = t^2 = 1 \rangle, but it is slightly more complicated to describe the representation for D_\infty. (In this case s and t do not act by orthogonal reflections.)

Let W be the Weyl group of G, i.e., the quotient N/T, where N = N_G(T) is the normalizer of T in G. This is a constant k-group, which we will often identify with its underlying group of k-points. A fundamental fact (and the reason this section is here) is that if W is the Weyl group of the reductive group G and S is the set of orthogonal reflections (with respect to a W-invariant inner product on X(T) \otimes \mathbf{R}) along roots in \Delta, then the pair (W, S) is a Coxeter system. This follows from the general formalism of Tits systems, see [B, Chap. V, Sec. 2].

Example. Taking G = \mathrm{SL}_n (or \mathrm{GL}_n), we can deduce that (S_n, S) is a Coxeter system, where S_n is the symmetric group on n letters and S is the set of adjacent transpositions s_i = (i \,\, i+1) for 1 \leq i \leq n-1. In fact, S_n is generated by the s_i with the following relations:

  • s_i^2 = 1 for all i,
  • (s_i s_{i+1})^3 = 1 for all i, and
  • (s_i s_j)^2 = 1 whenever j \not \in \{i-1, i, i+1\}.

If w \in W then we may write w = s_1 \cdots s_n for some s_1 \dots s_n \in S. If n is minimal among all such decompositions, then we will call (s_1, \dots, s_n) a reduced decomposition of w and we will define the length \ell(w) of w to be equal to n. For any decomposition (s_1, \dots, s_N) of w, there is some subsequence 1 \leq i_1 < \cdots < i_m \leq N such that (s_{i_1}, \dots, s_{i_m}) is a reduced decomposition of w. Thus in particular every decomposition for which there exists no proper such subsequence is of length n. While there may be several different minimal decompositions of a given element, the set \{s_1, \dots, s_n\} does not depend on this choice.

Example. In D_4 we have stst = tsts, and these are both reduced decompositions. (This generalizes entirely to D_n when n < \infty.)

For every subset I \subset S, there is a subgroup W_I of W generated by all of the elements of I. This is itself a Coxeter group, and we have W_I \cap S = I, i.e., any element of S which can be written as a product of elements of I is itself an element of I. In particular, the subgroup W_I determines the subset I. The discussion above on minimal decompositions shows also that W_I \cap W_J = W_{I \cap J}.

Example. For any Coxeter system (W, S) and any s \in S we have W_{\{s\}} \cong \mathbf{Z}/2. If s, t \in S are distinct then W_{\{s, t\}} \cong \langle s, t: s^2 = t^2 = (st)^{m_{s, t}} = 1 \rangle is a dihedral group (of finite or infinite order according to whether m_{s, t} < \infty or not).

Example. If W = S_n as in a previous example and I = \{s_i: i \neq m\} for some 1 \leq m \leq n-1, then W_I \cong S_m \times S_{m-n}.

We are now ready to define the Coxeter complex \Sigma = \Sigma(W, S). It is defined as follows: the vertices of \Sigma are precisely the cosets wW_{S - \{s\}}, where as above W_{S - \{s\}} is the subgroup of W generated by S - \{s\}. The facets of \Sigma are then defined to be those sets \{v_0, \dots, v_m\} of vertices such that v_0 \cap \cdots \cap v_m \neq \emptyset. We will denote such a facet by [v_0, \dots, v_m]. If we have chosen an ordering s_0, \dots, s_n of S, then we will let e_i = W_{S - \{s_i\}}. If \alpha = \{i_1, \dots, i_m\} is a subset of \{0, \dots, n\}, then we will say that the facet [we_{i_1}, \dots, we_{i_m}] has type \alpha. Note that every facet is of this form for some w \in W and some subset \{i_1, \dots, i_m\} of \{0, \dots, n\}.

Example. If W = D_2 \cong (\mathbf{Z}/2)^2 as above, then \Sigma is a square: order S by s < t. We have W_{\{s\}} = \{1, s\} and W_{\{t\}} = \{1, t\}, and the facets are precisely the vertices e_0, te_0, e_1, se_1 and the chambers [e_0, e_1], [e_0, se_1], [te_0, e_1], and [te_0, se_1]. It is easily checked that this is a square. (Draw a picture!) In general, the Coxeter complex of D_n is a 2n-gon.

Notice that W acts on \Sigma by simplicial automorphisms, via w' \cdot wW_{S - \{s\}} = (w'w)W_{S - \{s\}}. We will often not distinguish between \Sigma and its geometric realization. One of the main results of [B, Chap. V] is that \Sigma is a triangulation of the (|S| - 1)-dimensional Euclidean sphere. (See the footnote at the end of this post.) In particular, if |S| = 1 (so W = \mathbf{Z}/2) then its (integral) homology in degree 0 is \mathbf{Z}^2 and 0 in all other degrees; if |S| > 1 then its homology is \mathbf{Z} in degrees 0 and |S| - 1, and it is 0 in all other degrees. The main theorem in this section relies on a computation of the character of the representation of W on H_{|S| - 1}(\Sigma, \mathbf{Z}).

Theorem: If \epsilon is the alternating character of (W, S), the homomorphism W \to \mathbf{C}^* determined by the condition \epsilon(s) = -1 for all s \in S, then

    \[ \epsilon = \sum_{I \subset S} (-1)^{|I|} \mathrm{ind}_{W_I}^{W} 1_{W_I}. \]

Proof. Let n = |S| - 1. If n = 0 then the result is obvious, so we will assume n > 0. Choose an ordering S = \{s_0, \dots, s_n\}. For each m, 0 \leq m \leq n, let C_m denote the free abelian group generated by the m-facets of \Sigma. For each subset \alpha = \{s_{i_0}, \dots, s_{i_m}\} of S, let L_\alpha denote the set of m-facets of \Sigma of type \alpha. Let \kappa_m be the character of the permutation representation of W on C_m; for each \alpha as above, let \lambda_\alpha denote the character of the permutation representation of W on L_\alpha; and let \theta_m denote the character of the representation of W on H_m(\Sigma). Trivially, \kappa_m = \sum_{|\alpha| = m+1} \lambda_\alpha. By the Hopf trace formula, we have

    \[ \sum_{m=0}^{n} (-1)^m \kappa_m = \theta_0 + (-1)^n \theta_n. \]

Evidently \theta_0 is the trivial representation since \Sigma is connected. Moreover, \theta_n = \epsilon, as we can see by considering the fundamental n-cycle \sum_{w \in W} (-1)^{\ell(w)} [we_0, \dots, we_n]. Then we have

    \[ -\sum_{\emptyset \neq \alpha \subset \{0, \dots, n\}} (-1)^{|\alpha|} \lambda_\alpha = 1 + (-1)^n \epsilon. \]

Now if \alpha, \pi are complementary subsets of \{0, \dots, n\} then W_\pi is the subgroup of W fixing each e_i, i \in \alpha. So \lambda_\alpha = \phi_\pi := \mathrm{ind}_{W_\pi}^{W} 1_{W_\pi}. As (-1)^\alpha = (-1)^n(-1)^{|\pi|} we have

    \[ -\sum_{\pi \neq \{0, \dots, n\}} (-1)^n(-1)^{|\pi|} \phi_\pi = 1 + (-1)^n \epsilon \]

and a simple rearrangement gives

    \[ \sum_\pi (-1)^{|\pi|} = \epsilon \]

as desired.

Comparing characters of G(k) and W

In this section we will prove point 2 using fairly formal methods in finite group theory along with some comparisons between the subgroup structures of G(k) and W. First, if we identify the sets \Delta of simple roots and S of orthogonal reflections along simple roots, then we have P_I(k) = B(k)W_IB(k) for all subsets I \subset S. (Although it is not sensible to multiply elements of W with elements of G(k), these double cosets are still sensible objects because W normalizes T(k) and T is contained in B.) In particular, the Bruhat decomposition states that G(k) = B(k)WB(k). It follows from the same formalism that if J, K \subset S are two subsets then the number of (W_J, W_K)-double cosets is equal to the number of (P_J(k), P_K(k))-double cosets. To give a flavor of the methods involved (and because this will be needed), I give a proof of this statement below.

Lemma: The map W_J\backslash W/W_K \to P_J(k)\backslash G(k)/P_K(k), W_JwW_K \mapsto P_J(k)wP_K(k), is a bijection.

Proof. First, the map is well-defined: for example, is s \in J then we have

    \begin{align*} P_J(k)sw = B(k)W_JB(k)sw &\subset B(k)W_JB(k)w \cup B(k)W_JsB(k)w \\ &= P_J(k)w \end{align*}

by one of the axioms of a Tits system. The Bruhat decomposition shows that this map is surjective, so we need only show that for every w \in W, P_J(k)wP_K(k) = B(k)W_JwW_KB(k). For this note that

    \[ P_J(k)wP_K(k) = B(k)W_JB(k)wB(k)W_KB(k). \]

We claim B(k)W_JB(k)wB(k) = B(k)W_JwB(k). Since one inclusion is obvious, it suffices to show the inclusion \subset. Let w' \in W_J, and write w' = s_1 \cdots s_n for some s_i \in J (as we can do by definition of W_J). By [B, Chap. V, Sec. 2, Lem. 1], we have

    \[ B(k)w'B(k)wB(k) \subset \bigcup_{1 \leq i_1 < \cdots < i_m \leq n} B(k)s_{i_1} \cdots s_{i_m} wB(k). \]

The right hand side is clearly contained in B(k)W_JwB(k), so we are done. It suffices now to show W_JwB(k)W_K \subset B(k)W_JwW_KB(k), but indeed this follows from precisely the same argument as above. So the first displayed equation is true and the Lemma has been proved.

We are now ready to compare \epsilon and \mathrm{St}. For each subset J of S (= \Delta), we let \psi_J = \mathrm{ind}_{W_J}^{W} 1_{W_J} and \chi_J = \mathrm{ind}_{P_J(k)}^{G(k)} 1_{P_J(k)}.

Theorem: The mapping \theta: \sum_J a_J \psi_J \mapsto \sum_J a_J \chi_J is an isometry (with respect to the usual inner product) from the complex vector space generated by the characters \psi_J of W to the complex vector space generated by the characters \chi_J of G(k). If \psi = \sum_J n_J \psi_J is an irreducible character of W for integers n_J, then \chi = \sum_J n_J \chi_J or its negative is an irreducible character of G(k). In particular, \mathrm{St} or its negative is an irreducible character of G(k).

Proof. The second statement follows from the first: namely, write \chi as an (integral) linear combination of irreducible characters of G(k) and note that the isometry statement implies that exactly one of the coefficients in this linear combination is nonzero, and this nonzero coefficient is either equal to 1 or -1. The final statement follows from the theorem in the previous section.

It is now enough to show that (\psi_J, \psi_K)_W = (\chi_J, \chi_K)_{G(k)} for all subsets J, K of S. In fact, we will show that (\psi_J, \psi_K)_W is equal to the number of (W_J, W_K)-double cosets in W. The same method will show an analogous result for G(k). As the number of (W_J, W_K)-double cosets is equal to the number of (G_J, G_K)-double cosets (as noted in the Lemma above), the result follows.

First, note that if H is a finite group acting transitively on two sets X and Y, and if x_0 \in X, y_0 \in Y, then the number of orbits of H acting on X \times Y is equal to the number of (H_{x_0}, H_{y_0})-double cosets in H, where H_{x_0} and H_{y_0} are the stabilizers of x_0 and y_0 in H, respectively. If \psi_X and \psi_Y are the characters of the permutation representations of H on X and Y, then we note that \psi_X(h)\psi_Y(h) is the number of fixed points of h acting on X \times Y. By Burnside’s lemma, it follows that the number of orbits of H acting on X \times Y is equal to

    \[ |H|^{-1}\sum_{h \in H} \psi_X(h)\psi_Y(h) = (\psi_X, \psi_Y)_H. \]

Apply these observations to H = W, X = \{wW_J\}_{w \in W}, Y = \{wW_K\}_{w \in W}, x_0 = W_J, y_0 = W_K, and similarly for G(k), to conclude.

Footnote. It is not entirely trivial to extract from [B, Chap. V] the fact that \Sigma(W, S) is a triangulation of an (|S| - 1)-dimensional sphere. The main point to show is that (the geometric realization of) \Sigma(W, S) can alternatively described in the following way: equip V = X(T) \otimes \mathbf{R} with a W-invariant inner product, so that W is generated by orthogonal reflections. There is a system of hyperplanes in V consisting of those hyperplanes fixed by some reflection in W, and this system satisfies the axioms outlined in the beginning of [B, Chap. V, Sec. 3]. Equip the unit sphere K \subset V (with respect to the given inner product) with the triangulation coming from this system of hyperplanes. Using the results of [B, Chap. V, nos. 3.2, 3.3], one can show that in fact \Sigma(W, S) is isomorphic to this triangulation of K.

References. [B] = Bourbaki, Groupes et algèbres de Lie, Chaps. IV, V, and VI

Notify of
Inline Feedbacks
View all comments