Simple Proof of Tokuyama’s Formula

Tokuyama’s Formula is a combinatorial result that expresses the product of the deformed Weyl denominator and the Schur polynomial as a sum over strict Gelfand-Tsetlin patterns. This result implies Gelfand’s parametrization of the Schur polynomial, Weyl’s Character Formula, and Stanley’s formula on Hall-Littlewood polynomials — all for \(\text{GL}_n\); also, the formula is related to alternating sign matrices. Moreover, Tokuyama’s formula gives a combinatorial evaluation of certain matrix coefficients of representations of \(p\)-adic groups.

A Gelfand-Tsetlin pattern (or GT-pattern for short) with the top row \((a_{11},a_{12},\dots,a_{1n})\) is a triangular array of numbers

\[\left\{\begin{matrix} a_{11} &  & a_{12} &  & \dots & &  a_{1n}\\ & a_{22} & & \dots & & a_{2n} &\\ & & \ddots & & \udots & & \\ & & & a_{nn} & & & \end{matrix}\right\}\]

such that the betweenness condition \(a_{i,j} \geq a_{i+1,j+1} \geq a_{i,j+1}\) is satisfied for all \(i,j \in [n-1]\). In other words, each entry in the pattern lies between two entries above it. A Gelfand-Tsetlin pattern is called strict if rows are strictly decreasing, that is, \(a_{i,j} < a_{i,j+1}\) for all \(i,j \in [n-1]\). Denote by \(\text{GT}(\lambda)\) and \(\text{SGT}(\lambda)\) the sets of GT patterns and strict GT patterns with the top row \(\lambda\), respectively.

We call entry \(a_{i,j}\) left-leaning if \(a_{i,j} = a_{i-1,j-1}\). In other words, the entry equals to the entry above on the left. Similarly, \(a_{i,j}\) is right-leaning if \(a_{i,j} = a_{i-1,j}\). We call \(a_{i,j}\) generic if it’s not left- or right-leaning.

Let \(l(T)\) and \(r(T)\) be the number of left- and right-leaning entries in GT-pattern \(T\), respectively. Let \(g(T)\) be the number of generic entries. Also, let \(d_i(T)\) be the \(i\)-th row sum, and \(m_i(T) = d_i(T)-d_{i+1}(T)\), the difference of sums of consequent rows. Now we can state

Theorem. (Tokuyama, 1988). Let \(\lambda\) be a partition of length \(\leq n\). We have

\[\sum_{T \in \text{SGT}(\lambda+\rho)}(t+1)^{g(T)}t^{l(T)}\prod_{i=1}^{n}z_i^{m_i(T)} = \prod_{i < j}(z_i+t z_j)s_{\lambda}(z_1,\dots,z_n),\]

where \(\rho = (n-1,n-2,\dots,1,0)\), and \(s_\lambda(z_1,\dots,z_n)\) is the Schur polynomial.

The formula expresses a weighted sum over strict Gelfand-Tsetlin patterns in the closed form as the product of the deformed Weyl denominator and the Schur polynomial. Note that the usual Weyl denominator \(\prod_{i<j}(z_i - z_j)\) for \(\text{GL}_n\) is obtained by setting \(t = -1\).

Example. Let’s see Tokuyama’s formula in action. Let \(\lambda = (1, 0, 0)\). Then \(\lambda + \rho = (1, 0, 0) + (2, 1, 0) = (3, 1, 0)\). In the picture above you can see all the patterns \(G(\lambda+\rho)\) with the corresponding contribution to the sum. Yellow entries are left-leaning, blue entries are generic.

Example of Tokuyama's Formula

Now, if we sum all the contributions, we get

\begin{multline*} t^3 z_2 z_3^3+t^2 (t+1) z_1 z_2 z_3^2+t^2 z_1 z_3^3+t^2 z_2^3 z_3+t^2 (t+1) z_2^2 z_3^2+ \\ +t z_1^3 z_3+ (t+1) z_1^2 z_2^2+ + (t+1)^2 z_1^2 z_2 z_3+t (t+1) z_1^2 z_3^2+ \\ + t z_1 z_2^3+t (t+1) z_1 z_2^2 z_3+t (t+1) z_1 z_2^2 z_3+t (t+1) z_1 z_2 z_3^2+z_1^3 z_2 = \\ (z_1+tz_2)(z_1+tz_3)(z_2+tz_3)(z_1+z_2+z_3) = \prod_{i<j}(z_i+tz_j)s_\lambda(z_1,z_2,z_3). \end{multline*}

Beautiful. Before going to the proof of Tokuyama’s formula, let’s have a look at its multiple corollaries.

Corollary. (Gelfand’s parametrization) Let \(\lambda\) be a partition of length \(\leq n\). Then

\[\prod_{i=1}^{n}z_i^{n-i} s_\lambda(z_1,\dots,z_n) = \sum_{T \in G(\lambda)}z^{m(T)}.\]

Sketch of the proof. Set \(t=0\), then non-zero terms in the sum are exactly strict Gelfand-Tsetlin patterns with the top row \(\lambda + \rho\) with no right-leaning terms. They are in bijection with all Gelfand-Tsetlin patterns with the top row \(\lambda\). QED.

Corollary. (Weyl’s Character Formula). Let \(\lambda\) be a partition of length \(\leq n\). Let \(a_\lambda\) be functions from the bialternant formula of Jacobi. Then

\[a_{\lambda+\delta}(z_1,\dots,z_n)\prod_{i < j}(z_i - z_j) = s_\lambda(z_1,\dots,z_n).\]

Sketch of the proof. Set \(t = -1\), then non-zero terms in the sum are exactly strict Gelfand-Tsetlin patterns with the top row \(\lambda+\rho\) with no generic terms. They are in bijection with the permutation group \(S_n\). The sum becomes the deformed Vandermonde determinant since \(l(T)\) equals to the number of inversions in \(w\). QED.

In a similar fashion, by setting \(t = 1\), one can prove Stanley’s formula for expressing Hall-Littlewood polynomial as a sum over strict Gelfand-Tsetlin patterns. See the details in Section 3.3. Another connection is with alternating sign matrices: the sum in Tokuyama’s formula can be interpreted as a sum over alternating sign matrices which gives a generalization of the Weyl’s Character Formula where the sum is taken just over permutations (which are a subset of the alternating sign matrices). See Section 3.4. for details.

The original proof in terms of characters of tensor products of highest weight representations by Tokuyama is a little hard to read. A better proof can be given in terms of solvable lattice models, but I will show the following short proof by Daniel Bump which is expressed in a more familiar language.

Denote \(\rho_n = (n-1,n-2,\dots,1,0)\). Note that in a strict GT pattern each row is strictly decreasing, and so it can be written as \(\lambda+\rho_k\) for some partition \(\lambda\) of length \(k\). Recall that partition\(\mu\) of length \(n-1\) interleaves partition \(\lambda\) of length \(n\) if

\[\lambda_1 \geq \mu_1 \geq \lambda_2 \geq \mu_2 \geq \dots \geq \mu_{1} \geq \lambda_1.\]

Note that each row in a GT pattern interleaves the next row. In a strict GT pattern every row \(\mu + \rho_k\) interleaves \(\lambda + \rho_{k+1}\) above it.

Idea of the proof. Denote the left and right sides of Tokuyama’s Formula by

\[F(\lambda; t; z_1,\dots,z_n) = \sum_{T \in \text{SGT}(\lambda+\rho)}(t+1)^{g(T)}t^{l(T)}\prod_{i=1}^{n}z_i^{m_i(T)},\]

\[G(\lambda; t; z_1,\dots,z_n) = \prod_{i < j}(z_i+t z_j)s_{\lambda}(z_1,\dots,z_n).\]

We will show the following branching rules for \(F\) and \(G\):

\[\begin{theorem}\label{BranchingF}F(\lambda; t; z_1,\dots,z_n) = \sum_{\nu}(t+1)^{g(\lambda,\nu)}t^{l(\lambda,\nu)}z_1^{m_1(\lambda,\nu)}F(\nu;t;z_2,\dots,z_n), \end{theorem}\]

\[G(\lambda; t; z_1,\dots,z_n) = \sum_{\nu}\left(\sum_{\mu}t^{|\nu \setminus \mu|}\right)z_1^{m_1(\lambda,\nu)}G(\nu;t;z_2,\dots,z_n),\]

where partitions \(\nu\) are such that \(\nu + \rho_{n-1}\) interleaves with \(\lambda + \rho_n\), and \(\mu\) interleaves with \(\lambda\); also, \(\nu \setminus \mu\) is a vertical strip. Then we show that \((t+1)^{g(\lambda,\nu)}t^{l(\lambda,\nu)} = \sum_{\mu}t^{|\nu \setminus \mu|}\), and by induction we are done.

Proof. We start with \(F(\lambda; t; z_1,\dots,z_n)\). We consider the contribution of patterns \(T\) with top row \(\lambda+\rho_n\) and second row \(\nu+\rho_{n-1}\). Since \(\nu + \rho_{n-1}\) interleaves \(\lambda + \rho_n\) we have \(\lambda_i+n-i \geq \nu_i + (n-1) - i \geq \lambda_{i+1} + n - (i+1)\), or \(\lambda_i + 1 \geq \nu_i \geq \lambda_{i+1}\). Removing the top row from \(T\) produces a pattern with the top row \(\nu+\rho_{n-1}\) and collecting these produces \(F(\nu; t; z_2,\dots,z_n)\) with a coefficient equal to \((1+t)^{g(\lambda,\nu)}t^{l(\lambda,\nu)}z_1^{m_1(\lambda,\nu)}\). Thus, we proved the branching rules for \(F\).

Next, consider \(G(\lambda; t; z_1,\dots,z_n)\). Recall the classical branching rules for the Schur polynomial:

\[s_\lambda(z_1,z_2,\dots,z_n) = \sum_{\mu}z_1^{|\lambda \setminus \mu|}s_\mu(z_2,z_2,\dots,z_n),\]

where partitions \(\mu\) of length \(n-1\) interleaves with \(\lambda\). We also have

\[\prod_{1<j}(z_1+tz_j)=\sum_{k=0}^{n-1}t^k z^{-k} e_k(z_2,\dots,z_n),\]

where \(e_k\) is the \(k\)-th elementary symmetric polynomial. Then, replace \(s_\lambda\) and \(\prod_{1<j}\) in \(G\) to get that \(G\) equals

\[\left(\sum_{k=0}^{n-1}t^k z_1^{-k} e_k(z_2,\dots,z_n)\right)\prod_{1<i<j}(z_i+tz_j)\left(\sum_{\mu}z_1^{|\lambda \setminus \mu|}s_\mu(z_2,\dots,z_n)\right)=\]

\[= \sum_{k,\mu}t^kz_1^{|\lambda \setminus \mu|-k}\prod_{1<i<j}(z_i+tz_j)e_k(z_2,\dots,z_n)s_\mu(z_2,\dots,z_n).\]

Recall Pieri’s formula in the form \(e_ks_\mu = \sum_{\nu}s_\nu\), where each \(\nu\) is such that \(\nu \setminus \mu\) is a \(k\)-vertical strip, that is, \(\nu_i = \mu_i\) or \(\nu_i = \mu_i + 1\), and \(\nu_i = \mu_i + 1\) exactly \(k\) times. Collecting everything together, we get that \(G\) equals to

\[\sum_{k,\mu,\nu}t^kz_1^{|\lambda \setminus \mu|-k}\prod_{1<i<j}(z_i+tz_j)s_{\nu}(z_2,\dots,z_n)=\]

\[=\sum_{k,\mu,\nu}t^kz_1^{|\lambda \setminus \mu|-k} G(\nu;t;z_2,\dots,z_n),\]

with \(k, \mu, \nu\) as above. Next, we notice that \(k = |\nu \setminus \mu|\) and \(|\lambda \setminus \mu| - |\nu \setminus \mu| = |\lambda \setminus \nu| = m(\lambda,\nu)\), and that \(\nu + \rho_{n-1}\) interleaves \(\lambda + \rho_n\) since \(\mu\) interleaves \(\lambda\) and \(\nu \setminus \mu\) is a vertical strip. Hence, we can write the branching rules for \(G\) as follows:

\[G(\lambda; t; z_1,\dots,z_n) = \sum_{\nu}\left(t^{|\nu \setminus \mu|}\right)z_1^{|\lambda \setminus \nu|}G(\nu; t; z_2,\dots,z_n).\]

Finally, we have to prove that

 (2)\begin{equation*} (1+t)^{g(\lambda,\nu)}t^{l(\lambda,\nu)} = \sum_{\mu}t^{|\nu \setminus \mu|} \end{equation*}

where \(\mu\) interleaves \(\lambda\) and \(\nu \setminus \mu\) is a vertical strip. The conditions on \(\mu\) can be written as \(\lambda_i \geq \mu_i \geq \lambda_{i+1}\) and \(\mu_i = \nu_i\) or \(\mu_i = \nu_i - 1\). Hence, we can write the right side in terms of individual \(\mu_i\) as follows:

\[\sum_{\mu}t^{|\nu \setminus \mu|} = \prod_{i=1}^{n-1}\left(\sum_{\mu_i}t^{\nu_i - \mu_i}\right).\]

Now, if \(\nu_i = \lambda_{i+1}\), then a pattern has a left-leaning entry in the \(i\)-th position; if \(\nu_i = \lambda_i\), a pattern has a right-leaning entry; and in the remaining cases the entry is generic. In the left-leaning case, \(\mu_i\) must be \(\nu_i-1\), and in the right-leaning case, \(\mu_i\) must be \(\nu_i\). In the generic case, \(\mu_i\) can be either \(\nu_i-1\) or \(\nu_i\). Thus,

\[\sum_{\mu_i}t^{\nu_i-\mu_i} = \begin{cases} t, &\text{in the left-leaning case}\\ 1, &\text{in the right-leaning case}\\ 1+t, &\text{in the generic case} \end{cases}.\]

It proves (2), and we are done. QED.

Tokuyama’s formula is related to the representation theory of \(p\)-adic groups as follows. Let \(G = \text{GL}_n(F)\), where \(F\) is a local non-archimedean field with the uniformizer \(\varpi\) and the cardinality of the residue field \(q\). By Casselman-Shalika formula, the value of the spherical Whittaker function on the diagonal element \(\text{diag}(\varpi^{\lambda_1},\dots,\varpi^{\lambda_n})\) is given by


which up to a constant equals the right side of Tokuyama’s formula when \(t = -q^{-1}\). In other words, Tokuyama’s formula can be interpreted as a combinatorial evaluation of the spherical Whittaker function. Now, the Casselman-Shalika formula holds for other groups beside \(\text{GL}_n(F)\) evaluating the spherical Whittaker function as the product of the deformed Weyl denominator and the corresponding character.

\[\prod_{\alpha \in \Phi^+}(1-q^{-1}z^\alpha)\chi_\lambda(z_1,\dots,z_n).\]

For example, for the symplectic group \(\text{Sp}(2n)\), the formula, up to a constant, takes the form


A natural question is whether the formula has a combinatorial evaluation in terms of GT-patterns for the symplectic (and other) groups. For the symplectic group, the answer is affirmative and is achieved by Dmitriy Ivanov using the solvable lattice models. No short proof like the one presented above is known.

For the orthogonal groups, the answer is still unknown. In other words, it is unknown how to represent

\[\prod_{i=1}^{n}(1+t z_i)\prod_{i<j}(1+tz_iz_j)(1+tz_iz_j^{-1})\chi_\lambda^{B}(z_1,\dots,z_n)\]

as a weighted sum over orthogonal GT patterns.

Note that the idea of the short proof above doesn’t work (at least I tried and failed!) for symplectic or orthogonal groups: both the branching rules and Pieri’s rules become not multiplicity-free within the type.

Notify of
Inline Feedbacks
View all comments