Post 19: Schur Functions and Irreducible Characters of Symmetric Groups

08 Jun 2025

In this post, I will discuss Schur functions and their application to the representation theory of symmetric groups.

Introduction

In my last post, I discussed symmetric functions, along with the family of elementary symmetric functions. In this post, I will discuss another family of symmetric functions, the Schur functions. Then, I will discuss their relationship to the irreducible representations of $S_n$ over the complex numbers.

Unlike my previous post, I’m not sure that this post will be accessible to a reader equipped only with a basic knowledge of groups and rings. The main reason for this is the usage of representation theory. There will be an appendix discussing some relevant representation theory things at the end of this post.

Partitions

Unlike the elementary symmetric functions, which are indexed by positive integers, the Schur functions are indexed by partitions of non-negative integers. So, I will first discuss partitions.

The typical view on partitions is that they are sum decompositions, such as $5=3+1+1$. We take something slightly more flexible. We define a partition $\lambda$ as an infinite sequence $(\lambda_1,\lambda_2,\dots)$ of non-negative integers such that $\lambda_1\ge\lambda_2\ge \cdots$ and such that only finitely many terms are non-zero. The non-zero elements of a partition are called its parts. The number of parts is called the length, denoted $\ell(\lambda)$. The sum of the parts is called the weight or size, denoted $|\lambda|$. Sometimes we will “truncate” a partition $\lambda$ to be a finite vector. In particular, when $\ell(\lambda)\le n$, we may truncate $\lambda$ to $(\lambda_1,\dots,\lambda_n)$. Throughout the post, $\lambda$ will always denote a partition.

Schur Functions

Now we arrive at the crux of the post. Something I neglected to mention in my last post is that, when defining specific symmetric functions, we often start by taking a finite number of variables. We do that now, say we work with $n$ variables. We also take our coefficient ring to be $\Z$.

Firstly, a piece of notation. We let $\epsilon:S_n\to{\pm1}$ denote the sign representation. It is a homomorphism uniquely defined by sending transpositions to $-1$.

Now, let $\alpha=(\alpha_1,\dots,\alpha_n)$ be a vector of non-negative integers. Let $x^\alpha$ denote the monomial $x_1^{\alpha_1}\cdots x_n^{\alpha_n}$. We then define the polynomial $a_\alpha$ by “antisymmetrizing” $x^\alpha$:

\[a_\alpha=\sum_{w\in S_n}\epsilon(w)w(x^\alpha).\]

This is an antisymmetric polynomial, meaning that $w(a_\alpha)=\epsilon(w)a_\alpha$ for any $w\in S_n$. Now, assume that $\alpha_1>\cdots>\alpha_n$. Then $\alpha$ can be express as the sum $\lambda+\delta$, where $\lambda=(\lambda_1,\dots,\lambda_n)$ is a (truncated) partition, and $\delta = (n-1,n-2,\dots,0)$. The polynomial $a_\delta$ is very special; it is called the Vandermonde determinant. It is the product of the binomials $x_i-x_j$ as $i,j$ range through $1\le i< j\le n$. What makes $a_\delta$ special for us is that it perfectly divides into $a_\alpha=a_{\lambda+\delta}$, meaning that the quotient $a_{\lambda+\delta}/a_\delta$ is a polynomial. Since we divided two antisymmetric polynomials, the result is a symmetric polynomial, denoted $s_\lambda$. This is the Schur polynomial associated to the partition $\lambda$. Note that, while we started with $\alpha$, we could have started with a partition $\lambda$. The only constraint on this partition is that its length must be less than or equal to the number of variables we are working with. This polynomial is homogeneous of degree $|\lambda|$.

It is a standard fact that the $s_\lambda$ satisfy the “consistency” property that lets them define symmetric functions, which I will also denote by $s_\lambda$. Surprisingly, both editions of the book “Symmetric Functions and Hall Polynomials” by Macdonald, which is the standard reference for this material, have incorrect proofs of the consistency of the $s_\lambda$. A correct proof isn’t that complicated, but I will not give it here.

Once we work with symmetric functions instead of symmetric polynomials, we don’t need to worry about the condition $\ell(\lambda)\le n$, for $n$ the number of variables; any partition will give rise to a Schur function. Two important facts about these functions are the following:

The $s_\lambda$ form a $\Z$-basis of $\Lambda$.
The $s_\lambda$ for $|\lambda|=k$ form a $\Z$-basis of $\Lambda^{(k)}$.

Unlike the $e_i$, the $s_\lambda$ are much harder to write down explicitly. One nice example is when $\lambda=(1,1,\dots,1,0,\dots)$, say with $n=|\lambda|$. In this case, $s_\lambda=e_n$. There is in fact a general formula that one can use to relate $s_\lambda$ to the $e_i$, but I will not give it here. Some examples are $s_{(2)}=e_1^2-e_2$ and $s_{(2,1)}=e_2e_1-e_3$.

Irreducible Characters of $S_n$

I will now discuss the relationship between Schur functions and the irreducible characters of $S_n$. If you do not know any representation theory, see the appendix at the bottom of this post for a crash course. All representations will be finite dimensional and over $\C$.

It is a classic fact that the conjugacy classes of $S_n$ can be indexed by partitions of $n$ via the “cycle type” of a permutation. In particular, if $w\in S_n$, then $w$ factors uniquely into a product of disjoint cycles of orders $\rho_1\ge\rho_2\ge\cdots$ where the sum of the $\rho_i$ is $n$. Let $\rho(w)=(\rho_1,\rho_2,\dots)$, the cycle type of $w$. For a given partition $\lambda$ of $n$, the elements of $S_n$ with cycle type $\lambda$ form a conjugacy class, and conversely, every conjugacy class consists of the elements with the same cycle type.

Since the number of irreducible representations of a finite group equals the number of conjugacy classes, we then find that the number of irreducible representations of $S_n$ is the number of partitions of $n$. In general, there is no way to naturally associate an irreducible representation to a given conjugacy class. However, since the combinatorics here is so rich, we will be able to associate an irreducible representation to a given partition.

Let $R^{(n)}$ be the group of virtual characters of $S_n$, i.e. the free abelian group generated by the irreducible characters of $S_n$. The elements of $R^{(n)}$ are (class) functions on $S_n$.

Now, for each $n\ge1$, let $p_n$ be the symmetric function given by summing the $n$th powers of variables. Also, let $p_0=1$. For a partition $\lambda$, let $p_\lambda=p_{\lambda_1}p_{\lambda_2}\cdots$, which is well defined since all but finitely many terms in the product will be $p_0=1$. We note that $p_\lambda\in\Lambda^{(n)}$ if $n=|\lambda|$. Now, let $ch:R^{(n)}\to\Lambda_\C^{(n)}=\Lambda^{(n)}\otimes\C$ be given by

\[ch^{(n)}(f)=\frac1{n!}\sum_{w\in S_n}f(w)p_{\rho(w)}.\]

The main theorem is the following:

$ch$ is a group isomorphism. The inverse image of $s_\lambda$ is an irreducible character $\chi_\lambda$ of $S_n$ for $n=|\lambda|$.

Hence, if we wanted to compute the $\chi_\lambda$, we would need to express $s_\lambda$ as a linear combination of $p_\rho$. The $p_\rho$ form a basis for symmetric functions over $\C$, so this is well-defined. Let me give an example of this computation.

Example Computation

I will compute $\chi_{(2,1)}$. It will suffice to deal with three variables $x,y,z$. Recall that I said $s_{(2,1)}=e_2e_1-e_3$. In order to write $s$ in terms of $p$, we need to write the $e$ in terms of $p$. We have $e_1=p_1$. I mentioned last time that $p_2=e_1^2-2e_2$, so $e_2=\frac12(p_1^2-p_2)$. The classic identity $x^3+y^3+z^3-3xyz=(x^2+y^2+z^2-xy-xz-yz)(x+y+z)$ then shows that $p_3-3e_3=(p_2-e_2)p_1=p_2p_1-\frac12(p_1^3-p_2p_1)=\frac32p_2p_1-\frac12p_1^3$. Then $e_3 = \frac13(p_3-\frac32p_2p_1+\frac12p_1^3)=\frac13p_3-\frac12p_2p_1+\frac16p_1^3$. Putting this all together, we get

\[s_{(2,1)}= \frac13p_{(1,1,1)}-\frac13p_{(3)}.\]

Next, we need to look at the conjugacy classes of $S_3$. The cycle type $(1,1,1)$ corresponds to the identity element. The cycle type $(2,1)$ corresponds to transpositions, of which there are three. The cycle type $(3)$ corresponds to $3$-cycles, of which there are two. Thus, writing $\chi_\lambda(\rho)$ for the value of $\chi_\lambda$ on an element with cycle type $\rho$, we have

\[ch(\chi_\lambda)=\frac16\left (\chi_\lambda((1,1,1))p_{(1,1,1)}+3\chi_\lambda((2,1))p_{(2,1)}+2\chi_\lambda((3))p_{(3)}\right ).\]

Since the main theorem tells us $ch(\chi_{(2,1)})=s_{(2,1)}$, we can compare coefficients to find

$\chi_{(2,1)}((1,1,1))=2,$ $\chi_{(2,1)}((2,1))=0,$ $\chi_{(2,1)}((3))=-1.$

Admittedly, this is far from a good way to compute the characters of such a small group. However, the methods are completely combinatorial and work for any $n$, so it is not so difficult to make a computer program do the work for you.

To end the post, let me briefly fire off some related results.

It is possible to assemble the $R^{(n)}$ into a graded ring $R$ so that $ch$ becomes a ring isomorphism. The multiplication on $R$ has to do with taking induced representations.
You can even define bilinear pairings on both $R$ and $\Lambda$ so that $ch$ is isometric with respect to these pairings. The pairing on $R$ is related to the usual pairing of virtual characters, whereas the pairing on $\Lambda$ is a bit ad hoc.
In the computation, we had to write $s_\lambda$ in terms of $p_\rho$. If we look the other way around, we get $p_{\rho}=\sum_\lambda\chi_\lambda(\rho)s_\lambda$, where the sum is over partitions with the same size as $\rho$.
As a consequence of the above formula, one can say that $\chi_\lambda(\rho)$ is the coefficient of $x^{\lambda+\delta}$ in $a_\delta p_\rho$, where we specialize to $n\ge \ell(\lambda)$ variables and $\delta = (n-1,\dots,0)$ as before. This is called the Frobenius character formula.

Appendix: Representations of Finite Groups

Basic Definitions

I will restrict myself to talking about finite dimensional representations of a finite group $G$ over the complex numbers $\C$. These assumptions are critical for various results that I will state, but I will not mention which results rely on which assumptions.

A representation of $G$ is a homomorphism $\rho:G\to GL_n(\C)$, where $GL_n(\C)$ is the group of invertible $n$ by $n$ matrices with entries in $\C$. In other words, we are “representing” each element of $G$ by a matrix, in such a way that matrix multiplication corresponds to the group multiplication.

While this notion is fairly explicit, it is often more convenient to think in a “coordinate-free” way. That is to say, a representation of $G$ is a finite dimensional $\C$-vector space $V$ together with a homomorphism $\rho:G\to GL(V)$, where $GL(V)$ is the group of invertible linear maps $V\to V$. By choosing a basis for $V$, we can recover the previous definition. We often just say $V$ is a representation of $G$ and leave the homomorphism implicit. Furthermore, we often write $gv$ instead of $\rho(g)(v)$, where $g\in G$ and $v\in V$.

A subrepresentation of a representation $V$ is a vector subspace $W\subseteq V$ such that $gw\in W$ for all $g\in G$ and $w\in W$. For instance, $0$ (shorthand for ${0}$) and $V$ are always subrepresentations of $V$. We say a representation is irreducible if these are the only subrepresentations. You can think of an irreducible representation as being similar to a prime number, or an element on the periodic table, for instance.

To take the analogy further, we state some decomposition results for representations.

Any representation $V$ decomposes into a finite direct sum $W_1\oplus \cdots\oplus W_m$ of irreducible subrepresentations.
There are finitely many irreducible representations up to isomorphism. In particular, the number of irreducible representations equals the number of conjugacy classes of $G$.

These results use some notions I haven’t mentioned yet, so let me comment on those.

The direct sum of representations $U,V$ is the direct sum $W=U\oplus V$ of vector spaces together with the action $gw=g(u+v)=gu+gv$, where $w=u+v$ is a decomposition of $w\in W$ into its $U$ and $V$ components. For a representation $V$ to split into a direct sum of subrepresentations $W_1,\dots, W_m$, it must be the case that there is a basis for $V$ in which all the matrices representing the elements of $G$ are block diagonal with the same block sizes across different elements.

A morphism of representations $f:V\to W$, also called a $G$-map or an intertwiner, is a linear map such that $f(gv)=gf(v)$ for all $g\in G$ and $v\in V$. An isomorphism of representations is then just a bijective morphism.

Characters

Now we get to the most powerful aspect of the representation theory of finite groups over $\C$: the characters.

The character of a representation $V$ is the function $\chi_V:G\to \C$ sending $g$ to its trace, after being considered as a linear map. A virtual character is any function $G\to\C$ that is obtained by taking $\Z$-linear combinations of characters of representations.

There are many things I could say about characters, but I will restrict myself to only the things you need to know in order to understand my post.

A character $\chi$ is a class function, meaning $\chi(g)=\chi(h)$ if $g,h$ are in the same conjugacy class.
$\chi_{V\oplus W}=\chi_V+\chi_W$, so every character is a linear combination of irreducible characters, i.e. the characters corresponding to irreducible characters.
A representation is determined (up to isomorphism) by its character, i.e. $\chi_V = \chi_W$ if and only if $V\cong W$.

Therefore, to classify the irreducible representations, we just need to determine the irreducible characters, and particularly their values on the various conjugacy classes of the group. The theory of characters is incredibly rich, and I recommend looking at the book on representation theory by Fulton and Harris to learn more about it.