Post 17: A Taste of Enumerative Geometry

22 Jul 2024


In this post, I’ll talk about enumerative geometry. In particular, I’ll talk about the standard questions of interest and how one goes about solving them.


Introduction

Enumerative geometry is the branch of mathematics concerned with enumerating, i.e. counting, possible geometric configurations satisfying some conditions. The most (in)famous problem in this area is the following:

For instance, if I drew five circles on a plane, how many curves that are either hyperbolas, parabolas, or ellipses could you find that are in that plane and are tangent to each of the five circles?

The story behind this problem is that in 1848, Steiner gave a false answer of $6^5=7776$. He incorrectly assumed that the tangency conditions were independent. About 11 years later, the correct answer, $3264$, was found by Ernest de Jonquières. However, it was not until 1978 – over 100 years later – that the problem was solved rigorously. Indeed, there are many subtle problems with even stating a question. Perhaps the biggest issue is addressing the subtle notion that the given conics need to be in “general position”. This roughly means that the configuration of conics avoids edge cases, such as two conics overlapping. Defining this condition precisely is difficult, perhaps impossible, so I will not do it. There are other subtleties at play, such as having to work in the complex projective plane, not just the affine real plane.

One of the surprising things to me is that problems such as the one above can even be answered. In the past few months, I spent a lot of time learning how to approach enumerative geometry, so I want to relay the big picture here.

More Sample Questions

Before getting into details, let me give some more questions at the heart of enumerative geometry. All given data is assumed in “general position”. Don’t worry if some adjectives are unfamiliar to you.

Getting Started – Complex Projective Space

Two basic modifications must be made before starting with enumerative geometry. The first is that we must deal with complex numbers, so that polynomials always have roots, and they have the “correct” number of roots. For instance, if I draw a circle $(x-a)^2+(y-b)^2=r^2$ in the real plane, and you draw a line $y=cx+d$, then they may not visually intersect, but they will have complex intersection points, since it comes down to finding the roots of $(x-a)^2 + (cx+d-b)^2=r^2$.

The other modification we need is to consider projective space. One of the ways I motivate the introduction of projective space into enumerative geometry is so that intersection counts are consistent with “wiggling”. For example, the lines $y=x$ and $y=1.1x+1$ intersect, but if I “wiggle” the second line to $y=x+1$, they no longer intersect in the affine plane. However, they would intersect in the projective plane. The geometric explanation is projective space adds “points at infinity”.

However, this may seem confusing at an algebra level: how can $y=x+1$ and $y=x$ share a solution? We get $x=x+1$, which is impossible! Well, since we are enlarging our space, we no longer have just $x$ and $y$. Instead, we have three variables, $X,Y,Z$, which are defined up to a non-zero constant; this means that $X=a,Y=b,Z=c$ represents the same point as $X=ad,Y=bd,Z=cd$, if $d$ is non-zero. Furthermore, the three variables $X,Y,Z$ are not allowed to all be zero at the same time. We use $[a:b:c]$ to represent the point where $X=a,Y=b, Z=c$ (up to a constant). If $c$ is not zero, then $[a:b:c]=[\frac ac:\frac bc:1]$ recovers the usual point $(\frac ac, \frac bc)$ in the affine plane.

Now, polynomials have to be “homogenized” so that all terms have the same degree. For instance, $y=x$ becomes $Y=X$, while $y=x+1$ becomes $Y=X+Z$. The original equations can be recovered by setting $Z=1$. On the other hand, the “points at infinity” correspond to setting $Z=0$. Indeed, if we return to our equations $Y=X$ and $Y=X+Z$, we see there is no solution unless $Z=0$, in which case we get $Y=X$. Now, remember, $X,Y,Z$ are only defined up to a constant. In particular, $Z=0$ and $Y=X$ represents a single point, which we may express as $[1:1:0]$. We always homogenize to the largest degree appearing in the original polynomial. For instance, $y=x^2-x+2$ becomes $YZ=X^2 - XZ + 2Z^2$, and $y^3 = x^2 - x + 2$ becomes $Y^3 = X^2 Z - X Z^2 + 2Z^3$.

Everything I’ve just said was only for the projective plane, but the same idea works for any dimension. If we start with affine $n$-space, e.g. $\R^n$ or $\C^n$, then the corresponding projective $n$-space is denoted $\R\bb P^n$ or $\C\bb P^n$. From now on, when I refer to affine or projective space, I mean over $\C$. When I say “the plane”, I mean $\C\bb P^2$, but you can (and I often do) visualize $\R^2$.

The Powerhouse: Chow Ring

The fundamental battleground for enumerative geometry computations is the Chow ring $A(X)$ of a relevant space $X$. In some ways, the Chow ring is similar to the (singular) cohomology ring; they are both graded2 rings with (some) functoriality, and they agree3 for affine and projective space. However, this isn’t the best way to think about what the Chow ring does, since they encode different data.

The Chow ring $A(X)$ is generated by elements that are represented by certain subspaces of $X$, in such a way that we are allowed to “wiggle around” these subspaces without changing the element. The precise definition of “wiggle around” is beyond the scope of this blog post. Addition in this ring is free abelian. Multiplication in this ring is the so-called intersection product, which takes a decent amount of work to set up. However, if you take the details of the construction for granted, the actual computations are pleasant.

The typical situation is that, if $Y$ and $Z$ are subspaces and $[Y]$ and $[Z]$ are the corresponding elements, then $[Y]\cdot[Z]=m[Y\cap Z]$ for a positive integer $m$ called the intersection multiplicity. Remember, we are allowed to “wiggle” the spaces $Y$ and $Z$ around so that they intersect in a “generic” way.

The additive identity $0$ in $A(X)$ can be thought of as the class of the empty set. This should make sense, as $\varnothing\cap Y = \varnothing$ translates to $0\cdot [Y]=0$. The multiplicative identity $1$ in $A(X)$ is the class of $X$. This should also make sense, as $X\cap Y = Y$ translates to $1\cdot[Y] = [Y]$.

For example, in $\C\bb P^2$, if $Y$ and $Z$ are two lines, then $[Y]=[Z]$, and $[Y]\cdot[Z]$ is the class of a point, which is also independent on the choice of point. This calculation reflects the idea that two “generic” lines in $\C\bb P^2$ should intersect in exactly one point. Similarly, if we consider a conic section $Y$, i.e. a parabola defined by $xz-y^2=0$, and a line $Z$, then $[Y]\cdot [Z]$ is $2$ times the class of a point. This calculation reflects the idea that a “generic” parabola and a “generic line” should intersect in two points.

The Grading in $A(X)$

Let me now back up and discuss the graded structure on $A(X)$. First and foremost, a graded ring is a ring4 $R$ together with additive subgroups $R_0,R_1,\dots$ such that $R=\bigoplus R_i$, i.e. any element of $R$ is uniquely a finite sum of elements chosen from the $R_i$, and $R_i\cdot R_j\subset R_{i+j}$. Note that $R_0\cdot R_0\subset R_0$, i.e. $R_0$ is a subring, and thus $1\in R_0$. For instance, any polynomial ring $R[x_1,\dots,x_n]$ is naturally graded by degree; the degree $0$ subgroup is $R$, the degree $1$ subgroup is $\bigoplus Rx_i$, and so on.

Now, the spaces we consider in enumerative geometry usually have a well-defined notion of dimension. For instance, $\C\bb P^n$ is $n$-dimensional5. For a space $X$ of dimension $n$, its Chow ring $A(X)$ has graded pieces $A^i(X)$ for $i=0,\dots,n$; if $i>n$, then $A^i(X)=0$. If $Y\subset X$ has dimension $m$, then $[Y]\in A^{n-m}(X)$. For instance, the identity $1=[X]$ lives in the degree $0$ component $A^0(X)$, as it should, and the class of a point lives in the highest degree component, $A^n(X)$. This implies, for instance, that if $Y$ is a subspace of dimension less than $n$, then $[Y]\cdot [\text{point}]=0$; this reflects the fact that we can “wiggle” the point so that it does not intersect $Y$.

Let us return to examples to illustrate the grading.

The Chow ring of projective space is incredibly nice to work with. As a graded ring, $A(\C\bb P^n)\cong \Z[\zeta]/(\zeta^{n+1})$, where $\zeta\in A^1(\C\bb P^n)$ is the class of any hyperplane, e.g. the subspace defined by $X_0+\cdots+X_n=0$. On the other end, $\zeta^n$ is the class of a point. If you take a degree $d$ homogeneous irreducible polynomial $F$ and consider the subspace $Y$ defined by $F=0$, you get $[Y]=d\zeta$. For instance, in $A(\C\bb P^2)\cong \Z[\zeta]/(\zeta^3)$, the class of the parabola $xz-y^2=0$ is $2\zeta$. If $Y$ and $Z$ are subspaces of dimensions $k$ and $n-k$, then $[Y]\cdot[Z]=m\zeta^n$ for some integer $m$. The fact that we get an integer is the reason why the Chow ring is fundamental to its importance in solving enumerative problems.

On the other hand, the Chow ring of affine space is also nice to work with, but is in some since “so good it is bad”. In particular, as a graded ring, it is isomorphic to $\Z$; it is concentrated in degree $0$. This means that if $Y\subset \C^n$ is a subspace of dimension less than $n$, then $[Y]=0$.

Given a dimension $n$ space $X$, there is a canonical degree map $A^n(X)\to\Z$ given (more or less) by mapping non-zero clases of points to $1\in\Z$. In other words, given a typical element $\sum n_P[P]\in A^n(X)$, where each class $[P]$ is not zero, its degree is $\sum n_P$. We can therefore interpret elements of $A^n(X)$ as integers, and we often do so implicitly. Note that, as in the case of affine space, $A^n(X)$ need not be isomorphic to $\Z$.

Some Sample Solutions

Let’s see how the Chow ring machinery can be used to solve some enumerative problems. Unfortunately, I’ll have to use some special geometric facts and definitions, which may add on to the unsatisfying ad-hoc nature of this blog post.

By my conventions, “$n$-space” means $\C\bb P^n$. “General position” refers to our ability to move things around to get minimal intersections. Now, a hypersurface of degree $d$ means a subspace defined by a homogeneous irreducible polynomial of degree $d$. By what I said above, the class of such a subspace in $A(\C\bb P^n)\cong \Z[\zeta]/(\zeta^{n+1})$ is $d\zeta$. To find out about their intersection, we want to compute the product of the elements $d_1\zeta,\dots,d_n\zeta$, which is just $d_1\cdots d_n \zeta^n$. Since $\zeta^n$ is the class of a point, we can interpret this computation by saying that the intersection consists of $d_1\cdots d_n$ points.

The above result is known as Bézout’s theorem. As a concrete example, two (generic) conic sections in the plane intersect in $4$ points.

In this problem, we can exploit the duality of projective space. This states that the space of hyperplanes in $\C\bb P^n$ is naturally identified with $\C\bb P^n$. Explicitly, the point $[a_0 : \cdots : a_n]\in\C\bb P^n$ is identified with the hyperplane $a_0X_0 + \cdots + a_n X_n = 0$. Now, we use the following fact: if $S$ is a smooth degree $d$ hypersurface in $\C\bb P^n$, then the set of tangent hyperplanes to $S$ forms a degree $d(d-1)^{n-1}$ hypersurface in the dual projective space. In particular, the dual of our cubic surface is a degree $3\cdot 2^2=12$ surface. In $3$-space, the dual of a line is a line. Furthermore, in $3$-space, the condition that a plane contains a line is dual to the condition that a point is contained in a line. All of this means that we can translate our problem to the following:

This is now easily answered by the Chow ring: $12\zeta\cdot\zeta^2 = 12\zeta^3$ represents $12$ points.

We have $[S]\cdot[T]=[L]+[C]$. Furthermore, $[S]=s\zeta, [T]=t\zeta,$ and $[L]=\zeta^2$. Thus, we have $[C]=(st-1)\zeta^2$. It remains to relate the degree and genus of $C$ to its class in $A(\C\bb P^3)$. We will have to use some geometric definitions facts that I have not mentioned before. The degree of $C$ is, more or less by definition, the number of points in the intersection of $C$ with a generic hyperplane. We can calculate this in the Chow ring: $[C]\cdot\zeta = (st-1)\zeta^3$ implies that the degree of $C$ is $st-1$.

The computation of the genus is a bit more contrived, so bear with me. The Chow ring of $S$ contains an element $\sigma$ which is the class of a (generic) intersection of a plane with $S$. The class $\sigma$ is essentially the restriction of $\zeta\in A(\C\bb P^3)$ to $A(S)$. For instance, $\deg(\sigma^2)=s$, since $\zeta^2$ is the class of a line, and a line hits $S$ in $s$ points. Similarly, since $[T]=t\zeta$ in $A(\C\bb P^3)$, we have $[S\cap T]_S = t\sigma$ in $A(S)$. I have used the subscript $S$ to denote that we are taking the class of $S\cap T$ in $S$, i.e. we are only allowed to “wiggle” $S\cap T$ within $S$. We therefore have $[C]_S + [L]_S = t\sigma$. Next, we use a genus formula for smooth curves in degree $s$ surfaces in $\C\P^3$:

This formula applies to both $C$ and $L$. The genus of a line is $0$. Furthermore, $\deg(\sigma\cdot [L]_S)=1$, since $\deg(\zeta\cdot [L])=1$. It then follows from $[C]_S + [L]_S = t\sigma$ and $\deg(\sigma^2)=s$ that $\deg(\sigma\cdot [C]_S)=st-1$. Writing $g$ for the genus of $C$, we get the following equations:

The third equation is obtained by multiplying $[C]_S + [L]_S = t\sigma$ by $[C]_S$ and $[L]_S$ separately. We see that we can use $\deg([L]_S^2)=\deg([C]_S^2)-t(st-2)$ in the second equation to solve for $\deg([C]_S^2)$, and then use it in the first equation to solve for $g$. Indeed, we get:

\[0 = \deg([C]_S^2) - t(st-2) + s-4 + 2,\]

so

\[\deg([C]_S^2) = t(st-2)+2-s,\]

so

\[g = t(st-2)/2 + 1 -s/2 + (s-4)(st-1)/2 + 1\] \[= t(st-2)/2 - (s-4)/2 + (s-4)(st-1)/2\] \[= (s+t-4)(st-2)/2.\]

Conclusion

Hopefully these sample solutions give you some feel for the power of the Chow ring; we can use basic geometric reasoning and simple algebra to obtain rigorous solutions to enumerative geometry problems.


Footnotes:

  1. The given conics must be in general position. I’ll also implicitly assume smoothness. 

  2. If you don’t know what this means, fret not; I’ll explain it in a bit. 

  3. Technically they are different as graded rings, but they agree as non-graded rings. 

  4. I will take rings to be commutative with a multiplicative identity, as is standard in algebraic geometry. 

  5. Warning: we take the dimension over the base field $\C$, not over $\R$.