The rank of \(y^2 = x^3 - 2\) via Mazur-Tate methods

When I was a young kid, I heard the mathematical fact that the only (positive) integer that is one more than a square and one less than a cube is \(26\). Said differently, the only integer solutions \((x,y)\) to \(y^2 = x^3 - 2\) are given by \((3,\pm 5)\). There are elementary methods to prove this, using the fact that the ring of integers of \(\mathbf{Q}(\sqrt{-2})\) is a unique factorization domain. However, what if we now ask for rational solutions? The equation

\[E : y^2 = x^3 - 2\]

is of course an elliptic curve over \(\mathbf{Q}\). Therefore, this more difficult question is equivalent to asking for the structure of the Mordell-Weil group of \(E\).

Let us first analyze the structure of the torsion subgroup of \(E\), namely \(E_{\text{tors}}\). The discriminant of \(E\) is \(\Delta = -4^3\cdot 3^3\) and so \(E\) has good reduction outside of \(2\) and \(3\). One checks that \(\#E(\mathbf{F}_5) = 6\) and \(\#E(\mathbf{F}_7) = 7\), and thus \(E(\mathbf{Q})_{\text{tors}} = 0\). It now remains to compute the rank of \(E\). In Silverman’s Arithmetic of Elliptic Curves, one finds an algorithm to compute the rank of an elliptic curve over \(\mathbf{Q}\) – in the case that the \(2\)-torsion is defined over \(\mathbf{Q}\). However, the \(2\)-torsion of \(E\) is not defined over \(\mathbf{Q}\), so the method in Silverman cannot be applied. In this article, we prove the following theorem:

Theorem 1: The rank of \(E\) is equal to \(1\).

We remark that since \(P = (3,5)\) is non-torsion, it is enough to show that the rank of \(E\) is bounded by \(1\). Certainly, it is true that there are computer programs such as MAGMA that can calculate the rank of \(E\). Nonetheless, we believe that there is value in bounding the rank of \(E\) “by hand.” Reason being, the latter method requires non-trivial input about the arithmetic of a certain \(S_3\)-extension of \(\mathbf{Q}\). In addition, the calculations turns out to be very explicit (computing invariants in terms of generators and relations) which is always fun!

The method of 2-descent

We will use the method of \(2\)-descent to compute the rank of \(E\). As in the proof of the weak Mordell-Weil theorem, we want to compute the dimension of the \(\mathbf{F}_2\)-vector space \(E(\mathbf{Q})/2E(\mathbf{Q})\), and in fact for the purposes of Theorem 1, we want to show that \(\dim_{\mathbf{F}_2}E(\mathbf{Q})/2E(\mathbf{Q}) \leq 1.\) To this end, consider the short exact sequence of étale sheaves on \(\spec \mathbf{Q}\)

\[0 \to E[2] \to E \stackrel{2\cdot}{\to} E \to 0.\]

This gives an injection

\[ \frac{E(\mathbf{Q})}{2E(\mathbf{Q})} \hookrightarrow H^1(G_\mathbf{Q}, E[2](\qbar)).\]

Now we run into the following problem: The Galois cohomology group \(H^1(G_\mathbf{Q}, E[2](\qbar))\) is often infinite-dimensional. To illustrate this, let us suppose for the moment that the \(2\)-torsion of \(E\) is defined over \(\mathbf{Q}\). Then \(E[2](\qbar) \simeq \mu_2^{\oplus 2}\), and therefore by the Kummer sequence \(H^1(G_\mathbf{Q}, E[2](\qbar) )\simeq (\mathbf{Q}^\times/\mathbf{Q}^{\times 2})^{\oplus 2}\) which is very infinite-dimensional.

The get-out-of-jail-free card, as introduced in the Mazur-Tate article Points of order \(13\) on elliptic curves, is to work integrally. The reason this is great is because for a number field \(K\), the group of non-squares \(K^\times/K^{\times 2}\) is infinite-dimensional, but \(\mathcal{O}_K^\times/\mathcal{O}_K^{\times 2}\) is not, thanks to Dirichlet’s unit theorem. In view of this, we will now modify our approach above as follows. First, observe that the elliptic curve \(E\) has bad reduction at \(2\) and \(3\) and nowhere else. Therefore, we may extend \(E\) to an elliptic scheme \(\mathcal{E}\) over \(\spec \mathbf{Z}[1/6]\). In simple terms, \(\mathcal{E}\) is simply the vanishing locus of the same equation for \(E\) in \(\mathbf{P}^2_{\mathbf{Z}[1/6]}\), since \(E\) is already defined integrally. The more advanced reader may note that \(\mathcal{E}\) is also the Néron model of \(E\) over \(\spec \mathbf{Z}[1/6]\), since any abelian scheme over a Dedekind base is the Néron model of its generic fiber.

Now by the valuative criteria for properness,

\[ \frac{E(\mathbf{Q})}{2 E(\mathbf{Q})} = \frac{ \mathcal{E}(\mathbf{Z}[1/6])}{2\mathcal{E}(\mathbf{Z}[1/6])}\]

and therefore it suffices to show that \(\dim_{\mathbf{F}_2} { \mathcal{E}(\mathbf{Z}[1/6])}/{2\mathcal{E}(\mathbf{Z}[1/6])} \leq 1.\) Furthermore, by considering the exact sequence arising from multiplication by \(2\) on \(\mathcal{E}\), we obtain (as in the case for \(E\)) an injection

 (8)\begin{equation*} \frac{\mathcal{E}(\mathbf{Z}[1/6])}{2\mathcal{E}(\mathbf{Z}[1/6]) }\hookrightarrow H^1(\mathbf{Z}[1/6], \mathcal{E}[2]).\end{equation*}

The upshot of replacing \(E\) with \(\mathcal{E}\)? The group \(H^1(\mathbf{Z}[1/6], \mathcal{E}[2])\) is finite! In fact, its dimension as an \(\mathbf{F}_2\)-vector space is bounded by \(1\). This is what we will show next.

Computing with Galois cohomology


Let \(K\) denote the splitting field of \(x^3 - 2\). It is a basic exercise in Galois theory that \(K = \mathbf{Q}(\sqrt[3]{2}, \omega)\), where \(\omega\) is a primitive third root of unity. By definition of the group law on an elliptic curve, the \(2\)-torsion on \(E\) is precisely the zero locus of \(x^3 - 2\), and therefore

\[ E[2] \otimes_\mathbf{Q} K \simeq \{\infty, \sqrt[3]{2}, \sqrt[3]{2}\omega, \sqrt[3]{2}\omega^2\} \simeq ( \mathbf{Z}/2\mathbf{Z})^{\oplus 2}\]

where the last isomorphism is non-canonical, i.e. depends on a choice of basis.

In addition, we make the following important observation. Let \(S\) be the set of primes in \(\mathcal{O}_K\) lying over \(2\) and \(3\). It is known (e.g. by Keith Conrad’s article here) that \(S = \{(\sqrt[3]{2}), (\eta)\}\) where \(\eta = \sqrt{-3}/(1 + \sqrt[3]{2})\). Consequently \(\spec \mathcal{O}_{K,S} \to \spec \mathbf{Z}[1/6]\) is finite étale, where

\[ \mathcal{O}_{K,S} \coloneqq \{x \in K : \text{\(\nu_{\mathfrak{p}}(x) \geq 0\) for all \(\mathfrak{p} \notin S\)}\}.\]

Furthermore, the Galois group \(G \coloneqq \text{Gal}(K/\mathbf{Q})\) acts on the ring \(\mathcal{O}_{K,S}\), and we claim that in fact \(G\) is equal to the full automorphism group of \(\mathcal{O}_{K,S}\) over \(\mathbf{Z}[1/6]\), i.e. \(\spec \mathcal{O}_{K,S} \to \spec \mathbf{Z}[1/6]\) is a Galois cover with Galois group \(G\). Indeed, this follows from the fact that any automorphism of the generic fiber \(K\) preserves \(\mathcal{O}_{K,S}\), since \(G(S) \subseteq S\).

We leave it as an exercise for the reader to verify that \(\mathcal{E}[2]\) splits over \(\mathcal{O}_{K,S}\).

Kummer Theory

Since \(\mathcal{E}[2]\) is isomorphic to \((\mathbf{Z}/2\mathbf{Z})^{\oplus 2}\) over \(\spec \mathcal{O}_{K,S}\), it seems only natural to pass to this extension. Even better, we now have the Kummer sequence at our disposal, which as we will see makes everything very explicit. To this end, recall that \(\spec \mathcal{O}_{K,S} \to \spec \mathbf{Z}[1/6]\) is Galois with Galois group \(G\). Therefore, we have a Hochschild-Serre spectral sequence

\[H^i(G, H^j(\mathcal{O}_{K,S}, \mathcal{E}[2])) \implies H^{i+j}(\mathbf{Z}[1/6], \mathcal{E}[2]).\]

The low degree terms of this spectral sequence give rise to an exact sequence

 (9)\begin{equation*} \begin{tikzcd}  0 \ar{r} & H^1( G, H^0(\mathcal{O}_{K,S}, \mathcal{E}[2])) \ar{r} &  H^{1}(\mathbf{Z}[1/6], \mathcal{E}[2]) \ar{d} & {} \\  {} & {} & H^1(\mathcal{O}_{K,S}, \mathcal{E}[2])^G \ar{r} &  H^2( G, H^0(\mathcal{O}_{K,S}, \mathcal{E}[2])). \end{tikzcd} \end{equation*}

We claim:

Lemma 1: \(H^1( G, H^0(\mathcal{O}_{K,S} ,\mathcal{E}[2])) = 0.\)

Proof: Do inflation-restriction with the normal subgroup \(H \coloneqq \text{Gal}(K/\mathbf{Q}(\omega))\).

Now recall that our goal is to show that the dimension of \(H^1(\mathbf{Z}[1/6], \mathcal{E}[2])\) (as an \(\mathbf{F}_2\)-vector space) is bounded by \(1\). By (9) and Lemma 1, we have an injection

 (10)\begin{equation*}  H^1(\mathbf{Z}[1/6] , \mathcal{E}[2]) \hookrightarrow H^1(\mathcal{O}_{K,S}, \mathcal{E}[2])^G,\end{equation*}

and hence it suffices to show that the same is true of the right side of (10).

Let us spell out the right side of (10) without the Galois invariants. We know that the \(2\)-torsion \(\mathcal{E}[2]\) is abstractly isomorphic to \(\mu_2^{\oplus 2 }\). Therefore, the calculation of \(H^1(\mathcal{O}_{K,S}, \mathcal{E}[2]) \simeq H^1(\mathcal{O}_{K,S}, \mu_2)^{\oplus 2}\) reduces to one using the Kummer sequence

\[0 \to \mu_2 \to \mathbf{G}_m \to \mathbf{G}_m \to 0.\]

Note this is exact on the étale site of \(\spec \mathcal{O}_{K,S}\), precisely because \(2\) is an \(S\)-unit.

Now pass to the long exact sequence in cohomology. We get a short exact sequence

\[0 \to \units \to H^1(\mathcal{O}_{K,S}, \mu_2)^{\oplus 2} \to \operatorname{Pic}(\mathcal{O}_{K,S})[2]^{\oplus 2} \to 0.\]

But \(\operatorname{Pic}(\mathcal{O}_{K,S}) = 0\) because it receives a surjection from \(\operatorname{Pic}(\mathcal{O}_K)\) which is also zero since \(K\) has class number \(1\). In summary, we have shown that as abstract abelian groups,

 (11)\begin{equation*} H^1(\mathcal{O}_{K,S}, \mathcal{E}[2])\simeq H^1(\mathcal{O}_{K,S}, \mu_2)^{\oplus 2} \simeq (\units)^{\oplus 2}. \end{equation*}

Great, so we now want to take the Galois invariants of \((\units)^{\oplus 2}\). But how do we do this precisely ? It is tempting to think that the Galois action on \((\units)^{\oplus 2}\) is coordinate-wise given by the usual Galois action on \(K\). However, this is false because the Galois action on \((\units)^{\oplus 2}\) is “twisted,” in the sense that it comes from the Galois action on \(\mathcal{E}[2]\).

The Galois action on units mod squares

Recall that \(K = \mathbf{Q}(\sqrt[3]{2}, \omega)\) where \(\omega\) is a primitive third root of unity. The Galois group of \(K\) (always denoted \(G\)) is abstractly isomorphic to \(S_3\), with explicit generators given by

\[ \begin{array}{rrc} \sigma : \sqrt[3]{2} & \mapsto& \sqrt[3]{2} \omega \\ \omega& \mapsto&  \omega ,\end{array} \hspace{10mm}  \begin{array}{rrc} \tau : \omega & \mapsto& \omega^2  \\ \sqrt[3]{2}& \mapsto&  \sqrt[3]{2} \end{array}  \]

satisfying the relations \(\sigma^3 = \tau^2 = 1\), \(\sigma\tau = \tau \sigma^2\). For any \((a,b) \in (\units)^{\oplus 2}\), we want to give an explicit description of \(\sigma \cdot (a,b)\) and \(\tau \cdot (a,b)\) via the isomorphism (11). To write down such an action explicitly, it makes sense intuitively that we must also compute the (usual) Galois action on the finite-dimensional \(\mathbf{F}_2\)-vector space \(\units\), where by “usual” we mean the one coming from the Galois action on \(K\).

This is summarized in the two propositions below:

Proposition 1: The group of units mod squares \(\units\) is a finite-dimensional \(\mathbf{F}_2\)-vector space with basis given by

\[\{-1, \varepsilon, \overline{\varepsilon}, \eta, \sqrt[3]{2}\}.\]

Here \(\varepsilon\) satisfies \(\varepsilon^2 = -u\varepsilon - u\) and \(u= 1 + \sqrt[3]{2} + \sqrt[3]{4}\) is the fundamental unit of \(\mathbf{Q}(\sqrt[3]{2})\). Furthermore, the restriction of the Galois action on \(K\) to \(\units\) is given in terms of this basis as follows: For \(\sigma\), we have

 (12)\begin{eqnarray*} \sigma(-1) &=& -1  \\  \sigma(\varepsilon) &=& \varepsilon \overline{\varepsilon} \label{eq:7} \\ \sigma(\overline{\varepsilon}) &=& \varepsilon \label{eq:8} \\  \sigma(\eta) &=&\varepsilon \eta \label{eq:9} \\ \sigma(\sqrt[3]{2}) &=& \sqrt[3]{2}. \label{eq:10} \end{eqnarray*}

For \(\tau\), we have

 (13)\begin{eqnarray*} \tau(-1) &=& -1  \\  \tau(\varepsilon) &=& \overline{\varepsilon} \label{eq:2} \\ \tau(\overline{\varepsilon}) &=& \varepsilon \label{eq:3} \\  \tau(\eta) &=&- \eta\label{eq:4} \\ \tau(\sqrt[3]{2}) &=& \sqrt[3]{2} . \label{eq:5} \end{eqnarray*}

Proposition 2: For \((a,b) \in (\units)^{\oplus 2}\), we have

\begin{eqnarray*} \tau \cdot (a,b) &=& (\tau(a),\tau(ab)) \\ \sigma \cdot (a,b) &=& (\sigma(b), \sigma(ab))\end{eqnarray*}

where by \(\tau(a), \sigma(b)\), etc we mean the \(G\)-action on \(\units\) given by Proposition 1.

The reader may refer to the expanded version of this article here for proofs of these propositions.

Proof of Theorem 1

By (8) and (10), we have injections

\[\frac{\mathcal{E}(\mathbf{Z}[1/6]) }{2\mathcal{E}(\mathbf{Z}[1/6])} \hookrightarrow H^1(\mathbf{Z}[1/6], \mathcal{E}[2])   \hookrightarrow  H^1(\mathcal{O}_{K,S}, \mathcal{E}[2])^G. \]

Furthermore, the right-most term is isomorphic to \(( (\units)^{\oplus 2})^G\) with the \(G\)-action given by Propositions 1 and 2. To this end, for \((a,b) \in (\units)^{\oplus 2}\), let us write

\begin{eqnarray*} a &=& (-1)^{m_1} \varepsilon^{m_2} \overline{\varepsilon}^{m_3} \eta^{m_4} \sqrt[3]{2}^{m_5} \\ b &=& (-1)^{n_1} \varepsilon^{n_2} \overline{\varepsilon}^{n_3} \eta^{n_4} \sqrt[3]{2}^{n_5} \end{eqnarray*}

for some \(\vec{m} := (m_1,\ldots, m_5)\) and \(\vec{n} := ( n_1, \ldots, n_5)\) in \(\mathbf{F}_2^{\oplus 5}\).

Now suppose that \((a,b) \in (\units)^{\oplus 2}\) is invariant under \(\tau\) and \(\sigma\). Then the following relations must hold:

\begin{eqnarray*}  \tau(a) &=& a \label{eq:tau_a} \\ \sigma(b) &=& a \label{eq:sig_a} \\ \tau(ab) &=& b \label{eq:tau_b}\\ \sigma(ab) &=& b.\label{eq:sig_b} \end{eqnarray*}

The first relation \(\tau(a) = a\) says

\[ (-1)^{m_1 + m_4} \overline{\varepsilon}^{m_2} \varepsilon^{m_3} \eta^{m_4} \sqrt[3]{2}^{m_5} =  (-1)^{m_1} \varepsilon^{m_2} \overline{\varepsilon}^{m_3} \eta^{m_4} \sqrt[3]{2}^{m_5},\]

which implies that

\begin{eqnarray*} m_4 &=& 0\\ m_2 &=& m_3. \end{eqnarray*}

In other words,

 (14)\begin{equation*} a = (-1)^{m_1}\varepsilon^{m_2} \overline{\varepsilon}^{m_2} \sqrt[3]{2}^{m_5}. \end{equation*}

Now consider the second relation \(\sigma(b) = a\). Using (14), this reads

\[(-1)^{n_1} \varepsilon^{n_2+n_3 + n_4}  \overline{\varepsilon}^{n_2} \eta^{n_4} \sqrt[3]{2}^{n_5} = (-1)^{m_1} \varepsilon^{m_2} \overline{\varepsilon}^{m_2} \sqrt[3]{2}^{m_5}.\]

Comparing coefficients, we obtain

\begin{eqnarray*} m_1 &=& n_1 \\ m_2 &=& n_2 + n_3 + n_4 \\ m_2 &=& n_2 \\ n_4 &=& 0\\ n_5 &=& m_5. \end{eqnarray*}

and therefore \(n_3 = 0\) as well. Now we summarize what we have deduced about \(\vec{m}\) and \(\vec{n}\) so far:

\begin{eqnarray*} \vec{m} &=& (m_1, m_2, m_2, 0 , m_5) \\ \vec{n} &=& (m_1, m_2, 0,0,m_5).\end{eqnarray*}

We’re nearly there. Consider the third relation \(\tau(ab) = b\). The product \(ab\) corresponds to adding the vectors \(\vec{m}\) and \(\vec{n}\). But \(\vec{m} + \vec{n} = (0,0,m_2,0,0)\) and therefore

\[\tau(ab) = \tau(0,0,m_2,0,0) = (0,m_2,0,0,0).\]

This must equal \(b\), which in terms of \(\vec{n}\), says

\[ (0,m_2,0,0,0) = (m_1, m_2, 0,0,m_5).\]

In other words, \(m_1 =m_5 = 0\). The final relation does not yield any extra information since \(\sigma(ab) = \tau(ab)\). In summary, we have proven that \(\vec{m} = \vec{n} = (0,m_2,0,0,0)\), i.e. \(((\units)^{\oplus 2})^G\) is spanned by \((\varepsilon, \varepsilon)\). This completes the proof of Theorem 1.

Notify of
Inline Feedbacks
View all comments