Skip to content

Commit

Permalink
more updates to <= chapter 3.
Browse files Browse the repository at this point in the history
  • Loading branch information
kbuzzard committed Apr 4, 2024
1 parent 3e8d931 commit 10ec4e2
Show file tree
Hide file tree
Showing 7 changed files with 142 additions and 87 deletions.
29 changes: 22 additions & 7 deletions blog.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,20 +16,35 @@ Kevin Buzzard discusses the project to prove Fermat's Last Theorem in Lean.
<!-- TEASER_END -->
# Introduction

Fermat's Last Theorem (FLT) is the claim that some equation or other has no solutions in positive integers. The equation has no applications, either theoretical or practical. So why did its solution by Andrew Wiles in 1993 make the front page of the New York Times?
Fermat's Last Theorem (FLT) is the claim that some equation or other has no solutions in positive integers. The equation has no applications, either theoretical or practical. So why did the announcement of its solution by Andrew Wiles in 1993 make the front page of the New York Times?

One aspect of FLT is that the question is ridiculously simple to *state*, and yet it seems incredibly hard to *prove* (indeed, it took the mathematical community over 350 years). Thus it serves as a great reminder of how profound mathematics is. But what really justifies the interest in the problem is that over the past few centuries a huge amount of mathematics has been developed and discovered in an attempt to resolve it, and this mathematics has had applications from cryptography to physics. Indeed Wiles has said that, unsurprisingly, his breakthrough work in the Langlands Program was motivated by FLT, and historically many other developments in algebraic number theory (for example the theory of factorization in cyclotomic fields) have been motivated by the same desire.
One aspect of FLT is that the theorem is ridiculously simple to *state* ($x^n+y^n=z^n$ has no solutions in positive integers if n>=3) (*TODO* maths mode?), and yet it was incredibly hard to *prove* (indeed, it took the mathematical community over 350 years). Thus it serves as a great reminder of how profound mathematics is. But what really justifies the interest in the problem is that over the past few centuries a huge amount of mathematical theory has been developed and invented/discovered in an attempt to resolve the problem, and this mathematics has had applications from cryptography to physics. Indeed Wiles' breakthrough work in the Langlands Program which ultimately resolved FLT was of course motivated by FLT, and historically many other developments in algebraic number theory (for example the theory of factorization in number fields, and the arithmetic of cyclotomic fields) have been at least partly motivated by the desire to shed more light on the problem.

The work of Wiles and Taylor-Wiles built upon a gigantic edifice of 20th century mathematics, and furthermore the fundamental technique which it introduced (a "modularity lifting theorem") has been simplified and hugely generalised in the 30 years following the publication of the original papers; the area continues to be a very active one today. [Frank Calegari's 2022 ICM paper](https://arxiv.org/abs/2109.14145) is a survey of what has happened since.
The work of Wiles, completed by Taylor--Wiles, was built upon a gigantic edifice of 20th century mathematics, and furthermore the fundamental technique which Wiles introduced (a "modularity lifting theorem") has been conceptually simplified and hugely generalised in the 30 years following the publication of the original papers; the area continues to be a very active one today. [Frank Calegari's 2022 ICM paper](https://arxiv.org/abs/2109.14145) is a survey of what has happened since, and what questions remain open. The fact that the area is still active is part of my motivation to "formalise" the proof. But what is formalisation?

# Formalisation of mathematics

Formalisation of mathematics is the art of translating paper mathematics into a computer programming language rich enough to understand the concepts of theorems and proofs. Over the last few years, the area seems to have captured the attention of parts of the mathematics community. We have seen several examples of modern mathematics being formalised in real time, the most recent example of which was Tao et al's formalisation of his work with Gowers, Green and Manners resolving the polynomial Freiman-Ruzsa conjecture. Bystanders might be forgiven for thinking that interactive theorem provers like Lean are now ready to deal with all modern mathematics. However the truth is far more sobering. In *some* areas of mathematics, for example additive combinatorics, we now have extensive evidence that modern breakhroughs can be formalised in real time. However, I attend the London Number Theory Seminar week in week out, and essentially every week I note that Lean does not know enough modern mathematical definitions to even *state* the results announced in the seminar, let alone to check the proofs.
Formalisation of mathematics is the art of translating paper mathematics into a computer programming language rich enough to understand the concepts of theorems and proofs. These programming languages, also called interactive theorem provers (ITPs), have been around for decades. Over the last few years however, the area seems to have captured the attention of parts of the mathematics community. We have now seen several examples of modern mathematics being formalised (by mathematicians, rather than computer scientists) in real time, the most recent of which (at the time of writing) is Tao et al's formalisation of his work with Gowers, Green and Manners resolving the polynomial Freiman--Ruzsa conjecture. Success stories like these might lead casual observers to believe that ITPs such as Lean are now ready to formalise all modern mathematics. However the truth is far more sobering. In *some* areas of mathematics, for example additive combinatorics, we now have extensive evidence that modern breakthroughs can be formalised in real time. However, I attend the London Number Theory Seminar on a regular basis, and most weeks I note that Lean does not know enough modern mathematical definitions to even *state* the results announced in the seminar, let alone to check their proofs.

The fact that number theory is still "behind" in this regard is one of my main motivations to embark on a formalisation of a modern proof of FLT. Once the project is finished, Lean will understand the concepts of automorphic forms and representations, L-functions, Galois representations, potential automorphy, modularity lifting theorems, Shimura varieties and many other fancy concepts used in modern algebraic number theory; we will be closer to my preliminary goal of being able to formalise *statements* of what is happening in my own area of expertise. And why might one want to do this? Well, if we are to believe some computer scientists, the exponential growth in AI should one day translate to computers being able to help mathematicians do their work. But what chance does this have of coming true if computers cannot even *understand what we are doing*?
The fact that number theory is still "behind" in this regard is one of my main motivations to embark on a formalisation of a modern proof of FLT. Once the project is finished, Lean will understand the concepts of automorphic forms and representations, L-functions, Galois representations, potential automorphy, modularity lifting theorems, the arithmetic of varieties, class field theory, arithmetic duality theorems, Shimura varieties and many other concepts used in modern algebraic number theory; formalising the *statements* of what is happening in my own area of expertise will no longer be science fiction. And why might one want to do this? Well, if we are to believe some computer scientists, the exponential growth in AI should one day translate into computers being able to help mathematicians do their work. But what chance does this have of coming true if computers cannot even *understand what we are doing*?

# The FLT project

The Fermat's Last Theorem formalisation project is [now open](https://github.com/ImperialCollegeLondon/FLT). Perhaps more interesting than a link to a github repository is the [blueprint graph](https://legendary-couscous-yrrvjlg.pages.github.io/blueprint/dep_graph_document.html) **TODO** couscous giving an indication of the current progress of the proof. If you are a mathematician you might also be interested in the [blueprint](https://legendary-couscous-yrrvjlg.pages.github.io/blueprint/index.html) itself, which contains an in-progress LaTeX write-up of the route we shall be taking. Experts interested in the precise details of the route we shall be taking can refer to chapter 4 of the blueprint for the details. Note that whilst our main applications are in the 2-dimensional case, we will work in the n-dimensional case whenever we can.
The Fermat's Last Theorem formalisation project is [now open](https://github.com/ImperialCollegeLondon/FLT). Perhaps more interesting than a link to a github repository is the [blueprint graph](https://legendary-couscous-yrrvjlg.pages.github.io/blueprint/dep_graph_document.html) (**TODO** couscous in link) giving an indication of the current progress of the proof. If you are a mathematician you might also be interested in the [blueprint](https://legendary-couscous-yrrvjlg.pages.github.io/blueprint/index.html) itself, which contains an in-progress LaTeX write-up of the route we shall be taking. Experts interested in the precise details of the route we shall be taking can refer to Chapter 4 of the blueprint for the details.

I am being funded for the first five years of the project by EPSRC grant [blah] **TODO**. During this time, my first goal is to *reduce* FLT to claims which were known to mathematicians by the end of the 1980s. I am quietly confident that we, the community, will succeed in this aim.
I am being funded for the first five years of the project by EPSRC grant blah (**TODO**). During this time, my first goal is to *reduce* FLT to claims which were known to mathematicians by the end of the 1980s. I am quietly confident that we will succeed in this aim. But who is "we"?

# Crowd-sourcing mathematics

I cannot formalise FLT alone. Indeed there are several parts of the arguments where I understand the basic principles but have never checked the details carefully, and on top of this there are two inputs (Langlands' work on cyclic base change for GL_2, and the work of Jacquet-Langlands relating automorphic representations for GL_2 with automorphic representations for inner forms of GL_2) which I have only the most superficial understanding of. However this is where one begins to see the benefits of a formalisation project. I will be able to explicitly *state* the results I need in Lean, and pass them on to others. The blueprint software we're using was developed by Patrick Massot to enable collaboration and was first used by him to formalise a modern proof of Smales' [sphere eversion theorem](link). The blueprint framework, combined with the Lean Zulip chat, a high-powered research forum where mathematicians and computer scientists can collaborate in real time, posting code and mathematics, have proved to be decisive here. People interested in watching progress of the project can occasionally check the blueprint graph (**TODO** link); people interested in contributing themselves should make themselves known on the FLT stream on the Lean Zulip.

***********

The blueprint software was developed by Patrick Massot, motivated by his own work formalising

*********
Algebraic number theory is certainly not the only area of mathematics where we cannot state the theorems which are currently being proved in mathematics departments today. I would urge people working in other areas of mathematics to come up with challenges and consider writing blueprints which can be used as roadmaps to formalise these challenges.

***********

There seem to be two viable approaches to teaching computers what is happening in mathematics today. The first is to feed a bunch of modern papers into a large language model and to hope that enough of the ideas sink in for these systems to go beyond the "stochastic parrot" mode which we see so often nowadays when they attempt to do mathematics. The second is to teach an ITP some modern results; this is the route I have decided to take. Perhaps in the future some combination of the two approaches is what will be decisive.
2 changes: 1 addition & 1 deletion blueprint/src/FLT.bib
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
@article {mazur,
@article {mazur-torsion,
AUTHOR = {Mazur, B.},
TITLE = {Modular curves and the {E}isenstein ideal},
NOTE = {With an appendix by Mazur and M. Rapoport},
Expand Down
4 changes: 2 additions & 2 deletions blueprint/src/chapter/ch01introduction.tex
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,6 @@ \chapter{Introduction}

Fermat's Last Theorem was explicitly raised by Fermat in 1637, and was proved by Wiles (with the proof completed in joint work with Taylor) in 1994. There are now several proofs but all of them go broadly in the same direction, using elliptic curves and modular forms.

Explaining a proof of Fermat's Last Theorem to Lean is in some sense like explaining the proof to Diophantus; for example, the proof starts by observing that before we go any further it's convenient to first invent/discover zero and negative numbers, and one can point explicitly at places in Lean's source code \href{https://github.com/leanprover/lean4/blob/260eaebf4e804c9ac1319532970544a4e157c336/src/Init/Prelude.lean#L1049}{here} and \href{https://github.com/leanprover/lean4/blob/260eaebf4e804c9ac1319532970544a4e157c336/src/Init/Data/Int/Basic.lean#L45}{here} where these things happen. However we will adopt a more efficient approach: we will assume all of the theorems both in Lean and in its mathematics library \href{https://github.com/leanprover-community/mathlib4}{\tt mathlib}, and proceed from there. To give some idea of what this entails: {\tt mathlib} at the time of writing contains most of an undergraduate mathematics degree and parts of several relevant Masters level courses (for example, the definitions and basic properties of elliptic curves and modular forms are in {\tt mathlib}). Thus our task is like teaching a graduate level course on Fermat's Last Theorem to a computer.
Explaining a proof of Fermat's Last Theorem to Lean is in some sense like explaining the proof to Diophantus; for example, the proof starts by observing that before we go any further it's convenient to first invent/discover zero and negative numbers, and one can point explicitly at places in Lean's source code \href{https://github.com/leanprover/lean4/blob/260eaebf4e804c9ac1319532970544a4e157c336/src/Init/Prelude.lean#L1049}{here} and \href{https://github.com/leanprover/lean4/blob/260eaebf4e804c9ac1319532970544a4e157c336/src/Init/Data/Int/Basic.lean#L45}{here} where these things happen. However we will adopt a more efficient approach: we will assume all of the theorems both in Lean and in its mathematics library \href{https://github.com/leanprover-community/mathlib4}{\tt mathlib}, and proceed from there. To give some idea of what this entails: {\tt mathlib} at the time of writing contains most of an undergraduate mathematics degree and parts of several relevant Masters level courses (for example, the definitions and basic properties of \href{https://leanprover-community.github.io/mathlib4_docs/Mathlib/AlgebraicGeometry/EllipticCurve/Weierstrass.html}{elliptic curves} and \href{https://leanprover-community.github.io/mathlib4_docs/Mathlib/NumberTheory/ModularForms/Basic.html}{modular forms} are in {\tt mathlib}). Thus our task can be likened to teaching a graduate level course on Fermat's Last Theorem to a computer.

The proof explained in these notes was constructed by Taylor, taking into account Buzzard's comments on what would be easy or hard to do in Lean. The proof uses refinements of the original Taylor-Wiles method by Diamond/Fujiwara, Khare-Wintenberger, Skinner-Wiles, Taylor and others. We shall explain more about the exact path we're taking in Chapter~\ref{ch_overview}. But before we go into those technical details, we can enjoy some of the more basic arguments at the start of the proof. And the proof starts, as every known proof does, with some basic reductions and the introduction of a certain elliptic curve. We explain this in the next chapter.
The proof explained in these notes was constructed by Taylor, taking into account Buzzard's comments on what would be easy or hard to do in Lean. The proof uses refinements of the original Taylor-Wiles method by Diamond/Fujiwara, Khare-Wintenberger, Skinner-Wiles, Kisin, Taylor and others -- one could call it a 21st century proof of the theorem. We shall explain more about the exact path we're taking in Chapter~\ref{ch_overview}. But before we go into those technical details, we can enjoy some of the more basic arguments at the start of the proof. And the proof starts, as every known proof does, with some basic reductions and the introduction of a certain elliptic curve. We explain this in the next chapter.
4 changes: 2 additions & 2 deletions blueprint/src/chapter/ch02reductions.tex
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ \section{The Frey curve}

\begin{definition}[mod $p$ Galois representation attached to a Frey package]\label{Frey_mod_p_Galois_representation}\lean{FLT.FreyCurve.mod_p_Galois_representation}\uses{Frey_curve} Given a Frey package $(a,b,c,p)$ with corresponding Frey curve $E$, the mod $p$ Galois representation associated to this package is the representation of $\GQ$ on $E(\Qbar)[p].$\end{definition}

Frey's observation is that this mod $p$ Galois representation has some very surprising properties. We will come back to this in the next chapter.
Frey's observation is that this mod $p$ Galois representation has some very surprising properties. We will make this remark more explicit in the next chapter. Here we shall show how these properties can be used to finish the job.

\section{Reduction to two big theorems.}

Expand All @@ -99,4 +99,4 @@ \section{Reduction to two big theorems.}
Indeed, if Fermat's Last Theorem is false then there is a Frey package $(a,b,c,p)$ by~\ref{Frey_package_of_FLT_counterex}, contradicting Corollary~\ref{no_Frey_package}.
\end{proof}

The structure of the rest of this document is as follows. In Chapter 3 we develop some of the basic theory of elliptic curves and the Galois representations attached to their $p$-torsion subgroups. We then apply this theory to the Frey curve, deducing in particular how Mazur's result on torsion subgroups of elliptic curves implies Theorem~\ref{Mazur_on_Frey_curve}, the assertion that $\rho$ cannot be reducible. In Chapter~\ref{ch_overview} we give a high-level overview of our strategy to prove that $\rho$ cannot be irreducible, which diverges from the original approach taken by Wiles; one key difference is that we work with the $p$-torsion directly rather than switching to the 3-torsion. Here we flag some more results from the 1980s and before which are necessary ingredients in our proof. We then begin the journey towards the proof of the modularity lifting theorem which provides the key ingredient in our approach.
The structure of the rest of this document is as follows. In Chapter 3 we develop some of the basic theory of elliptic curves and the Galois representations attached to their $p$-torsion subgroups. We then apply this theory to the Frey curve, deducing in particular how Mazur's result on torsion subgroups of elliptic curves implies Theorem~\ref{Mazur_on_Frey_curve}, the assertion that $\rho$ cannot be reducible. In Chapter~\ref{ch_overview} we give a high-level overview of our strategy to prove that $\rho$ cannot be irreducible, which diverges from the original approach taken by Wiles; one key difference is that we work with the $p$-torsion directly rather than switching to the 3-torsion. In this chapter we also state several more results from the 1980s and before which are necessary ingredients in our proof. We then begin the journey towards the proof of the modularity lifting theorem which provides the key ingredient in our approach.
Loading

0 comments on commit 10ec4e2

Please sign in to comment.