Translation project: Dedekind’s second definition of finite set

[tl;dr please help improve the translation if you can!]

Over the local long weekend I played around with using Google Lens to translate a scan of a text from Dedekind’s Nachlass, in the collected works edited by Robert Fricke, Emmy Noether and Øystein Ore and published in 1932. It concerns a different definition given by Dedekind of what it means to be a finite set without referencing the natural numbers, that turns out to be equivalent to the definition of finiteness that does use the natural numbers (even in the absence of Choice, unlike Dedekind’s much more famous definition). [EDITED the following, as I was too hasty and used the wrong quantifier. Mea culpa!]

There exists an \phi\colon S\to S, such that if A\subseteq S satisfies \phi(A) \subseteq A, then A is either \emptyset or S.

This note has barely been cited, but the definition was given by Dedekind in the foreword to the second edition of his famous “What are numbers and what should they be?”, and he hints that the work to establish the equivalence of this new definition and the usual definition of “finite” is nontrivial. But Dedekind had actually done (most of) the work! He just didn’t in the end edit “What are numbers…” in the second or third edition to include this work (and I don’t blame him, this is all very fiddly, low-level stuff).

One paper that I know of that engages with this unpublished note of Dedekind is cited by Noether in the collected works, by her co-editor (of the Cantor–Dedekind correspondence) Jean Cavaillès, a logician-philosopher. He apparently filled in a part of the proofs that Dedekind left out, but I haven’t yet read this paper. The zbMath review of Cavaillès’ paper tells us what he proved about this second definition of Dedekind: that a subset of a finite set is finite, that the successor of a finite set is finite, that induction holds for this definition of finite, and that the powerset of a finite set is finite.

And this paper of Cavaillès has likewise almost never been cited. Compare this to Tarski’s definition of what it means to be a finite set (the powerset is classically well-founded with respect to subset inclusion), which appeared a decade earlier in the same journal (Fundamenta Mathematicae), or Kuratowski‘s also in the same journal, before that. Note that zbMath has no citations for Cavaillès’ paper, and MathSciNet somehow doesn’t have the volumes of Fund. Math. in 1932 indexed. Sierpinski’s book Cardinal and Ordinal Numbers cites Cavaillès (in what context I do not know), but all Google Scholar can give me is 4 citations total: two post-1990 theses in Spanish, Sierpinski’s book, and somehow a review of another paper in JSL from the 60s that I haven’t yet checked out.

The upshot is, that I want to get together a good critical edition of Dedekind’s note in English, in order to make it better-known. My efforts so far have just been to format the mathematics, and get a first working draft of the machine-translation, in a TeX file. I have made this a public GitHub repository so that German speakers can give advice on smoothing out and improving the translation, say in GH issues, in pull requests, or by comments here or elsewhere (I will post a version of this announcement around the place).

A next step is to get Dedekind’s definition onto the nLab , and also (once the text is stable) to start thinking about more refined relationships to other definitions of finiteness, particularly in non-classical settings. Since the definition itself singles out the special subsets \emptyset, S\subseteq S, I can imagine that constructive variants might be clunky, but my intuition is not great.

Ctrl-z, 18 years later — “Yang-Mills theory for bundle gerbes” is now retracted

tl;dr – my very first paper, co-authored with Mathai Varghese when I was a PhD student (and he was advising me), had a critically flawed assumption, which I only discovered in the second half of last year.

It was tremendously exciting for me in 2005 to have my rough ideas turned into a real draft of a real paper by Mathai, to work on the paper together a bit more before sending it off to the arXiv, for publication, and then to get it published in early 2006. That my own idea could be made into a scientific publication was a thrill. At the time my PhD supervision was a bit in flux. Students were required to have at least two supervisors (a primary and a secondary), and my primary supervisor—Michael Murray—had just taken up the role of Head of School, the School in question being newly formed from the merge of three departments. Meetings were often rescheduled or cancelled due to time constraints. My original secondary supervisor could contribute little specific to the topic I was looking at, and it was arranged I would swap Mathai in as secondary. This started a period of about 18 months, going from memory, of mostly working with him, though ultimately I moved on and found a different project for my thesis. Our joint paper was the first thing we did in short order, and I was still very fresh and inexperienced. I had moved to the maths department from the physics department, switching disciplines. My thinking was still very approximate and “physicsy” (how times have changed!). From what I can only guess are my notes going into the meeting that kicked this project off, it looks like I was hoping we could reproduce the derivation of the sourceless vacuum Maxwell equations from the appropriate Lagrangian/action, except now from the 3-form curvature of a bundle gerbe, rather than from the 2-form curvature (the Faraday tensor) of a connection on a U(1)-bundle (i.e. the EM vector potential).

Page from my notebook dated 10th August 2005, and showing a first pass at a variational calculus approach to a bundle gerbe Yang–Mills action.

The details of who did what in the paper will be left vague, but suffice it to say that I did at one point “verify” the gauge group in the paper did indeed satisfy the conditions of having a group action on our space of bundle gerbe curvings. Looking at my calculations, they are not wrong—they just took the assumption behind the definition for granted. This assumption is that there is a certain 2-form on the Banach Lie group PU(\mathcal{H}) with a reasonable-seeming “primitivity” property (which certainly holds at the level of de Rham cohomology, and even in the linearised level, at the identity element). Our paper cites no source for this assumption. Given such a 2-form, everything works fine, and my calculation in 2005 is sound.

However, as is my wont, I like to read over my past papers to keep the ideas from being completely forgotten, and particularly papers like this one that has left its original rationale dangling: can we generalise not just electromagnetism to higher-degree forms, but non-abelian Yang–Mills theory? As one can tell from the 2006 paper’s title, this was at the forefront of my mind at the time. The knowledge of non-abelian higher gauge theory was at the time not developed enough for my to attack this problem, though it was the original goal of the PhD project proposal. But in recent years, the kind of nitty-gritty down-to-earth tech has pretty much arrived, and so while I’m not planning on picking up my original goal, I do still care about supplying physicists with concrete examples to illustrate the theory.

In the process of re-reading our paper, I noticed, for the first time, the primitivity assumption as being a bit of a bold claim that it is:

…recall that the line bundle L associated with the central extension … is primitive in the sense that there are canonical isomorphisms …, and there is a connection \nabla, on the line bundle L, called a primitive connection, which is compatible with these isomorphisms.

Mathai and Roberts (2006), section 2.2

In the treatment by Murray and Stevenson of connections on central extensions (see section 3 at the link), one generically does not get such a primitive connection; rather there is a 1-form on the square of the quotient group measuring the failure of the compatibility of the connection with the isomorphisms. Back in 2005–06, this was not a paper I had spent time with, but in the past decade it has become a standard reference for me. And so only now does the unwarranted assumption stand out like a sore thumb. The problem is not so much that the connection fails to be compatible with the isomorphisms, but that there is in all cases that I knew of, an resulting exact 2-form, the exterior derivative of the 1-form mentioned above. To satisfy the axiom for a group action of our gauge group C^\infty(X,PU(\mathcal{H})) on the affine space of bundle gerbe curvings, the exact 2-form needed to vanish everywhere.

I was somewhat perturbed, and to gloss over the subsequent discussions of the awkward situation, I ended up writing a paper that not only showed that there was no 2-form on PU(\mathcal{H}) as we had assumed, but classifying the more general class of bi-invariant 2-forms on all (reasonable) infinite-dimensional Lie groups in terms of the topological abelianisation of the Lie algebra. In many examples of interest this more general space of 2-forms still turns out to be trivial (i.e. only the identically vanishing 2-form is bi-invariant), so certainly there can be no primitive 2-forms as we assumed in all of these cases. Thankfully, one can wring some mildly interesting examples out of this result, namely that one can classify bi-invariant forms on some infnite-dimensional structured diffeomorphism groups (in the volume-preserving and symplectomorphism cases, for instance), in terms of specific de Rham cohomology spaces of the original compact manifold. Whether this is useful or even interesting, it isn’t clear to me. But to provide a positive result out of such a negative situation helped ease the disappointment.

Whereas at the point of finding this unjustified assumption I was hoping that perhaps there was something particularly special about the group PU(\mathcal{H}) that I didn’t yet know, I was now in the position where I knew the definition using this assumption was unfixable. Since the resulting analysis of the moduli space of solutions uses this group action in multiple instances, there was no way I could be confident in the result. So I had to (with permission from Mathai) contact the journal. I wrote a note (technically, a “corrigendum”), and submitted it, with the comment that if it had to be a retraction after all, I would be ok with that. Given the amount of time I put into correcting the literature in other instances (eg the two-year process starting with this blog post, and ending with this formal erratum, and the flow-on effects from that process), I couldn’t very well leave our own mistake in the literature to cause problems down the line.

Ultimately, the journal decided that the paper really should be retracted, and now you can find the notice at the journal website. I was thinking last year throughout this process how I might approach this problem afresh with what I know now, and find a similarly concrete description of the moduli space (as opposed to as a higher stack with some universal property). While I had some fruitful ideas, I hadn’t time then to dedicate to them (I was mostly working late at night sitting up in bed in the dark, sketching notes on my tablet in dark mode!). Were I able to fix the problems easily I would have sent them to the journal and had a corrigendum published, rather than retract the paper. Certainly if people are interested, I can share my ideas as they exist so far. As far as my coauthor goes, he was content to authorise me take charge throughout this whole affair, and I think he is willing to let the matter slide. However, as a matter of personal and professional pride, it would be nice to be able to eventually rectify this error by producing a theorem analogous to the one we had once claimed.

DeMarr’s theorem, part 1

I just learned of a very cool result that shows that the usual assumptions that uniquely characterise the real numbers—for example as a Dedekind complete ordered field—can be relaxed and still arrive at the same conclusion. Without qualification, that’s a bit too vague, but suffice it to say that a) one can just take the ordering to be a partial order, instead of a total order (partial in the sense that, for example, the powerset is only partially ordered by inclusion) and b) one only needs to know that sequences x_1 \geq x_2 \geq \ldots \geq 0 of elements that are greater than or equal to 0 have a greatest lower bound. This latter is weaker than Dedekind completeness, which is equivalent to the statement that every bounded-below set has a greatest lower bound. And, finally, c) the multiplication doesn’t need to be assumed to be commutative, it follows from the other axioms.

This theorem was used in the quantum reconstruction program to prove that the endomorphism ring of the tensor unit object of a dagger category is, under some extra assumptions, is in fact either the real numbers or the complex numbers. The point was to avoid a dependence on Solèr’s theorem, which characterises the ring of scalars of what turns out to be an infinite-dimensional separable Hilbert space, to be able to one day capture the category of finite-dimensional Hilbert spaces. You can see the talk Bypassing Solèr’s Theorem by Matthew Di Meglio for details on this, but here I just want to talk about the theorem of DeMarr, below the fold.

Continue reading “DeMarr’s theorem, part 1”

A fun “little” example of a crossed module

I’ve been looking back at my very first published paper (for “reasons”), and in it there is the group C^\infty(X,\mathrm{PU}), where X is a compact manifold, and \mathrm{PU} is the Banach Lie group of projective unitary operators on an infinite-dimensional separable Hilbert space (one needs to take the norm topology here). This very large group turns out to be a Lie group (not something that’s in the paper, but it follows from work of other people in infinite-dimensional Lie groups). It’s somewhat implicit, but what is really being considered is the homomorphism C^\infty(X,\mathrm{U}) \to C^\infty(X,\mathrm{PU}) induced by the projection map \mathrm{U}\to \mathrm{PU}. Let us call this map \pi_*\colon \widehat{\mathcal{K}} \to \mathcal{L}. Now the projection map is both a central extension and a principal U(1)-bundle, in particular it’s smoothly locally trivial. I will denote by \mathcal{K} the subgroup of \mathcal{L} those maps to \mathrm{PU} that lift to \mathrm{U}. It so happens that \mathcal{K} \subset \mathcal{L} is the connected component of the identity element, since a map lifts iff it is null-homotopic, as \mathcal{U} is contractible, X has the homotopy type of a finite CW complex, and \mathrm{U}\to \mathrm{PU}, being a principal bundle, is a Serre fibration (in fact I think \mathrm{PU} is paracompact, which implies it’s a numerable principal bundle, which implies it’s a Hurewicz fibration).

Something that I realised is that we in fact have that \widehat{\mathcal{K}} \to \mathcal{L} is a crossed module of Lie groups (in fact I think Fréchet Lie groups). What is missing is the action of \mathcal{L} on \widehat{\mathcal{K}}, lifting the adjoint action of \mathcal{L} on \mathcal{K}. To get this, fix g\colon X\to \mathrm{PU} and consider a trivialising cover \{W_\alpha\} on X of the U(1)-bundle g^*\mathrm{U}\to X. On each W_\alpha there is a lift g_\alpha\colon W_\alpha \to \mathrm{U} of g\big|_{W_\alpha} and we can form, for any f\colon X\to \mathrm{U}, the map \mathrm{Ad}_{g_\alpha}f\colon W_\alpha \to \mathrm{U}. Since on W_{\alpha\beta} = W_\alpha \cap W_\beta the two lifts g_\alpha,\ g_\beta differ by a multiplicative factor c_{\alpha\beta}\colon W_{\alpha\beta}\to U(1), we have \mathrm{Ad}_{g_\alpha}f = \mathrm{Ad}_{g_\beta}f\colon W_{\alpha\beta} \to \mathrm{U}. And thus there is a smooth map which I will denote by \widehat{\mathrm{Ad}}_g f\colon X\to \mathrm{U}. An argument involving common refinements shows that this is indeed an action (of abstract groups) of \mathcal{L} on \widehat{\mathcal{K}}.

One can check that this map satisfies the required conditions (to be a crossed module) of covering the adjoint map, by construction, and also that the composite \widehat{\mathcal{K}} \times \widehat{\mathcal{K}} \to \widehat{\mathcal{K}}\times \mathcal{L} \to \widehat{\mathcal{K}} of the projection \mathrm{id}\times \pi_* followed by the action is just the adjoint action of \widehat{\mathcal{K}} on itself.

The only missing thing is that the action should be smooth. I claim this is true, and is an exercise for the reader to check using the charts supplied by the Lie algebras C^\infty(X,u) and C^\infty(X,pu) (Banach Lie groups are locally exponential, so the exponential map does indeed furnish a chart around the identity for \mathrm{U} and \mathrm{PU}, and taking mapping groups into a locally exponential Lie group is again locally exponential).

We have the nice property that \mathcal{K} \subset \mathcal{L} is a closed subgroup, and in fact, as noted, it is the connected component of the identity, so that the projection map \mathcal{L} \to \mathrm{coker}(\pi_*) \simeq H^2(X,\mathbb{Z}) is not something pathological (like the quotient map of a non-closed or non-Lie subgroup). However, I don’t know as yet (because I’m juggling too many balls at the moment) if \widehat{\mathcal{K}} \to \mathcal{K} is a locally trivial bundle, which is what one wants for a crossed module of Lie groups. One has that \ker(\pi_*)= C^\infty(X,U(1)), which is a particularly nice Fréchet Lie group (eg it is nuclear), and so the idea is that

1 \to C^\infty(X,U(1)) \to C^\infty(X,\mathrm{U}) \to C^\infty(X,\mathrm{PU})_0\to 1

should be a locally split extension of Fréchet Lie groups. In particular, it should be the case that the extension of Fréchet spaces

1 \to C^\infty(X,i\mathbb{R}) \to C^\infty(X,u) \to C^\infty(X,pu)\to 1

is continuously split. I believe there is a continuous linear section pu\to u of the quotient map, which is enough, here. However, I haven’t thought too hard about checking that one indeed gets submersion charts for \pi_*, let alone local trivialisations for the putative bundle.

Now clearly there is nothing particularly special here about this specific choice of central extension of Banach Lie groups. One could have started with any A\to \widehat{G} \to G. Probably requiring Banach here is not necessary, and a central extension of (locally exponential) Fréchet Lie groups might be fine to make the argument go through.

However, what is special is that that the crossed module above seems to be very close to a presentation of the stack (of monoidal groupoids) of principal U(1)-bundles on X! However, the symmetric nature of the tensor product of bundles here is not apparent, since the crossed module is decidedly made up of non-abelian groups! However, we do know that the product on the Lie groups is commutative up to homotopy. So it might well be that the differentiable stack that (I claim) is presented by this strictly monoidal Lie groupoid has the same “shape” (in the technical sense of cohesive higher toposes) as the stack of U(1)-bundles. This is something to think about.

No order-10 projective planes via SAT

The proof that there is no finite projective plane of “order 10” (namely, one with 11 points per line) was a gory and large multi-part computation. The last phase, on a CRAY supercomputer (at low priority)

“…started running at IDA in the fall of 1986 and was to await completion in two to three years.”

C.W.H. Lam, The Search for a Finite Projective Plane of Order 10

before ultimately completing in 1988 (not without a final wrinkle, see the paper for more).

But I just learned there was an independent newer proof, this time using a SAT-solver, rather than a patchwork of custom code and by-hand computation!

This is roughly analogous to how the original proof of the Four Colour Theorem was a much more bespoke operation, then the next proof in the 90s was more clever about removing the error-prone by-hand part. But! This is in fact the third independent proof of the non-existence result. There was an intermediate proof by Dominique Roy in 2011, but this was apparently still a matter of custom-written research code. A SAT-solver, on the other hand, is a generic tool, and their theory is intensely studied. Further, the software outputs a “proof certificate” at the end, rather than a boolean yes/no: this is a data object that can be fed into a piece of software that verifies the proof really does do what it says it does (you can get them here!). However, this is not yet a completely formalised proof (in the sense of how the Four Colour Theorem has been by now proved in the proof assistant Coq): there is some unformalised theorem(s) from the 1970s that are relied on. As you might imagine, the proof runtime took a lot less than 2 years of supercomputer time! Or rather, it took about 24 months on a standard personal computer, three months faster than Roy, compared to 30 months of compute time for Lam and collaborators in the 1980s, including their access to a CRAY machine.

Catching myself

I just recently found myself in an interesting position. I was looking at a recently-published paper very close to my own area of expertise. I had missed seeing the preprint, and I found myself thinking “Isn’t this result a trivial consequence of already known facts?” What complicated matters was this was from a junior researcher (who for this blog post will remain anonymous).

Those who have been around in category theory for a while may recognise this type of statement. Indeed, there was a major public dust-up in the not-so-distant past between topos theorists over just such an issue. But it is the type of thing that I’m sure has happened multiple times. The problem with category theory is that the formalism and the intuition of category theorists are extremely close. So that if one is sufficiently experienced, theorems can sometimes immediately suggest themselves, and it’s a matter of setting up the framework so that the theorem holds, then showing that known results or examples are “just” special cases. Instead of proving the theorem holds for objects X, one can be reduced to proving objects X satisfy the conditions of a definition.

However, this is merely often, and not always the case. Sometimes there are really interesting wrinkles that happen when hypotheses are relaxed. Counterexamples to claims at the boundaries. Or techniques need to be wholly reworked to make them feasibly applicable as the complexity of the constructions grow (proving something is a symmetric monoidal closed bicategory in the weakest possible sense? Have fun with that…). Or perhaps there’s a standing assumption that holds in practice in the applications….or almost all applications…and the “well-known-to-experts theorem” has only ever been proved with that standing assumption.

This is the situation I found myself in.

After looking at the paper, the main definitions, the statement of the main results, I wondered why this paper didn’t cite the results I was familiar with. The paper is written well. It actually follows a philosophy I agree with, in supplying details where authors trying to save themselves time and effort in diagram typesetting brush off the proof as an exercise for the reader. For all the talk of intuition, when matters get complex enough, it is worth actually recording the proof properly, so we don’t find ourselves in the situation of Voevodsky and his early work on higher categories (which turned out to have a fatal flaw). But for some reason I was reacting emotionally.

I had to consciously stop myself. This was a junior researcher. I should not be firing off an email claiming the results are trivial applications of known results. This paper was published for a reason. I had another look.

This time, I noticed a claim that a certain property didn’t always hold, when I thought it did. At least, in the situation I was familiar with. Under a certain standing assumption. So now I was wondering if something had been missed, making everything a lot harder than it needed to be. Surely, this property follows reasonably trivially, and then we can all cite the literature and go home.

But again, this was just my intuition talking. I read through the paper to see if a counterexample was supplied. There was not. But….the author can’t just be pulling this fact from nowhere.

So I slowed down more: pull apart the definition. Prove for myself some helper lemmas that would help me prove this property always holds. Do a bunch of reductions. Find a general statement that, if true, leads to the property. Then, armed with the general statement (which was of a class I knew should be known), go hunting online. Surely someone has written about this already.

And then I found it: the general statement I thought should be true was true … if and only if the standing assumption was true.

But, surely, the setup without the standing assumption was, while non-empty, not as interesting/applicable/important as the extremely wide and well-known entire area with the standing assumption? Actually….not so. A bunch of nontrivial and important special constructions turn out to fall outside the standing assumption. Ones that I myself had never really looked at, as they are from a field of mathematics rather distant to my own particular non-category-theoretic speciality.

So there really was a non-trivial use case in applications that people cared about, and, so far as I can see, the result was indeed new in this case. It took me a day of returning to the problem a few times, giving the author the benefit of the doubt, and really checking I’d read things properly, and giving their thoughts the respect they deserved.

I didn’t think I’d become that guy in category theory who claims everything is a special case or a trivial application of existing work. Or it would be a trivial exercise with someone of my expertise. And I’m glad I stopped to check myself, especially in this case. Yes, in principle I could have proved these results. But I didn’t. I didn’t stop to think if there was this little wrinkle that meant the nice property could fail. And so there is now a new piece of mathematics I think is rather nice, that I didn’t know before. And kudos to the author. May they become established as a valuable part of our research community.

A Diophantine puzzle

I’ve had this piece of paper for a couple of years now. The picture shows some solutions to a relatively low-degree Diophantine equation (no more than degree 4, and probably even just degree 3), and the arrows show how a solution can be modified to create a new solution. The coefficients are integers, of course. What is the equation? I can’t remember what it is!

New solutions from old

Joshi’s quest

Kirti Joshi has been doing what seems to me, as an outsider, interesting work at trying to rehabilitate Mochizuki’s ideas captured in Inter-universal Teichmüller Theory (IUT), in terms of existing mathematics. At least, when I skim through his recent papers on arithmetic deformation theory (using “arithmetic Teichmüller spaces”), they read like normal mathematics, rather than something alien and mysterious. Additionally, there’s not (at least so far) hundreds of pages of somewhat unmotivated technical work whose sole purpose is, allegedly, the proof of the abc-conjecture. Personally, early on in the saga, I had hoped that Mochizuki’s work on Frobenioids and what-have-you would turn out to have independent interest, and spark investigations into side-topics, but that has not happened. With Joshi’s work, however, it’s mostly applying existing machinery of p-adic arithmetic geometry with some interesting innovations, and—as far as I can tell—trying to recreate what Mochizuki wanted to do in a much more efficient and understandable way. If this work turns out to not achieve the great heights it aspires to, it would be for the ordinary reasons, and not because the formalism obfuscates what’s going on and the author feels everything is obvious, and chooses to explain why critics are wrong using vastly oversimplified metaphors in non-standard language.

In any case, Joshi has just released a document (disclosure: I was given earlier copies to have a read of, and invited to give comments) that details the philosophy of what his programme of arithmetic deformation theory is trying to achieve, and how. I’m not expert enough to check the technical details, but I don’t find this in the same class of writing as Mochizuki’s “explanations” of IUT; I can follow along, and things look familiar and sensible, regardless of any downstream application to hard Diophantine problems like abc. It would be interesting to get expert comments on this high-level document on its own terms, to probe its robustness, as it currently stands, leaving aside the community’s issues with Mochizuki’s work. I appreciate that references to Mochizuki pervade the linked document, but I think it’s possible to just focus on the mathematics at hand, and treat the links to IUT as a kind of set dressing, providing a level of external motivation.

Bessis on “the secret mathematics”; teaching and research

I thought these (by David Bessis, sourced from this review of his book Mathematica by Michael Harris ) were wonderful, worth sharing, and worth keeping in mind when trying to convey mathematical ideas to students. At the very least, they resonate with me!

Understanding mathematics is seeing and feeling, it’s traveling along a secret path that brings us back to the mental plasticity we had as children

When a human being is faced with a mathematical text, the aim is not to read from the first line to the last, as a robot would. The aim is to grasp “the thoughts between the lines” … Mathematical texts are written by humans, for humans. Without our ability to give them meaning, without “the thoughts between the lines,” there would be no mathematical texts.

For Thurston and for all mathematicians, mathematics is a sensual, carnal experience situated upstream from language. Logical formalism is at the heart of the apparatus that makes this experience possible. Mathematics books are unreadable but we need them. They are a tool that allows us to share in writing the true mathematics, the only one that really counts: the secret mathematics, the one that is in our head.

One might say that it’s all very well to have this type of philosophical picture about what mathematics “really is”, but how does that help a student get past a blockage in their current understanding? I believe that being mindful that these type of unspoken, and often unconscious, processes are happening in mathematicians’ (eg the lecturer’s) heads can help us remember that mathematical understanding is a complex thing, not always contained in the lecture notes. We have the opportunity to reflect on and convey our own understanding, our own “secret mathematics”, hard-won through years of persistence, to the students.

Not every student is going to end up as a career mathematician, but for those who do, here is the most important thing:

More than publications or official works, the mathematician’s great creation is the intuition that is the accomplishment of an entire life.

Sadly, of course, the practicalities of life, academia etc mean that the former things are also necessary and not sufficient, but aside from having to ensure mouths and fed and bills and paid, it is indeed such a incorporeal thing that is the most valuable creation of a mathematician. People praise this among those mathematicians who are considered great: their feeling of rightness for new mathematics that leads to the most fruitful research programs (eg: Noether, Atiyah, Grothendieck, Erdős, Conway, Thurston, …. all in their own ways), even above their actual results sometimes.

A study in basepoints: guest post by Kirti Joshi

[The text below the divider is a response by Kirti Joshi in response to some comments at MathOverflow regarding his recent preprint, “Untilts of fundamental groups: construction of labeled isomorphs of fundamental groups — Arithmetic Holomorphic Structures“. Kirti reached out to me regarding making a response, and I suggested that a blog posting would be better than an answer at MathOverflow, since it is not in the format of an answer to the question, to which he agreed. Regular readers of the blog will know that I follow developments around IUT with close interest, but I am not an expert in that area. My long-stated hope is that some interesting mathematics comes out of the whole affair, regardless of specifics about the correctness of Mochizuki’s proof or otherwise. –tHG]


I wish to clarify my work in the context of the discussion here. For this purpose suppose that X is a geometrically connected, smooth quasi-projective variety over a p-adic field which I will take to be {\mathbb{Q}}_p for simplicity. In Mochizuki’s context this X will additionally required to be an hyperbolic curve.

  1. First of all let me say this clearly: one cannot fix a basepoint for the tempered fundamental group of X in Mochizuki’s Theory [IUT1–IUT4]. The central role of (arbitrary) basepoints play in Mochizuki’s theory is discussed in (print version) [IUT1, Page 24], and notably the key operations of the theory, namely the log-link and theta-link, change or require arbitrary basepoints on either side of these operations [IUT2, Page 324] (and similar discussion in [IUT3]).
  2. This means one cannot naturally identify the tempered fundamental groups arising from distinct basepoints. [The groups arising from different basepoints are of course abstractly (and non-canonically) isomorphic. Mochizuki does not explicitly track basepoints while requiring them and so this makes his approach extremely complicated.]
  3. In the context of tempered fundamental groups, a basepoint for the tempered fundamental group of X is a morphism of Berkovich spaces \mathcal{M}(K)\to X^{an}_{{\mathbb{Q}}_p}, where K is an algebraically closed complete valued field containing an isometrically embedded {\mathbb{Q}}_p. [Such fields are perfectoid.] As I have detailed in my paper, arbitrary basepoints requires arbitrary perfectoid fields K containing an isometrically embedded {\mathbb{Q}}_p. [For experts on Scholze’s Theory of Diamonds, let me say that the datum (X^{an}_{{\mathbb{Q}}_p}, \mathcal{M}(K)\to X^{an}_{{\mathbb{Q}}_p}) required to define tempered fundamental group with basepoint \mathcal{M}(K)\to X^{an}_{{\mathbb{Q}}_p} is related (by Huber’s work) to a similar datum for the diamond (X^{ad})^\diamond associated to the adic space for X/{\mathbb{Q}}_p.]
  4. In my approach I track basepoints explicitly (because of the (1) above) and I demonstrate how basepoints are affected by the key operations of the theory. [This is claimed in Mochizuki’s papers, but I think his proofs of this are quite difficult to discern (for me).]
  5. Because basepoints have to be tracked, and tempered fundamental groups arising from distinct basepoints cannot all be naturally identified, assertions which involve arbitrarily identifying fundamental groups arising from distinct basepoints cannot be used to arrive at any conclusion about Mochizuki’s Theory.
  6. In arithmetic geometry one typically works with isomorphism classes of Riemann surfaces i.e. with moduli of Riemann surfaces. Teichmuller space requires a different notion of equivalence and it is possible for distinct points of the classical Teichmuller space to have isomorphic moduli. This is also what happens in my p-adic theory.
  7. There is no linguistic trickery in my paper. I have developed my approach independently of Mochizuki’s group theoretic approach and my approach is geometric and completely parallels classical Teichmuller Theory. Nevertheless in its group theoretic aspect, my theory proceeds exactly as is described in [IUT1–IUT3] and arrives at all the principal landmarks with added precision because I bring to bear on the issues the formidable machinery of modern p-adic Hodge Theory due to Fargues-Fontaine, Kedlaya, Scholze and others. This precision allows me to give clear, transparent and geometric proofs of many of the principal assertions claimed in [IUT1–IUT3] without using Mochizuki’s machinery. [Notably my view is that Mochizuki’s Corollary 3.12 should be viewed as consequent to the existence of Arithmetic Teichmuller Spaces (at all primes) as detailed in my papers. One version of Corollary 3.12 is detailed (with proof) in my Constructions II paper. The cited version works with the standard complete Fargues-Fontaine curve {\mathscr{X}}_{{\mathbb{C}_p^\flat},{\mathbb{Q}}_p}, but there is a full version better adapted for Diophantine applications which works with the adic Fargues-Fontaine curve {\mathscr{Y}}_{{\mathbb{C}_p^\flat},{\mathbb{Q}}_p} and its finite étale covers and which exhibits the tensor packet structure of [IUT3, Section 3] which will appear in Constructions III (the two proofs are similar).] No claims are presently being made about the main result of [IUT4]. That is a work in progress.
  8. Apart from its intrinsic value, my original and independent work provides new evidence regarding Mochizuki’s work and I urge the mathematical community to reexamine his work using the emerging mathematical evidence.
  9. I am happy to talk to any mathematician who is interested in my work and I have written my papers as transparently as possible. [If there are any concrete mathematical objections to my papers, I will be happy to address them.]
  10. I thank David M. Roberts for a number of suggestions which have helped me improve this text.