If you wanted to try to pinpoint the foundations of computing, you could do worse than turn your attention to two papers of 1936: Alonzo Church's "An Unsolvable Problem of Elementary Number Theory" and Alan Turing's "On Computable Numbers, With an Application to the Entscheidungsproblem". Their vastly different approaches to defining the notion of "computability" broke new ground – and somehow turned out to be equivalent.

But Church's and Turing's methods did not arise in a vacuum. Around the turn of the century mathematicians had begun to ask the kind of fundamental, introspective questions about mathematics itself that had been largely glossed over until then; about what could be proved, and whether – given inconvenient truths like Russell's Paradox – it was even consistent. The papers that brought to the attention of the world the untyped lambda calculus and the Turing machine had in fact built towards the same conclusion: that a certain David Hilbert's "decision problem" (Entscheidungsproblem), which he deemed to go "to the essence of mathematical thought", could not be solved. Hilbert had provided tools to enable mathematics itself be the subject of study: without them, in the words of Saunders Mac Lane, there would be "no Gödel's theorem, no definition of computability, no Turing machine, and hence no computers." It's worth considering how his ideas – notably relating to formalism, metamathematics and logic – influenced the work of Gödel, Church and Turing; and how inward-looking, theoretical concerns about the foundations of mathematics helped usher in the modern era in all its practicality.

*

Hilbert's early work was on so-called invariant theory: from Descartes' use of coordinates to describe geometry, a path had been traced to the study of algebraic invariants, properties which remained constant under certain transformations.

He held that every mathematical theory went through easily identifiable stages of development: the naive, the formal and the critical. Ex-lawyers Cayley and Sylvester (who had also coined the related terms covariant and contravariant used today in computing) represented the naive period of invariant theory, with its "immediate joy of first discovery". Clebsch and Gordan represented the formal, as "inventors and perfecters of the symbolic calculation". Hilbert himself stood for the critical, and eschewed their techniques in the spirit of "following Riemann's postulate that proofs should be given through ideas and not voluminous computations."

Instead he answered Gordan's conjecture, the outstanding problem of the day – that all invariants could be generated by a finite number of them, that formed a basis – with a radically different approach: not by actually constructing such a basis, but by proving that it necessarily existed. Gordan saw this not as mathematics but theology (eventually admitting that "theology also has its merits"). By contrast Felix Klein had found it "wholly simple and, therefore, logically compelling." For Hilbert there were sound reasons to use such methods: "The value of pure existence proofs consists precisely in that the individual construction is eliminated by them, and that many different constructions are subsumed under one fundamental idea so that only what is essential to the proof stands out clearly; brevity and economy of thought are the raison d'être of existence proofs."

Although the wind went out of the sails of invariant theory, the concept remained important: you might say that since the nineteenth century mathematics has often emphasized properties that stay the same (invariants), and structures that are the same despite appearances (isomorphic ones). Constancy against a backdrop of change suggests fundamental characteristics. Klein's Erlangen Program characterized geometrical figures by the groups of transformations under which they were invariant. In physics invariance soon played a major role, since as Klein remarked, the special theory of relativity could be thought of as the invariant theory of the group of Lorentz transformations. Einstein's conjecture was that "the general laws of nature are to be expressed by equations which hold good for all systems of coordinates, that is are covariant with respect to any substitution whatever (generally covariant)". In mathematics itself Hilbert's work had a wider reach: as Constance Reid puts it, "the ideas in his work on Gordan's Problem had extended far beyond algebraic invariants in method and significance", over time "[flowering] into the general theories of abstract fields, rings and modules – in short, modern algebra."

Wondering if it might not be simpler

Hilbert arrived in Goettingen, once the university of Gauss and Riemann, in 1895, having been invited by Klein despite his impression of Hilbert as "the most difficult person of all". Hardly in terms of demeanor: he cuts a modern figure, on a warm day arriving at "the lecture hall in a short-sleeved open shirt, an attire inconceivably inappropriate for a professor in that day". Having taken up cycling at age forty-five, he "still preferred to work outdoors... He would work for a while at the big blackboard which hung from his neighbor's wall. Then suddenly he would stop, jump on the bicycle, do a figure-eight around the two circular rose beds, or some other trick. After a few moments of riding, he would throw the bicycle to the ground and return to the blackboard." He had an easy rapport with students and tried to make sure they followed his explanations, repeating himself as often as needed, even from one lesson to the next.

But he was difficult in the sense that he questioned any and all mathematical authority, and insisted on working things out in his own way from first principles. He once remarked: "That I have been able to accomplish anything in mathematics is really due to the fact that I have always found it so difficult. When I read, or when I am told about something, it nearly always seems so difficult, and practically impossible to understand, and then I cannot help wondering if it might not be simpler. And on several occasions it has turned out that it really was more simple!" According to Richard Courant, "Scientifically, he did not grasp complicated things at a flash and absorb them. This kind of talent he did not have. He had to go to the bottom of things." In this need to get to the essentials he resembled the inventors of category theory, Samuel Eilenberg and Saunders Mac Lane. Indeed Mac Lane, who once attended his lectures on philosophy, was described by Colin McLarty as "the last mathematician from Hilbert's Goettingen".

Tables, chairs and beer mugs

Leaving behind his work on invariants and then algebraic number fields, Hilbert revisited a two-thousand-year-old system of thought to view it in a new light. Euclid's Elements had, among other things, taken an axiomatic approach to geometry: all results were presented as logical deductions from a relatively small number of assumptions known as postulates. Hilbert's interest was not so much in geometry per se as in the deductive system that underpinned it. But where Euclid had viewed the postulates of geometry as evident, and representative of physical space, Hilbert made no attempt to relate his axioms to anything, nor did he see the need to as a matter of general principle. In 1891 he had remarked: "One must be able to say at all times instead of points, straight lines, and planes – tables, chairs, and beer mugs." What counted was simply what the rules of logic allowed you to deduce from uninterpreted axioms.

This wasn't the first time mathematicians had cast a fresh look at Euclidean geometry. Gauss had observed, although he had kept the idea to himself so as not to court controversy, that Euclid's parallel postulate could be altered and still yield a consistent geometry; and Lobachevsky had later exhibited such a geometry, where through a single point there was more than one line parallel to a given line in a plane. Others (Pasch, then Peano) had emphasized rigorous deduction from axioms, but in a somewhat idiosyncratic manner. It was Hilbert's succinct exposition and use of familiar Euclidean terms and imagery, in his 1899 book "Foundations of Geometry", that brought home a central fact: the axiomatic method didn't depend on whether the axioms were "true" in any sense. Poincaré's AMS review of the book put it this way: "The logical point of view alone appears to interest Professor Hilbert. Given a sequence of propositions, he finds that all follow logically from the first. With the foundation of this first proposition, with its psychological origin, he does not concern himself... The axioms are postulated; we do not know from whence they come; it is then as easy to postulate A as C." This he saw as a great step forward for the philosophy of mathematics (or if you prefer the original translation, as a "long step in advance").

Just as importantly, Hilbert had taken a step back and considered the axiomatic foundations of geometry as a whole, in a more abstract way. Were the axioms consistent? Were they independent of each other? In the words of Hermann Weyl, "It is one thing to build up geometry on sure foundations, another to inquire into the logical structure of the edifice thus erected. If I am not mistaken, Hilbert is the first who moves freely on this higher 'metageometric' level". These were the beginnings of "metamathematics", the use of mathematical methods to study... mathematics itself.

*

Twenty-three questions

At the turn of the century Hilbert was asked to address the second International Congress of Mathematicians, as Poincaré had during its inauguration. Given free rein he decided (with the encouragement of his close friend Hermann Minkowski, who correctly predicted that mathematicians would be "talking about his lecture decades later") to look not to past or current developments but to the future. Aware of the "deep significance of certain problems for the advance of mathematical science", he chose twenty-three of them as a kind of guide to the journey ahead. By and large they took decades to prove, and some are still unresolved. They included the outstanding open problem of the day, and still of ours: the Riemann Hypothesis, which characterizes the distribution of the prime numbers (through the zeroes of the complex-valued zeta function). Most others he had devised himself; on the whole specific questions of the sort mathematicians were largely accustomed to.

A few though stood apart in their view of mathematics as an entity in its own right, both as an edifice and as a system of thought. Here were the questions which already pointed the way toward computing: the second problem, which asked whether the axioms of arithmetic were consistent, later addressed by Gödel's incompleteness theorem – itself an important stepping stone toward the Turing Machine – and especially the tenth problem, which was not solved until 1970, through the combined efforts of Yuri Matiyasevich, Julia Robinson, Martin Davis and Hilary Putnam. This asked, given a Diophantine equation (a polynomial for which only integer solutions are sought), to "devise a process according to which it can be determined in a finite number of operations whether the equation is solvable". It requests of the mathematician not the solution to a specific equation but a generalized procedure: what we would call an algorithm. It's as much a "decision problem" as the "Entscheidungsproblem" that Church and Turing used as their guide in 1936 (though more related to a register machine than a Turing Machine).

Hilbert's address at the Sorbonne in Paris is worth reading in full. Its observations about mathematics are often insights into other disciplines, computing in particular. On what constitutes a good mathematical problem: "An old French mathematician said: 'A mathematical theory is not to be considered complete until you have made it so clear that you can explain it to the first man whom you meet on the street.' This clarity and ease of comprehension, here insisted on for a mathematical theory, I should still more demand for a mathematical problem if it is to be perfect; for what is clear and easily comprehended attracts, the complicated repels us. Moreover a mathematical problem should be difficult in order to entice us, yet not completely inaccessible, lest it mock our efforts."

On the oft-debated subject of rigor: "It is an error to believe that rigor in the proof is the enemy of simplicity. On the contrary, we find it confirmed by numerous examples that the rigorous method is at the same time the simpler and the more easily comprehended. The very effort for rigor forces us to discover simpler methods of proof. It also frequently leads the way to methods which are more capable of development than the old methods of less rigor." But he argues against too narrow a conception of rigor: it should include geometric as well as arithmetic reasoning, citing "the picture of three points following one another on a straight line" as the faithful representation of "a > b > c", in other words of "the idea 'between'". There's not a huge leap between his view that "geometrical figures are drawn formulas" and the commutative diagrams of category theory.

His remarks on overcoming mathematical difficulties surely apply to other fields in equal measure. Touching upon what Mac Lane would later call the right generality: "If we do not succeed in solving a mathematical problem, the reason frequently consists in our failure to recognize the more general standpoint from which the problem before us appears only as a single link in a chain of related problems. After finding this standpoint, not only is this problem frequently more accessible to our investigation, but at the same time we come into possession of a method which is applicable also to related problems... This way for finding general methods is certainly the most practical and the most certain; for he who seeks for methods without having a definite problem in mind seeks for the most part in vain."

In a similar vein: "In dealing with mathematical problems, specialization plays, as I believe, a still more important part than generalization. Perhaps in most cases where we unsuccessfully seek the answer to a question, the cause of the failure lies in the fact that problems simpler and easier than the one in hand have been either incompletely solved, or not solved at all. Everything depends then, on finding those easier problems and on solving them by means of devices as perfect as possible and of concepts capable of generalization. This rule is one of the most important levers for overcoming mathematical difficulties; and it seems to me that it is used almost always, though perhaps unconsciously."

Near the end of his speech is an observation which if anything applies more to programming than to mathematics: "But, we ask, with the extension of mathematical knowledge will it not finally become impossible for the single investigator to embrace all departments of this knowledge? In answer let me point out how thoroughly it is ingrained in mathematical science that every real advance goes hand in hand with the invention of sharper tools and simpler methods which at the same time assist in understanding earlier theories and cast aside older, more complicated developments. It is therefore possible for the individual investigator, when he masters these sharper tools and simpler methods, to find his way more easily in the various branches of mathematics than is possible in any other science."

There is, finally, a glimpse of why the Diophantine decision problem made the list: the "conviction (which every mathematician shares, but which no one has as yet supported by a proof) that every definite mathematical problem must necessarily be susceptible of an exact settlement... that [its] solution must follow by a finite number of purely logical processes". "Seek its solution. You can find it by pure reason, for in mathematics there is no ignoramibus." (Even if his audience wasn't familiar with the Latin for "we shall not know", they probably could have worked it out from ignoramus, "we do not know".) Hilbert's conviction that logical deduction could settle every mathematical question would hold sway for some time yet.

It wasn't long before Bertrand Russell put the cat amongst the pigeons. But to appreciate the significance of this, we need at least an overview of developments in two fundamental – foundational, even – areas of mathematics: set theory and logic. They're not unrelated.

*

Turning in a circle

No overview however brief of the mathematics underlying computing can reasonably omit the field of logic. Gödel, Church, and Turing were all logicians. And Hilbert was instrumental in the development of what logic became in the twentieth century: mathematical logic. The field is notoriously hard to define, but let's go with: the study of reasoning. While on the face of it logic has always been mathematical in nature, it has had a surprisingly uneven relationship with the rest of mathematics. Perhaps this has something to do with the fact that for centuries, philosophers lay as much claim to it as mathematicians, and its interpretation had to accomodate the subtleties and sometimes ambivalence of natural language. Some of the terminology of logic is grammatical, such as "predicate" – deriving from the division of a sentence into subject and predicate. To read any of Russell's early writings on logic is to be struck by their wordiness and emphasis on language. There would even be a discussion of what the meaning of the word "is" is (he distinguishes the relation of subject and predicate from the expression of identity, since you ask). Russell was a logicist, enthusing in 1903 that "the fact that all Mathematics is Symbolic Logic is one of the greatest discoveries of our age". This rather strange claim to modern ears rested on the idea that set theory was in fact part of logic.

It would take time to more clearly delineate set theory from logic, since it wasn't particularly obvious where one ended and the other began. As described by Weyl, Brouwer's reading of history was that "classical logic was abstracted from the mathematics of finite sets and their subsets". To say that a statement is true is usually to say that a property is true of something, which is also a way of saying that that something is a member of a set (namely everything with the property). From this angle, logical implication relates to subsets, conjunction to set intersection, and so on. Sets themselves can be seen as essentially equivalent to predicates. Church would later say as much, in the terminology of his time: "no purpose is served by maintaining a distinction between classes and singulary propositional functions". Hilbert noted of the intertwining of set theory, logic and arithmetic: "If we observe attentively... we realize that in the traditional exposition of the laws of logic certain fundamental arithmetical notions are already used; for example, the notion of set and, to some extent, also that of number. Thus we find ourselves turning in a circle, and that is why a partly simultaneous development of the laws of logic and of arithmetic is required..."

There was more to classical logic than sets and the interpretation of natural language: Aristotle's syllogisms were in effect "formal" logic, where emphasis is on an argument's form, or shape. The use of variable letters to represent logical inferences emphasized this. Since this observation, though, logic had appeared to be treading water: in the preface to "Critique of Pure Reason", Kant opined that "since Aristotle... logic has not been able to advance a single step, and is thus to all appearance a closed and completed body of doctrine". (Logic to Kant meant "the formal rules of all thought".) It was not until the mid-19th century that Boole renewed visible progress by providing an "algebra of logic".

But as Russell recognized, the major development in logic was due to Frege, who among other things had put his finger on the central importance of quantifiers "for all" and "there exists" (now written ∀ and ∃). Jean van Heijenoort calls his first work "one of the sharpest breaks that ever occurred in the development of a science". As he puts it: "Modern logic began in 1879, the year in which Gottlob Frege... published his Begriffsschrift. In less than ninety pages this booklet presented a number of discoveries that changed the face of logic. The central achievement of the work is the theory of quantification; but this could not be obtained till the traditional decomposition of the proposition into subject and predicate had been replaced by its analysis into function and argument(s)." The view of a logical predicate as a function remains the modern one. He continues: "Of cardinal importance was the realization that, if circularity is to be avoided, logical derivations are to be formal, that is, have to proceed according to rules that are devoid of any intuitive logical force but simply refer to the typographical form of the expressions; thus the notion of formal system made its appearance."

Let's pause to note that even when logic became fully mathematical, shedding the baggage of natural language, and was more clearly separate from set theory, its relationship to mathematics "proper" remained somewhat uneasy. On the face of it this is surprising: logic is highly rigorous, and mathematics is often thought of as the most rigorous of the sciences. For a traditional mathematician, what's not to like? The difference lies in the type of rigor: the impression was that logic tended to use a pernickety, fastidious type of rigor where no step, however small or obvious, could be omitted. Formal verifications of proofs are generally far more involved than what most mathematicians consider to be rigorous proofs. René Thom's definition of a rigorous proof was one which "of a reader who is sufficiently educated and prepared, elicits an impression of obviousness that leads to agreement". If mathematicians are indeed sticklers for rigor, they'd still prefer to focus on what is conceptually interesting and important over minutiae. Otherwise there's the risk of not seeing the forest for the trees, of the meaning of a proof disappearing in a morass of detail. For Bourbaki in 1950, it was clear that "every mathematician knows that a proof has not been truly 'understood' if one has done nothing more than verify step by step the correctness of the deductions of which it is composed and has not tried to gain a clear insight into the ideas which have led to the construction of this particular chain of deductions in preference to every other." (Bourbaki's "scribe" Dieudonné was notoriously dismissive of logic.) This echoed Poincaré's earlier criticism: "When the logician has dissected each proof into a multitude of elementary operations, all of them correct, he won't yet possess the full reality of it: that je ne sais quoi that makes the unity of the proof will elude him entirely... What point is there in admiring the work of the mason if we can't understand the architect's plan? Pure logic can't give us a bird's-eye view, for this we need intuition." While at Goettingen, Mac Lane had written a dissertation which sought to allow formal logic to express something closer to a mathematician's idea of a proof: "A proof is not just a series of individual steps, but a group of steps, brought together according to a definite plan or purpose. So we affirm that every mathematical proof has a leading idea, which determines all the individual steps, and can be given as a plan of the proof." Of course fastidiousness and attention to mind-numbing detail was not always a drawback: it suited a machine just fine. Tarski summarized it this way: "the task of logic is to mechanize thinking".

*

Along the diagonal

At the same time set theory was being developed by Georg Cantor. The most striking, sometimes counterintuitive results concerned infinite sets, and were often obtained by a technique that would become widely used in logic: the diagonal argument.

Infinity is a slippery concept. While most ordinary mathematical objects seem static and available for inspection, infinities race away. The only way to get one to "stand still" long enough to say anything about it is in a relative sense: to race along with it. The ideas behind calculus were only expressible rigorously when Cauchy and Weierstrass provided a (ε, δ)-definition of a limit, with one vanishingly small quantity shadowing another. For infinite sets, there was one tried and true method of keeping pace: a bijection, i.e. a one-to-one correspondence. Here was the first central idea: just as two finite sets had the same number of elements if there was a bijection between them, two infinite sets were of the same size, i.e. had the same "cardinal", if you could establish a bijection. Conversely if you could prove that no bijection could exist between two infinite sets, they necessarily were of different sizes. What no one until then had realized – or at least been able to show – was that some infinities are larger than others.

As early as 1874 Cantor proved a result which would become as important for Turing as it was for Gödel: there are more real numbers ($$ℝ$$) than there are natural numbers ($$ℕ$$). This meant that real numbers were not enumerable: there was no way to establish a bijection between $$ℝ$$ and the bargain-basement, lowest rung of infinities, namely $$ℕ$$. Because this has become associated with diagonalization, it's worth pointing out that the initial proof had nothing to do with it. Charles Petzold gives a good summary of the proof in "The Annotated Turing" (an engrossing account of the backdrop to, and meaning of, Turing's 1936 paper): essentially Cantor supposes that the real numbers can be enumerated, so forming a subscripted (but unordered) list $$ω_0, ω_1, ..., ω_n, ...$$ He then picks arbitrary real numbers $$α$$ and $$β$$, and goes along the omega list until the first two real numbers between $$α$$ and $$β$$ are found, with the lesser of the two called $$α′$$ and the other $$β′$$. Repeating the process, i.e. continuing along the list until $$α″$$ and $$β″$$ are found between $$α′$$ and $$β′$$, and so on, can be carried out indefinitely (otherwise for instance the midpoint between the current lower and upper bounds has been missed). The process tends toward limits: $$α^∞$$ from the lower side, and $$β^∞$$ from the upper side, which are never reached but necessarily equal (for a similar reason). Thus a real number $$η$$ (equal to $$α^∞$$ and $$β^∞$$) has been exhibited which could not have been on the original list; and so the list can't exist.

Even if you can believe that there are (many, many) more reals than natural numbers, intuition is not always a good guide when it comes to infinities. Already Galileo had noted the surprising fact that "there are as many squares as there are numbers", in that there is a one-to-one correspondence ("every number is the root of some square"). Perhaps less intuitive still is that the rational numbers ($$ℚ$$) are enumerable. After all between any two numbers, no matter how close together, there are an infinity of rationals. But if you view rationals as essentially pairs of integers, so $$ℚ$$ as equivalent to the cartesian product $$ℤ^2$$ (technically $$ℤ \times ℤ^*$$ since a rational is the quotient of an integer by a nonzero integer), then rationals are points on a grid; and you can have a path that winds around that grid in roughly the shape of a spiral, eventually going through every point as the radius increases. The same holds for higher dimensions, i.e. integer tuples of any length: $$ℤ^n$$ and hence $$ℕ^n$$ are countable for arbitrary $$n$$ (you can picture, at least for $$n=3$$, the linear progress associated with counting in terms of the radius of an expanding ball instead of a spiral). None of this seemed obviously true, even once you'd proved it: in a letter to Dedekind Cantor wrote: "Je le vois, mais je ne le crois pas" – seeing wasn't necessarily believing.

What then is the diagonal argument? On one level it's about the difference between $$n$$ and $$2^n$$ – another sign of set theory and logic being intertwined with basic arithmetic. $$2^n$$ is always greater than $$n$$, with an explosive exponential growth neatly illustrated by the tale of the chess player who modestly requests payment in grains of rice, starting with a single grain and doubling it for each square to end up with a mere $$2^{64}$$ or so grains. At its simplest the diagonal argument asks us to consider a square table of zeroes and ones, and produces a new row (or column, it makes no difference) which is, by construction, different from the existing rows. This shouldn't be too difficult: if each row has $$n$$ entries, there are $$2^n$$ possible distinct rows, so at least $$2^n - n$$ not already contained in the table. The "anti-diagonal" is simply the most obvious construction: by flipping the values along the $$\backslash$$ diagonal, you ensure that the new row differs from the first row in the first position, from the second row in the second position, and so on. (In a $$3 \times 3$$ table with rows $$001$$, $$110$$, and $$101$$, your new row would be $$100$$.) For a finite table you could just as easily choose, for instance, to use the opposite diagonal. But the advantage of the original diagonal is that it could be used even when there were an infinite number of rows and columns.

This yielded a more visual proof of the uncountability of $$ℝ$$. You assumed that all real numbers between $$0$$ and $$1$$ could be enumerated (this was basically a proxy for positive real numbers, through their inverses), and wrote down the list of their "decimal" expansion in zeroes and ones. This would yield a "square" table in the sense that each row was the (enumerably) infinite expansion of a real number in terms of binary digits, and there was assumed to be an (enumerable) infinity of rows. You constructed the anti-diagonal in just the same way as for a finite table, except that it went on indefinitely. Again, by construction, this new real number differed from the first on the list by its first digit, from the second by its second digit, and on and on. Since it couldn't be on the list, no exhaustive list of the reals existed, and they were uncountable.

The more familiar base 10 decimal expansion of a real number would have worked just as well, but binary digits for their part can be conveniently reinterpreted: as true or false, in or out. Essentially the same argument shows that for any set $$S$$, the power set $$2^S$$ – the set of subsets of $$S$$ – has a larger cardinal than $$S$$. Let's assume to start off with that the set $$S$$ is countable. The same tables of zeroes and ones can be used, but now the row labels and column labels are important: they correspond to the elements of $$S$$, listed symmetrically, although not in any order inherent to that set (as exists in $$ℕ$$). Rows now correspond to the values of a subset's "indicator function": $$1$$ if the element corresponding to the given column is in the subset, $$0$$ if it isn't. (For instance in the $$3 \times 3$$ case you might have $$a$$, $$b$$ and $$c$$ as both column and row labels. Rows $$001$$, $$110$$, and $$101$$ would correspond to subsets $$\{c\}$$, $$\{a, b\}$$ and $$\{a, c\}$$, with the anti-diagonal $$100$$ indicating a new subset $$\{a\}$$.) The elements of $$S$$ used as row labels "index" the subsets represented by the rows: the table thus visualizes a function $$H: S \rightarrow P(S)$$, where $$P(S)$$ is the power set of $$S$$. The diagonal argument shows that $$H$$ cannot be a bijection: the anti-diagonal subset $$S′ = \{x∈S | x ∉H(x)\}$$, made up of elements that are not contained in the subset they index, is an unindexed element of $$P(S)$$. (If it were indexed, we would have, for some $$x'$$: $$H(x′) = S′$$. But then $$x′ ∈ S′$$ would imply $$x′ ∉ S′$$, and $$x′ ∉ S′$$ would imply $$x′ ∈ S′$$, yielding a contradiction.) The pithy expression of the anti-diagonal is in fact one that can be used for any set, countable or not. For once our intuition about infinite sets doesn't let us down: $$2^ℕ$$ is astronomically larger than $$ℕ$$, just as $$2^n$$ is exponentially larger than $$n$$. And there is clearly no largest cardinal.

Diagonalization would become a most effective tool that led to, as Jaroslav Peregrin put it, "a battery of very nontrivial results constituting, as it were, the central nervous system of modern logic". There is though a sense in which the term "diagonal argument" is a bit unfortunate: in that it seems to suggest that the diagonal has characteristics which are somehow unique. In fact in its usual role in proving that one infinite set is immeasurably larger than another, it's much more a "diagonal of convenience". The natural numbers are a negligeable sliver of the reals; once you have exhibited one real number by construction that can't be on your hypothetical list, you can exhibit countless others, for example by altering its leading digits in a way that guarantees it still won't be on the list, producing a "pseudo-diagonal".

This contrasts with what we might think of as a "true diagonal" which alone describes a particular property. For instance, if you kept the square $$ℕ \times ℕ$$ table and viewed it as a quadrant of a grid, with horizontal values as $$x$$ and vertical values as $$y$$, you could fill the table with the values for $$1/(y-x)$$. Except for the diagonal, that is, since for $$y=x$$ that fraction is undefined. If you viewed this quadrant as representing pairs of natural numbers $$〈x,y〉$$, the diagonal represents an infinity, certainly, but an increasingly sparse one compared to all pairs. (In exactly the same way, squares are equinumerous with natural numbers, but less frequent: surely when an infinity is a strict subset of another, it makes sense to talk of "density" - and indeed the notion of "natural" or "asymptotic" density can make this precise.) Viewed as canonical rational numbers – that is, irreducible fractions – the diagonal is basically a singularity: the number $$1$$. This is quite a different kettle of fish.

*

Barbers in need of a shave

Bertrand Russell's highly inconvenient observation to Frege – Russell's Paradox – can be viewed as a version of the diagonal argument. For something which caused as much fallout in mathematics as it did, it's surprisingly easy to state: consider the set of all sets which do not contain themselves. In a similar notation to before, this is $$S = \{x | x ∉ x\}$$. And a contradiction arises in the same way when you look at the case of $$S$$ itself: if you suppose $$S$$ is a member of itself, then by definition it isn't; and if you suppose it isn't, then the definition tells you it is. So there can't be any set that meets this definition – but the problem was, at the turn of the century, that nothing stopped you from defining a set in such a way.

Russell had a finite version of this, the barber paradox. There's a village barber who shaves all those who don't shave themselves; who shaves the barber? We can look at this again using a square table, and yet another interpretation of those zeroes and ones. They now describe whether there is a certain relation between the element listed as a row label, and the element listed as a column label: specifically, whether one shaves the other. Let's denote the relation by the letter $$R$$. If the village inhabitants are (unimaginatively) called $$a$$, $$b$$ and $$c$$, then rows $$001$$, $$110$$ and $$101$$ translate to: $$aRc$$ ($$a$$ shaves $$c$$), $$bRa$$, $$bRb$$, $$cRa$$ and $$cRc$$. The link with the power set table is that this exhaustive list of "who shaves who" also defines subsets of "shavees": for $$a$$, this is $$\{c\}$$, for $$b$$ it's $$\{a,b\}$$ and for $$c$$, $$\{a, c\}$$. The anti-diagonal yields a new subset of shavees: $$\{a\}$$. So far there is no contradiction, you've just picked one of the $$2^n-n$$ possibilities for a different subset. The contradiction comes when you unwisely assert that that subset was on the original list. This occurs through that "lazy" definition of the shaving relation: for a barber $$z$$, you are asserting the equivalence, for any $$x$$: $$zRx \equiv ¬xRx$$. If $$x$$ is allowed to take on the value $$z$$, then $$zRz \equiv ¬zRz$$: the barber shaves himself if and only if he doesn't.

Russell's Paradox asserts a similarly unsatisfiable equivalence: defining $$S$$ as the set $$\{x | x ∉ x\}$$ means that for all $$x$$, $$x∈S \equiv x ∉ x$$. If $$x$$ is allowed to take on the value $$S$$, this becomes $$S∈S \equiv S∉S$$. In Russell's own words: "Let w be the class of all those classes which are not members of themselves. Then, whatever class x may be, 'x is a w' is equivalent to 'x is not an x.' Hence, giving to x the value w, 'w is a w' is equivalent to 'w is not a w.'" One way of viewing a "paradox" is as a roundabout way of defining an unsolvable equation or equivalence. Perhaps the oldest of its kind is the liar's paradox, which in its simplest form states "I am lying". Neither truth value can be assigned to this statement, because it implies the opposite. If $$a$$ is its posited truth value, the unsolvable equation is essentially $$a = ¬a$$. You could imagine using Boolean values, where $$a$$ could be either $$1$$ (true) or $$0$$ (false), and $$a = 1 - a$$ could only be solved by a disallowed "half-truth". The temptation is to view this as a singularity, a kind of division by zero which is simply undefined. But if you allow statements about the truth of other statements, contradictions can be produced by cycles: for instance $$A$$ stating that $$B$$ is true, and $$B$$ stating that $$A$$ is false. You could still view it as a kind of division by zero, but where the zero results from a sum of values in the cycle. You might even be inclined to interpret a paradox as drawing attention to an unsolvable equation among a family of otherwise solvable ones: given two linear equations of the form $$ax + by + c = 0$$, the solution for $$x$$ and $$y$$ is generally the intersection of two lines; but for certain coefficients the lines are parallel and no such solution exists.

*

From falsehood, anything follows

It may seem odd that much attention was paid to what has some of the flavor of a mathematical parlor game, and yet the paradoxes, Russell's Paradox in particular, were the impetus behind major changes in mathematics. The specter they raised was that of shaky mathematical foundations. As systems went, logic was not particularly robust: it was either correct through-and-through, or it was inconsistent, and all bets were off. "From falsehood, anything follows" – once you could prove something like '$$0=1$$', you could prove anything at all, according to the rules of logic.

Russell hadn't been the first to notice so-called antinomies, since Burali-Forti's paradox concerning ordinals dated from 1897, and Zermelo may have told Goettingen colleagues about his version of Russell's Paradox a couple of years later. Now though it was clear that Frege's "Foundations of Arithmetic" had come unstuck, at least for the time being. Interestingly set theory itself deftly sidestepped the issue: by restricting the "comprehension principle", which allowed the construction of a set from any property, to a "separation axiom" which let properties characterize only subsets. The old way of doing things was now known as "naive set theory", and the world moved on. Russell himself confronted the issue head on, leading to the first version of type theory – today so important in computing – in a 1908 paper entitled "Mathematical Logic as based on the Theory of Types", whose opening sentence describes it being motivated "in the first instance by its ability to solve certain contradictions".

Naturally Hilbert, who had inquired into the consistency and solidity of the axiomatic foundations of geometry before this came to light, took the issue to heart. For him Russell's Paradox was having, by 1904, a "downright catastrophic effect" in mathematics. Frege and Dedekind had given up work on set theory, and Weyl would remain troubled by the "inner unreliability of the foundations" of their science. At the third ICM congress that same year, Hilbert argued for the investigation of proof itself, something Poincaré – who had noted dryly that "logic is no longer sterile, it engenders contradictions" – could not see the need for. This was a "crisis of foundations"; not all mathematicians paid heed to it, but it would stay with Hilbert for years to come. In 1925 he remarked: "the present state of affairs where we run up against the paradoxes is intolerable. Just think, the definitions and deductive methods which everyone learns, teaches, and uses in mathematics, the paragon of truth and certitude, lead to absurdities! If mathematical thinking is defective, where are we to find truth and certitude?"

It's striking that some of his ideas of greatest significance for computing – the Diophantine decision problem, the consistency of axiomatic systems and the foundations of mathematics in general – were formulated around the turn of the century, but did not fully resurface until the 1920s. Hilbert though had a habit of focusing on different areas of inquiry by turn. During the first decade of the new century much of his attention turned to integral equations. What he called "spectral theory" – now known as Hilbert spaces – extended the notion of Euclidean space to an infinite number of dimensions; in it geometric concepts could be seen as applying to functions, with functions themselves as vectors, certain integrals as inner products, and notions such as perpendicularity and projection providing intuition that a wholly analytic approach did not. Courant would speak of the "ordering and clarifying effect" of such a "general function theory" – a description which might apply to both category theory and Church's lambda calculus. (There's a hint of a geometrical interpretation of functions in category theory, with its commutative diagrams and arrows – and Mac Lane had after all noted that "at Goettingen a vector was an arrow".) From 1910 to 1922 Hilbert's interests changed even more markedly: he was essentially a physicist (albeit with an axiomatic approach).

When his interest in the foundations of mathematics was renewed (in full public view at least, since he had quietly continued to lecture on the topic), the publication by Russell and Whitehead of Principia Mathematica had been a factor, by providing a thorough – nay, exhaustive – description of the tools in logic that would probably be required. (Perhaps too exhaustive, since Russell later commented: "my intellect never quite recovered from the strain. I have been ever since definitely less capable of dealing with difficult abstractions than I was before. This is part, though by no means the whole, of the reason tor the change in the nature of my work.") But so too had been a disagreement on the philosophy of mathematics with the founder of "intuitionism", L. E. J. Brouwer.

No middle ground

Brouwer was every bit as dismissive of mathematical authority as Hilbert. He placed great importance on the idea of intuition in mathematics, to the point where he saw it as the final arbiter of whether something was true or not. Even though Poincaré broadly shared this emphasis, he fell foul of Brouwer in his "confusion between the act of constructing mathematics and the language of mathematics". For Brouwer, mathematics was a "languageless activity": "Language cannot play a creative role in mathematics; there are no mathematical truths that can be arrived at by linguistic means (such as logic) that could not, at least in principle, have been arrived at in acts of languageless mathematical construction."

His most celebrated disagreement with Hilbert was over his refusal to accept the "law of the excluded middle". This asserts that a proposition is true or it is false; either the proposition or its negation must be true, and there can be no other possibility. Brouwer's stance wasn't so much a desire to revisit the basic notion of truth as it was a principled argument that infinity was different: of a statement that a property held for at least one element of a finite set, intuitionists agreed that in principle you could say whether it was true or false by checking each element; but for an infinite set, you could only ever say for sure that it was true, if you found an element with that property.

Again it is shown that even rigorous mathematicians can be put off by some types of rigor – either the granular argumentation of the logician, or the asceticism of the intuitionist. For Hilbert outlawing proofs that used the law of the excluded middle was tantamount to prohibiting a boxer from "the use of his fists". When Brouwer defended his point of view at Goettingen, Hilbert countered that "with your methods, most of the results of modern mathematics would have to be abandoned, and to me the important thing is not to get fewer results but to get more results." Some of his colleagues were of a similar practical bent: Hand Lewy thought that "it seems there are some mathematicians who lack a sense of humor or have an over-swollen conscience... If we have to go through so much trouble as Brouwer says, then nobody will want to be a mathematician any more." Not all of them found fault with Brouwer's arguments though. For Weyl, Brouwer had correctly identified the origins of logic as "the mathematics of finite sets and their subsets", and that to be "forgetful of this limited origin" in applying it to infinite sets was "justly punished by the antinomies".

Today Brouwer's objections don't seem quite so impractical: they gave rise to so-called intuitionistic logic (perhaps more appropriately known as constructive logic, since the link with intuition has faded) which avoids the use of the law of the excluded middle. It's the type of logic most clearly associated with the Curry-Howard isomorphism, the expression of the deep structural equivalence between computer programs and mathematical proofs.

*

Who needs meaning?

The beginnings of what would become known as the Hilbert Program could be heard in a 1917 speech in Zurich on "Axiomatic Thought", in which he hopes for the completion of the "broad Russellian enterprise of axiomatizing logic". With such an undertaking, as had been the case in his much earlier work on Euclidean geometry, "the examination of consistency is an unavoidable task". At Goettingen he had begun to collaborate with Paul Bernays, in whom he had found a mathematician with similar interests in foundations. The aims that they had set themselves were announced in Zurich, but not the means, since as Bernays later put it, "the theory was not to rely on the current mathematical methods." By 1921 the means had begun to take shape; and Hilbert had now rejected the idea that "number theory as well as set theory were just part of logic".

Weyl though troubled the waters in a paper entitled "On the New Foundational Crisis in Mathematics", in which he advocated joining Brouwer in a "revolution". He had come to the conclusion that "at the center of the superficially glittering and smooth activity" of mathematics lay unstable foundations. But the proposed solution to the antinomies did not agree in the least with Hilbert, who in 1922 replied that "high-ranking and meritorious mathematicians, Weyl and Brouwer, seek the solution of the problem through false ways... They seek to save mathematics by throwing overboard all that which is troublesome... They would chop up and mangle the science." Instead Hilbert, by now sixty, gave what Weyl would call "a completely new turn to the question of foundations", circumventing the concerns of the intuitionists about what had "real meaning" by seemingly... ignoring meaning altogether. This was the formalist approach: the use of systems of axioms and rules of deduction expressed in a well-defined language, such that deriving theorems becomes a rule-based manipulation of the language according to a "calculus".

Undoubtedly this approach now had the hallmarks of a game. What mattered was not what was "true", but what was provable, according to the rules of the game. Syntax, not semantics. In the words of Frank Ramsey, "mathematics proper is thus regarded as a sort of game, played with meaningless marks on paper rather like noughts and crosses". This made it an easy target for criticism from those who, like Brouwer, placed great emphasis on intuition and meaning. Poincaré had not warmed to formal languages in general; Peano's use of symbols he had called "pasigraphy... the art of writing a mathematical treatise without using a single word of ordinary language." For him it was "difficult to admit that the word 'if' acquires, when written ↄ, a virtue it did not possess when written if". Now Hilbert had upped the ante, with a game based on such a formal language. But he was hardly playing "noughts and crosses", as Hardy remarked: "It is impossible to suppose that Hilbert denies the significance and reality of mathematical concepts, and we have the best of reasons for refusing to believe it: 'The axioms and demonstrable theorems,' he says himself, 'which arise in our formalistic game, are the images of the ideas which form the subject-matter of ordinary mathematics.'" The game only really made sense when it was the faithful representation of "proper" mathematics.

This is not to say that there couldn't be a grey area... where formal systems represented ordinary mathematics, sort of. Those who carried out a formalistic approach after Hilbert, including a few of the pioneers of theoretical computing, sometimes played a little fast and loose with the idea of producing faithful images. Haskell Curry (after whom the language Haskell is named) was at Goettingen in 1929, where Hilbert was his thesis advisor, and Bernays a frequent contact; in developing combinatory logic (following Schönfinkel), his idea of formalism became more absolutist than Hilbert's, with little concern for what it was supposed to be the image of. Then there was the lambda calculus: Alonzo Church, who had also studied at Goettingen, developed it as part of more general theory of functions meant to provide his own version of the foundations of mathematics. You can learn about its workings at university and still be none the wiser about this question: what exactly is the lambda calculus? (And for that matter, what's a calculus?) Church answers with a directness often missing from later accounts: "Underlying the formal calculi which we shall develop is the concept of a function... The study of the general properties of functions... is the original motivation for the calculi – but they are so formulated that it is possible to abstract from the intended meaning and regard them merely as formal systems."

A game worth playing

Hilbert presented his formalist approach more clearly in a 1927 speech where he took the opportunity to "round out and develop my thoughts on the foundations of mathematics, which I expounded here one day five years ago and which since then have constantly kept me most actively occupied." The fundamental idea of what he called proof theory was that "all the propositions that constitute mathematics are converted into formulas, so that mathematics proper becomes an inventory of formulas. These differ from the ordinary formulas of mathematics only in that, besides the ordinary signs, the logical signs... also occur in them." These were, in more modern notation, "implies" ($$\rightarrow$$), "and" (&), "or" ($$∨$$), "not" ($$¬$$), "for all" ($$∀$$) and "there exists" ($$∃$$). (The notation at the time used an overline $$‾$$ for negation, $$(x)$$ for all, and ($$Ex)$$ for existence.) There followed an overview of the foundations of mathematical logic, soon to be expounded in "Principles Of Mathematical Logic", in which he and Ackermann made digestible and extended the ideas in Principia Mathematica - resulting in what Jose Ferreiros calls the "first really modern treatise of formal logic". But the last section of the speech was a response to the intuitionists.

In part, this was a defence of the law of the excluded middle – which was not to blame for the paradoxes of set theory, "due merely to the introduction of inadmissible and meaningless notions which are automatically excluded" from proof theory. More significantly, it described why formalism was not the "meaningless game" the intuitionists held it to be. "For this formula game is carried out according to certain definite rules, in which the technique of our thinking is expressed. These rules form a closed system that can be discovered and definitively stated. The fundamental idea of my proof theory is none other than to describe the activity of our understanding, to make a protocol of the rules according to which our thinking actually proceeds. Thinking, it so happens, parallels speaking and writing: we form statements and place them one behind another. If any totality of observations and phenomena deserves to be made the object of a serious and thorough investigation, it is this one – since, after all, it is a part of the task of science to liberate us from arbitrariness, sentiment and habit..."

The advantages of studying a formal system, in which the vagueness and ambiguities of natural language had been replaced by the rules-based manipulation of a limited number of symbols, were becoming clearer. No consideration would need to be given to what the meaning of the word "is" is. Instead the "closed system" would aim to express the mechanics of thinking – and naturally point towards its mechanization. Such a system could be kicked and prodded to see whether the foundations were indeed solid. Chief among concerns was – as ever – whether the system was consistent, so that no contradictory theorems could be derived (Hilbert broadly agreed with Poincaré, that the very notion of mathematical existence rested solely on freedom from contradiction). You would also want to check whether the axioms were independent – none could be expressed in terms of the others – and whether the system was complete: that for any statement A, either A or its negation ¬A was a theorem. This was now possible because the precise formulation of the game in terms of symbols and rules meant that you could reason outside of the system as to the limits of its behavior: this was metamathematics. As Kleene put it, "To Hilbert is due now, first, the emphasis that strict formalization of a theory involves the total abstraction from the meaning, the result being called a formal system or formalism... and second, his method of making the formal system as a whole the object of a mathematical study called metamathematics or proof theory."

Completeness – also known as negation completeness – could be expressed in terms of decidability: a statement A was undecidable if neither A nor ¬A was a theorem; a system was complete if no statement was undecidable, and incomplete otherwise. Hilbert's emphasis on decidability was longstanding, as evidenced by the tenth problem of 1900: to devise a process which in a finite number of steps could determine whether an equation was solvable – or a formula was provable. Not to solve the equation, or to prove the formula, but to assert whether one could do so. As Church and Turing would show, decidability was inherently related to computability. Here was, in a sense, mathematical computing laid bare: given a question represented by a series of symbols within a closed system, to correctly compute a zero, or a one.

This circumscription of the problem, an introspective question into the workings of mathematics, appeared ever more related to representing thought as a mechanical process. Hilbert's assistant Behmann, whose 1921 thesis was on the decision problem, spoke of the requirement that "the path of calculation as a whole should be specified by rules, in other words, an elimination of thinking in favor of mechanical calculation." More tellingly: "One could, if one wanted to, speak of mechanical or machinelike thought (Perhaps one could later let the procedure be carried out by a machine)." For Hilbert decidability was, among the properties of formal systems, the "best-known and the most discussed; for it goes to the essence of mathematical thought." In "Principles Of Mathematical Logic" (1928) the decision problem (Entscheidungsproblem) as familiar to Church and Turing was formulated. Hilbert and Ackermann note that "it may be said that only now is the concept of a system of axioms formulated with precision; for, a complete axiomatization should include not only the setting up of the axioms themselves, but also the exact statement of the logical means which enable us to derive new theorems from the axioms. We will examine... the question of whether every statement which would intuitively be regarded as a consequence of the axioms can be obtained from them by means of the formal method of derivation." As examples of axiomatic systems, they cite the natural numbers (Peano), set theory (Zermelo), and the simpler, first-order systems of geometry (Hilbert) and group theory. "The following question now arises as a fundamental problem: Is it possible to determine whether or not a given statement pertaining to a field of knowledge is a consequence of the axioms?" There follows a chapter devoted to the "the two equivalent problems of universal validity and satisfiability [referred to] by the common name of the decision problem of the restricted predicate calculus... we are justified in calling it the main problem of mathematical logic." As if to emphasize the link with digital computing, discussions about formula satisfiability are made clearer by considering "a domain of two individuals... 0 and 1".

Hilbert made no bones about his insistence on the significance of this particular problem, aware that his influence in mathematics was not limited to his own work. Weyl points out that unlike his predecessors at Goettingen, Gauss and Riemann, who had shown a "taste for solitude", Hilbert was gregarious and sought the "exchange of ideas" with others. He managed, in the words of one contemporary, to persuade "us to consider important those problems which you considered important." The way to solve this crucial problem seemed sure to be metamathematics - the field that according to Tarski he had created "as an independent being", having then "fought for its right to existence, backing it with his whole authority as a great mathematician."

*

Incompleteness

Paying close attention to these developments was logician Kurt Gödel, whose 1929 thesis showed that first-order logic, of the kind now clearly enunciated in Hilbert and Ackermann's textbook, was "complete" in a different sense: that all "universally valid" formulas can be derived from it. But it was a paper he submitted a couple of years later which truly lay the groundwork for Church and Turing, entitled "On Formally Undecidable Propositions of Principia Mathematica and Related Systems". In it he proved his so-called "incompleteness theorems" – as much of a body blow to the Hilbert Program as Russell's Paradox had been to Frege's treatise on arithmetic. To add insult to injury, he had combined a novel use of Hilbert's metamathematics with a dash of, again, the diagonal argument. Far from continuing the seemingly straightforward progress towards putting mathematics on unassailable, solid ground, he had shown that the formal system of arithmetic was fundamentally incomplete; and worse still, that its consistency could not be proved within the system itself.

Gödel's incompleteness theorems may well be the most debated theorems of the twentieth century. The discussions about what they do and don't say, and may or may not imply for the soundness of mathematics as a whole, or philosophical considerations of knowability and truth, show few signs of drawing to a close. Of more particular relevance to computing though are the tools Gödel brings to bear on the problem - and today's practitioners might just be surprised by the familiarity of techniques nearly a century old.

Bits of string

A programmer can be forgiven for imagining that the manipulation of strings belongs to the age of the microprocessor. And yet not only is it a central notion in the earliest symbolic logic, it plays an equally central role in Gödel's proof. This begins with lowly string concatenation (which even on its own leads to a group-like algebraic structure, the so-called "free monoid"). But a more fundamental string transformation in symbolic logic is substitution. It's not confined to logic either: for instance Leibniz's integral notation $$\int_a^b f(x)$$ involved substituting a and b for x. But what mathematicians did more or less subconciously, logicians made explicit. Peano's 1889 axiomatization of the natural numbers included notation such as $$\alpha [x]x'$$ to signify, for a term $$\alpha$$ containing $$x$$, the result of replacing $$x$$ with $$x'$$. Substitution plays a key role in Russell's 1908 expounding of the theory of types, where the notation $$p/a^;x$$ meant "the proposition which results from substituting x for a wherever a occurs in p". It could be extended to $$p/(a,b)^;(x,y)$$: substituting $$x$$ for $$a$$, then $$y$$ for $$b$$. An interest in the meaning of substitution led Haskell Curry towards combinatory logic, the precursor to the lambda calculus that E.J. Hogan described as being "concerned with certain basic notions of the foundations of mathematics which are usually used in an intuitive and unanalysed way." Before setting off to join Hilbert and Bernays in Goettingen, Curry wrote "An analysis of logical substitution".

Still, it can come as somewhat of a surprise that one of the key building blocks of Gödel's proof is the function Subst $$a(^v_b)$$, which he describes as "the formula derived from $$a$$, when we replace $$v$$ in it, wherever it is free, by $$b$$". Gödel may be talking about a "series of basic signs" but he means a "string" in the modern sense. A similar function is present in virtually every modern programming language, known for instance as replace in Python and even subst in the GNU make utility. Of course as a mathematician Gödel finds Subst a tad long and replaces it with the notation $$Sb (x^v_y)$$, noting that $$Sb [Sb (x^v_y)^w_z]$$ can be written $$Sb (x^{v w}_{y z})$$, not a million miles away from Russell's double substitution notation. He wants the notation to be concise, because it is central to his argument and occurs frequently.

How does string replacement come to play such a pivotal role in determining whether arithmetic is consistent or not? Hilbert's "formula game" had reduced the workings of arithmetic to rules-based string manipulation, and this formal system could be analyzed from the outside, in a metamathematical way. In their well-known explanatory book "Gödel's Proof", Ernst Nagel and James Newman compare metamathematics to "meta-chess": a means by which certain statements about chess can be deduced from the rules of the game – that end-game mates can't be forced by two knights alone, for instance. This was along the lines of what Hilbert might have hoped for, to show that the mechanics of manipulating strings to form arithmetic proofs could not possibly result in contradictory statements.

Having done more than anyone else to allow deductive systems to be fully mirrored by a calculus of string transformations, he might not have been overly surprised by Gödel's mapping of these strings and transformations to natural numbers and arithmetic statements. This wasn't that dissimilar from Descartes' mapping of geometric figures to coordinates and algebraic equations. The two worlds coexisted separately, at a respectful distance, with insights (and theorems) from one translating to the other. But Hilbert may not have anticipated Gödel's next move: to bring these worlds together. It was a bit as if the number 10 representing a geometric coordinate was reinterpreted as a line segment and an oval, and incorporated into an existing shape.

If this sounds a bit "random", there's a sense in which it is: Gödel's mapping of strings to natural numbers – the first step by which he achieves the "arithmetization" of formalized arithmetic, and then of metamathematics – forms an essential part of his overall construction, but is not particularly meaningful in and of itself. His invention of a kind of "mathematician's ASCII" was later used by both Church and Turing to map symbols and more generally series of strings to numbers; but there are countless other mappings he could have chosen, the main property of which is that they are injective, so that different strings map to different numbers. (He could for instance presumably have used an ASCII-like valuation of symbols to simply interpret a string as a base-128 number; but this would have called for changes elsewhere in his proof.)

As it was the so-called "Gödel number" of a string had a prime factorization that the string could effectively be read from. If the strings were say "$$x = 0$$" and "$$y > 0$$", then the pseudo-expression $$2^{2^{"x"} \times 3^{"="} \times 5^{"0"}} \times 3^{2^{"y"} \times 3^{">"} \times 5^{"0"}}$$ was the corresponding Gödel number, assuming "$$x$$", "$$=$$", "$$0$$" and "$$>$$" were replaced with the numeric values of those symbols. Gödel kept the list of individual "constant" symbols to a minimum, each mapped to a prime number, with a fair amount of overlap with Hilbert's logical symbols: "~" (not), "$$\vee$$" (or), "$$\Pi$$" (for all), "$$0$$" (zero), "$$s$$" and brackets "$$($$" and "$$)$$". The letter "$$s$$" (or "$$f$$" in the original) meant "the successor of" and referred to the Peano axioms, where natural numbers were defined by multiple applications of the successor function to zero: $$ssss0$$ meant four. As he summarizes it, the "formal system P" under consideration is "essentially the system obtained by superimposing on the Peano axioms the logic of PM" (Russell and Whitehead's Principia Mathematica) – noting however that the addition of the Peano axioms "serves only to simplify the proof and can in principle be dispensed with". The Peano notation of numbers, and nested exponentials, meant that Gödel numbers were not only without numeric significance but also enormous – their only truly relevant property was whether they were equal to another Gödel number or not.

Gödel's mapping of typographical relationships between strings to arithmetic relationships between natural numbers is what R. B. Braithwaite calls the invention of "co-ordinate metamathematics", differing from geometrical cartesian coordinates by being one-dimensional, and limited to whole numbers. Strings and numbers become essentially interchangeable, and he will barely bother to distinguish between the two: but this is wholly justified by a structural equivalence, or isomorphism, that he describes clearly: the mapping "provides an isomorphic image of the system PM in the domain of arithmetic, and all metamathematical arguments can equally well be conducted in the isomorphic image. This occurs in the following outline proof, i.e. 'formula', 'proposition', 'variable', etc. are always to be understood as the corresponding objects of the isomorphic image." In other words he will continue to talk of relationships between strings, and by extension of metamathematical properties such as provability, while presenting arithmetic relationships which are their equivalent – the only reminder will be the use of italics, such as "$$x$$ is the proof of the formula $$y$$" recalling that this is an arithmetic relationship mirroring the "provability" of the formula with Gödel number $$y$$ by the series of formulae with Gödel number $$x$$.

A functional approach

Gödel's setup for proving the incompleteness theorem – the year is 1931 – amounts to a functional computer program. He starts by defining the numeric values for the constant symbols mentioned earlier: "$$0$$", $$"s"$$, "~", "$$v$$", "$$\Pi$$", "$$($$" and "$$)$$" are respectively mapped to 1, 3, 5, 7, 9, 11 and 13. Then "variables of type $$n$$" are assigned numbers of the form $$p^n$$, where $$p$$ is a prime number greater than 13. The three types of variables he has defined are essentially natural numbers, sets of numbers and sets of sets (he uses the term "class" as was customary at the time). For the purposes of his proof, the variables $$x$$ and $$y$$, mapped to 17 and 19 and representing "individual" numbers, will suffice. He notes that a "natural number is thereby assigned in one-to-one correspondence, not only to every basic sign, but also to every finite series of such signs", i.e. to every string. This (Gödel) number he writes $$\phi(a)$$ for a given string $$a$$. If you then consider a relation between strings, or set of strings, $$R (a_1, a_2, ..., a_n)$$, this is mirrored by a relation or set of natural numbers $$R' (x_1, x_2, ..., x_n)$$ such that $$x_i = \phi(a_i)$$. "For all considerations as to content (more especially also of a metamathematical kind)", he will be using "Hilbertian symbolism", specifically that found in Hilbert and Ackermann's 1928 tome on mathematical logic.

This allows a focus on arithmetic functions of numbers rather than typographical functions of strings. For reasons that we'll see shortly, Gödel wants these functions to be "recursive", or what is now known as "primitive recursive". This is the kind of "tame" recursion that never produces an infinite loop, and always yields a value – allowing well-behaved functions to be constructed from basic building blocks. There follows a list of 46 functions, all of which, save the last one, are primitive recursive. They start with a few purely arithmetic "utilities" (properties of divisibility, primeness, being the $$n^{th}$$ prime; definition of a factorial), and move on to the arithmetic images of metamathematical concepts, up to the main property of provability, which alone is not primitive recursive.

The "programming language" may be different, but the general approach Gödel takes in building more complex functions from simple ones will be familiar to any programmer. Function 1 with the comment "$$x$$ is divisible by $$y$$" is defined by $$x/y \equiv (E z) [z ≤ x \; \& \; x = y.z]$$, where today "there exists a $$z$$ (such that)" would be written $$∃ z$$ rather than $$(E z)$$. As throughout his function definitions, the quantifier "there exists" is restricted to apply to a finite number of values, ensuring primitive recursiveness: here the restriction is the condition $$z ≤ x$$. Function 4 (no comment deemed necessary) defines a factorial recursively by $$0! \equiv 1$$ and $$(n+1)! \equiv (n+1).n!$$.

He moves on to the gist of the program: arithmetic functions which describe string functions. These start with fairly simple ones (string length, string concatenation), where the trick is to map a string concept via his particular numbering system, which relies on prime factorizations. For instance function 7 defines the length of string $$x$$ as $$l(x) \equiv \epsilon \, y \, [ y ≤ x \; \& \; y \, Pr \, x > 0 \; \& \; (y+1) \, Pr \, x = 0]$$. The symbol $$\epsilon$$ has been defined such that $$\epsilon \, x \, F(x)$$ is "the smallest number $$x$$ such that $$F(x)$$ holds and $$0$$ if there is no such number." The expression $$n \, Pr \, x$$ has been defined (via function 3) as "the $$n$$-th (in order of magnitude) prime number contained in $$x$$". So the length of a string such as "$$ssss0$$" is derived from the prime factorization of its Gödel number, $$2^3 \times 3^3 \times 5^3 \times 7^3 \times 11^1$$, where there is a fifth prime but not a sixth, yielding $$l(x)$$ = 5.

Similar arithmetic functions that pick apart prime factorizations are used to mirror string concatenation, written $$x * y$$, and the selection of the $$n^{th}$$ string in a series (read: array), which he writes $$n \, Gl \, x$$. (There's even some array bounds checking thrown in: "for $$n > 0$$ and $$n$$ not greater than the length of this series.") The assembly of building blocks continues, each producing a new primitive recursive arithmetic function (or relation) that is the faithful image of a string manipulation function, including the aforementioned $$Sb (x^v_y)$$ substitution. The last such primitive recursive relation is number 45, $$x \, B \, y \equiv Bw (x) \, \& \, [l(x)] \, Gl \, x = y$$. Here $$Bw$$ (for "Beweis") means that $$x$$ is a proof: an arithmetic function has been constructed that reflects the verification of $$x$$ as "a finite series of formulae, of which each is either an axiom or an immediate consequence of two previous ones." The mechanical nature of deduction from axioms means that the entire process can be checked by the arithmetic function $$Bw$$. There remains one step to check that $$x$$ is a proof of $$y$$ in particular (i.e. $$x \, B \, y$$): that the last formula in series $$x$$ is in fact $$y$$. The formula $$[l(x)] \, Gl \, x$$ expresses "last" as "the $$n^{th}$$ in a series of length $$n$$", another familiar programming technique. Finally function 46 expresses what he's after: $$Bew (x) = (E \, y) \, y \, B \, x$$. The comment: "$$x$$ is a provable formula".

The intricate construction of relation $$B$$ (as functional programs go, it's quite dense but not overly long) has provided the equivalent of a boolean function $$Dem (x, y)$$ of Gödel numbers $$x$$ and $$y$$, or alternatively the strings they correspond to, that returns true if the first demonstrates the second. Primitive recursiveness means a value is always produced. The same cannot be said of the last function, $$Bew$$, which is more speculative: although at first glance it seems to address the decision problem, the existential quantifier is for the first time unconstrained, and so is the search space. If your criterion was "provability in under a page" for an arbitrary formula, then you would now have a – very inefficient – way of determining this, by sizing up all strings under a certain length as potential proofs. But you couldn't talk about decidability in absolute terms.

Contrasting with the intricate mechanics of mapping metamathematical concepts to arithmetic ones through the properties of prime factorizations – these relate to the definition of Gödel numbers, which could have been different – is the relatively straightforward main thrust of the proof. Gödel draws attention to its resemblance with well-known antinomies: Richard's paradox and the liar's paradox. There's nothing unique about these though: "every epistemological antimony can likewise be used for a similar undecidability proof." Really the only requirement is some flavor of the diagonal argument. Let's have a look at the basic technique, first as summarized by Nagel and Newman, which gives a good overview; then as described by Braithwaite, in a way that's closer to what Gödel actually did.

*

The first hacker

The general idea of the liar's paradox was that if it was allowed that statements could make assertions about the truth of other statements, then at some point a chain of statements could become a cycle, looping back to the first and "biting its tail". If all statements asserted that the others were true, there was no unique "solution": they could all be true, or all false. If statement $$A$$ asserted that $$B$$ was true, and $$B$$ that $$A$$ was false, then there was no solution at all. In the case of a "cycle" with one statement, if $$A$$ asserted that $$A$$ was true, then nothing had been said. If $$A$$ asserted that $$A$$ was false you had the liar's paradox. Gödel wants a version of this, which he knows can play havoc with Hilbert's carefully laid plans: "a proposition which asserts its own unprovability".

The summary by Nagel and Newman starts with their version of $$Bew (z)$$ (i.e. $$z$$ is demonstrable): the formula '$$(∃ x) Dem (x, z)$$'. To negate this formula and thus state that $$z$$ in unprovable requires just a tilde prefix: '$$\sim (∃ x) Dem (x, z)$$': there is no $$x$$ such that $$x$$ is a proof of $$z$$. They state that "a special case of this formula is not formally demonstrable", and show how it's constructed in two steps (we'll keep the original Gödel number for variable $$y$$, i.e. 19):
$$\;\;$$ • $$\sim (∃ x) Dem (x, Sub (y, 19, y))$$: a formula with Gödel number $$n$$
$$\;\;$$ • $$\sim (∃ x) Dem (x, Sub (n, 19, n))$$: the formula $$G$$, which can't be proven

What is the technique being employed here? First it involves the use of substitution (whose $$Sub$$ notation here draws out the parallels a bit more with programming), but of a particular kind: the "rather circular-seeming idea of substituting a string's own Gödel number into the string itself (and then taking the Gödel number of the result)". This is where Gödel, instead of maintaining two parallel, isomorphic worlds at a distance, mirroring the structures and behavior of one in the other, unexpectedly brings them together. This requires "crossover" function 17, defined as $$Z(n) \equiv n \, N \, [R(1)]$$ and described as the "number-sign for the number $$n$$" – alone among metamathematical functions it contains no reference to a variable $$x$$ or $$y$$. There is now a "round trip" from strings to numbers (through the function $$\phi$$) and back to strings again (through the Peano representation $$Z(n)$$). Gödel is essentially looking for a fixed point: in general for a function $$f$$, this is a value of $$x$$ such that $$f(x) = x$$. Here, if we simplify the function "is unprovable" to $$U$$, and write $$g()$$ for a Gödel number produced by a function $$g$$, then we want the string '$$U(g())$$' to have the Gödel number $$g()$$. If we look at the simplified formulas:
$$\;\;$$ • $$U ( Sub (y, 19, y))$$: a formula with Gödel number $$n$$
$$\;\;$$ • $$U(Sub (n, 19, n))$$: the formula $$G$$
The formula $$Sub (n, 19, n)$$ represents a function that returns a string, by starting with the string of Gödel number $$n$$, and replacing within it the symbol '$$y$$' by the numeral for $$n$$. The string with Gödel number $$n$$ is the first formula, $$U ( Sub (y, 19, y))$$, and replacing '$$y$$' by $$n$$ yields $$U(Sub (n, 19, n))$$, which is in fact $$G$$. So $$G$$ states that $$U(G)$$, i.e. $$G$$ is unprovable.

For the purposes of this part of the proof, the prime factorization and other numeric properties of Gödel number $$n$$ are irrelevant: in programming terms, it functions essentially as a pointer to a string. Its only relevant properties are being able to be dereferenced, yielding a string, and to be represented as a (different) string. You can think of the formulas as looking something like
$$\;\;$$ • $$U ( Sub (y, 19, y))$$: a string at address 0x7fff56b39a7c
$$\;\;$$ • $$U(Sub ($$0x7fff56b39a7c$$, 19,$$ 0x7fff56b39a7c$$))$$: the formula $$G$$
But even this doesn't fully convey the strangeness of what Gödel is doing: instead of a pointer represented by an address, the string twice pasted into $$G$$ is made up of the character "$$s$$" as far as the eye can see, terminated by a zero. Of course he has a very good reason for using such a technique – he needs to engineer an application of the diagonal argument. Let's be honest though, it's the kind of programming that would be flagged by the most cursory of code reviews. Gödel may or may not have been the world's first functional programmer; but there's a strong case to be made that he was the world's first... hacker.

*

It only looks circular

This implementation of the diagonal argument need not, in fact, depend on the use of a numbering system. You can view $$G$$ as
$$\;\;$$ • $$U(Sub (U ( Sub (y, 19, y)", 19, U ( Sub (y, 19, y))"))$$
If you then subsitute, within the string $$U ( Sub (y, 19, y))$$, the string $$U ( Sub (y, 19, y))"$$ for $$y$$, you obtain $$U(Sub (U ( Sub (y, 19, y)", 19, U ( Sub (y, 19, y))"))$$, in other words $$G$$. You get a sense of why Gregory Chaitin thought that "the paper begs to be rewritten in LISP". Torkel Franzén noted that after Gödel this "provable fixpoint" technique became widely used in logic, quoting a method devised by Quine:

"yields a sentence with property $$P$$ when appended to its own quotation." yields a sentence with property $$P$$ when appended to its own quotation.
and known, appropriately enough, as "quining".

All this however added to an overall impression of circularity and self-reference which was not always persuasive. In his 1908 paper on the theory of types, Russell had identified the common characteristic of the antinomies as "self-reference or reflexiveness". His identification of the root cause of paradox as "impredicativity" – a term he invented to describe a statement that couldn't be neatly separated into subject and predicate – was contested; but it did lead him to the first type theory, where a function "of one variable which is of the order next above that of its argument" was known as predicative. The possibility of circularity was avoided by a "hierarchy of functions", where progressing to higher-order types in a sense meant a circle would become an upwards spiral.

The same paper contains an interesting parallel with Gödel's technique. Russell first defines the word "type" as "the collection of arguments for which [a propositional] function has values." His motivation as mentioned is that "the division of objects into types is necessitated by the reflexive fallacies which otherwise arise." There was a standard way to obtain a higher type, namely "the process of generalization, i.e. the substitution of a variable for one of the terms of the proposition, and the assertion of the resulting function for all possible values of the variable." He goes on to say that in practice functions are more convenient than propositions, and "may be obtained from propositions of various orders by the method of substitution." This was his $$p/a^;x$$ notation, essentially the equivalent of Gödel's Subst (also $$Sb$$). There's a similarity with Gödel's two-step technique in producing unprovable statement $$G$$: the two steps in question are generalization and substitution. Gödel's function 15 was $$x \, Gen \, y \equiv R(x) * R(9) * E(y)$$, with the comment "$$x \, Gen \, y$$ is the generalization of $$y$$ by means of the variable $$x$$". This produced the string "$$x \Pi (y)$$", or today's "$$\forall x, y$$". At a key point in the proof Gödel notes that "the operations $$Gen$$ and $$Sb$$ are always commutative, whenever they refer to different variables." For Braithwaite being able to arrive at the same result by substitution then generalization, then the other way around, was the "crux of the argument".

Sound though the technique may be, Gödel is aware that "a proposition which asserts its own unprovability" sets alarm bells ringing: "In spite of appearances, there is nothing circular about such a proposition, since it begins by asserting the unprovability of a wholly determinate formula... and only subsequently (and in some way by accident) does it emerge that this formula is precisely that by which the proposition itself is expressed."

It's just as well that he points this out, because there is an unmistakable impression of circularity, even infinite strings, division by zero... you name it. It's sometimes necessary to remind yourself that the proof has been checked countless times, and even machine-verified. Does not the phrase "There's no proof of this" translate to – really – the infinite string "There's no proof of "There's no proof of "There's ..."? And would it really matter that such an infinite statement is unprovable, even if it's "true"? The (simplified) formula $$Sub (y, 19, y)$$ tends to make programmers wary, particularly any among them who ever tried to replace a substring by the string itself, and had a runtime error that is the string replacement equivalent of division by zero. But in Gödel's carefully constructed world of primitive recursion, there are no infinite loops or strings. Whether or not it gave the appearance of involving a sleight of hand, the first incompleteness theorem was definite proof that certain statements of arithmetic were undecidable ‐ and even if these did not appear to be "common", there was still an infinite number of them, no matter what additional axioms you added. The second incompleteness theorem took it one step further: in the formal system $$P$$, the undecidable formula $$G$$ could in fact be shown to be equivalent to the statement "$$P$$ is consistent". However much Hilbert was banking on it, the consistency of arithmetic could not be proved within arithmetic itself.

*

Just because you saw that Gödel's proof was correct didn't mean you had to sing its praises. Of course, you could admire its construction: Freeman Dyson called it "a soaring piece of architecture, as unique and as lovely as Chartres Cathedral. Gödel took Hilbert's formalized axioms of mathematics as his building blocks and built out of them a lofty structure of ideas into which he could finally insert his undecidable arithmetical statement as the keystone of the arch." But to say that it has divided opinion among mathematicians (and philosophers, and many others) is putting it mildly.

The initial reaction of some mathematicians was an unwillingness to admit defeat. The usually placid Church wrote a 1934 article where he hints at the existence of "a particular set of postulates for symbolic logic, whose freedom from contradiction can be proved", thereby escaping the "unpleasant theorem of Kurt Gödel to the effect that, in the case of any system of symbolic logic which has a claim to adequacy, it is impossible to prove its freedom from contradiction in the way projected in the Hilbert program". He is clearly unsettled by the possibility that "if there is no exact description of logic, then there is no sound basis for supposing that there is such a thing as logic", leading logicians "to despair of the currently accepted search for mathematical rigor." Hence the need for "a way out of this condition of nihilism".

Gödel himself seems almost apologetic about upsetting the apple cart. Towards the end of his paper he states that "it must be expressly noted that" his theorems "represent no contradiction of the formalistic standpoint of Hilbert. For this standpoint presupposes only the existence of a consistency proof effected by finite means, and there might conceivably be finite proofs which cannot be stated in P". For Braithwaite this was a "pious hope", and clarification of Hilbert's notion of a proof "effected by finite means" settled the question. Hilbert's assistant Gentzen may have provided a proof in 1936 of the consistency of arithmetic, but one that relied on the non-constructive method of "transfinite induction" – establishing a hierarchy of proofs in terms of complexity, perhaps vaguely reminiscent of Russell's avoiding circularity by a hierarchy of types. Whether you accepted this technique was, for Kleene, "a matter for individual judgment". Braithwaite emphasized that Gödel's paper "showed the essential limitations imposed upon constructivist formal systems", but also "displayed the power of constructivist methods for establishing metamathematical truths." It constituted "one of the greatest and most surprising of the intellectual achievements" of the twentieth century.

You might think that, over time, a consensus view would have emerged. But not really. At least it became arguable that the theorem was not inextricably linked with self-reference and circularity: Franzén points out that alternative arguments based on computability theory go back at least as far as the MRDP theorem, the long-awaited answer to Hilbert's tenth problem of 1900. This had shown that there was an infinite number of undecidable Diophantine equations: "It is indeed important to emphasize that undecidable arithmetical sentences need not be formalizations of odd self-referential statements, but this point is fully illustrated by the undecidability of statements about Diophantine equations." He pours cold water on the idea that problems of general interest to mathematicians are likely to be unprovable: finding the twin prime conjecture (that there unlimited numbers of pairs of primes differing by 2) to be undecidable would be "a mathematical sensation comparable to the discovery of an advanced underground civilization on the planet Mars."

A more matter-of-fact view was that of Yuri Manin: "Baffling discoveries such as Gödel's incompleteness of arithmetic lose some of their mystery once one comes to understand their content as a statement that a certain algebraic structure simply is not finitely generated with respect to the allowed composition laws." And for a frankly unimpressed William Lawvere, the theorems were "consequences of some very simple algebra in the Cartesian-closed setting. It was always hard for many to comprehend how Cantor's mathematical theorem could be re-christened as a 'paradox' by Russell and how Gödel's theorem could be so often declared to be the most significant result of the 20th century."

But Gödel's proof represents a different – and more interesting – diagonal argument from that which shows the uncountability of $$ℝ$$. Instead of illustrating that $$ℝ$$ is astronomically larger than $$ℕ$$, it represents a "hole" within the natural numbers. It's like removing a thread from an infinite piece of fabric. The missing numbers, those that can be thought of as "not finitely generated" by the composition laws of the formalism, resemble perfect squares: infinite in number, but increasingly less "dense". A simple example of a countable set not being fully generated would be to start with all pairs of natural numbers, i.e. $$ℕ^2$$, map them to sets of two numbers, then again to pairs: along the way the pairs of the form $$〈n,n〉$$ would have disappeared. There is a dual nature to this missing thread: both as a countable infinity, of the same cardinal as any other, and in relative terms, compared to the overall fabric, as a quasi-singularity. True arithmetic statements are countable, and so are all proofs generated by a formal system; by definition there is a bijection between the two, but it's not quite the one that was anticipated by "naive proof theory". In a sense any bijection needs to transit through the wilderness of $$2^ℕ$$, where things don't necessarily go according to plan. For those who were wondering "why?" Gödel summarized it in this way: "the true reason for the incompleteness inherent in all formal systems of mathematics is that the formation of ever higher types can be continued into the transfinite (see Hilbert 1926, page 184), while in any formal system at most denumerably many of them are available."

*

So how did Hilbert take the halt to a decades-long quest to put mathematics on firm foundations, as Gödel twisted his painstakingly constructed formal system into a Viennese pretzel? As it happens, not that well. According to Reid, "Classical mathematics might be consistent and, in fact, probably was; but its consistency could never be established by mathematical proof, as he had hoped and believed it could be. The boundless confidence in the power of human thought which had led him inexorably to this last great work of his career now made it almost impossible for him to accept Gödel's result emotionally... the framework of formalism was not strong enough for the burden he wanted it to carry. At first he was only angry and frustrated, but then he began to try to deal constructively with the problem. Bernays found himself impressed that even now, at the very end of his career, Hilbert was able to make great changes in his program."

It's not uncommon today for mathematicians to think that Hilbert should have seen the signs, and realized that proving consistency from within arithmetic was too much like trying to lift yourself up by your own bootstraps. Perhaps. There are however a number of formal systems that are complete, including the arithmetic of real numbers (this encompasses the arithmetic of natural numbers, but natural numbers themselves are not definable within the system) and so-called "Presburger arithmetic", which allows addition but not multiplication. There's also the view that Hilbert's program was ill-conceived from the start: Dyson wrote that Hilbert had "walked into a blind alley of reductionism" (which drew a stinging rebuke from Mac Lane). Jean-Yves Girard saw formalism as an "extreme form of scientism", and thought it "outrageous" to see mathematics as a "formal discipline". Naturally mathematicians are touchy about the idea that their work could be entirely mechanized – Voevodsky, in presenting a talk on the "controversial" idea that mathematics might actually be inconsistent, was quick to reassure his audience that intuition would always be necessary – but it seems odd to pin that notion on Hilbert. He once explained that a mathematician who had become a writer "did not have enough imagination for mathematics, but he had enough for novels."

As it was, the Hilbert program, begun over a century ago, had now ended in "failure". But it had one heck of a silver lining. As Chaitin puts it, "formal systems did not succeed for reasoning, but they succeeded wonderfully for computation... here we are with these complete formalizations which are computer programming languages, they're everywhere!... So it worked! In another sense, it worked tremendously." There was one more stage in proving that "complete formalization" was a mathematical dead end, but a start of sorts for computing: the proofs by Church and Turing that the decision problem couldn't be solved.

*