Complexity and Prediction Part V: The crisis of mathematical paradoxes, Gödel, Turing and the basis of computing

Before the referendum I started a series of blogs and notes exploring the themes of complexity and prediction. This was part of a project with two main aims: first, to sketch a new approach to education and training in general but particularly for those who go on to make important decisions in political institutions and, second, to suggest a new approach to political priorities in which progress with education and science becomes a central focus for the British state. The two are entangled: progress with each will hopefully encourage progress with the other.

I was working on this paper when I suddenly got sidetracked by the referendum and have just looked at it again for the first time in about two years.

The paper concerns a fascinating episode in the history of ideas that saw the most esoteric and unpractical field, mathematical logic, spawn a revolutionary technology, the modern computer. NB. a great lesson to science funders: it’s a great mistake to cut funding on theory and assume that you’ll get more bang for buck from ‘applications’.

Apart from its inherent fascination, knowing something of the history is helpful for anybody interested in the state-of-the-art in predicting complex systems which involves the intersection between different fields including: maths, computer science, economics, cognitive science, and artificial intelligence. The books on it are either technical, and therefore inaccessible to ~100% of the population, or non-chronological so it is impossible for someone like me to get a clear picture of how the story unfolded.

Further, there are few if any very deep ideas in maths or science that are so misunderstood and abused as Gödel’s results. As Alan Sokal, author of the brilliant hoax exposing post-modernist academics, said, ‘Gödel’s theorem is an inexhaustible source of intellectual abuses.’ I have tried to make clear some of these using the best book available by Franzen, which explains why almost everything you read about it is wrong. If even Stephen Hawking can cock it up, the rest of us should be particularly careful.

I sketched these notes as I tried to pull together the story from many different books. I hope they are useful particularly for some 15-25 year-olds who like chronological accounts about ideas. I tried to put the notes together in the way that I wish I had been able to read at that age. I tried hard to eliminate errors but they are inevitable given how far I am from being competent to write about such things. I wish someone who is competent would do it properly. It would take time I don’t now have to go through and finish it the way I originally intended to so I will just post it as it was 2 years ago when I got calls saying ‘about this referendum…’

The only change I think I have made since May 2015 is to shove in some notes from a great essay later that year by the man who wrote the textbook on quantum computers, Michael Nielsen, which would be useful to read as an introduction or instead, HERE.

As always on this blog there is not a single original thought and any value comes from the time I have spent condensing the work of others to save you the time. Please leave corrections in comments.

The PDF of the paper is HERE (amended since first publication to correct an error, see Comments).


‘Gödel’s achievement in modern logic is singular and monumental – indeed it is more than a monument, it is a land mark which will remain visible far in space and time.’  John von Neumann.

‘Einstein had often told me that in the late years of his life he has continually sought Gödel’s company in order to have discussions with him. Once he said to me that his own work no longer meant much, that he came to the Institute merely in order to have the privilege of walking home with Gödel.’ Oskar Morgenstern (co-author with von Neumann of the first major work on Game Theory).

‘The world is rational’, Kurt Gödel.

Complexity, ‘fog and moonlight’, prediction, and politics I

‘What can be avoided

Whose end is purposed by the mighty gods? 

Yet Caesar shall go forth, for these predictions 

Are to the world in general as to Caesar.’ 

Julius Caesar, II.2.

‘Ideas thus made up of several simple ones put together, I call Complex; such as are Beauty, Gratitude, a Man, an Army, the Universe.’ Locke.

‘I can calculate the motion of heavenly bodies but not the madness of people.’ Newton, after the South Sea Bubble ‘Ponzi scheme’. 

‘Everything in war is very simple, but the simplest thing is difficult. The difficulties accumulate and end by producing a kind of friction that is inconceivable unless one has experienced war… Countless minor incidents – the kind you can never really foresee – combine to lower the general level of performance, so that one always falls short of the intended goal.  Iron will-power can overcome this friction … but of course it wears down the machine as well… Friction is the only concept that … corresponds to the factors that distinguish real war from war on paper.  The … army and everything else related to it is basically very simple and therefore seems easy to manage. But … each part is composed of individuals, every one of whom retains his potential of friction… This tremendous friction … is everywhere in contact with chance, and brings about effects that cannot be measured… Friction … is the force that makes the apparently easy so difficult… Finally … all action takes place … in a kind of twilight, which like fog or moonlight, often tends to make things seem grotesque and larger than they really are.  Whatever is hidden from full view in this feeble light has to be guessed at by talent, or simply left to chance.’ Clausewitz.

‘It is a wonderful feeling to recognise the unity of complex phenomena that to direct observation appear to be quite separate things.’ Einstein to Grossman, 1901.

‘All stable processes we shall predict. All unstable processes we shall control.’  Von Neumann.

‘Imagine how much harder physics would be if electrons had feelings.’ Richard Feynman.

At the beginning of From Russia With Love (the movie not the book), Kronsteen, a Russian chess master and SPECTRE strategist, is summoned to Blofeld’s lair to discuss the plot to steal the super-secret ‘Lektor Decoder’ and kill Bond. Kronsteen outlines to Blofeld his plan to trick Bond into stealing the machine for SPECTRE.

Blofeld: Kronsteen, you are sure this plan is foolproof?

Kronsteen: Yes it is, because I have anticipated every possible variation of counter-move.

Political analysis is full of chess metaphors, reflecting an old tradition of seeing games as models of physical and social reality. (‘Time is a child moving counters in a game; the royal power is a child’s’, Heraclitus.) A game which has ten different possible moves at each turn and runs for two turns has 102 possible ways of being played; if it runs for fifty turns it has 1050 possible ways of being played, ‘a number which substantially exceeds the number of atoms in the whole of our planet earth’ (Holland); if it runs for ninety turns it has 1090 possible ways of being played, which is about the estimated number of atoms in the Universe. Chess is merely 32 pieces on an 8×8 grid with a few simple rules but the number of possible games is much greater than 1090.

Many practical problems (e.g logistics, designing new drugs) are equivalent to the Travelling Salesman Problem (TSP). For any TSP involving travelling to n cities, the number of possible tours when starting with a specific city is: (n-1)!/2. For 33 cities, the total number of possible journeys is:

32!/2 = 131,565,418,466,846,765,083,609,006,080,000,000

The IBM Roadrunner, the fastest supercomputer in the world in 2009, could perform 1,457 trillion operations per second. If we could arrange the tours such that examining each one would take only one arithmetical operation, then it would take it ~28 trillion years to examine all possible routes between 33 cities, about twice the estimated age of the Universe. As n grows linearly (add one city, add another etc), the number of possible routes grows exponentially. The way in which the number of possible options scales up exponentially as the number of agents scales up linearly, and the difficulty of finding solutions quickly in vast search landscapes, connects to one of the most important questions in maths and computer science, the famous $1 million dollar ‘P=NP?’ Clay Millennium Prize.

Kronsteen’s confidence, often seen in politics, is therefore misplaced even in chess. It is far beyond our ability to anticipate ‘every possible variation of counter-move’ yet chess is simple compared to the systems that scientists or politicians have to try to understand and predict in order to try to control. These themes of uncertainty, nonlinearity, complexity and prediction have been ubiquitous motifs of art, philosophy, and politics. We see them in Homer, where the gift of an apple causes the Trojan War; in Athenian tragedy, where a chance meeting at a crossroads settles the fate of Oedipus; in Othello’s dropped handkerchief; and in War and Peace with Nikolai Rostov, playing cards with Dolohov, praying that one little card will turn out differently, save him from ruin, and allow him to go happily home to Natasha.

 ‘I know that men are persuaded to go to war in one frame of mind and act when the time comes in another, and that their resolutions change with the changes of fortune…  The movement of events is often as wayward and incomprehensible as the course of human thought; and this is why we ascribe to chance whatever belies our calculation.’ Pericles to the Athenians.

Maths and models

Because of the ‘unreasonable effectiveness of mathematics’ in providing the ‘language of nature’ and foundations for a scientific civilization, we understand some systems very well and can make very precise predictions based on accurate quantitative models. Sometimes a mathematical model predicts phenomena which are later found (e.g. General Relativity’s field equations); sometimes an experiment reveals a phenomenon that awaits an effective mathematical model (e.g. the delay between the discovery of superconductivity and a quantum theory). The work of mathematicians on ‘pure’ problems has often yielded ideas that have waited to be rediscovered by physicists. The work of Euclid, Apollonius and Archimedes on ellipses would be used centuries later by Kepler for his theory of planetary motion. The work of Riemann on non-Euclidean four-dimensional geometry was (thanks to Grossmann) used by Einstein for General Relativity. The work of various people since the 16th Century on complex numbers would be used by Heisenberg et al for quantum mechanics in the 1920s.

The work of Cantor, Gödel, and Turing (c. 1860-1936) on the logical foundations of mathematics, perhaps the most abstract and esoteric subject, gave birth to computers. The work of Galois on ‘groups’ (motivated by problems with polynomial equations) would be used post-1945 to build the ‘Standard Model’ of particle physics using ‘symmetry groups’. In a serendipitous 1972 meeting in the Institute of Advanced Study cafeteria, it was discovered that the distribution of prime numbers has a still-mysterious connection with the energy levels of particles. G.H. Hardy famously wrote, in ‘A Mathematician’s Apology’ which influenced many future mathematicians, that the field of number theory was happily ‘useless’ and did not contribute to ‘any warlike purpose’; even as he wrote the words, it was secretly being applied to cryptography and it now forms the basis of secure electronic communications among other things. Perhaps another example will be the ‘Langlands Program’ in pure mathematics which was developed in the 1960’s and work on it is now funded by DARPA (the famous military technology developer) in the hope of practical applications.

Mathematicians invent (or discover?) concepts by abstraction and then discover connections between concepts.* Nature operates with universal laws and displays symmetry and regularity as well as irregularity and randomness.

‘What do we mean by “understanding” something? We can imagine that this complicated array of moving things which constitutes “the world” is something like a great chess game being played by the gods, and we are observers of the game. We do not know what the rules of the game are; all we are allowed to do is to watch the playing. Of course, if we watch long enough, we may eventually catch on to a few of the rules. The rules of the game are what we mean by fundamental physics. Even if we knew every rule, however, we might not be able to understand why a particular move is made in the game, merely because it is too complicated and our minds are limited. If you play chess you must know that it is easy to learn all the rules, and yet it is often very hard to select the best move or to understand why a player moves as he does. So it is in nature, only much more so; but we may be able at least to find all the rules. Actually, we do not have all the rules now. (Every once in a while something like castling is going on that we still do not understand.) Aside from not knowing all of the rules, what we really can explain in terms of those rules is very limited, because almost all situations are so enormously complicated that we cannot follow the plays of the game using the rules, much less tell what is going to happen next. We must, therefore, limit ourselves to the more basic question of the rules of the game. If we know the rules, we consider that we “understand” the world.’ Richard Feynman.

These physical laws, or rules, use mathematicians’ abstractions.**

‘It is an extraordinary feature of science that the most diverse, seemingly unrelated, phenomena can be described with the same mathematical tools. The same quadratic equation with which the ancients drew right angles to build their temples can be used today by a banker to calculate the yield to maturity of a new, two-year bond. The same techniques of calculus developed by Newton and Leibniz two centuries ago to study the orbits of Mars and Mercury can be used today by a civil engineer to calculate the maximum stress on a new bridge… But the variety of natural phenomena is boundless while, despite all appearances to the contrary, the number of really distinct mathematical concepts and tools at our disposal is surprisingly small… When we explore the vast realm of natural and human behavior, we find the most useful tools of measurement and calculation are based on surprisingly few basic ideas.’ Mandelbrot

There is an amazing connection between mathematicians’ aesthetic sense of ‘beauty’ and their success in finding solutions:

‘It is efficient to look for beautiful solutions first and settle for ugly ones only as a last resort… It is a good rule of thumb that the more beautiful the guess, the more likely it is to survive.’ Timothy Gowers.

‘[S]ciences do not try to explain, they hardly even try to interpret, they mainly make models. By a model is meant a mathematical construct which, with the addition of certain verbal interpretations, describes observed phenomena. The justification of such a mathematical construct is solely and precisely that it is expected to work – that is, correctly to describe phenomena from a reasonably wide area. Furthermore, it must satisfy certain aesthetic criteria – that is, in relation to how much it describes, it must be rather simple… If only relatively little has been explained, one will absolutely insist that it should at least be done by very simple and direct means.’ Von Neumann.

Some of these models allow relatively precise predictions about a particular physical system: for example, Newton’s equations for classical mechanics or the equations for ‘quantum electrodynamics’. Sometimes they are statistical predictions that do not say how a specific event will turn out but what can be expected over a large number of trials and with what degree of confidence: ‘the epistemological value of probability theory is based on the fact that chance phenomena, considered collectively and on a grand scale, create a non-random regularity’ (Kolmogorov). The use of statistical models has touched many fields: ‘Moneyball’ in baseball (the replacement of scouts’ hunches by statistical prediction), predicting wine vintages and ticket sales, dating, procurement decisions, legal judgements, parole decisions and so on.

For example, many natural (e.g. height, IQ) and social (e.g. polling) phenomena follow the statistical theorem called the Central Limit Theorem (CLT) and produce a ‘normal distribution’, or ‘bell curve’. Fields Medallist Terry Tao describes it:

‘Roughly speaking, this theorem asserts that if one takes a statistic that is a combination of many independent and randomly fluctuating components, with no one component having a decisive influence on the whole, then that statistic will be approximately distributed according to a law called the normal distribution (or Gaussian distribution), and more popularly known as the bell curve

‘The law is universal because it holds regardless of exactly how the individual components fluctuate, or how many components there are (although the accuracy of the law improves when the number of components increases); it can be seen in a staggeringly diverse range of statistics, from the incidence rate of accidents, to the variation of height, weight, or other vital statistics amongst a species, to the financial gains or losses caused by chance, to the velocities of the component particles of a physical system. The size, width, location, and even the units of measurement of the distribution varies from statistic to statistic, but the bell curve shape can be discerned in all cases.

‘This convergence arises not because of any “low-level” or “microscopic” connection between such diverse phenomena as car crashes, human height, trading profits, or stellar velocities, but because in all of these cases the “high-level” or “macroscopic” structure is the same, namely a compound statistic formed from a combination of the small influences of many independent factors.  This is the essence of universality: the macroscopic behaviour of a large, complex system can be almost totally independent of its microscopic structure.

‘The universal nature of the central limit theorem is tremendously useful in many industries, allowing them to manage what would otherwise be an intractably complex and chaotic system.  With this theorem, insurers can manage the risk of, say, their car insurance policies, without having to know all the complicated details of how car crashes actually occur; astronomers can measure the size and location of distant galaxies, without having to solve the complicated equations of celestial mechanics; electrical engineers can predict the effect of noise and interference on electronic communications,  without having to know exactly how this noise was generated; and so forth.’

Many other phenomena (e.g. terrorist attacks, earthquakes, stock market panics) produce a ‘power law’ and trusting to a CLT model of a phenomenon when it actually follows a power law causes trouble, as with the recent financial crisis. When examining phase transitions of materials (e.g the transition from water to ice), the patterns formed by atoms are almost always fractals which appear everywhere from charts of our heartbeats to stock prices to Bach. (Recent work (here) has made breakthroughs in understanding the statistics of phase transitions.)

However, even our best understood mathematical models can quickly become practically overwhelming. Laplace voiced a famous expression of the post-Newton Enlightenment faith in science’s potential to predict.

‘We may regard the present state of the universe as the effect of its past and the cause of its future.  An intellect which at a certain moment would know all the forces that animate nature, and all positions of the beings that compose it, if this intellect were vast enough to submit the data to analysis, would condense in a single formula the movements of the greatest bodies of the universe and those of the tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present before its eyes… Present events are connected with preceding ones by a tie based upon the evident principle that a thing cannot occur without a cause that produces it… All events, even those which on account of their insignificance do not seem to follow the great laws of nature, are a result of it just as necessarily as the revolutions of the sun.’ Laplace

Newton himself had warned of the potential complexity of calculating more than two interacting bodies.

‘The orbit of any one planet depends on the combined motions of all the planets, not to mention the action of all these on each other. But to consider simultaneously all these causes of motion and to define these motions by exact laws allowing of convenient calculation exceeds, unless I am mistaken, the force of the human intellect.’

It turned out that Newton’s famous gravitational equation cannot be extended to just three bodies without producing ‘deterministic chaos’, so although ‘cosmologists can use universal laws of fluid mechanics to describe the motion of entire galaxies, the motion of a single satellite under the influence of just three gravitational bodies can be far more complicated’ (Tao). Deterministic chaos, a system which is ‘sensitive to initial conditions’, was first articulated by Poincaré as he struggled to solve the ‘three-body problem’, and broke Laplace’s dream of perfect understanding and prediction:

‘If one seeks to visualize the pattern formed by these two [solution] curves and their infinite number of intersections, . . .[their] intersections form a kind of lattice-work, a weave, a chain-link network of infinitely fine mesh; … One will be struck by the complexity of this figure, which I am not even attempting to draw. Nothing can give us a better idea of the intricacy of the three-body problem, and of all the problems of dynamics in general…

‘A very small cause which escapes our notice determines a considerable effect that we cannot fail to see, and then we say that that effect is due to chance. If we knew exactly the laws of nature and the situation of the universe at the initial moment, we could predict exactly the situation of that same universe at a succeeding moment.  But even if it were the case that the natural laws had no longer any secret for us, we could still only know the initial situation approximately.  If that enabled us to predict the succeeding situation with the same approximation, that is all we require, and we should say that the phenomenon had been predicted, that it is governed by laws. But it is not always so; it may happen that small differences in the initial conditions produce very great ones in the final phenomena. A small error in the former will produce an enormous error in the latter.  Prediction becomes impossible, and we have the fortuitous phenomenon.’ (Poincaré, Science and Method, 1913)

Even with systems displaying chaos because of sensitivity to initial conditions, short-term predictions are not hopeless. The best example is weather – the study of which was actually the prompt for Lorenz’s re-discovery of ‘chaos’. Weather forecasts have improved greatly over the past fifty years. For example, 25 years ago forecasts of where a hurricane would hit land in three days time missed by an average of 350 miles; now they miss by about 100 miles. We have bought ourselves an extra 48 hours to evacuate. Is a weather forecast better than it would be by simply a) looking at historical data (climatology), or b) assuming tomorrow will be similar to today (persistence)? Our forecasts are significantly better until about day 9 when forecasts become no better than looking at historical data.

However, chaos means that beyond the short-term, forecasts rapidly break down and usually greater and greater resources are needed to extend the forecasts even just a little further; for example, there has been a huge increase in computer processing applied to weather forecasts since the 1950’s, just to squeeze an accurate forecast out to Day 9. (Cf. Nate Silver’s ‘The signal and the noise‘ for more details.)

‘Even when universal laws do exist, it may still be practically impossible to use them to predict what happens next.  For instance, we have universal laws for the motion of fluids, such as the Navier-Stokes equations, and these are certainly used all the time in such tasks as weather prediction, but these equations are so complex and unstable that even with the most powerful computers, we are still unable to accurately predict the weather more than a week or two into the future.’ (Tao)

Between the precision of Newtonian mechanics (with a small number of interacting agents) and the statistics of multi-agent systems (such as thermodynamics and statistical mechanics) ‘there is a substantial middle ground of systems that are too complex for fundamental analysis, but too simple to be universal. Plenty of room, in short, for all the complexities of life as we know it’ (Tao).


In England, less than 10 percent per year leave school with formal training in basics such as ‘normal distributions’ and conditional probability. Less than one percent are well educated in the basics of how the ‘unreasonable effectiveness of mathematics’ provides the language of nature and a foundation for our scientific civilisation. Only a small subset of that <1% then study trans-disciplinary issues concerning complex systems. This number has approximately zero overlap with powerful decision-makers.

Generally, they are badly (or narrowly) educated and trained. Even elite universities offer courses such as PPE that are thought to prepare future political decision-makers but are clearly inadequate and in some ways damaging, giving people like Cameron and Balls false confidence in 1) the value of their acquired bluffing skills and 2) the scientific basis of modern economics’ forecasts. Powerful decision-makers also usually operate in institutions that have vastly more ambitious formal goals than the dysfunctional management could possibly achieve, and which generally select for the worst aspects of chimp politics and against those skills seen in rare successful organisations (e.g the ability to simplify, focus, and admit errors). Most politicians, officials, and advisers operate with fragments of philosophy, little knowledge of maths or science (few MPs can answer even simple probability questions yet most are confident in their judgement), and little experience in well-managed complex organisations. The skills, and approach to problems, of our best mathematicians, scientists, and entrepreneurs are almost totally shut out of vital decisions.

These issues are connected to the failure of political elites to get big decisions right since the 1860s, as I discussed in The Hollow Men. In Part II next week, I will discuss some of the issues about how Whitehall works that cause so many problems and what can be done to improve this situation. In Part II of this blog, I will explore some more of the science of prediction. But I’d prefer you to look at my essay, from which most of this is taken…

*  This happens in social sciences too. E.g. Brouwer’s fixed-point theorem in topology was first applied to ‘equilibrium’ in economics by von Neumann (1930’s), and this approach was copied by Arrow and Debreu in their 1954 paper that laid the foundation for modern ‘general equilibrium theory’ in economics.

** Einstein asked, ‘How is it possible that mathematics, a product of human thought that is independent of experience, fits so excellently the objects of physical reality?’ ‘Is mathematics invented or discovered?’, Tim Gowers (Polkinghorne, 2011). Hilbert, Cantor and Einstein thought it is invented (formalism). Gödel thought it is discovered (Platonism). For a non-specialist summary of many issues concerning maths and prediction, cf. a talk by Fields Medallist Terry Tao. Wigner answered Einstein in a famous paper, ‘The Unreasonable Effectiveness of Mathematics in the Natural Sciences’ (1960).