Specialist maths schools – some facts

The news reports that the Government will try to promote more ‘specialist maths schools’ similar to the King’s College and Exeter schools.

The idea for these schools came when I read about Perelman, the Russian mathematician who in 2003 suddenly posted on arXiv a solution to the Poincaré Conjecture, one of the most important open problems in mathematics. Perelman went to one of the famous Russian specialist maths schools that were set up by one of the most important mathematicians of the 20th Century, Kolmogorov.

I thought – a) given the fall in standards in maths and physics because of the corruption of the curriculum and exams started by the Tories and continued by Blair, b) the way in which proper teaching of advanced maths and physics is increasingly limited to a tiny number of schools many of which are private, and c) the huge gains for our civilisation from the proper education of the unusual small fraction of children who are very gifted in maths and physics, why not try to set up something similar.

Gove’s team therefore pushed the idea through the DfE. Dean Acheson, US Secretary of State, said, ‘I have long been the advocate of the heretical view that, whatever political scientists might say, policy in this country is made, as often as not, by the necessity of finding something to say for an important figure committed to speak without a prearranged subject.’ This is quite true (it also explains a lot about how Monnet created the ECSC and EEC). Many things that the Gove team did relied on this. We prepared the maths school idea and waited our chance. Sure enough, the word came through from Downing Street – ‘the Chancellor needs an announcement for the Budget, something on science’. We gave them this, he announced it, and bureaucratic resistance was largely broken.

If interested in some details, then look at pages 75ff of my 2013 essay for useful links. Other countries have successfully pursued similar ideas, including France for a couple of centuries and Singapore recently.

One of the interesting aspects of trying to get them going was the way in which a) the official ‘education world’ loathed not just the idea but also the idea about the idea – they hated thinking about ‘very high ability’ and specialist teaching; b) when I visited maths departments they all knew about these schools because university departments in the West employ a large number of people who were educated in these schools but they all said ‘we can’t help you with this even though it’s a good idea because we’d be killed politically for supporting “elitism” [fingers doing quote marks in the air], good luck I hope you succeed but we’ll probably attack you on the record.’ They mostly did.

The only reason why the King’s project happened is because Alison Wolf made it a personal crusade to defeat all the entropic forces that elsewhere killed the idea (with the exception of Exeter). Without her it would have had no chance. I found few equivalents elsewhere and where I did they were smashed by their VCs.

A few points…

1) Kolmogorov-type schools are a particular thing. They undoubtedly work. But they are aimed at a small fraction of the population. Given what the products of these schools go on to contribute to human civilisation they are extraordinarily cheap. They are also often a refuge for children who have a terrible time in normal schools. If they were as different to normal kids in a negative sense as they are in a positive sense then there would be no argument about whether they have ‘special needs’.

2) Don’t believe the rubbish in things like Gladwell’s book about maths and IQ. There is now very good data on this particularly in the form of the unprecedented SMPY multi-decade study. Even a short crude test at 11-13 gives very good predictions of who is likely to be very good at maths/physics. Further there is a strong correlation between performance at the top 1% / 1:1,000 / 1:10,000 level and many outcomes in later life such as getting a doctorate, a patent, writing a paper in Science and Nature, high income, health etc. The education world has been ~100% committed to rejecting the science of this subject though this resistance is cracking.

This chart shows the SMPY results (maths ability at 13) for the top 1% of maths ability broken down into quartiles 1-4: the top quartile of the top 1% clearly outperforms viz tenure, publication and patent rates.  

screenshot-2017-01-23-11-53-01

3) The arguments for Kolmogorov schools do not translate to arguments for selection in general – ie. they are specific to the subject. It is the structure of maths and the nature of the brain that allows very young people to make rapid progress. These features are not there for English, history and so on. I am not wading into the grammar school argument on either side – I am just pointing out a fact that the arguments for such maths schools are clear but should not be confused with the wider arguments over selection that involve complicated trade-offs. People on both sides of the grammar debate should, if rational, be able to support this policy.

4) These schools are not ‘maths hot houses’. Kolmogorov took the children to see  Shakespeare plays, music and so on. It is important to note that teaching English and other subjects is normal – other than you are obviously dealing with unusually bright children. If these children are not in specialist schools, then the solution is a) specialist maths teaching (including help from university-level mathematicians) and b) keeping other aspects of their education normal. Arguably the greatest mathematician in the world, Terry Tao, had wise parents and enjoyed this combination. So it is of course possible to educate such children without specialist schools but the risks are higher that either parents or teachers cock it up.

5) Extended wisely across Britain they could have big benefits not just for those children and elite universities but they could also play an important role in raising standards generally in their area by being a focus for high quality empirical training. One of the worst aspects of the education world is the combination of low quality training and resistance to experiments. This has improved since the Gove reforms but the world of education research continues to be dominated by what Feynman called ‘cargo cult science’.

6) We also worked with a physicist at Cambridge, Professor Mark Warner, to set up a project to improve the quality of 6th form physics. This project has been a great success thanks to his extraordinary efforts and the enthusiasm of young Cambridge physicists. Thousands of questions have been answered on their online platform from many schools. This project gives kids the chance to learn proper problem solving – that is the core skill that the corruption of the exam system has devalued and increasingly pushed into a ghetto of of private education. Needless to say the education world also was hostile to this project. Anything that suggests that we can do much much better is generally hated by all elements of the bureaucracy, including even elements such as the Institute of Physics that supposedly exist to support exactly this. A handful of officials helped us push through projects like this and of course most of them have since left Whitehall in disgust, thus does the system protect itself against improvement while promoting the worst people.

7) This idea connects to a broader idea. Kids anywhere in the state system should be able to apply some form of voucher to buy high quality advanced teaching from outside their school for a wide range of serious subjects from music to physics.

8) One of the few projects that the Gove team tried and failed to get going was to break the grip of GCSEs on state schools (Cameron sided with Clegg and although we cheated a huge amount through the system we hit a wall on this project). It is extremely wasteful for the system and boring for many children for them to be focused on existing exams that do not develop serious skills. Maths already has the STEP paper. There should be equivalents in other subjects at age 16. There is nothing that the bureaucracy will fight harder than this and it will probably only happen if excellent private schools decide to do it themselves and political pressure then forces the Government to allow state schools to do them.

Any journalists who want to speak to people about this should try to speak to Dan Abramson (the head of the King’s school), Alison Wolf, or Alexander Borovik (a mathematician at Manchester University who attended one of these schools in Russia).

It is hopeful that No10 is backing this idea but of course they will face determined resistance. It will only happen if at least one special adviser in the DfE makes it a priority and has the support of No10 so officials know they might as well fight about other things…


This is the most interesting comment probably ever left on this blog and it is much more interesting than the blog itself so I have copied it below. It is made by Borovik, mentioned above, who attended one of these schools in Russia and knows many who attended similar…

‘There is one more aspect of (high level) selective specialist mathematics education that is unknown outside the professional community of mathematicians.

I am not an expert on “gifted and talented” education. On the other hand, I spent my life surrounded by people who got exclusive academically selective education in mathematics and physics, whether it was in the Lavrentiev School in Siberia, or Lycée Louis-le-Grand in Paris, or Fazekas in Budapest, or Galatasaray Lisesi (aka Lycée de Galatasaray) in Istanbul — the list can be continued.

The schools have nothing in common, with the exception of being unique, each one in its own way.

I had research collaborators and co-authors from each of the schools that Ilisted above. Why was it so easy for us to find a common language?

Well, the explanation can be found in the words of Stanislas Dehaene, the leading researcher of neurophysiology of mathematical thinking:

“We have to do mathematics using the brain which evolved 30 000 years ago for survival in the African savanna.”

In humans, the speed of totally controlled mental operations is at most 16 bits per second. Standard school maths education trains children to work at that speed.

The visual processing module in the brain crunches 10,000,000,000 bits per second.

I offer a simple thought experiment to the readers who have some knowledge of school level geometry.

Imagine that you are given a triangle; mentally rotate it about the longest side. What is the resulting solid of revolution? Describe it. And then try to reflect: where the answer came from?

The best kept secret of mathematics: it is done by subconsciousness.

Mathematics is a language for communication with subconsciousness.

There are four conversants in a conversation between two mathematicians: two people and two their “inner”, “intuitive” brains.

When mathematicians talk about mathematics face-to-face, they
* frequently use language which is very fluid and informal;
* improvised on the spot;
* includes pauses (for a lay observer—very strange and awkwardly timed) for absorbtion of thought;
* has almost nothing in common with standardised mathematics “in print”.

Mathematician is trying to convey a message from his “intuitive brain” directly to his colleagues’ “intuitive brain”.

Alumni of high level specialist mathematics schools are “birds of feather” because they have been initiated into this mode of communication at the most susceptible age, as teenagers, at the peak of intensity of their socialisation / shaping group identity stream of self-actualisation.

In that aspect, mathematics is not much different from arts. Part of the skills that children get in music schools, acting schools, dancing school, and art schools is the ability to talk about music, acting, dancing, art with intuitive, subconscious parts of their minds — and with their peers, in a secret language which is not recognised (and perhaps not even registered) by uninitiated.

However, specialist mathematics schools form a continuous spectrum from just ordinary, with standard syllabus, but good schools with good maths teachers to the likes of Louis-le-Grand and Fazekas. My comments apply mostly to the top end of the spectrum. I have a feeling that the Green Paper is less ambitious and does not call for setting up mathematics boarding schools using Chetham’s School of Music as a model. However, middle tier maths school could also be very useful — if they are set up with realistic expectations, properly supported, and have strong connections with universities.’

A Borovik

 

 

Unrecognised simplicities of effective action #1: expertise and a quadrillion dollar business

‘The combination of physics and politics could render the surface of the earth uninhabitable.’ John von Neumann.

Introduction

This series of blogs considers:

  • the difference between fields with genuine expertise, such as fighting and physics, and fields dominated by bogus expertise, such as politics and economic forecasting;
  • the big big problem we face – the world is ‘undersized and underorganised’ because of a collision between four forces: 1) our technological civilisation is inherently fragile and vulnerable to shocks, 2) the knowledge it generates is inherently dangerous, 3) our evolved instincts predispose us to aggression and misunderstanding, and 4) there is a profound mismatch between the scale and speed of destruction our knowledge can cause and the quality of individual and institutional decision-making in ‘mission critical’ institutions – our institutions are similar to those that failed so spectacularly in summer 1914 yet they face crises moving at least ~103 times faster and involving ~106 times more destructive power able to kill ~1010 people;
  • what classic texts and case studies suggest about the unrecognised simplicities of effective action to improve the selection, education, training, and management of vital decision-makers to improve dramatically, reliably, and quantifiably the quality of individual and institutional decisions (particularly 1) the ability to make accurate predictions and b) the quality of feedback);
  • how we can change incentives to aim a much bigger fraction of the most able people at the most important problems;
  • what tools and technologies can help decision-makers cope with complexity.

[I’ve tweaked a couple of things in response to this blog by physicist Steve Hsu.]

*

Summary of the big big problem

The investor Peter Thiel (founder of PayPal and Palantir, early investor in Facebook) asks people in job interviews: what billion (109) dollar business is nobody building? The most successful investor in world history, Warren Buffett, illustrated what a quadrillion (1015) dollar business might look like in his 50th anniversary letter to Berkshire Hathaway investors.

‘There is, however, one clear, present and enduring danger to Berkshire against which Charlie and I are powerless. That threat to Berkshire is also the major threat our citizenry faces: a “successful” … cyber, biological, nuclear or chemical attack on the United States… The probability of such mass destruction in any given year is likely very small… Nevertheless, what’s a small probability in a short period approaches certainty in the longer run. (If there is only one chance in thirty of an event occurring in a given year, the likelihood of it occurring at least once in a century is 96.6%.) The added bad news is that there will forever be people and organizations and perhaps even nations that would like to inflict maximum damage on our country. Their means of doing so have increased exponentially during my lifetime. “Innovation” has its dark side.

‘There is no way for American corporations or their investors to shed this risk. If an event occurs in the U.S. that leads to mass devastation, the value of all equity investments will almost certainly be decimated.

‘No one knows what “the day after” will look like. I think, however, that Einstein’s 1949 appraisal remains apt: “I know not with what weapons World War III will be fought, but World War IV will be fought with sticks and stones.”’

Politics is profoundly nonlinear. (I have written a series of blogs about complexity and prediction HERE which are useful background for those interested.) Changing the course of European history via the referendum only involved about 10 crucial people controlling ~£107  while its effects over ten years could be on the scale of ~108 – 10people and ~£1012: like many episodes in history the resources put into it are extremely nonlinear in relation to the potential branching histories it creates. Errors dealing with Germany in 1914 and 1939 were costly on the scale of ~100,000,000 (108) lives. If we carry on with normal human history – that is, international relations defined as out-groups competing violently – and combine this with modern technology then it is extremely likely that we will have a disaster on the scale of billions (109) or even all humans (~1010). The ultimate disaster would kill about 100 times more people than our failure with Germany. Our destructive power is already much more than 100 times greater than it was then: nuclear weapons increased destructiveness by roughly a factor of a million.

Even if we dodge this particular bullet there are many others lurking. New genetic engineering techniques such as CRISPR allow radical possibilities for re-engineering organisms including humans in ways thought of as science fiction only a decade ago. We will soon be able to remake human nature itself. CRISPR-enabled ‘gene drives’ enable us to make changes to the germ-line of organisms permanent such that changes spread through the entire wild population, including making species extinct on demand. Unlike nuclear weapons such technologies are not complex, expensive, and able to be kept secret for a long time. The world’s leading experts predict that people will be making them cheaply at home soon – perhaps they already are. These developments have been driven by exponential progress much faster than Moore’s Law reducing the cost of DNA sequencing per genome from ~$108 to ~$10in roughly 15 years.

screenshot-2017-01-16-12-24-13

It is already practically possible to deploy a cheap, autonomous, and anonymous drone with facial-recognition software and a one gram shaped-charge to identify a relevant face and blow it up. Military logic is driving autonomy. For example, 1) the explosion in the volume of drone surveillance video (from 71 hours in 2004 to 300,000 hours in 2011 to millions of hours now) requires automated analysis, and 2) jamming and spoofing of drones strongly incentivise a push for autonomy. It is unlikely that promises to ‘keep humans in the loop’ will be kept. It is likely that state and non-state actors will deploy low-cost drone swarms using machine learning to automate the ‘find-fix-finish’ cycle now controlled by humans. (See HERE for a video just released for one such program and imagine the capability when they carry their own communication and logistics network with them.)

In the medium-term, many billions are being spent on finding the secrets of general intelligence. We know this secret is encoded somewhere in the roughly 125 million ‘bits’ of information that is the rough difference between the genome that produces the human brain and the genome that produces the chimp brain. This search space is remarkably small – the equivalent of just 25 million English words or 30 copies of the King James Bible. There is no fundamental barrier to decoding this information and it is possible that the ultimate secret could be described relatively simply (cf. this great essay by physicist Michael Nielsen). One of the world’s leading experts has told me they think a large proportion of this problem could be solved in about a decade with a few tens of billions and something like an Apollo programme level of determination.

Not only is our destructive and disruptive power still getting bigger quickly – it is also getting cheaper and faster every year. The change in speed adds another dimension to the problem. In the period between the Archduke’s murder and the outbreak of World War I a month later it is striking how general failures of individuals and institutions were compounded by the way in which events moved much faster than the ‘mission critical’ institutions could cope with such that soon everyone was behind the pace, telegrams were read in the wrong order and so on. The crisis leading to World War I was about 30 days from the assassination to the start of general war – about 700 hours. The timescale for deciding what to do between receiving a warning of nuclear missile launch and deciding to launch yourself is less than half an hour and the President’s decision time is less than this, maybe just minutes. This is a speedup factor of at least 103.

Economic crises already occur far faster than human brains can cope with. The financial system has made a transition from people shouting at each other to a a system dominated by high frequency ‘algorithmic trading’ (HFT), i.e. machine intelligence applied to robot trading with vast volumes traded on a global spatial scale and a microsecond (10-6) temporal scale far beyond the monitoring, understanding, or control of regulators and politicians. There is even competition for computer trading bases in specific locations based on calculations of Special Relativity as the speed of light becomes a factor in minimising trade delays (cf. Relativistic statistical arbitrage, Wissner-Gross). ‘The Flash Crash’ of 9 May 2010 saw the Dow lose hundreds of points in minutes. Mini ‘flash crashes’ now blow up and die out faster than humans can notice. Given our institutions cannot cope with economic decisions made at ‘human speed’, a fortiori they cannot cope with decisions made at ‘robot speed’. There is scope for worse disasters than 2008 which would further damage the moral credibility of decentralised markets and provide huge chances for extremist political entrepreneurs to exploit. (* See endnote.)

What about the individuals and institutions that are supposed to cope with all this?

Our brains have not evolved much in thousands of years and are subject to all sorts of constraints including evolved heuristics that lead to misunderstanding, delusion, and violence particularly under pressure. There is a terrible mismatch between the sort of people that routinely dominate mission critical political institutions and the sort of people we need: high-ish IQ (we need more people >145 (+3SD) while almost everybody important is between 115-130 (+1 or 2SD)), a robust toolkit for not fooling yourself including quantitative problem-solving (almost totally absent at the apex of relevant institutions), determination, management skills, relevant experience, and ethics. While our ancestor chiefs at least had some intuitive feel for important variables like agriculture and cavalry our contemporary chiefs (and those in the media responsible for scrutiny of decisions) generally do not understand their equivalents, and are often less experienced in managing complex organisations than their predecessors.

The national institutions we have to deal with such crises are pretty similar to those that failed so spectacularly in summer 1914 yet they face crises moving at least ~103 times faster and involving ~106 times more destructive power able to kill ~1010 people. The international institutions developed post-1945 (UN, EU etc) contribute little to solving the biggest problems and in many ways make them worse. These institutions fail constantly and do not  – cannot – learn much.

If we keep having crises like we have experienced over the past century then this combination of problems pushes the probability of catastrophe towards ‘overwhelmingly likely’.

*

What Is To be Done? There’s plenty of room at the top

‘In a knowledge-rich world, progress does not lie in the direction of reading information faster, writing it faster, and storing more of it. Progress lies in the direction of extracting and exploiting the patterns of the world… And that progress will depend on … our ability to devise better and more powerful thinking programs for man and machine.’ Herbert Simon, Designing Organizations for an Information-rich World, 1969.

‘Fascinating that the same problems recur time after time, in almost every program, and that the management of the program, whether it happened to be government or industry, continues to avoid reality.’ George Mueller, pioneer of ‘systems engineering’ and ‘systems management’ and the man most responsible for the success of the 1969 moon landing.

Somehow the world has to make a series of extremely traumatic and dangerous transitions over the next 20 years. The main transition needed is:

Embed reliably the unrecognised simplicities of high performance teams (HPTs), including personnel selection and training, in ‘mission critical’ institutions while simultaneously developing a focused project that radically improves the prospects for international cooperation and new forms of political organisation beyond competing nation states.

Big progress on this problem would automatically and for free bring big progress on other big problems. It could improve (even save) billions of lives and save a quadrillion dollars (~$1015). If we avoid disasters then the error-correcting institutions of markets and science will, patchily, spread peace, prosperity, and learning. We will make big improvements with public services and other aspects of ‘normal’ government. We will have a healthier political culture in which representative institutions, markets serving the public (not looters), and international cooperation are stronger.

Can a big jump in performance – ‘better and more powerful thinking programs for man and machine’ – somehow be systematised?

Feynman once gave a talk titled ‘There’s plenty of room at the bottom’ about the huge performance improvements possible if we could learn to do engineering at the atomic scale – what is now called nanotechnology. There is also ‘plenty of room at the top’ of political structures for huge improvements in performance. As I explained recently, the victory of the Leave campaign owed more to the fundamental dysfunction of the British Establishment than it did to any brilliance from Vote Leave. Despite having the support of practically every force with power and money in the world (including the main broadcasters) and controlling the timing and legal regulation of the referendum, they blew it. This was good if you support Leave but just how easily the whole system could be taken down should be frightening for everybody .

Creating high performance teams is obviously hard but in what ways is it really hard? It is not hard in the same sense that some things are hard like discovering profound new mathematical knowledge. HPTs do not require profound new knowledge. We have been able to read the basic lessons in classics for over two thousand years. We can see relevant examples all around us of individuals and teams showing huge gains in effectiveness.

The real obstacle is not financial. The financial resources needed are remarkably low and the return on small investments could be incalculably vast. We could significantly improve the decisions of the most powerful 100 people in the UK or the world for less than a million dollars (~£106) and a decade-long project on a scale of just ~£107 could have dramatic effects.

The real obstacle is not a huge task of public persuasion – quite the opposite. A government that tried in a disciplined way to do this would attract huge public support. (I’ve polled some ideas and am confident about this.) Political parties are locked in a game that in trying to win in conventional ways leads to the public despising them. Ironically if a party (established or new) forgets this game and makes the public the target of extreme intelligent focus then it would not only make the world better but would trounce their opponents.

The real obstacle is not a need for breakthrough technologies though technology could help. As Colonel Boyd used to shout, ‘People, ideas, machines – in that order!’

The real obstacle is that although we can all learn and study HPTs it is extremely hard to put this learning to practical use and sustain it against all the forces of entropy that constantly operate to degrade high performance once the original people have gone. HPTs are episodic. They seem to come out of nowhere, shock people, then vanish with the rare individuals. People write about them and many talk about learning from them but in fact almost nobody ever learns from them – apart, perhaps, from those very rare people who did not need to learn – and nobody has found a method to embed this learning reliably and systematically in institutions that can maintain it. The Prussian General Staff remained operationally brilliant but in other ways went badly wrong after the death of the elder Moltke. When George Mueller left NASA it reverted to what it had been before he arrived – management chaos. All the best companies quickly go downhill after the departure of people like Bill Gates – even when such very able people have tried very very hard to avoid exactly this problem.

Charlie Munger, half of the most successful investment team in world history, has a great phrase he uses to explain their success that gets to the heart of this problem:

‘There isn’t one novel thought in all of how Berkshire [Hathaway] is run. It’s all about … exploiting unrecognized simplicities… It’s a community of like-minded people, and that makes most decisions into no-brainers. Warren [Buffett] and I aren’t prodigies. We can’t play chess blindfolded or be concert pianists. But the results are prodigious, because we have a temperamental advantage that more than compensates for a lack of IQ points.’

The simplicities that bring high performance in general, not just in investing, are largely unrecognised because they conflict with many evolved instincts and are therefore psychologically very hard to implement. The principles of the Buffett-Munger success are clear – they have even gone to great pains to explain them and what the rest of us should do – and the results are clear yet still almost nobody really listens to them and above average intelligence people instead constantly put their money into active fund management that is proved to destroy wealth every year!

Most people think they are already implementing these lessons and usually strongly reject the idea that they are not. This means that just explaining things is very unlikely to work:

‘I’d say the history that Charlie [Munger] and I have had of persuading decent, intelligent people who we thought were doing unintelligent things to change their course of action has been poor.’ Buffett.

Even more worrying, it is extremely hard to take over organisations that are not run right and make them excellent.

‘We really don’t believe in buying into organisations to change them.’ Buffett.

If people won’t listen to the world’s most successful investor in history on his own subject, and even he finds it too hard to take over failing businesses and turn them around, how likely is it that politicians and officials incentivised to keep things as they are will listen to ideas about how to do things better? How likely is it that a team can take over broken government institutions and make them dramatically better in a way that outlasts the people who do it? Bureaucracies are extraordinarily resistant to learning. Even after the debacles of 9/11 and the Iraq War, costing many lives and trillions of dollars, and even after the 2008 Crash, the security and financial bureaucracies in America and Europe are essentially the same and operate on the same principles.

Buffett’s success is partly due to his discipline in sticking within what he and Munger call their ‘circle of competence’. Within this circle they have proved the wisdom of avoiding trying to persuade people to change their minds and avoiding trying to fix broken institutions.

This option is not available in politics. The Enlightenment and the scientific revolution give us no choice but to try to persuade people and try to fix or replace broken institutions. In general ‘it is better to undertake revolution than undergo it’. How might we go about it? What can people who do not have any significant power inside the system do? What international projects are most likely to spark the sort of big changes in attitude we urgently need?

This is the first of a series. I will keep it separate from the series on the EU referendum though it is connected in the sense that I spent a year on the referendum in the belief that winning it was a necessary though not sufficient condition for Britain to play a part in improving the quality of government dramatically and improving the probability of avoiding the disasters that will happen if politics follows a normal path. I intended to implement some of these ideas in Downing Street if the Boris-Gove team had not blown up. The more I study this issue the more confident I am that dramatic improvements are possible and the more pessimistic I am that they will happen soon enough.

Please leave comments and corrections…

* A new transatlantic cable recently opened for financial trading. Its cost? £300 million. Its advantage? It shaves 2.6 milliseconds off the latency of financial trades. Innovative groups are discussing the application of military laser technology, unmanned drones circling the earth acting as routers, and even the use of neutrino communication (because neutrinos can go straight through the earth just as zillions pass through your body every second without colliding with its atoms) – cf. this recent survey in Nature.

Bureaucratic cancer and the sabotage of A Level reform

‘Bureaucracy is cancerous in head and limbs; only its belly is sound and the laws it excretes are the most straightforward shit in the world… With this bureaucracy including the judges on the bench we can have press laws written by angels and they cannot lift us from the swamp. With bad laws and good civil servants one can still govern, with bad civil servants the best laws cannot help.’ Otto von Bismarck, 1850.

‘I had the agreement in principle of my colleagues; I had the agreement in principle of the entire Landtag; and yet, although minister-president, I found myself absolutely unable to bring the matter one step further along. Agreement does not help me at all when passive resistance – from what direction in this complicated machine is impossible to learn – is conducted with such success that I am scarcely in a position after two to three years to answer even the most basic questions.’ Otto von Bismarck, 1878.

If the most effective political operator of the modern world frequently complained about the difficulty of enforcing policy against a hostile bureaucracy, we should not be surprised if similar problems recur over and over again.

Here is an interesting example of how education policy is made and how Whitehall works.

In 2012, we announced that the DfE would step back from controlling A Levels and give universities control. (Allegra Stratton ran the original story on Newsnight.) The main mechanism was ALCAB. It was a nightmare to set up partly because although subject experts very much wanted to be involved the administrators who control universities wanted to stay out of the controversy and said to us in the DfE ‘we don’t want to have to say publicly that A Level papers are bad’.

We forced ALCAB to be created. MG and I spent a lot of time in awful meetings forcing it through. Its main role was supposed to be an annual review of specific A Level papers so that professors XYZ could say ‘hopeless question in the Edexcel physics paper, it gets the definition of entropy wrong again, it fails to test XXX’ etc.

The DfE has closed this committee down. It emerged via this Times Higher Education story.

I pointed a few hacks to it. They have called the DfE press office and spads. Both of those entities were given a line from officials saying ‘ALCAB’s work is done, no story here’. (Cf. Forsyth’s blog here.)

This is a lie. The main role was an annual review process. This should have been conducted this year and 2016 in preparation for new A Levels in 2017. It was envisaged as a permanent role. Interestingly, the letters completely elide this main role out of existence and present ALCAB as having only a temporary role.

Now this annual review won’t happen.

This is almost a Jedi-level operation from DfE officials. The DfE hated giving away control, obviously, and hated ALCAB. The very point of the process – a sword of Damocles in the form of eminent professors saying ‘crap questions’ each year – was supposed to force the DfE, exam boards, and Ofqual to raise their game. You can imagine how popular this was. Now the situation will revert to the status quo – the DfE firmly in charge and those pesky professors who point out things like – specific papers do not test the maths skills in the specifications – are happily excluded, with no ‘unhelpful’ public scrutiny of standards.

I very much doubt that poor Nicky Morgan Nicky Morgan [*see end] realises what she has done. It was probably a letter buried deep in her box weeks ago that she had no reason to suspect meant she was being used to subvert reform and entrench Whitehall’s power. It is impossible for a new minister to spot all such things – you don’t know what you don’t know. We can also safely bet that No10 has not the faintest idea about what ALCAB is or what the annual review process was supposed to do.

This is how Whitehall closes down threats to its power. Although it is systemically incompetent viz policy and implementation, its real focus is on its own power, jobs, and money. To these, it pays careful attention and deploys its real skills.

It is possible that the hard struggle to improve A Levels and remove politicians’ and Whitehall’s grip of them is now substantially lost, without the MPs having a clue as to why and the details lost in a miasma of untraceable decisions and discussions.

Nicky Morgan and her spads should ask Rose (head of private office) and Wormald (Perm Sec) not just ‘how did this happen?’, but also ‘why were we and the press office given lies to tell the media?’ They would also be well advised to make clear that a repetition of this fancy footwork will mean someone fired. But of course this will have little effect. The officials are lining up their holidays and their own plans for the future, safe in the happy knowledge that whoever ‘wins’ the election, they will remain in charge. The MPs of all parties are largely content for this situation to continue. In the focus groups, swing voters will continue to say ‘they’re all the same’ with much more accuracy than they realise, but few in Westminster are really listening and even fewer know what is to be done…

I will blog a few reflections on No10’s ‘schools week’ tomorrow. NB. notice how, just as I wrote in The Hollow Men, this No10 ‘schools week’ is like all the others – two days of rubbish gimmicks, a self-inflicted cockup (‘real terms cuts to the budget’), followed by silence such that by Friday the 8 people who knew it was ‘schools week’ have themselves forgotten? Plus ca change…

Ps. If you want details on the devaluation of exams since 1988, and therefore why the annual review process was so important, read THIS.


 

UPDATE. Some have asked ‘how much confidence did you have in ALCAB doing a good job?’ Answer? Initially not much. They are all under huge pressure to say everything is fine. Initially for example, despite physics departments across the country  complaining about the removal of calculus from Physics A Level (complaints that practically none of them will repeat publicly because of fear of their VC office), it did not look like ALCAB would be much use and they rejected calls from various professors I know on this subject. There is massive political pressure to focus exclusively on the numbers taking an A Level rather than the quality  of the A Level.

But my hope was that by creating something that would be seen as the ‘voice of the university subject experts’, they would have to listen and adapt in order to maintain credibility and avoid embarrassing challenges. There are more and more enraged academics fed up of VC offices lying to the media and misrepresenting academics’ opinions. I thought that creating something would push the debate in increasingly sensible directions where the emphasis would be on the skills needed on arrival at university. Now, everything to do with A Levels is dominated by political not educational concerns about the numbers doing them and ‘access’. This has helped corrupt the exam system. If we had professors of physics, French, music etc every year publicly humiliating exam boards for errors, this would soon improve things from a low base and make it much harder for MPs and Whitehall to keep corrupting public exams.

[* I wrote ‘poor Nicky Morgan’ with the feeling – poor her, I know what it’s like to be pottering around in the DfE dealing with all sorts of problems before the horror of Question Time then someone walks in with a new bigger problem… But a few people email to say it sounds patronising which was not deliberate, hence deletion…]

Standards In English Schools Part I: The introduction of the National Curriculum and GCSEs

The Introduction to this series of blogs, HERE, sets out the background and goals.

There are many different senses in which people discuss ‘standards’. Sometimes they mean an overall judgement on the performance of the system as judged by an international test like PISA. Sometimes they mean judgements based on performance in official exams such as KS2 SATs (at 11) or GCSEs. Sometimes they mean the number of schools above or below a DfE ‘floor target’. Sometimes they mean the number of schools and/or pupils in Ofsted-defined categories. Sometimes people talk about ‘the quality of teachers’. Sometimes they mean ‘the standards required of pupils when they take certain exams’. Today, the media is asking ‘have Academies raised standards?’ because of the Select Committee Report (which, after a brief flick through, seems to have ignored most of the most interesting academic studies done on a randomised/pseudo-randomised basis).

This blog in the series is concerned mainly with the questions of – what has happened to the standards required of pupils when they take GCSEs and A Levels as a result of changes since the mid-1980s, and how do universities and learned societies judge the preparation of pupils for further studies. Have the exams got easier? Do universities and learned societies think pupils are well-prepared for further studies?

I will give a very short potted history of the introduction of GCSEs and the National Curriculum before examining the evidence of their effects. If you are not interested in the history, please skip to the Section B on Evidence. If you just want to see my Conclusions, scroll to the end for a short section.

I stress that my goal is not to argue for a return to the pre-1988 system of O Levels and A Levels. While it had some advantages over the existing system, it also had profound problems. I think that an unknown fraction of the cohort could experience far larger improvements in learning than we see now if they were introduced to different materials in different ways, rather than either contemporary exams or their predecessors, but I will come to this argument, and why I have this belief, in a later blog.

I have used the word ‘Department’ to represent the DES of the 1980s, the DfE of post-2010, and its different manifestations in between.

This is just a rough first stab at collecting things I’ve shoved in boxes, emails etc over the past few years. Please leave corrections and additions in Comments.

A. A very potted history

Joseph introduces GCSEs – ‘a right old mess’

The debate over the whole of education policy, and particularly the curriculum and exams, changed a lot after Callaghan’s Ruskin speech in 1976 and the Department’s Yellow Book. Before then, the main argument was simply about providing school places and the furore over selection. After 1976 the emphasis shifted to ‘standards’ and there was growing momentum behind a National Curriculum (NC) of some sort and reforms to the exam system.

Between 1979-85, the Department chivvied LAs on the curriculum but had little power and nothing significant changed. Joseph was too much of a free marketeer to support a NC so its proponents could not make progress.

Joseph was persuaded to replace O Levels with GCSEs. He thought that the outcome would be higher standards for all but he later complained that he had been hoodwinked by the bureaucratic process involving The Schools Examination Committee (SEC). He later complained:

‘I should have fought against flabbiness in general more than I did… I thought I did, but how do you reach into such a producer-oriented world? … “Stretching” was my favourite word; I judged that if you leant on that much else would follow. That’s what my officials encouraged me to imagine I was achieving… I said I’d only agree to unify the two examinations provided we established differentiation [which he defined as ‘you’re stretching the academic and you’re stretching the non-academic in appropriate ways’], and now I find that unconsciously I have allowed teacher assessment, to a greater extent than I assumed. My fault … my fault… it’s the job of ministers to see deeply… and therefore it’s flabby… You don’t find me defending either myself or the Conservative Party, but I reckon that we’ve all together made a right old mess of it. And it’s hurt most those who are most vulnerable.’ (Interview with Ball.)

I have not come across any other ministers or officials from this period so open about their errors.

The O Level survived under a different name as an international exam provided by Cambridge Assessment. It is still used abroad including in Singapore which regularly comes in the top three in all international tests. Cambridge Assessment also offers an ‘international GCSE’ that is, they say, tougher than the ‘old’ GCSE (i.e. the one in use now before it changes in 2015) but not as tough as the O Level. This international GCSE was used in some private schools pre-2010 along with ‘international GCSEs’ from other exam boards. From 2010, state schools could use iGCSEs. In 2014, the DfE announced that it would stop this again. I blogged on this decision HERE.

Entangled interests – Baker and the National Curriculum

In 1986, Thatcher replaced Joseph with Baker hoping, she admitted, that he would make up ‘in presentational flair what ever he lacked in attention to detail’. He did not. Nigel Lawson wrote of Baker that ‘not even his greatest friends would describe him as a profound thinker or a man with mastery of detail’. Baker’s own PPS said that at the morning meeting ‘the main issue was media handling’. Jenny Bacon, the official responsible for The National Curriculum 5-16 (1987), said that Baker liked memos ‘in “ball points” … some snappy things with headings. It wasn’t glorious continuous prose…[Ulrich, a powerful DES official] was appalled but Baker said “That’s just the kind of brief I want”.’

Between 1976 and 1986, concern had grown in Whitehall about the large number of awful schools and widespread bad teaching. Various intellectual arguments, ideology, political interests (personal and party), and bureaucratic interests aligned to create a National Curriculum. Thatcherites thought it would undermine what they thought of as the ‘loony left’, then much in the news. Baker thought it would bring him glory. The Department and HMI rightly thought it would increase their power. After foolishly announcing CTCs at Party Conference, thus poisoning their brand with politics from the start, Baker announced he would create a NC and a testing system at 7, 11, and 14.

The different centres of power disagreed on what form the NC would take. HMI lobbied against subjects and wanted a NC based on ‘areas of expertise’, not traditional subjects. Thatcher wanted a very limited core curriculum based on English, maths, and science. The Department wanted a NC that stretched across the whole curriculum. Baker agreed with the Department and dismissed Thatcher’s limited option as ‘Gradgrind’.

In order to con Thatcher into agreeing his scheme, Baker worked with officials to invent a fake distinction between ‘core’ and ‘foundation’ subjects. As Baker’s Permanent Secretary Hancock said, ‘We devised the notion of the core and the foundation subjects but if you examine the Act you will see that there is no difference between the two. This was a totally cynical and deliberate manoeuvre on Kenneth Baker’s part.’

The 1988 Act established two quangos to be what Baker called ‘the twin guardians of the curriculum’ – The National Curriculum Council (NCC), focused on the NC, and The Schools Examinations and Assessment Council (SEAC), focused on tests. Once the Act was passed, Baker’s junior minister Rumbold said that ‘Ken went out to lunch.’ Like many ministers, he did not understand the importance of the policy detail and the intricate issues of implementation. He allowed officials to control appointments to the two vital committees and various curriculum working groups. Even Baker’s own spad later said that Baker was conned into appointing ‘the very ones responsible for the failures we have been trying to put right’. Baker forlornly later admitted that ‘I thought you could produce a curriculum without bloodshed. Then people marched over mathematics. Great armies were assembled’, and he ‘never envisaged it would be as complex as it turned out to be’. Bacon, the official responsible for the NC, said that Baker ‘wasn’t interested in the nitty gritty’. Nicholas Tate (who was at the NCC and later headed the QCA) said that Baker was ‘affable but remote. He didn’t trouble his mind with attainment targets. He was resting on his laurels.’ Hancock, his Permanent Secretary, said that ‘after 1987 he became increasingly arrogant and impatient’. In 1989, Baker was moved to Party Chairman leaving behind chaos for his successor.

According to his colleagues, Baker was obsessed with the media, he did not try to understand (and did not have the training to understand) the policy issues in detail, and he confused the showmanship necessary to get a bill passed with serious management – he described himself as ‘a doer’ but the ‘doing’ in his mind consisted of legislation and spin. He did not even understand that there were strong disputes among teachers, subject bodies, and educationalists about the content of the NC – never mind what to do about these disputes. (Having watched the UTC programme from the DfE, the same traits were much in evidence thirty years later.)

Baker’s legacy 1989 – 1997: Shambles

Baker’s memoirs do not mention the report of The Task Group on Assessment (TGAT), chaired by Professor Paul Black, commissioned by Baker in 1987 to report on how the NC could be assessed. The plan was very complicated with ten levels of attainment having to be defined for each subject. Thatcher hated it and criticised Baker for accepting it. Meanwhile the Higginson Report had recommended replacing A Levels with some sort of IB type system. Bacon said that ‘the political trade-off was Higginson got ditched … and we got TGAT. In retrospect it may have been the wrong trade off.’

MacGregor could not get a grip of the complexity. He did not even hire a specialist policy adviser because, he said, ‘I didn’t feel I needed one.’ He blamed Baker for the chaos who, he said, ‘hadn’t spent enough time thinking about who was appointed to the bodies. He left it to officials and didn’t think through what he wanted the bodies to do. For the first year I was unable to replace anybody.’ The chairman of NCC described how they used ‘magic words to appease the right’ and get through what they wanted. The officials who controlled SEAC stopped the simplification that Thatcher wanted using the ‘legal advice’ card, claiming that the 1988 Act required testing of all attainment targets. (I had to deal with the same argument 25 years later.) MacGregor was trapped. He had an unworkable system and was under contradictory pressure from Thatcher to simplify everything and from Baker to maintain what he had promised.

Clarke bluffed and bullied his way through 18 months without solving the problems. His Permanent Secretary described the trick of getting Clarke to do what officials wanted: ‘The trick was to never box him into a corner… Show him where there was a door but never look at that door, and never let on you noticed when he walked through.’ Like MacGregor, Clarke blamed Baker for the shambles: ‘[Baker] had set up all these bloody specialist committees to guide the curriculum, he’d set up quango staff who as far as I could see had come out of the Inner London Education Authority the lot of them.’ Clarke solved none of the main problems with the tests, antagonised everybody, and replaced HMI with Ofsted.

After his surprise win, Major told the Tory Conference in 1992, ‘Yes it will mean another colossal row with the education establishment. I look forward to that.’ Patten soon imploded, the unions went for the jugular over the introduction of SATs, and by the end of 1993 Number Ten had backtracked on their bellicose spin and was in full retreat with a review by Dearing (published 1994). Suddenly, the legal advice that had supposedly prevented any simplification was rethought and officials told Dearing that the legal advice did allow simplification after all: ‘our advice is that the primary legislation allows a significant measure of flexibility’. (In my experience, one of the constants of Whitehall is that legal advice tends to shift according to what powerful officials want.) Dearing produced a classic Whitehall fudge that got everybody out of the immediate crisis but did not even try to deal with the fundamental problems, thus pushing the problems into the future.

The historian Robert Skidelsky, helping SEAC, told Patten ‘these tests will not run’ and he should change course but Patten shouted ‘That is defeatist talk.’ Skidelsky decided to work out a radically simpler model than the TGAT system with a small group in SEAC: ‘We pushed the model through committee and through the Council and sent it off to John Patten. We never received a reply. Six months after I resigned Emily Blatch approached me and said she had been looking for my paper on Assessment but no one seems to know where it is.’

Patten was finished. Gillian Shephard was put in to be friendly to the unions and quiet the chaos. Soon she and Major had also fallen out and the cycle of briefing and counter-briefing against Number Ten returned with permanent policy chaos. One of her senior officials, Clive Saville, concluded that ‘There was a great intellectual superficiality about Gillian Shephard and she was as intellectually dishonest as Shirley Williams. She was someone who wanted to be liked but wasn’t up to the job.’

A few thoughts on the process

The Government had introduced a new NC and test system and replaced O Levels with GCSEs. (They also introduced new vocational qualifications (NVQs) described by Professor Alan Smithers as a ‘disaster of epic proportions … utterly lightweight’.) The process was a disastrous bungle from start to finish.

Thatcher deserves considerable blame. She allowed Baker to go ahead with fundamental reforms without any agreed aims or a detailed roadmap. She knew, as did Lawson, that Baker could not cope with details yet appointed him on the basis of ‘presentational flair’ (media obsession is often confused with ‘presentational flair’).

The best book I have read by someone who has worked in Number Ten and seen why the Whitehall architecture is dysfunctional is John Hoskyns’ Just In Time. Extremely unusually for someone in a senior position in No10, Hoskyns both had an intellectual understanding of complex systems and was a successful manager. Inevitably, he was appalled at how the most important decisions were made and left Number Ten after failing to persuade Thatcher to tear up the civil service system. Since then, everybody in Number Ten has been struggling with the same issues. (If she had taken his advice history might have been extremely different – e.g. no ERM debacle.) His conclusion on Thatcher was:

‘The conclusion that I am coming to is that the way in which [Thatcher] herself operates, the way her fire is at present consumed, the lack of a methodical mode of working and the similar lack of orderly discussion and communication on key issues, means that our chance of implementing a carefully worked out strategy – both policy and communications – is very low indeed… Difficult problems are only solved – if they can be solved at all – by people who desperately want to solve them… I am convinced that the people and the organisation are wrong.’ (Emphasis added.)

Arguably the person who knowingly appoints someone like Baker is more to blame for the failings of Baker than Baker is himself. Major and the string of ministers that followed Baker were doomed. They were not unusually bad – they were representative examples of those at the apex of the political process. They did not know how to go about deciding aims, means, and operations. They were obsessed with media management and therefore continually botched the policy and implementation. They could not control their officials. They could not agree a plan and blamed each other. If they were the sort of people who could have got out of the mess, then they were the sort of people who would not have got into the mess in the first place.

Officials over-complicated everything and, like ministers, did not engage seriously with the core issue – what should pupils of different abilities be doing and how can we establish a process where we can collect reliable information. The process was dominated by the same attitude on all sides – how to impose a mentality already fixed.

It was also clearly affected by another element that has contemporary relevance – the constant churn of people. Just between summer 1989 and the end of 1992, there was: a new Permanent Secretary in May 1989, a new SoS in July 1989 (MacGregor), another new SoS in November 1990 (Clarke), a new PM and No10 team (Major), new heads for the NCC and SEAC in July 1991, then another new SoS in spring 1992 (Patten) and another new Permanent Secretary. Everybody blamed problems on predecessors and nobody could establish a consistent path.

Even its own Permanent Secretaries later attacked the DES. James Hamilton (1976-1983) was put into DES in June 1976 from the Cabinet Office to help with the Ruskin agenda and found a place where ‘when something was proposed someone would inevitably say, “Oh we tried that back in whenever and it didn’t work”…’. Geoffrey Holland (1992-3) admitted that, ‘It [DES] simply had no idea of how to get anything off the ground. It was lacking in any understanding or experience of actually making things happen.’

A central irony of the story shows how dysfunctional the system was. Thatcher never wanted a big NC and a complicated testing system but she got one. As some of her ideological opponents in the bureaucracy tried to simplify things when it was clear Baker’s original structure was a disaster, ministers were often fighting with them to preserve a complex system that could not work and which Thatcher had never wanted. This sums up the basic problem – a very disruptive process was embarked upon without the main players agreeing what the goal was.

Although the think tanks were much more influential in this period than they are now, Ferdinand Mount, head of Thatcher’s Policy Unit, made a telling point about their limitations: ‘Enthusiasts for reform at the IEA and the CPS were prodigal with committees and pamphlets but were much less helpful when it came to providing practical options for action. This made it difficult for the Policy Unit’s ideas to overcome the objections put forward by senior officials’. Thirty years later this remains true. Think tanks put out reports but they rarely provide a detailed roadmap that could help people navigate such reforms through the bureaucracy and few people in think tanks really understand how Whitehall works. This greatly limits their real influence. This is connected to a wider point. Few of those who comment prominently on education (or other) policy understand how Whitehall works, hence there is a huge gap between discussions of ideal policy and what is actually possible within a certain timeframe in the existing system, and commentators think that all sorts of things that happen do so because of ministers’ wishes, confusing public debate further.

I won’t go into the post-1997 story. There are various books that tell this whole story in detail. The National Curriculum remained but was altered; the test system remained but gradually narrowed from the original vision; there were some attempts at another major transformation (such as Tomlinson’s attempt to end A Levels, thwarted by Blair) but none took off; money poured into the school system and its accompanying bureaucracy at an unprecedented rate but, other than a large growth in the number and salaries of everybody, it remained unclear what if any progress was being made.

This bureaucracy spent a great deal of taxpayers’ money promoting concepts such as ‘learning styles’ and ‘multiple intelligences’ that have no proper scientific basis but which nevertheless were successfully blended with old ideas from Vygotsky and Piaget to dominate a great deal of teacher training. A lot of people in the education world got paid an awful lot of money (Hargreaves, Waters et al) but what happened to standards?

(The quotes above are taken mainly from Daniel Callaghan’s Conservative Party Education Policies 1976-1997.)

B. The cascading effects of GCSEs and the National Curriculum

Below I consider 1) the data on grade inflation in GCSEs and A Levels, 2) various studies from learned societies and others that throw light on the issue, 3) knock-on effects in universities.

1. Data on grade inflation in GCSEs and A Levels

We do not have an official benchmark against which to compare GCSE results. The picture is therefore necessarily hazy. As Coe has written, ‘we are limited by the fact that in England there has been no systematic, rigorous collection of high-quality data on attainment that could answer the question about systemic changes in standards.’ This is one of the reasons why in 2013 we, supported by Coe and others, pushed through (against considerable opposition including academics at the Institute of Education) a new ‘national reference test’ in English and maths at age 16, which I will return to in a later blog.

However, we can compare the improvement in GCSE results with a) results from international tests and b) consistent domestic tests uncontrolled by Whitehall.

The first two graphs below show the results of this comparison.

Chart 1: Comparison of English performance in international surveys versus GCSE scores 1995-2012 (Coe)

Screenshot 2015-01-06 16.32.49

Chart 2: GCSE grades achieved by candidates with same maths & vocab scores each year 1996-2012 (Coe)

Screenshot 2015-01-06 16.33.23

Professor Coe writes of Chart 1:

‘When GCSE was introduced in 1987 [I think he must mean 1988 as that was the first year of GCSEs or else he means ‘the year before GCSEs were first taken’], 26.4% of the cohort achieved five grade Cs or better. By 2012 the proportion had risen to 81.1%. This increase is equivalent to a standardised effect size of 1.63, 3 or 163 points on the PISA scale… If we limit the period to 1995 – 2011 [as in Chart 1 above] the rise (from 44% to 80% 5A*-C) is equivalent to 99 points on the PISA scale [as superimposed on Chart 1]… [T]he two sets of data [international and GCSEs] tell stories that are not remotely compatible. Even half the improvement that is entailed in the rise in GCSE performance would have lifted England from being an average performing OECD country to being comfortably the best in the world. To have doubled that rise in 16 years is just not believable

‘The question, therefore, is not whether there has been grade inflation, but how much…’ [Emphasis added.] (Professor Robert Coe, ‘Improving education: a triumph of hope over experience‘, 18 June 2013, p. vi.)

Chart 2 plots the improving GCSE grades achieved by pupils scoring the same each year in a test of maths and vocabulary: pupils scoring the same on YELLIS get higher and higher GCSE grades as time passes. Coe concludes that although ‘it is not straightforward to interpret the rise in grades … as grade inflation’, the YELLIS data ‘does suggest that whatever improved grades may indicate, they do not correspond with improved performance in a fixed test of maths and vocabulary’ (Coe, ibid).

This YELLIS comparison suggests that in 2012 pupils received a grade higher in maths, history, and French GCSE, and almost a grade higher in English, than students of the same ability in 1996.

It is important to note that neither of Coe’s charts or measurements include the effects of either a) the initial switch from O Level to GCSE or b) what changed with GCSEs from 1988 – 1995. 

The next two charts show this earlier part of the story (both come from Education: Historical statistics, House of Commons, November 2012). NB. they have different end dates.

Chart 3: Proportion getting 5 O Levels / GCSEs at grade C or higher 1953/4 – 2008/9 

Screenshot 2015-01-09 17.24.19

Chart 4: Proportion getting 1+ or 3+ passes at A Level 1953/4 – 1998/9

Screenshot 2015-01-09 17.24.42

Chart 3 shows that the period 1988-95 saw an even sharper increase in GCSE scores than post-1995 so a GCSE/YELLIS style comparison that included the years 1988-1995 would make the picture even more dramatic.

Chart 4 shows a dramatic increase in A Level passes after the introduction of GCSEs. One interpretation of this graph, supported by the 1997-2010 Government and teaching unions, is that this increase reflected large real improvements in school standards.

There is GCSE data that those who believe this argument could cite. In 1988, 8% of GCSEs were awarded an ‘A’ in GCSE. In 2011, 23% of GCSEs were awarded an ‘A’ or ‘A*’ in GCSE. The DfE published data in 2013 which showed that the number of pupils with ten or more A* grades trebled 2002-12. This implies a very large increase in the numbers of those excelling at GCSE, which is consistent with a picture of a positive knock-on effect on improving A Level results.

However, we have already seen that the claims for GCSEs are ‘not believable’ in Coe’s words. It also seems prima facie very unlikely that a sudden large improvement in A Level results from 1990 could be the result of immediate improvements in learning driven by GCSEs. There is also evidence for A Levels similar to the GCSE/YELLIS comparison.

Chart 5: A level grades of candidates having the same TDA score (1988-2006)

Screenshot 2015-01-21 00.43.33

Chart 5 plots A Level grades in different subjects against the international TDA test. As with GCSEs, this shows that pupils scoring the same in a non-government test got increasingly higher grades in A Levels. The change in maths is particularly dramatic from an ‘Unclassified’ mark in 1988 to a B/C in 2006.

What we know about GCSEs combined with this information makes it very hard to believe that the sudden dramatic increase in A Level performance since 1990 is because of real improvements and suggests another interpretation: these dramatic increases in A Level results reflected (mostly or entirely) A Levels being made significantly easier probably in order to compensate for GCSEs being much easier.

However, the data above can only tell part of the story. Logically, it is hard or impossible to distinguish between possible causes just from these sorts of comparisons. For example, perhaps someone might claim that A Level questions remained as challenging as before but grade boundaries moved – i.e. the exam papers were the same but the marking was easier. I think this is prima facie unlikely but the point is that logically the data above cannot distinguish between various possible dynamics.

Below is a collection of studies, reports, and comments from experts that I have accumulated over the past few years that throws light on which interpretation is more reasonable. Please add others in Comments.

(NB. David Spiegelhalter, a Professor of Statistics at Cambridge, has written about  problems with PISA’s use of statistics. These arguments are technical. To a non-specialist like me, he seems to make important points that PISA must answer to retain credibility and the fact that it has not (as of the last time I spoke to DS in summer 2014) is a blot on its copybook. However, I do not think they materially affect the discussion above. Other international tests conducted on different bases all tell roughly the same story. I will ask DS if he thinks his arguments do undermine the story above and post his reply if any.)

2. Studies 2007 – now 

NB1. Most of these studies are comparing changes over the past decade or so, not the period since the introduction of the NC and GCSEs in the 1980s.

NB2. I will reserve detailed discussion of the AS/A2/decoupling argument for a later blog as it fits better in the ‘post-2010 reforms’ section.

Learned societies. The Royal Society’s 2011 study of Science GCSEs: ‘the question types used provided insufficient opportunity for more able candidates … to demonstrate the extent of their scientific knowledge, understanding and skills. The question types restricted the range of responses that candidates could provide. There was little or no scope for them to demonstrate various aspects of the Assessment Objectives and grade descriptions… [T]he use of mathematics in science was examined in a very limited way.’ SCORE also published (2012) evidence on science GCSEs which reported ‘a wide variation in the amount of mathematics assessed across awarding organisations and confirmed that the use of mathematics within the context of science was examined in a very limited way. SCORE organisations felt that this was unacceptable.’

The 2012 SCORE report and Nuffield Report showed serious problems with the mathematical content of A Levels. SCORE was very critical:

‘For biology, chemistry and physics, it was felt there were underpinning areas of mathematics missing from the requirements and that their exclusion meant students were not adequately prepared for progression in that subject. For example, for physics many of the respondents highlighted the absence of calculus, differentiation and integration, in chemistry the absence of calculus and in biology, converting between different units… For biology, chemistry and physics, the analysis showed that the mathematical requirements that were assessed concentrated on a small number of areas (e.g. numerical manipulation) while many other areas were assessed in a limited way, or not at all… Survey respondents were asked to identify content areas from the mathematical requirements that should feature highly in assessments. In most cases, the biology, chemistry and physics respondents identified mathematical content areas that were hardly or not at all assessed by the awarding organisations.

‘[T]he inclusion of more in-depth problem solving would allow students to apply their knowledge and understanding in unstructured problems and would increase their fluency in mathematics within a science context.’

‘The current mathematical assessments in science A-levels do not accurately reflect the mathematical requirements of the sciences. The findings show that a large number of mathematical requirements listed in the biology, chemistry and physics specifications are assessed in a limited way or not at all within these papers. The mathematical requirements that are assessed are covered repeatedly and often at a lower level of difficulty than required for progression into higher education and employment. It has also highlighted a disparity between awarding organisations in their assessment of the use of mathematics within biology, chemistry and physics A-level. This is unacceptable and the examination system, regardless of the number of awarding organisations, must ensure the assessments provide an authentic representation of the subject and equip all students with the necessary skills to progress in the sciences.

‘This is likely to have an impact on the way that the subjects are taught and therefore on students’ ability to progress effectively to STEM higher education and employment.’ SCORE, 2012. Emphasis added.

The 2011 Institute of Physics report showed strong criticism from university academics of the state of physics and engineering undergraduates’ mathematical knowledge. Four-fifth of academics said that university courses had changed to deal with a lack of mathematical fluency and 92% said that a lack of mathematical fluency was a major obstacle.

‘The responses focused around mathematical content having to be diluted, or introduced more slowly, which subsequently impacts on both the depth of understanding of students, and the amount of material/topics that can be covered throughout the course…

‘Academics perceived a lack of crossover between mathematics and physics at A-level, which was felt to not only leave students unprepared for the amount of mathematics in physics, but also led to them not applying their mathematical knowledge to their learning of physics and engineering.’ IOP, 2011.

The 2011 Centre for Bioscience criticised Biology and Chemistry A Levels and preparation of pupils for bioscience degrees: ‘very many lack even the basics… [M]any students do not begin to attempt quantitative problems and this applies equally to those with A level maths as it does to those with C at GCSE. A lack of mathematics content in A level Biology means that students do not expect to encounter maths at undergraduate level. There needs to be a more significant mathematical component in A level biology and chemistry.’ The Royal Society of Chemistry report, The five decade challenge (2008), said there had been ‘catastrophic slippage in school science standards’ and that Government claims about improving GCSE scores were ‘an illusion’. (The Department said of the RSC report, ‘Standards in science have improved year on year thanks to 10 years of sustained investment and improvement in teaching and the education system – this is something we should celebrate, not criticise. Times have changed.’)

Ofqual, 2012. Ofqual’s Standards Review in 2012 found grade inflation in both GCSE and A-levels between 2001-03 and 2008-10: ‘Many of these reviews raise concerns about the maintenance of standards… In the GCSEs we reviewed (biology, chemistry and mathematics) we found that changes to the structure of the assessments, rather than changes to the content, reduced the demand of some qualifications.’

On A-levels, ‘In general we found that changes to the way the content was assessed had an impact on demand, in many cases reducing it. In two of the reviews (biology and chemistry) the specifications were the same for both years. We found that the demand in 2008 was lower than in 2003, usually because the structure of the assessments had changed. Often there were more short answer, structured questions’ (Ofqual, Standards Reviews – A Summary, 1 May 2012, found here).

Chief Executive of Ofqual, Glenys Stacey, has said: ‘If you look at the history, we have seen persistent grade inflation for these key qualifications for at least a decade… The grade inflation we have seen is virtually impossible to justify and it has done more than anything, in my view, to undermine confidence in the value of those qualifications’ (Sunday Telegraph, 28 April 2012).

The OECD’s International Survey of Adult Skills (October 2013). This assessed numeracy, literacy and computing skills of 16-24-year-olds. The tests were done over 2011/2012. England was 22nd out of 24 for literacy, 21st out of 24 for numeracy, and is 16th out of 20 for ‘problem solving in a technology-rich environment’.

PISA 2012. The normal school PISA tests taken in 2012 (reported 2013) showed no significant change between 2009-12. England was 21st for science, 23rd for reading, and 26th for mathematics. A 2011 OECD report concluded: ‘Official test scores and grades in England show systematically and significantly better performance than international and independent tests… [Official results] show significant increases in quality over time, while the measures based on cognitive tests not used for grading show declines or minimal improvements’ (OECD Economic Surveys: United Kingdom, 16 March 2011, p. 88-89). This interesting chart shows that in the PISA maths test the children of English professionals perform the same as children of Singapore cleaners (Do parents’ occupations have an impact on student performance?, PISA 2014).

Chart 6: Comparing pupil maths scores by parent occupation, UK (left) and Singapore (right) maths skills (PISA 2012)

Screenshot 2015-01-26 18.43.03

TIMMS/PIRLS. The TIMMS/PIRLS tests (taken summer 2011, reported December 2012) told a similar story to PISA. England’s score in reading at age 10 increased since 2006 by a statistically significant amount. England’s score in science at age 10 decreased since 2007 by a statistically significant amount. England’s scores in science at age 14 and mathematics at ages 10 and 14 showed no statistically significant changes since 2007. (According to experts, the PISA maths test relies more on language comprehension than TIMMS which is supposedly why Finland scores higher in the former than the latter.)

National Numeracy (February 2012). Research showed that in 2011 only a fifth of the adult population had mathematical skills equivalent to a ‘C’ in GCSE, down a few percent from the last survey in 2003. About half of 16-65 year olds have at best the mathematical skills of an 11 year-old. A fifth of adults will struggle with understanding price labels on food and half ‘may not be able to check the pay and deductions on a wage slip.’

King’s College, 2009. A major study by academics from King’s College London and Durham University found that basic skills in maths have declined since the 1970s. In 2008, less than a fifth of 14 year-olds could write 11/10 as a decimal. In the early 1980s, only 22 per cent of pupils obtained a GCE O-level grade C or above in maths. In 2008, over 55 per cent gained a GCSE grade C or above in the subject (King’s College London/University of Durham, ‘Secondary students’ understanding of mathematics 30 years on‘, 5 September 2009, found here).

Chart 7: Performance on ICCAMS / CSMS Maths tests showing declines over time

Screenshot 2015-01-22 16.42.53

Shayer et al (2007) found that performance in a test of basic scientific concepts fell significantly between 1976 and 2003. ‘[A]lthough both boys and girls have shown great drops in performance, the relative drop is greater for boys… It makes it difficult to believe in the validity of the year on year improvements reported nationally on Key Stage 3 NCTs in science and mathematics: if children are entering secondary from primary school less and less equipped with the necessary mental conditions for processing science and mathematics concepts it seems unlikely that the next 2.5 years KS3 teaching will have improved so much as more than to compensate for what students of today lack in comparison with 1976.’

Chart 8: Performance on tests of scientific concepts, 1976 – 2003 (Shayer)

Screenshot 2015-01-23 17.21.10

Tymms (2007) reviewed assessment evidence in mathematics from children at the end of primary school between 1978 and 2004 and in reading between 1948 and 2004. The conclusion was that standards in both subjects ‘have remained fairly constant’.

Warner (2013) on physics. Professor Mark Warner (Cambridge University) produced a fascinating report (2013) on problems with GCSE and A Level Physics and compared the papers to old O Levels,  A Levels, ‘S’ Level papers, Oxbridge entry exams, international exams and so on. After reading it, there is no room for doubt. The standards demanded in GCSEs and A Levels have fallen very significantly.

‘[In modern papers] small steps are spelt out so that not more than one thing needs to be addressed before the candidate is set firmly on the right path again. Nearly all effort is spent injecting numbers into formulae that at most require GCSE-level rearrangements… All diagrams are provided… 1986 O-level … [is] certainly more difficult than the AS sample… 1988 A-level … [is] harder than most Cambridge entrance questions currently… 1983 Common Entrance [is] remarkably demanding for this age group, approaching the challenge of current AS… There is a staggering difference in the demands put on candidates… Exams [from the 1980s] much lower down the school system are in effect more difficult than exams given now in the penultimate years [i.e. AS].’

For example, the mechanics problems in GCSE Physics are substantially shallower than those in 1980s O Level, which examined concepts now in A Level. The removal of calculus from A Level physics badly undermined it. Calculus is tested in A Level Maths’ Mechanics I paper and Mechanics II and III test deeper material than Physics A Level. This is one of the reasons why Cambridge Physics department stopped requiring Physics A Level for entry and made clear that Further Maths A Level is acceptable instead (many say it is better preparation for university than physics A Level is).

Warner also makes the point that making Physics GCSE and A Level much easier did not even increase the number taking physics degrees, which has declined sharply since the mid-1980s. He concludes: ‘one could again aim for a school system to get a sizable fraction of pupils to manage exams of these [older] standards. Children are not intrinsically unable to attack such problems.’ (NB. The version of this report on the web is not the full version – I would urge those interested to email Professor Warner.)

Gowers (2012) on maths. Tim Gowers, Cambridge professor and Fields Medallist, described some problems with Maths A Level and concluded:

‘The general point here is of course that A-levels have got easier [emphasis added] and schools have a natural tendency to teach to the test. If just one of those were true, it would be far less of a problem. I would have nothing against an easy A-level if people who were clever enough were given a much deeper understanding than the exam strictly required (though as I’ve argued above, for many people teaching to the test is misguided even on its own terms, since they will do a lot better on the exam if they have not been confined to what’s on the test), and I would not be too against teaching to the test if the test was hard enough…

‘[S]ome exams, such as GCSE maths, are very very easy for some people, such as anybody who ends up reading mathematics at Cambridge (but not just those people by any means). I therefore think that the way to teach people in top sets at schools is not to work towards those exams but just to teach them maths at the pace they can manage.’

Durham University analysis gives data to quantify this conclusion. Pupils who would have received a U (unclassified) in Maths A-Level in 1988 received a B/C in 2006 – see above for Chart 5 showing this (CEM Centre Durham University, Changes in standards at GCSE and A-Level: Evidence from ALIS and YELLIS, April 2007). Further Maths A Level is supposedly the toughest A Level and probably it is but a) it is not the same as its 1980s ancestor and b) it now introduces pupils to material such as matrices that used to be taught in good prep schools.

I spent a lot of time 2007-14 talking to maths dons, including heads of departments, across England. The reason I quote Gowers is that I never heard anybody dispute his conclusion but he was almost the only one who would say it publicly. I heard essentially the same litany about A Level maths from everybody I spoke to: although there were differences of emphasis, nobody disputed these basic propositions. 1) The questions became much more structured so pupils are led up a scaffolding with less requirement for independent problem-solving. 2) The emphasis moved to memorising some basic techniques the choice of which is clearly signalled in the question. 3) The modular system a) encouraged a ‘memorise, regurgitate, forget’ mentality and b) undermined learning about how different topics connect across maths, both of which are bad preparation for further studies. (There are also some advantages to a modular system that I will return to.) 4) Many undergraduates, including even those in the top 5% at such prestigious universities as Imperial, therefore now struggle in their first year as they are not well-prepared by A Level for the sort of problems they are given in undergraduate study. (The maths department at Imperial became so sick of A Level’s failings that they recently sought and got approval to buy Oxford’s entrance exam for use in their admission system.)

I will not go into arguments about vocational qualifications here but note the conclusion of Alison Wolf whose 2011 report on this was not disputed by any of the three main parties:

‘The staple offer for between a quarter and a third of the post- 16 cohort is a diet of low-level vocational qualifications, most of which have little to no labour market value.’

3. Knock-on effects in universities

Serious lack of maths skills

There are many serious problems with maths skills. Part of the reason is that many universities do not even demand A Level maths. The result? As of about 2010-12, about 20% of Engineering undergraduates, about 40% of Chemistry and Economics undergraduates, and about 60-70% of Biology and Computer Science undergraduates did not have A Level Maths. Less than 10% of undergraduate bioscience degree courses demand A Level Maths therefore ‘problems with basic numeracy are evident and this reflects the fact that many students have grades less than A at GCSE Maths. These students are unlikely to be able to carry out many of the basic mathematical approaches, for example unable to manipulate scientific notation with negative powers so commonly used in biology’ (2011 Biosciences report). (I think that history undergraduates should be able to manipulate scientific notation with negative powers – this is one of the many things that should be standard for reasonably able people.)

The Royal Society estimated (Mathematical Needs2012) that about 300,000 per year need a post-GCSE Maths course but only ~100,000 do one. (This may change thanks to Core Maths starting in 2015, see later blog.) This House of Lords report (2012) on Higher Education in STEM subjects concluded: ‘We are concerned that … the level at which the subject [maths] is taught does not meet the requirements needed to study STEM subjects at undergraduate level… [W]e urge HEIs to introduce more demanding maths requirement for admissions into STEM courses as the lack, or low level, of maths requirements at entry acts as a disincentive for pupils to study maths and high level maths at A level.’ House of Lords Select Committee on Science and Technology, Higher Education in STEM subjects, 2012.

Further, though this subject is beyond the scope of this blog, it is also important that the maths PhD pipeline ‘which was already badly malfunctioning has been seriously damaged by EPSRC decisions’, including withdrawal of funding from non-statistics subjects which drew the ire of UK Fields Medallists, cf. Submission by the Council for the Mathematical Sciences to the House of Lords, 2011. The weaknesses in biology also feed into the bioscience pipeline: only six percent of bioscience academics think their graduates are well prepared for a masters in the fast-growing field of Computational Biology (p.8 of report).

Closing of language departments, decline of language skills

I have not found official stats for this but according to research done for the Guardian (with FOIs):

‘The number of universities offering degrees in the worst affected subject, German, has halved over the past 15 years. There are 40% fewer institutions where it is possible to study French on its own or with another language, while Italian is down 23% and Spanish is down 22%.’

As Katrin Kohl, professor of German at Jesus College (Oxford) has said, ‘The UK has in recent years been systematically squandering its already poor linguistic resources.’ Dawn Marley, senior lecturer in French at the University of Surrey, summarised problems across languages:

‘We regularly see high-achieving A-level students who have only a minimal knowledge of the country or countries where the language of study is spoken, or who have limited understanding of how the language works. Students often have little knowledge of key elements in a country’s history – such as the French Revolution, or the fact that France is a republic. They also continue to struggle with grammatical accuracy, and use English structures when writing in the language they are studying… The proposals for the revival of A-level are directly in line with what most, if not all, academics in language departments would see as essential.’ (Emphasis added.)

The same picture applies to classical languages. Already by 1994 the Oxford Classics department was removing texts such as Thucydides as compulsory elements in ‘Greats’ because they were deemed ‘too hard’. These changes continued and have made Classics a very different subject than it was before 1990. At Oxford, they introduced whole new courses (Mods B then Mods C) that do not require any prior study of the ancient languages themselves. The first year of Greats now involves remedial language courses.

I quote at length from a paper by John Davie, a Lecturer in Classics at Trinity College, Oxford, as his comments summarise the views of other senior classicists in Oxbridge and elsewhere who have been reluctant to speak out (In Pursuit of Excellence, Davie, 2013). Inevitably, the problems described are damaging the pipeline for masters, PhDs, and future scholarship.

‘Classics as an academic subject has lost much of its intellectual force in recent years. This is true not only of schools but also, inevitably, of universities, which are increasingly required to adapt to the lowering of standards…

‘In modernist courses…, there is (deliberately) no systematic learning of grammar or syntax, and emphasis is laid on fast reading of a dramatic continuous story in made-up Latin which gives scope for looking at aspects of ancient life. The principle of osmosis underlying this approach, whereby children will learn linguistic forms by constant exposure to them, aroused scepticism among many teachers and has been thoroughly discredited by experts in linguistics. Grammar and syntax learned in this piecemeal fashion give pupils no sense of structure and, crucially, deny them practice in logical analysis, a fundamental skill provided by Classics…

‘[W]e have, in GCSE, an exam that insults the intelligence… Recent changes to this exam have by general consent among teachers made the papers even easier.

‘In the AS exam currently taken at the end of the first year of A-level … students study two small passages of literature, which represent barely a third of an original text. They are asked questions so straightforward as to verge on the banal and the emphasis is on following a prescribed technique of answering, as at GCSE. Imagination and independent thought are simply squeezed out of this process as teachers practise exam-answering technique in accordance with the narrow criteria imposed on examiners.

‘The level of difficulty [in AS] is not substantially higher than that of GCSE, and yet this is the exam whose grades and marks are consulted by the universities when they are trying to determine the ability of candidates… Having learned the translation of these bite-sized chunks of literature with little awareness of their context or the wider picture (as at GCSE, it is increasingly the case that pupils are incapable of working out the Latin/Greek text for themselves, and so lean heavily on a supplied translation), they approach the university interview with little or no ability to think “outside the box”. Dons at Oxford and Cambridge regularly encounter a lack of independent thought and a tendency to fall back on generalisations that betray insufficient background reading or even basic curiosity about the subject. This need not be the case and is clearly the product of setting the bar too low for these young people at school…

‘At A2 … students read less than a third of a literary text they would formerly have read in its entirety.

‘There is the added problem that young teachers entering the profession are themselves products of the modernist approach and so not wholly in command of the classical languages themselves. As a result they welcome the fact that they are not required by the present system to give their pupils a thorough grounding in the language, embracing the less rigorous approach of modern course-books with some relief.

‘In the majority of British universities Classics in its traditional form has either disappeared altogether or has been replaced by a course which presents the literature, history and philosophy mainly (or entirely) in translation, i.e. less a degree course in Classics than in Classical Civilisation.

‘This situation has been forced upon university departments of Classics by the impoverished language skills of young people coming up from schools… It is not only the classical languages but English itself which has suffered in this way in the last few decades. Every university teacher of the classical languages knows that he cannot assume familiarity with the grammar and syntax of English itself, and that he will have to teach from scratch such concepts as an indirect object, punctuation or how a participle differs from a gerund…

‘Even at Oxford cuts have been made to the number of texts students are required to read and, in those texts that remain, not as many lines are prescribed for reading in the original Latin or Greek.

‘In the last ten years of teaching for Mods [at Oxford] I have been struck by how the first-year students who come my way at the start of the summer term appear to know less about the classical languages each year, an experience I know to be shared by dons at other colleges…

‘GCSE should be replaced by a modern version of the O-level that stretches pupils… This would make the present AS exam completely unsuitable, and either a more challenging set of papers should be devised, if the universities wish to continue with pre A-level interviewing, or there should be a return to an unexamined year of wide reading before the specialisation of the last year.

‘Although the present exam, A2, has more to recommend it than AS, it also would no longer be fit for purpose and would need strengthening. As part of both final years there should be regular practice in the writing of essays, a skill that has been largely lost in recent years because of the exam system and is (rightly) much missed by dons.’

This combination of problems explains why we funded a project with Professor Pelling, Regius Professor of Greek at Oxford, to fund teacher training and language enrichment courses for schools.

I will not go into other humanities subjects. I read Ancient & Modern History and have thoughts about it but I do not know of any good evidence similar to the reports quoted above by the likes of the Royal Society. I have spoken to many university teachers. Some, such as Professor Richard Evans (Cambridge) told me they think the standard of those who arrive as undergraduates is roughly the same as twenty years ago. Others at Oxbridge and elsewhere told me they think that essay writing skills have deteriorated because of changes to A Level (disputed by Evans and others) and that language skills among historians have deteriorated (undisputed by anyone I spoke to).

For example, the Cambridge Professor of Mediterranean History, David Abulafia, has contradicted Evans and, like classicists, pointed out the spread of remedial classes at Cambridge:

‘It’s a pity, then, that the director of admissions at Cambridge has proclaimed that the old system [pre-Gove reforms] is good and that AS-levels – a disaster in so many ways – are a good thing because somehow they promote access. I don’t know for whom he is speaking, but not for me as a professor in the same university…

‘[Gove] was quite right about the abolition of the time-wasting, badly devised and all too often incompetently marked AS Levels; these dreary exams have increasingly been used as the key to admissions to Cambridge, to the detriment of intellectually lively, quirky, candidates full of fizz and sparkle who actually have something to say for themselves…

‘Bogus educational theories have done so much to damage education in this country… The effects are visible even in a great university such as Cambridge, with a steady decline in standards of literacy, and with, in consequence, the provision in one college after another of ‘skills teaching’, so that students who no longer arrive knowing how to structure an essay or even read a book can receive appropriate ‘training’… Even students from top ranked schools seem to find it very difficult … to write essays coherently… In the sort of exams I am thinking of, essay writing comes much more to the fore and examiners would be making more subjective judgements about scripts. In an ideal world there would be double marking of scripts.’ Emphasis added.

Judging essay skills is a more nebulous task than judging the quality of mechanics questions. Also, there is less agreement among historians about the sort of things they want to see in school exams compared to mathematicians and physicists who largely (in my experience, I stress, which is limited) agree about the sorts of problems they want undergraduates to be able to solve and the skills they want them to have.

I will quote a Professor of English at Exeter University, Colin MacCabe, whose view of the decline of essay skills is representative of many comments I have heard, but I cannot say confidently that this view represents a consensus, despite his claim:

‘Nobody who teaches A-level or has anything to do with teaching first-year university students has any doubt that A Levels have been dumbed down… The writing of the essay has been the key intellectual form in undergraduate education for more than a century; excelling at A-level meant excelling in this form. All that went by the board when … David Blunkett, brought in AS-levels… A-levels … became two years of continuous assessment with students often taking their first module within three months of entering the sixth form. This huge increase in testing went together with a drastic change in assessment. Candidates were not now marked in relation to an overall view of their ability to mount and develop arguments, but in relation to their ability to demonstrate achievement against tightly defined assessment objectives… A-levels, once a test of general intellectual ability in relation to a particular subject, are now a tightly supervised procession through a series of targets. Assessment doesn’t come at the end of the course – it is the course… In English, students read many fewer books… Students now arrive at university without the knowledge or skills considered automatic in our day… One of the results of the changes at A-level is that the undergraduate degree is itself a much more targeted affair. Students lack of a general education mean that special subjects, dissertations etc are added to general courses which are themselves much more limited in their approach… One result of this is a grade inflation much more dramatic even than A-levels… [T]here is little place within a modern English university for students to develop the kind of intellectual independence and judgment, which has historically been the aim of the undergraduate degree.’ Observer, 22 August, 2004. (Emphasis added.)

If anybody knows of studies on history and other humanities please link in Comments below.

Oxbridge entrance

As political arguments increasingly focused on ‘participation’ and ‘access’, Oxford and Cambridge largely abandoned their own entrance exams in the 1990s. There were some oddities. Cambridge University dropped their maths test and were so worried by the results that they immediately asked for and were given special dispensation to reintroduce it and they have used one since (now known as the STEP paper, used by a few other universities). Other Cambridge departments who wanted to do the same were refused permission and some of them (including the physics department) now use interviews to test material they would like to test in a written exam. Oxford changed its mind and gradually reintroduced admission tests in some subjects. (E.g. It does not use STEP in maths but uses its own test which has more ‘applied’ maths.) Cambridge now uses AS Levels. Oxford does not (but does not like to explain why).

A Levels are largely useless for distinguishing between candidates in the top 2% of ability (i.e. two standard deviations above average). Oxbridge entry now involves a complex and incoherent set of procedures. Some departments use interviews to test skills that are i) either wholly or entirely untested by A Levels and ii) are not explicitly set out anywhere. For example, if you go to an interview for physics at Cambridge, they will ask you questions like ‘how many photons hit your eye per second from Alpha Centauri?’ – i.e. questions that you cannot cram for but from which much information can be gained by tutors watching how students grapple with the problem.

The fact that the real skills they want to test are asked about in interviews rather than in public exams is, in my opinion, not only bad for ‘standards’ but is also unfair. Rich schools with long connections to Oxbridge colleges have teachers who understand these interviews and know how to prepare pupils for them. They still teach the material tested in old exams and other materials such as Russian textbooks created decades ago. A comprehensive in east Durham that has never sent anybody to Oxbridge is very unlikely to have the same sort of expertise and is much more likely to operate on the very mistaken assumption that getting a pupil to three As is sufficient preparation for Oxbridge selection. Testing skills in open exams that everybody can see would be fairer.

I will return to this issue in a later blog but it is important to consider the oddities of this situation. Decades ago, open public standardised tests were seen as a way to overcome prejudice. For example, Ivy League universities like Harvard infamously biased their admissions system against Jews because a fair open process based on intellectual abilities, and ignoring things like lacrosse skills, would have put more Jews into Harvard than Harvard wanted. Similar bias is widespread now in order to keep the number of East Asians low. It is no coincidence that Caltech’s admissions policy is unusually based on academic ability and it has a far higher proportion of East Asians than the likes of Harvard.

Similar problems apply to Oxbridge. A consequence of making exams easier and removing Oxbridge admissions tests was to make the process more opaque and therefore biased against poorer families. The fascinating journey made by the intellectual Left on the issue of standardised tests is described in Steven Pinker’s recent influential essay on university admissions. I agree with him that a big part of the reason for the ‘madness’ is that the intelligentsia ‘has lost the ability to think straight about objective tests’. Half a century ago, the Left fought for standardised tests to overcome prejudice, now many on the Left oppose tests and argue for criteria that give the well-connected middle classes unfair advantages.

This combination of problems is one of the reasons why the Cambridge pure maths department and physics department worked with me to develop projects to redo 16-18 curricula, teacher training, and testing systems. Cambridge is even experimenting with a ‘correspondence Free School’ idea proposed by the mathematician Alexander Borovik (who attended one of the famous Russian maths schools). Powerful forces tried to stop these projects happening because they are, obviously, implicit condemnations of the existing system – condemnations that many would prefer had never seen the light of day. Similar projects in other departments at other universities were kiboshed for the same reason, as were other proposals for specialist maths schools as per the King’s project (which also would never have happened but for the determination of Alison Wolf and a handful of heroic officials in the DfE). I will return to this too.

C. Conclusions

Here are some tentative conclusions.

  1. The political and bureaucratic process for the introduction of the GCSE and National Curriculum was a shambles. Those involved did not go through basic processes to agree aims. Implementation was awful. All elements of the system failed children. There are important lessons for those who want to reform the current system.
  2. Given the weight of evidence above, it is hard to avoid the conclusion that GCSEs were made easier than O Levels and became easier still over time. This means that at least the top fifth are aimed aged 14 at lower standards than they would have been aimed at previously (not that O Levels were at all optimal). Many of them spend two years with low grade material and repeating boring drills, in order that the school can maximise its league table position, instead of delving deeper into subjects. Inflation seems to have stopped in the last two years, perhaps temporarily, but by the use of an Ofqual system known as ‘comparable outcomes’ which is barely understood by anybody in the school system or DfE.
  3. A Levels, at least in maths, sciences, and languages, were quickly made easier after 1988 and not just by enough to keep pass marks stable but by enough to lead to large increases. Even A Level students are aimed at mundane tasks like ‘design a poster’ that are suitable for small children – not near-adults. (As I type this I am looking at an Edexcel textbook for Further Maths A Level which for some reason, Edexcel has chosen to decorate with the picture of a child in a ‘Robin’ masked outfit.)
  4. The old ‘S’ level papers, designed to stretch the best A Level students, were abandoned which contributed to a decline of standards aimed for among the top 5%.
  5. University degrees in some subjects therefore also had to become easier (e.g. classics) or longer (natural sciences) in order to avoid increases in failure rates. This happened in some subjects even in elite universities. Remedial courses spread, even in elite universities, to teach/improve skills that were previously expected on arrival (including Classics at Oxford and History at Cambridge). Not all of the problems are because of failures in schools or easier exams. Some are because universities themselves for political reasons will not make certain requirements of applicants. Even if the exam system were fixed, this would remain a big problem. On the other hand, while publicly speaking out for AS Levels, admissions officers also, very quietly, have been gradually introducing new, non-Government/Ofqual regulated, tests for admissions purposes. On this, it is more useful to watch what universities do than what they say.
  6. These problems have cascaded right through the system and now affect the pipeline into senior university research positions in maths, sciences, and languages. For example, the lack of maths skills among biologists is hampering the development of synthetic biology and computational biology. It is very common now to have (private) discussions with scientists deploring the decline in English research universities. Just in the past few weeks I have had emails from an English physicist now at Harvard and a prominent English neuroscientist giving me details of these developments and how we are falling further behind American universities. As they say, however, nobody wants to speak out.
  7. It is much easier to see what has happened at the top end of the ability curve, where effects show up in universities, than it is for median pupils. The media also  focuses on issues at the top end of the ability curve, A Levels, and the Russell Group.
  8. Because politicians took control of the system and used results to justify their own policies, and because they control funding, debate over standards became thoroughly dishonest, starting with the Conservative government in the 1980s and continuing to now when academics are pressured not to speak out by administrators for fear of politicians’ responses. When governments are in control of the metrics according to which they are judged, there is likely to be dishonesty. If people – including unions, teachers, and officials – claim they deserve more money on the basis of metrics that are controlled by a small group of people operating an opaque process and controlling the regulator themselves, there is likely to be dishonesty.

An important caveat. It is possible that simultaneously a) 1-8 is true and b) the school system has improved in various ways. What do I mean?

This is a coherent (not necessarily right) conclusion from the story told above…

GCSEs are significantly easier than O Levels. Nevertheless, the switch to GCSEs also involved many comprehensives and secondary moderns dropping the old idea that maybe only a fifth of the cohort are ‘academic’ – the idea from Plato’s Republic of gold, silver, and bronze children, that influenced the 1944 Act. Instead, more schools began to focus more pupils on academic subjects. Even though the standards demanded were easier than in the pre-1988 exams, this new focus (combined with other things) at least led between 1988 and now to a) a reduction in the number of truly awful schools and b) more useful knowledge and skills at least for the bottom fifth of the cohort (in ability terms), and perhaps for more. Perhaps the education of median ability pupils stayed roughly the same (declining a bit in maths) hence the consistent picture in international tests, the King’s results comparing maths in 1978/2008, Shayer’s results and so on (above). Meanwhile the standards demanded by post-1988 A Levels clearly fell (at least in some vital subjects), as the changes in universities testify, and S Level papers vanished, so the top fifth of the cohort (and particularly the +2 standard deviation population, i.e. the top 2%) leave school in some subjects considerably worse educated than in the 1980s. (Given most scientific and technological breakthroughs come from among this top 2% this has a big knock-on effect.) Private schools felt incentivised to perform better than state schools on easier GCSEs and A Levels rather than pursue separate qualifications with all the accompanying problems. There remains no good scientific data on what children at different points on the ability curve are capable of achieving given excellent teaching so the discussion of ‘standards’ remains circular. Easier GCSEs and A Levels are consistent with some improvements for the bottom fifth, roughly stability for the median, significant decline for the top fifth, and fewer awful schools.

This is coherent. It fits the evidence sketched above.

But is it right?

In the next blog in this series I will consider issues of ‘ability’ and the circularity of the current debate on ‘standards’.

Questions?

If people accept the conclusions about GCSEs and A Levels (at least in maths, sciences, and languages, I stress again) how should this evidence be weighed against the very strong desire of many in the education system (and Parliament and Whitehall) to maintain a situation in which the vast majority of the cohort are aimed at GCSEs (or international equivalents that are not hugely different) and, for those deemed ‘academic’, A Levels?

Do the gains from this approach outweigh the losses for an unknown fraction of the ‘more able’?

Is there a way to improve gains for all points on the ability distribution?

I have been told that there is no grade inflation in music exams. Is this true? If YES, is this partly because they are not regulated by the state? Are there other factors? Has A Level Music got easier? If not why not?

What sort of approaches should be experimented with instead of the standard approaches seen in O Levels, GCSEs, and A Levels?

What can be learned from non-Government regulated tests such as Force Concepts Tests (physics), university admissions tests, STEP, IQ tests and so on?

What are the best sources on ‘S’ Level papers and what happened with Oxbridge entrance exams?

What other evidence is there? Where are analyses similar to Warner’s on physics for other subjects?

What evidence is there for university grade inflation which many tell me is now worse than GCSEs and A Levels?

 

Standards In English Schools Part 0: Introduction

‘I think the educational and psychological studies I mentioned are examples of what I would like to call Cargo Cult Science. In the South Seas there is a Cargo Cult of people. During the war they saw airplanes land with lots of good materials, and they want the same thing to happen now. So they’ve arranged to make things like runways, to put fires along the sides of the runways, to make a wooden hut for a man to sit in, with two wooden pieces on his head like headphones and bars of bamboo sticking out like antennas – he’s the controller – and they wait for the airplanes to land. They’re doing everything right. The form is perfect. It looks exactly the way it looked before. But it doesn’t work. No airplanes land. So I call these things Cargo Cult Science, because they follow all the apparent precepts and forms of scientific investigation, but they’re missing something essential, because the planes don’t land.’ Richard Feynman’s Caltech commencement address on Education and Cargo Cult Science (1974). 

‘Let’s put behind us once and for all the old sterile debate about dumbing down. I want to end young people being told that the GCSE or A-level grades they are proud of aren’t worth what they used to be.’ Ed Balls to the Labour Party Conference,  2007. 

‘It is undeniable that the last Labour government dramatically improved school standards in secondary education.’ Tristram Hunt, 26 January 2015.

‘Despite the apparently plausible and widespread belief to the contrary, the evidence that levels of attainment in schools in England have systematically improved over the last 30 years is unconvincing. Much of what is claimed as school improvement is illusory… standards have not risen; teaching has not improved… The question, therefore, is not whether there has been grade inflation, but how much…’ (Professor Robert Coe, 2013, here.)

Summary

This series of blogs will discuss: 1) what we know about standards in English schools including the effect of the introduction of the National Curriculum and GCSEs; 2) how ‘ability’ and ‘standards’ should be defined; 3) what can be learned from the 2010-15 reforms and what incentives now dominate the system; 4) what research and policy agenda is needed; 5) what materials are there for those interested in standards beyond those of the National Curriculum and state controlled exams.

Introduction

The debate about ‘standards in English schools’ is obviously of great importance but it suffers from many fundamental problems. Ironically for a debate that often involves the word ‘rigour’, the debate is itself unrigorous.

The main concepts are not properly defined. Politicians, policy people, officials, and journalists speak and write daily using phrases such as ‘we must drive up standards so that [X% of schools or pupils] hit the standard of [Y]’ when Y has no objective definition. Most obviously there has been enormous debate about grades in GCSEs and A Levels but these grades themselves are arbitrarily created according to criteria that would not impress physical scientists. The ‘standards’ are circular. Exams are regulated by the DfE and Ofqual in order that there is a very high chance that at least X% ‘pass’, then people say ‘more than X% should pass’, or ‘X% is too tough’. But the X% is just based in the first place on where the system happens to be which is historically contingent – it is not based on any scientific judgement about what children of different abilities (rigorously defined) are capable of doing given certain teaching.

In the recent debate over reforming GCSEs, when we tried to drop their use in the accountability system in 2012, Nick Clegg insisted, and Cameron agreed, that the entire reform process be based on the principles that i) about 95% of the cohort should do the same exams at 16 and ii) not many more would fail to pass than now (2012). Definitions of a ‘pass’ were therefore set in order to fit with an a priori desire for a certain percentage to pass – a political desire of one party rather than an educational judgement. (The other two parties have had the same approach over the past thirty years – my point is not that the LibDems are particularly bad.)

Despite having a circular process for defining standards, it has been a central feature of education debates for politicians to set targets for what proportion of pupils every school must get to ‘pass’ – targets that have high stakes for school management and teachers. One can understand the motivation, given the bad effects for individuals of being in really bad schools, but the process as a whole does not make sense. Further, Ofqual imposes a system (‘comparable outcomes’) which is intended to combat grade inflation but which also seems to operate deliberately against the goal of significant rises in the proportion passing GCSEs. Further, Ofsted’s reports add noise, not signal, given, as Professor Coe has said, ‘its judgements have little scientific credibility’ (some argue this is too generous).

Similarly, people in the education world use the word ‘ability’ but they almost never define or have an objective measure for ‘ability’. The work of scientists on this subject has been almost entirely ignored and has had practically no effect on policy in England. Many teacher training colleges promote ‘cargo cult’ science on the subject of ‘ability’ to thousands of teachers who are therefore confident in views that are the opposite of what the science says.

As far as I am aware, there is no serious research agenda in English schools attempting to a) discover what pupils of different ability, using objective measures, are capable of achieving given certain teaching and b) use this knowledge to shape the curriculum, tests, and objective measures of school performance in an iterative feedback loop that can improve its accuracy over time.

The main point of these blogs is to help make the case for such a research programme (see below). Since first becoming involved in education debates in 2007, I have had many discussions about this. I have said to many people, including in the Royal Society, the home of British science, that we need a scientific approach to the issue of standards and ability. I wrote about it in my essay that became public in 2013. I argued for it in the DfE, with subject associations, with those responsible for teacher training (‘the most bankrupt institution I know’, said Hattie), and with many people who talk about ‘research’ and ‘evidence’.

Few have wanted to engage in this subject because it is so politically fraught. Even fewer have done so publicly and I have personal experience of severe pressure put on many academics by university administrators not to tell the truth. However, a very positive development in English education is the growth of support for thinking seriously about evidence. In the DfE, there was a long battle on this issue that ended suddenly when the new Permanent Secretary arrived and immediately agreed with the appointment of Ben Goldacre to do a review of the Department’s handling of evidence, research, and data, which was published in 2013. (I have written many critical things about officials, such as HERE, so it is worth noting that Wormald, and other officials particularly younger ones, took this enlightened view.) There is no doubt that the culture inside the Department changed as a result though there is a very long way to go in this area and it is reasonable to be doubtful about any of the three parties’ commitment to this approach and about civil service commitment. Tom Bennett’s efforts with ResearchEd have been fantastic and are one of the most hopeful things I’ve seen since 2007. There is also now a discussion about a possible College of Teachers – an institution that will only be credible if it has high standards on the subject of cargo cult research. Unsurprisingly, therefore, more people are starting to ask: what do we know about standards? (E.g. Sam Freedman recently blogged on it.)

I therefore thought I would jot down in a series of blogs various bits of evidence, history, thoughts, discussions and so on that I have accumulated since 2007.

Five broad areas

This series of blogs will consider inter alia these questions grouped in five rough areas (which may change as I go along).

A. What is the evidence concerning ‘standards in English schools’? What was the effect of the introduction of 1) GCSEs and 2) the National Curriculum with its connected testing regime? What were the cascading effects on A Levels and higher education? What do comparisons with international tests and other academic studies tell us? What do subject associations and organisations such as the Royal Society say? What do universities and subject experts say?

B How should ‘ability’ and ‘standards’ be defined? What undermines sensible discussion about this?

C. What was Gove’s team trying to do 2010-14? How effective were reforms concerning the curriculum, exams, and accountability (including the role of Ofsted)? What lessons might be learned from the period 2010-15? What incentives dominate the system now?

D. What should come next? What can we reasonably infer from the period since 1985 about what is very unlikely to work? What should the parties not put in their manifestos? What are the main reasons why political and policy discussion of this subject has been so controversial? How does the transformation of the technological landscape since the mid-1980s change arguments? How could a focus on evidence and empiricism help improve the system?

E. What materials are there that can be used by schools that are focused more on education and learning than the official accountability system?

The goal

The goal of these blogs is not to ‘defend the Gove reforms’. When I get onto them, I will try to explain as clearly as I can why we tried to do certain things and what went wrong.  GCSE reform (along with the disaster of Ofsted) is arguably the biggest failure of our team and therefore particularly needs analysis. The goal is not to affect party manifestos – it is possible but unlikely that someone reading this may be able to nudge things off a party or bureaucratic agenda. It is reasonable to assume that whatever the parties promise their plans will crumble on contact with reality. My main hope is that people outside SW1 at the coalface of education take matters into their own hands and develop their own approaches to scientific experimentation with the curriculum, exams, and training.

In my opinion, the only real hope for large improvements in learning is if 1) a critical mass of people become convinced of the need for an empirical approach and the rejection of ‘cargo cult science’ that has dominated education, and 2) an empirical programme emerges that iteratively a) tests what children of different abilities can learn and b) uses this information to alter curricula, tests, and teacher training. We need experiments and Grand Prizes in education that have brought dramatic breakthroughs in other areas, such as DARPA’s Grand Challenge that led to breakthroughs in basic science and then to driverless cars. Imagine what well-defined Grand Challenges could bring to English schools.

Improvements in education do not need to be justified as goals with reference to other things such as economic growth. Learning and education are fundamental aspects of being human. However, it is obvious that humans will have to grapple with profound challenges over the next thirty years. The population will grow by another few billion, mainly in cities and connected to the mobile internet and ‘the internet of things’. Energy and other resource demands will put the global system under huge pressure. We face old security threats like nuclear weapons and new threats such as the use of genetic engineering techniques empowering garage bio-hackers, for good and evil. For example, the revolutionary genome ‘cut and paste’ engineering tool, CRISPR, may soon be used to ‘de-extinct’ species and eradicate diseases but the same techniques could be used destructively. Much progress in machine intelligence and robotics is being driven by research controlled by militaries and intelligence agencies but little research is done on the profound dangers.

If we are to cope with these things, we will need new technologies, new institutions, and new ideas. Improving our education system is therefore obviously central. I have proposed that it ought to become the central organising principle for the British state, as an answer to Dean Acheson’s famous quip that Britain had failed to find a post-imperial role.

Hopefully the discussion of standards in English schools will be useful regardless of whether you agree with this broader argument or not.

Please leave comments, corrections, research reports, complaints etc below. I will add things people leave as I go along and at the end try to produce something short and rigorous…

International GCSEs and the DfE ban – was there a better path?

It is reported that the DfE has decided not to allow international GCSEs to be allowed to be used in league tables.

This is not a surprise though I think it is bad policy. I will explain some background, my involvement in discussions about this before I left in January 2014, and why I think it is a mistake.

None of the boards’ iGCSEs counted in league tables pre-2010. We thought this was a mistake. Some of the best private schools used the Cambridge International iGCSE. Some great state schools told us in Opposition that they wanted to do the same. It seemed reasonable to have more diversity in the system and let state schools do what private schools were doing particularly given the huge problems with standard GCSEs and the difficulty with reforming them.

During 2013, as standard GCSEs were being reformed, the issue arose of what to do about iGCSEs.

The issue was complicated by differences among the boards. I have never heard anyone claim that the Cambridge International iGCSE is easier than the standard GCSE. However, there were persistent arguments that other boards’ iGCSEs were not as hard as GCSEs and private schools and others were being conned.

There was a piece of research circulating (from people very well known in the education world who are taken very seriously) that plotted the Cambridge iGCSE against the standard GCSE and another board’s iGCSE, with the former harder than GCSEs and the latter easier than GCSEs. The research was not published because the people concerned were frightened of legal action (itself a telling detail about the epidemic dishonesty in the English debate on exams and standards, a dishonesty that, I think, most who discuss education policy greatly underestimate).

In 2013, officials and Ofqual argued that the reformed GCSEs starting from 2015 should completely replace iGCSEs which should not be allowed in the league tables. MG and I were not keen on this idea. I spoke to people about their concerns. I suggested the following path through (the below is not a quote from my memo, which I will dig out, but includes the main ideas)…

The DfE should announce that anybody who wants their exams, including iGCSEs, to be included in league tables will have to produce clear overwhelming and independent evidence that they are significantly more challenging than standard GCSEs. If they produce such evidence, we will include them; if not, not. This means that we avoid banning things that are obviously better than GCSEs but we also cull those exam boards who are abusing the system. We will also learn from the evidence presented and that exercise will be a useful thing – even if none of the boards submit anything we will learn something valuable. We’re trying to move all of these debates towards people discussing evidence rather than hunches and this is a very good candidate for this approach.

The responses boiled down to two things.

1) Nobody had an argument against the idea as policy.

2) Nobody wanted to do it. The bureaucratic arguments amounted to: a) ‘It’s messy.’ b) ‘We’ll get sued by the exam boards who are always cheating things and will hire fancy lawyers.’ c) ‘Ofqual doesn’t want to.’ Why? ‘It’s messy.’ d) Unstated but hanging over the discussion – ‘it’s a lot of hard work on a marginal issue and nobody is going to attack us for being elitist if we ban them’.

Given this response, MG and spads thought we should try it.

Some time in autumn 2013, I can’t remember when, I was tipped off that ‘David Laws hates your idea, he just wants to ban them, there’s a meeting shortly’. I went to the meeting. I was not optimistic and assumed I would have to torpedo him with ‘SoS agrees with me’, given the state of relationships by then given Clegg’s appalling behaviour. My assumption was wrong.

The issues were explained. I gave him my argument. He had not heard it. I said that the bureaucratic arguments were not relevant – particularly absurd fears about legal challenges (an argument deployed daily) – and the best thing educationally was to give it a whirl regardless of some complexity and irrelevant media noise. Laws listened and asked questions. He was reasonable. He asked officials if anybody had a policy argument against the idea. Nobody did, the main argument was ‘Ofqual really wants us to ban them’. It was also clear though that the issue was not closed and officials knew I was soon leaving.

It is no surprise to see the news today. The bureaucracy now has its clarity, but is it a good decision? Were Nicky Morgan / her spads given an alternative (it would not surprise me if the option was never presented to them)? What will the DfE say when a state school says ‘Eton does the Cambridge iGCSE in X because they think it’s better – why can’t we offer our pupils the same thing, as you promised before the 2010 election?’? Can anybody see a downside to trying the other path, given one could always have reverted to banning everything if it proved unworkable?

It is of course possible that detailed work was done after I left and this decision was taken for reasons that are not public but are sound. If so, I am sure the new evidence-based DfE will make the technical arguments public.

Please leave comments, corrections etc.

UPDATE 1. This story strengthens my view that one of the most important things for the improvement of education in English state schools is the development of new exams that are outside the regulatory structure of the DfE and Ofqual – exams that are aimed purely at encouraging deep skills in mathematical modelling, extended writing and so on. It is not a coincidence that perhaps the most challenging exam taken in English schools – maths STEP – a) is not created by the domestic exam boards (Cambridge Assessment, not OCR), b) has zero input from DfE, c) is not regulated by Ofqual, d) has a clear educational purpose of encouraging deep skills needed for a serious undergraduate degree, and e) the people who use it as a tool would be horrified at the idea of the DfE, Ofqual, or ‘education policy people’ proposing Whitehall should have anything to do with it. I will blog soon on how I think a new ‘post-GCSE & A Level’ system could evolve.

UPDATE 2. As some emails winging from the DfE say, not all officials wanted the ban and some agreed with the course suggested above. True. (I tend to assume readers of this blog will assume the DfE is not monolithic.) But it was also clear which way  Whitehall’s gravity was pulling.

UPDATE 3. An email arrives from inside the DfE – a senior official who was involved in these decisions… He points to this research on iGCSEs. The C/D borderline figures are the most interesting.

I want to stress – I am not saying that international GCSEs are ‘the answer’ to the problems with the exam system. I do not think they are. I think the problems are much more fundamental and require much deeper changes. My point is that it would have been much better policy to ask the boards for hard evidence about the exams in order that the policy world can examine the issues on the basis of data rather than hunches and just ‘officials say they’re easier’ / ‘well why does Eton do them then?’ etc, which is the level of debate over the past decade. If the DfE made the decision on the basis largely of the evidence in the link above, then it should explain this publicly so people can judge whether their thought process was reasonable, otherwise inevitably many will assume the decision was made for bureaucratic – not educational – reasons.

 

 

 

 

Times op-ed: What Is To Be Done? An answer to Dean Acheson’s famous quip

On Tuesday 2 December, the Times ran an op-ed by me you can see HERE. It got cut slightly for space. Below is the original version that makes a few other points.

I will use this as a start of a new series on what can be done to improve the system including policy, institutions, and management.

NB1. The article is not about the election or party politics. My suggested answer to Acheson is, I think, powerful partly because it is something that could be agreed upon, in various dimensions, across the political spectrum. I left the DfE in January partly because I wanted to have nothing to do with the election and this piece should not be seen as advocating ‘something Tories should say for the election’. I do not think any of the three leaders are interested in or could usefully pursue this goal – I am suggesting something for the future when they are all gone, and they could quite easily all be gone by summer 2016.

NB2. My view is not – ‘public bad, private good’. As I explained in The Hollow Men II, a much more accurate and interesting distinction is between a) large elements of state bureaucracies, dreadful NGOs like the CBI, and many large companies (that have many of the same HR and incentive problems as bureaucracies), where very similar types rise to power because the incentives encourage political skills rather than problem-solving skills, and b) start-ups, where entrepreneurs and technically trained problem-solvers can create organisations that operate extremely differently, move extremely fast, create huge value, and so on.

(For a great insight into start-up world I recommend two books. 1. Peter Thiel’s new book ‘Zero To One‘. 2. An older book telling the story of a mid-90s start-up that was embroiled in the Netscape/Microsoft battle and ended up selling itself to the much better organised Bill Gates – ‘High Stakes, No Prisoners‘ by Charles Ferguson. This blog, Creators and Rulers, by physicist Steve Hsu also summarises some crucial issues excellently.)

Some parts of government can work like start-ups but the rest of the system tries to smother them. For example, DARPA (originally ARPA) was set up as part of the US panic about Sputnik. It operates on very different principles from the rest of the Pentagon’s R&D system. Because it is organised differently, it has repeatedly produced revolutionary breakthroughs (e.g. the internet) despite a relatively tiny budget. But also note – DARPA has been around for decades and its operating principles are clear but nobody else has managed to create an equivalent (openly at least). Also note that despite its track record, D.C. vultures constantly circle trying to make it conform to the normal rules or otherwise clip its wings. (Another interesting case study would be the alternative paths taken by a) the US government developing computers with one genius mathematician, von Neumann, post-1945 (a lot of ‘start-up’ culture) and b) the UK government’s awful decisions in the same field with another genius mathematician, Turing, post-1945.)

When I talk about new and different institutions below, this is one of the things I mean. I will write a separate blog just on DARPA but I think there are two clear action points:

1. We should create a civilian version of DARPA aimed at high-risk/high-impact breakthroughs in areas like energy science and other fundamental areas such as quantum information and computing that clearly have world-changing potential. For it to work, it would have to operate outside all existing Whitehall HR rules, EU procurement rules and so on – otherwise it would be as dysfunctional as the rest of the system (defence procurement is in a much worse state than the DfE, hence, for example, billions spent on aircraft carriers that in classified war-games cannot be deployed to warzones). We could easily afford this if we could prioritise – UK politicians spend far more than DARPA’s budget on gimmicks every year – and it would provide huge value with cascading effects through universities and businesses.

2. The lessons of why and how it works – such as incentivising goals, not micromanaging methods – have general application that are useful when we think generally about Whitehall reform.

Finally, government institutions also operate to exclude from power scientists, mathematicians, and people from the start-up world – the Creators, in Hsu’s term. We need to think very hard about how to use their very rare and valuable skills as a counterweight to the inevitable psychological type that politics will always tend to promote.

Please leave comments, corrections etc below.

DC


 

What Is to Be Done?

There is growing and justified contempt for Westminster. Number Ten has become a tragi-comic press office with the prime minister acting as Über Pundit. Cameron, Miliband, and Clegg see only the news’s flickering shadows on their cave wall – they cannot see the real world behind them. As they watch floundering MPs, officials know they will stay in charge regardless of an election that won’t significantly change Britain’s trajectory.

Our institutions failed pre-1914, pre-1939, and with Europe. They are now failing to deal with a combination of debts, bad public services, security threats, and profound transitions in geopolitics, economics, and technology. They fail in crises because they are programmed to fail. The public knows we need to reorient national policy and reform these institutions. How?

First, we need a new goal. In 1962, Dean Acheson quipped that Britain had failed to find a post-imperial role. The romantic pursuit of ‘the special relationship’ and the deluded pursuit of a leading EU role have failed. This role should focus on making Britain the best country for education and science. Pericles described Athens as ‘the school of Greece’: we could be the school of the world because this role depends on thought and organisation, not size.

This would give us a central role in tackling humanity’s biggest problems and shaping the new institutions, displacing the EU and UN, that will emerge as the world makes painful transitions in coming decades. It would provide a focus for financial priorities and Whitehall’s urgent organisational surgery. It’s a goal that could mobilise very large efforts across political divisions as the pursuit of knowledge is an extremely powerful motive.

Second, we must train aspirant leaders very differently so they have basic quantitative skills and experience of managing complex projects. We should stop selecting leaders from a subset of Oxbridge egomaniacs with a humanities degree and a spell as spin doctor.

In 2012, Fields Medallist Tim Gowers sketched a ‘maths for presidents’ course to teach 16-18 year-olds crucial maths skills, including probability and statistics, that can help solve real problems. It starts next year. [NB. The DfE funded MEI to turn this blog into a real course.] A version should be developed for MPs and officials. (A similar ‘Physics for Presidents‘ course has been a smash hit at Berkeley.) Similarly, pioneering work by Philip Tetlock on ‘The Good Judgement Project‘ has shown that training can reduce common cognitive errors and can sharply improve the quality of political predictions, hitherto characterised by great self-confidence and constant failure.

New interdisciplinary degrees such as ‘World history and maths for presidents’ would improve on PPE but theory isn’t enough. If we want leaders to make good decisions amid huge complexity, and learn how to build great teams, then we should send them to learn from people who’ve proved they can do it. Instead of long summer holidays, embed aspirant leaders with Larry Page or James Dyson so they can experience successful leadership.

Third, because better training can only do so much, we must open political institutions to people and ideas from outside SW1.

A few people prove able repeatedly to solve hard problems in theoretical and practical fields, creating important new ideas and huge value. Whitehall and Westminster operate to exclude them from influence. Instead, they tend to promote hacks and apparatchiks and incentivise psychopathic narcissism and bureaucratic infighting skills – not the pursuit of the public interest.

How to open up the system? First, a Prime Minister should be able to appoint Secretaries of State from outside Parliament. [How? A quick and dirty solution would be: a) shove them in the Lords, b) give Lords ministers ‘rights of audience’ in the Commons, c) strengthen the Select Committee system.]

Second, the 150 year experiment with a permanent civil service should end and Whitehall must open to outsiders. The role of Permanent Secretary should go and ministers should appoint departmental chief executives so they are really responsible for policy and implementation. Expertise should be brought in as needed with no restrictions from the destructive civil service ‘human resources’ system that programmes government to fail. Mass collaborations are revolutionising science [cf. Michael Nielsen’s brilliant book]; they could revolutionise policy. Real openness would bring urgent focus to Whitehall’s disastrous lack of skills in basic functions such as budgeting, contracts, procurement, legal advice, and project management.

Third, Whitehall’s functions should be amputated. The Department for Education improved as Gove shrank it. Other departments would benefit from extreme focus, simplification, and firing thousands of overpaid people. If the bureaucracy ceases to be ‘permanent’, it can adapt quickly. Instead of obsessing on process, distorting targets, and micromanaging methods, it could shift to incentivising goals and decentralising methods.

Fourth, existing legal relationships with the EU and ECHR must change. They are incompatible with democratic and effective government

Fifth, Number Ten must be reoriented from ‘government by punditry’ to a focus on the operational planning and project management needed to convert priorities to reality over months and years.

Technological changes such as genetic engineering and machine intelligence are bringing revolution. It would be better to undertake it than undergo it.