Two tales of deception
We may be misled by false signs of authority, but being fooled by our deepest beliefs is a bigger concern
As the leaves in Stockholm (and elsewhere in the Northern hemisphere) begin to colour, and some early ones already let go of the twigs they’ve been holding on to since the spring, it’s the season of the Nobel prizes. And pretty much every year someone somewhere will refer to a study linking the number of Nobel prize winners a country has had with its national chocolate consumption per capita. Surely this must be fake?
The study ( ungated here) in question was conducted by Dr Franz Messerli, a Swiss cardiologist working in the US since 1976, and published in 2012 in the New England Journal of Medicine. It looked at 23 countries and found a strong correlation (r = 0.791) between the amount of chocolate consumed by a country’s population, and the number of Nobel laureates it has produced. (The correlation coefficient r is 0 for totally random, uncorrelated data, and 1 for perfect correlation). If outlier Sweden was discarded, the correlation rose to even loftier heights (r= 0.862). In order to produce one additional Nobel prize winner, every citizen of a country would need to eat another 400g of chocolate per year — a small sacrifice for some additional national pride, no?
Fooled by seriousness
It still looks like one of the Spurious Correlations, like the one between the cheese consumption of the US population and the number of people who perished by becoming tangled in their bedsheets. The fact that the author is Swiss (Switzerland is a large producer of quality chocolate, and a nation of consumers — they eat more of it than anyone else, about 12 kg per person per year) adds to the suspicion that it is really a joke article.
That is also what Daniël Lakens, a psychologist at the Technical University of Eindhoven in the Netherlands, and a respected expert in statistics, thought. On 22 October he tweeted a link to the study, indicating it was a humorous illustration why correlation is not causation. But less than 24 hours later, he tweeted “Oh holy crap I was wrong — this article actually WAS serious”, citing two colleagues who had pointed out the error of his ways. The study does actually look quite serious. It hypothesizes a causal mechanism: chocolate has been found to improve cognitive function; it also suggests a possible reverse causation (smart people know that chocolate is good for the brain, so they eat more chocolate).
One of the colleagues had, just a week before, given the paper to her students as a serious paper, and the other one had responded at the time with earnest criticism to the study. “It seemed unlikely that a journal with such exacting standards would give space to a lightweight piece on chocolate,” he wrote. And yet…
A few hours after tweeting he thought it was serious, Lakens conceded that “it probably WAS a joke after all”, citing a contemporary Reuters article revealing that the ‘study’ was indeed not serious. Messerli had come up with the idea after reading research linking flavonoids, an antioxidant substance present in cocoa and wine, to better results on cognitive tests. A little later, with nothing else to do in a hotel room in Kathmandu, he wrote it all up.
This amusing anecdote produces a couple of interesting insights. First is the observation that not only does correlation not mean causation, it can also mean nothing at all. The p-value — a statistical measure for the likelihood that a result occurs by chance — for Messerli’s study was 0.0001, or less than one in 10,000. Given that a p-value of less than 0.05 is treated as statistically significant, it makes you wonder how confident we can be in scientific results in general.
But it also highlights the heuristics we use, the rules of thumb to determine whether something is reliable or not. We attach a lot of importance to authority. The NJEM is a highly respected journal, not known for frivolous articles, and submissions need to meet very high standards to be accepted. Dr Messerli is an authority in the field of hypertension with over 800 publications to his name. The article quotes relevant other studies. If this is our starting point, then neither the improbable results and conclusions, nor even the disclosure statement at the end of the article, “Dr. Messerli reports regular daily chocolate consumption, mostly but not exclusively in the form of Lindt’s dark varieties” may be enough to make us doubt the seriousness of the claims.
Still, the worst that can happen if you take a satirical article for real is that you make a bit of a fool of yourself, and no serious harm is done. But that is not always the case.
A fake prison (experiment)
One of the most famous psychological experiments is the Stanford Prison Experiment (SPE), conducted in 1971 by researchers led by psychologist Philip Zimbardo at, you guessed it, the prestigious Stanford University. 24 volunteer students were randomly divided in two groups, twelve ‘prisoners’ and twelve ‘guards’. It was set up to be highly realistic, with mock prison cells in the basement of a university building, buckets for sanitation, and ill-fitting smocks to wear for the prisoners etc. The students playing the role of guards were likewise given suitable attributes (khaki shirts and pants, batons etc). Shortly after the start of the experiment, the ‘guards’ apparently spontaneously began embracing their role, becoming increasingly abusive towards the ‘inmates’, and even subjecting them to mental torture and physical violence.
According to Zimbardo, the situation rather than the individual disposition (i.e. their inherent character traits) of the ‘guards’ was the cause of their brutal behaviour. As he would later write in an article for Stanford Magazine, “normal people could behave in pathological ways even without the external pressure of an experimenter-authority.” Normal young men (many were avowed pacifists) became hostile and sadistic, verbally and physically abusing others — simply by playing the guards. The “simulation had to be ended after six days because the inhumanity of the ‘evil situation’ had totally dominated the humanity of the ‘good’ participants.”
Although there were questions around the experiment at the time, both regarding its ethics and its rigour, it received widespread media attention, and was included in many psychology textbooks. Zimbardo became a bit of a celebrity and as a result acquired considerable authority.
But all was not what it seemed. It turned out that Zimbardo, contrary to his claim, had played an active role (as the ‘prison superintendent’) and that his assistants had actually encouraged the mock guards’ authoritarian behaviour. There were doubts about bias among the participants and accuracy of the screening for psychological disorders. Apparently, only three of the guards were excessively abusive.
Nevertheless, the study remained prominently cited in textbooks to this day — mostly without referring to any of the criticisms, as a 2015 paper by Jared Bartels, a psychologist, describes. Thibault Le Texier, a researcher at the university of Nice, France, trawled not just through the academic history around the experiment (publications both by Zimbardo and by his critics over the years), but also through the archival material of the experiment. Furthermore, he also interviewed 15 of the original participants. His comprehensive study leaves little standing of the conclusions of the original experiment, and exposes it as a fake and a fraud.
The lure of authority
How come, given all that was known (and could have been known), that the SPE has been (and still is) enjoying a status it does not deserve? One factor is probably, like with the fake study on chocolate and Nobel prizes, the assumption of authority. The journal in which this study was published has a strong reputation, and so do introductory textbooks in the eyes of undergraduate psychology students. Such publications carry an implicit seal of approval. And as we have seen, our assumption may be mistaken.
But there is another likely reason, which is altogether more worrisome.
The thing is, the Stanford Prison Experiment ‘proves’ something many people would like to believe. In an interview with Julia Galef for the Rationally Speaking podcast, Thibault Le Texier cites a New York Times article from 2018. It argues that debunking studies that “became classics — and world-famous outside psychology” is different from challenging more obscure experiments, because “they dramatized something people recognized in themselves.” They are “metaphors, explanations for aspects of our behaviour that we sense are true.”
We may be surprised (if not shocked) that this view — never mind the science, we just feel it’s true — is propagated by a quality newspaper. But perhaps the piece just voices what is really a very widespread sentiment in its subtitle: “Many famous studies of human behavior cannot be reproduced. Even so, they revealed aspects of our inner lives that feel true.” Sometimes, scientific evidence is no match for what we feel to be true.
Ultimately, we should be less worried that we occasionally misjudge the authority of people, publications or institutions. By far the bigger worry is how easily we bow to the authority of our own, unevidenced beliefs.
Originally published at http://koenfucius.wordpress.com on November 1, 2019.
Thanks for reading this article — I hope you enjoyed it. Please do share it far and wide — there are handy Twitter and Facebook buttons nearby, and you can click here to share it via LinkedIn, or simply copy and paste this link. See all my other articles featuring observations of human behaviour (I publish one every Friday) here. Thank you!