If You Torture the Data Long Enough, It Will Confess

Ronald Coase? Irving John Good? Charles D. Hendrix? Robert W. Flower? Bulent Gultekin? Anonymous?

Dear Quote Investigator: Collecting and interpreting data is a delicate process that is subject to conscious and unconscious biases. The selective choice of inputs and statistical tests can yield results that are misleading. Here are two versions of a comical metaphorical adage:

  • If you torture the data long enough, it will confess.
  • If you torture the data enough, nature will always confess.

Strictly speaking these statements are ambiguous. Each interpretation hinges on whether the information in the coerced confession is correct or erroneous. The usual interpretations presume that the information extracted under duress is incorrect. Thus, torturing the data is counterproductive and not revelatory.

Both of these sayings have been attributed to Nobel Prize-winning economist Ronald Coase. Would you please explore this topic?

Quote Investigator: The earliest match located by QI appeared in an address delivered on April 22, 1971 by British mathematician I. J. Good (Irving John Good) at a meeting of the Institute of Mathematical Statistics. Good’s lecture was printed in “The American Statistician” in June 1972. Boldface added to excerpts by QI: 1

As Ronald Coase says “If you torture the data long enough, it will confess.” When data is tortured, it is useful when possible to reserve some of the sample for testing a hypothesis after it is formulated because there is not yet any satisfactory logic for using the whole of the sample.

Interestingly, Coase stated that he employed a different phrasing for the saying as shown in the citations presented further below dated August 1977 and 1982.

Here are additional selected citations in chronological order.

The figurative framework of torturing data to yield misleading evidence has appeared in print for decades, but early instances did not extend the metaphor to include confession. For example, a 1933 article in “The Elementary School Journal” contained the following passage: 2

The evidence submitted by the committee from its own questionnaire warrants no such conclusion. To torture the data given in Table I into evidence supporting a twelve-hour minimum of professional training is indeed a statistical feat, but one which the committee accomplishes to its own satisfaction.

In April 1971 Irving John Good attributed the saying under examination to Ronald Coase as mentioned previously.

In October 1972 the “Sunday Gazette-Mail” of Charleston, West Virginia announced a forthcoming lecture by Charles D. Hendrix with a title based on the saying. QI conjectures that Hendrix was repeating a clever remark he had heard circulating amongst statisticians: 3

“If You Torture the Data Long enough It Will Confess” will be the topic of Charles D. Hendrix’s talk before the American Society for Quality Control at 6:30 p.m. Wednesday at the Kim Tiki Restaurant.

In June 1976 the UPI News Service reported a remark by Robert W. Flower of the Johns Hopkins Applied Physics Laboratory. Flower employed the metaphorical framework, but he did not formulate his statement as an adage: 4

“We could torture the data and make it confess so that it appears that we have something of tremendous clinical significance. Instead we have to collect a large amount of data over a fairly long period of time. At that point we can say with certainty what the clinical significance of the work will be,” he said.

In June 1977 Professor of Management Dennis E. Logue attributed the saying to Professor of Finance Bulent Gultekin: 5

As my friend and colleague, Bulent Gultekin, has often remarked, “if you torture the data long enough, they confess—even to crimes that were never committed.” The history of empirical economic research is rife with instances when illegitimate “theories” were proven, and this is attested to by myriad tests of security selection models based on absolutely absurd correlations, such as the negative relationship between the length of women’s skirts and stock prices.

In August 1977 Professor of Economics by T. Dudley Wallace wrote that he had a 1975 letter in which Ronald Coase took credit for coining the variant statement containing the word “nature”: 6

Good attributed a remark to R. H. Coase, “If you torture the data long enough, it will confess” (p. 14). Coase, in a letter to the author dated 17 October 1975, wrote, “The correct quotation is: If you torture the data long enough nature will confess.”

In 1979 Thomas Mayer who was the President of the Western Economic Association delivered an address which included the following: 7

. . . journals should publish papers that find statistically insignificant results. This would not only remove the great pressure to stomp on the data until they give in and yield a t value of 2 or more (as a saying has it: “if you just torture the data long enough, they will confess”), but it would also prevent others from wasting time replicating an unsuccessful project.

In November 1981 Ronald Coase delivered a lecture at the American Enterprise Institute in Washington D.C. which was published in 1982. He stated that he employed the “nature” variant of the saying during a lecture he delivered in the early 1960s: 8

I remarked earlier on the tendency of economists to get the result their theory tells them to expect. In a talk I gave at the University of Virginia in the early 1960s, at which Warren Nutter was, I think, present, I said that if you torture the data enough, nature will always confess, a saying which, in a somewhat altered form, has taken its place in the statistical literature.

In March 1983 Professor of Economics Edward E. Leamer credited Coase within a piece published in “The American Economic Review”: 9

The econometrician’s shabby art is humorously and disparagingly labelled “data mining,” “fishing,” “grubbing,” “number crunching.” A joke evokes the Inquisition: “If you torture the data long enough, Nature will confess” (Coase).

In June 1983 within the pages of journal “Philosophy of Science” I. J. Good continued to credit Coase: 10

As the economist Ronald Coase said “If you torture the data long enough it will confess”. (Perhaps he should have said that they will confess.)

In 1987 Professor of Statistics Stephen M. Stigler employed an extended instance of the metaphor: 11

Beware of the problem of testing too many hypotheses; the more you torture the data, the more likely they are to confess, but confessions obtained under duress may not be admissible in the court of scientific opinion.

In 1991 “The Macmillan Book of Social Science Quotations” included an entry crediting Stigler with the saying based on the citation above. 12

In conclusion, QI believes that Ronald Coase coined the saying with the word “nature” and used it during the 1960s. Irving John Good heard the remark directly or indirectly from Coase. However, QI hypothesizes that Good’s memory was inexact, and he credited Coase with a version omitting the word “nature”. Economists, scientists, mathematicians, and others subsequently repeated instances of the witticism.

