If You Torture the Data Long Enough, It Will Confess

Ronald Coase? Irving John Good? Charles D. Hendrix? Robert W. Flower? Bulent Gultekin? Anonymous?

Dear Quote Investigator: Collecting and interpreting data is a delicate process that is subject to conscious and unconscious biases. The selective choice of inputs and statistical tests can yield results that are misleading. Here are two versions of a comical metaphorical adage:

  • If you torture the data long enough, it will confess.
  • If you torture the data enough, nature will always confess.

Strictly speaking these statements are ambiguous. Each interpretation hinges on whether the information in the coerced confession is correct or erroneous. The usual interpretations presume that the information extracted under duress is incorrect. Thus, torturing the data is counterproductive and not revelatory.

Both of these sayings have been attributed to Nobel Prize-winning economist Ronald Coase. Would you please explore this topic?

Quote Investigator: The earliest match located by QI appeared in an address delivered on April 22, 1971 by British mathematician I. J. Good (Irving John Good) at a meeting of the Institute of Mathematical Statistics. Good’s lecture was printed in “The American Statistician” in June 1972. Boldface added to excerpts by QI:[ref] 1972 June, The American Statistician, Volume 26, Number 3, Statistics and Today’s Problems by I. J. Good, (Invited lecture at the 129th Meeting of the Institute of Mathematical Statistics on April 22, 1971), Start Page 11, Quote Page 14, Taylor & Francis, Abingdon, Oxfordshire, England. (JSTOR) link [/ref]

As Ronald Coase says “If you torture the data long enough, it will confess.” When data is tortured, it is useful when possible to reserve some of the sample for testing a hypothesis after it is formulated because there is not yet any satisfactory logic for using the whole of the sample.

Interestingly, Coase stated that he employed a different phrasing for the saying as shown in the citations presented further below dated August 1977 and 1982.

Here are additional selected citations in chronological order.

The figurative framework of torturing data to yield misleading evidence has appeared in print for decades, but early instances did not extend the metaphor to include confession. For example, a 1933 article in “The Elementary School Journal” contained the following passage:[ref] 1933 March, The Elementary School Journal, Volume 33, Number 7, Educational News and Editorial Comment, Start Page 481, Quote Page 488, The University of Chicago Press, Chicago, Illinois. (JSTOR) link [/ref]

The evidence submitted by the committee from its own questionnaire warrants no such conclusion. To torture the data given in Table I into evidence supporting a twelve-hour minimum of professional training is indeed a statistical feat, but one which the committee accomplishes to its own satisfaction.

In April 1971 Irving John Good attributed the saying under examination to Ronald Coase as mentioned previously.

In October 1972 the “Sunday Gazette-Mail” of Charleston, West Virginia announced a forthcoming lecture by Charles D. Hendrix with a title based on the saying. QI conjectures that Hendrix was repeating a clever remark he had heard circulating amongst statisticians:[ref] 1972 October 15, Sunday Gazette-Mail, Data, Quote Page 2A, Column 4, Charleston, West Virginia. (Newspapers_com) [/ref]

“If You Torture the Data Long enough It Will Confess” will be the topic of Charles D. Hendrix’s talk before the American Society for Quality Control at 6:30 p.m. Wednesday at the Kim Tiki Restaurant.

In June 1976 the UPI News Service reported a remark by Robert W. Flower of the Johns Hopkins Applied Physics Laboratory. Flower employed the metaphorical framework, but he did not formulate his statement as an adage:[ref] 1976 June 13, Cumberland Sunday Times, New Eye Insight Gained by UPI News Service, Quote Page 56, Column 3 and 4, Cumberland, Maryland. (NewspaperArchive) [/ref]

“We could torture the data and make it confess so that it appears that we have something of tremendous clinical significance. Instead we have to collect a large amount of data over a fairly long period of time. At that point we can say with certainty what the clinical significance of the work will be,” he said.

In June 1977 Professor of Management Dennis E. Logue attributed the saying to Professor of Finance Bulent Gultekin:[ref] 1977 June, The Journal of Finance, Volume 32, Number 3, Book Review by Dennis E. Logue (Amos Tuck School of Business Administration, Dartmouth College), Note: The book reviewed by Logue was titled “Foundations of Finance”, Start Page 961, Quote Page 964, Wiley, New York. (JSTOR) link [/ref]

As my friend and colleague, Bulent Gultekin, has often remarked, “if you torture the data long enough, they confess—even to crimes that were never committed.” The history of empirical economic research is rife with instances when illegitimate “theories” were proven, and this is attested to by myriad tests of security selection models based on absolutely absurd correlations, such as the negative relationship between the length of women’s skirts and stock prices.

In August 1977 Professor of Economics by T. Dudley Wallace wrote that he had a 1975 letter in which Ronald Coase took credit for coining the variant statement containing the word “nature”:[ref] 1977 August, American Journal of Agricultural Economics, Volume 59, Number 3, Pretest Estimation in Regression: A Survey by T. Dudley Wallace (Professor of Economics at Duke University), Footnote 1, Start Page 431, Quote Page 431, Oxford University Press, Oxford, England. (JSTOR) link [/ref]

Good attributed a remark to R. H. Coase, “If you torture the data long enough, it will confess” (p. 14). Coase, in a letter to the author dated 17 October 1975, wrote, “The correct quotation is: If you torture the data long enough nature will confess.”

In 1979 Thomas Mayer who was the President of the Western Economic Association delivered an address which included the following:[ref] 1980 April, Economic Inquiry, Volume 18, Issue, Economics as a Hard Science: Realistic Goal or Wishful Thinking? by Thomas Mayer of University of California, Davis) (Presidential address delivered at the 54th Annual Conference of the Western Economic Association, Las Vegas, 1979), Start Page 165, Quote Page 175, A Journal of the Western Economic Association International, John Wiley & Sons. (ProQuest) [/ref]

. . . journals should publish papers that find statistically insignificant results. This would not only remove the great pressure to stomp on the data until they give in and yield a t value of 2 or more (as a saying has it: “if you just torture the data long enough, they will confess”), but it would also prevent others from wasting time replicating an unsuccessful project.

In November 1981 Ronald Coase delivered a lecture at the American Enterprise Institute in Washington D.C. which was published in 1982. He stated that he employed the “nature” variant of the saying during a lecture he delivered in the early 1960s:[ref] 1982, How Should Economists Choose? by R. H. Coase, The Third G. Warren Nutter Lecture in Political Economy delivered at the American Enterprise Institute for Public Policy Research, in Washington, D.C. on November 18, 1981, Quote Page 16, Published by American Enterprise Institute, Washington, D.C. (Verified with scans; accessed via aei.org on January 12, 2021) link [/ref]

I remarked earlier on the tendency of economists to get the result their theory tells them to expect. In a talk I gave at the University of Virginia in the early 1960s, at which Warren Nutter was, I think, present, I said that if you torture the data enough, nature will always confess, a saying which, in a somewhat altered form, has taken its place in the statistical literature.

In March 1983 Professor of Economics Edward E. Leamer credited Coase within a piece published in “The American Economic Review”:[ref] 1983 March, The American Economic Review, Volume 73, Number 1, Let’s Take the Con Out of Econometrics Edward E. Leamer (Professor of Economics at University of California, Los Angeles), Start Page 31, Quote Page 37, Published by American Economic Association. (JSTOR) link [/ref]

The econometrician’s shabby art is humorously and disparagingly labelled “data mining,” “fishing,” “grubbing,” “number crunching.” A joke evokes the Inquisition: “If you torture the data long enough, Nature will confess” (Coase).

In June 1983 within the pages of journal “Philosophy of Science” I. J. Good continued to credit Coase:[ref] 1983 June, Philosophy of Science, Volume 50, Number 2, The Philosophy of Exploratory Data Analysis by I. J. Good (Virginia Polytechnic Institute and State University), Start Page 283, Quote Page 289, The University of Chicago Press, Chicago, Illinois. (JSTOR) link [/ref]

As the economist Ronald Coase said “If you torture the data long enough it will confess”. (Perhaps he should have said that they will confess.)

In 1987 Professor of Statistics Stephen M. Stigler employed an extended instance of the metaphor:[ref] 1987, Neutral Models in Biology, Edited by Matthew H. Nitecki and Antoni Hoffman, Part 3: Paleontological Models, Chapter 8: Testing Hypotheses or Fitting Models? Another Look at Mass Extinctions by Stephen M. Stigler, Start Page 147, Quote Page 148, Oxford University Press, Oxford, England. (Verified with Amazon Look Inside) [/ref]

Beware of the problem of testing too many hypotheses; the more you torture the data, the more likely they are to confess, but confessions obtained under duress may not be admissible in the court of scientific opinion.

In 1991 “The Macmillan Book of Social Science Quotations” included an entry crediting Stigler with the saying based on the citation above.[ref] 1991 Copyright, The Macmillan Book of Social Science Quotations: Who Said What, When, and Where, Editors David L. Sills and Robert K. Merton, Entry: Stephen M. Stigler (U.S. statistician), Quote Page 223, Macmillan Publishing Company, New York. (Verified with scans) [/ref]

In conclusion, QI believes that Ronald Coase coined the saying with the word “nature” and used it during the 1960s. Irving John Good heard the remark directly or indirectly from Coase. However, QI hypothesizes that Good’s memory was inexact, and he credited Coase with a version omitting the word “nature”. Economists, scientists, mathematicians, and others subsequently repeated instances of the witticism.

Image Notes: Concentric circles of binary data from geralt at Pixabay. Image has been cropped and resized.

(Thanks to U.S. cartoonist Scott McCloud who tweeted an inquiry on this topic, and thanks to m_spiker who notified QI of the tweet. This led QI to formulate this question and perform this exploration. Further thanks to researcher Barry Popik who examined this topic and found evidence beginning in 1977. In addition, thanks to the volunteer editors of Wikiquote for listing the 1981 Coase citation on the webpage dedicated to “Information” quotations.)

Exit mobile version