Paul Romer has decided that it is time to air some grievances. In a widely discussed recent article in the AER Papers and Proceedings volume, he calls out some prominent macroeconomists for the alleged crime of “Mathiness.” Several blog commenters have offered their interpretations of the main thrust of Romer’s thesis. I admit that after reading the article I am not entirely sure what Romer means by “mathiness”. Noah Smith interprets mathiness as the result of “using math in a sloppy way to support […] preferred theories.” In his follow-up article in Bloomberg, Noah says much of the blame comes from abusing “mathematical theory by failing to draw a tight link between mathematical elements and the real world.”
Hopefully Romer is talking about something more than just mathematical errors in papers by prominent researchers. If this is his main gripe, let me break the bad news to you: Every paper has errors. Sometimes the errors are innocuous; sometimes they are fatal. Errors are not confined to mathematical papers either. There are plenty of mistakes in purely empirical papers. The famous mistake in the Reinhart–Rogoff debt paper; the results from Levitt’s abortion paper and so on… Mistakes and mess-ups are part of research. Criticizing someone for making lots of mistakes is almost like criticizing them for doing lots of research. If you aren’t making mistakes you aren’t working on stuff that is sufficiently hard. This isn’t to say that I encourage careless or intellectually dishonest work. All I am saying is that mistakes are an inevitable byproduct of research – particularly research on cutting-edge stuff. Moreover, mistakes will often live on. Mistakes are most likely to be exposed and corrected if the paper leads to follow-up work. Unfortunately, most research doesn’t lead to such subsequent work and thus any mistakes in the original contributions simply linger. This isn’t a big problem of course since no one is building on the work. Focusing on mistakes is also not what we should be spending our time on. We should not be discussing or critiquing papers that we don’t value very much. We should be focused on papers that we do value. This is how we judge academics in general. I don’t care about whether people occasionally (or frequently) write bad papers. I care whether they occasionally write good ones. We don’t care about the average paper – we care about the best papers (the orderstatistics).
If Romer is indeed talking about something other than mistakes then I suspect that his point is closer to what Noah describes in his recent columns: mathiness is a kind of mathematical theory that lacks a sufficiently tight link to reality. Certainly, having the “tight link” that Noah talks about is advantageous. Such a connection allows researchers to make direct make comparisons between theory and data in a way that is made much more difficult if the mapping between the model and the data is not explicit. On the other hand, valuable insights can certainly be obtained even if the theorist appeals to mathematical sloppiness / hand-waving / mathinesss, whatever you want to call it. In fact I worry that the pressure on many researchers is often in the opposite direction. Instead of being given the freedom to leave some of their theories somewhat loose / reduced form / partial equilibrium, researchers are implored to solve things out in general equilibrium, to micro found everything, to be as precise and explicit as possible – often at the expense of realism. I would welcome a bit of tolerance for some hand-waviness in economics. Outside of economics, the famous story of the proof of Fermat’s Last Theorem includes several important instances of at least what can be described as incompleteness if not outright hand-waving. The initial Taniyama–Shimura conjecture was a guess. The initial statement of Gerhard Frey’s epsilon conjecture was described by Ken Ribet (who ultimately proved the conjecture) as a “plausibility argument”. Even though they were incomplete, these conjectures were leading the researchers in important directions. Indeed, these guesses and sketches ultimately led to the modern proof of the theorem by Andrew Wiles. Wiles himself famously described his work experience like stumbling around in a dark room. If the room is very dark and very cluttered then you will certainly knock things over and stub your toes searching for a lightswitch. [Fn1]
In economics, some of my favorite papers have a bit of mathiness that serves the papers brilliantly. A good example occurs in Mankiw, Romer and Weil’s famous paper “A Contribution to the Empirics of Economic Growth” (QJE 1992). As the title suggests, the paper is an analysis of the sources of differences in economic growth experiences. The paper includes a simple theory section and a simple data section. The theory essentially studies simple variations of the standard Solow growth model augmented to include human capital (skills, know-how). The model is essentially a one-good economy with exogenous savings rates. In some corners of the profession, using a model with an exogenous savings rate might be viewed as a stoning offense but it is perfectly fine in this context. The paper is about human capital, not about the determinants of the saving rate. But that’s not the end of the mathiness. Their analysis proceeds by constructing a linear approximation to the growth paths of the model and then using standard linear regression methods together with aggregate data. Naturally such regressions are typically not identified but Mankiw, Romer and Weil don’t let that interfere with the paper. They simply assume that the error terms in their regressions are uncorrelated with the savings rates and proceed with OLS. There is a ton of mathiness in this work. And the consequence? Mankiw, Romer and Weil’s 1992 paper is one of the most cited and influential papers in the field of economic growth. Think about how this paper would be changed if some idiot referee decided that it needed optimizing savings decisions (after all we can’t allow hand-waving about exogenous savings rates), multiple goods and a separate human capital production function (no hand-waving about an aggregate production function or one-good economies), micro data (no hand-waving about aggregate data), and instruments for savings rates, population growth rates and human capital savings rate (no hand-waving about identification). The first three modifications suggested by the referee would simply be a form of hazing combined with obfuscation (the modifications make the authors jump through hoops for no good reason and the end product has an analysis that is less clear). The last one – insistence on a valid instrument – would probably be the end of the paper since such instruments probably don’t exist. Thank God this paper didn’t run into a referee like this.
My own opinion is that mathematical sloppiness can be perfectly fine if it deals with a feature that is not a focus of the paper. Hand-waving of this sort likely comes at very little cost and may have benefits by eliminating a lengthy discussion of issues only tangentially related to the paper. On the other hand, if the hand-waving occurs when analyzing or discussing central features of the paper then I am much more inclined to ask the researcher to do the analysis right. This type of hand-waving happens sometimes but it is not clear that it happens more often in macroeconomics or in freshwater macro at that – ironically, the Freshwater guys that Romer criticizes are much more likely to demand a tight specification and precise analysis of the model (whether it is called for or not).
[Fn1] If you are interested in the modern proof of Fermat’s Last Theorem I highly recommend this documentary.