(Paul) Romer’s Rant

Paul Romer has decided that it is time to air some grievances. In a widely discussed recent article in the AER Papers and Proceedings volume, he calls out some prominent macroeconomists for the alleged crime of “Mathiness.”  Several blog commenters have offered their interpretations of the main thrust of Romer’s thesis. I admit that after reading the article I am not entirely sure what Romer means by “mathiness”. Noah Smith interprets mathiness as the result of “using math in a sloppy way to support […] preferred theories.”  In his follow-up article in Bloomberg, Noah says much of the blame comes from abusing “mathematical theory by failing to draw a tight link between mathematical elements and the real world.”

Hopefully Romer is talking about something more than just mathematical errors in papers by prominent researchers. If this is his main gripe, let me break the bad news to you: Every paper has errors. Sometimes the errors are innocuous; sometimes they are fatal. Errors are not confined to mathematical papers either. There are plenty of mistakes in purely empirical papers. The famous mistake in the Reinhart–Rogoff debt paper; the results from Levitt’s abortion paper and so on…  Mistakes and mess-ups are part of research. Criticizing someone for making lots of mistakes is almost like criticizing them for doing lots of research. If you aren’t making mistakes you aren’t working on stuff that is sufficiently hard. This isn’t to say that I encourage careless or intellectually dishonest work. All I am saying is that mistakes are an inevitable byproduct of research – particularly research on cutting-edge stuff. Moreover, mistakes will often live on. Mistakes are most likely to be exposed and corrected if the paper leads to follow-up work. Unfortunately, most research doesn’t lead to such subsequent work and thus any mistakes in the original contributions simply linger. This isn’t a big problem of course since no one is building on the work.  Focusing on mistakes is also not what we should be spending our time on. We should not be discussing or critiquing papers that we don’t value very much. We should be focused on papers that we do value. This is how we judge academics in general. I don’t care about whether people occasionally (or frequently) write bad papers. I care whether they occasionally write good ones. We don’t care about the average paper – we care about the best papers (the orderstatistics).

If Romer is indeed talking about something other than mistakes then I suspect that his point is closer to what Noah describes in his recent columns: mathiness is a kind of mathematical theory that lacks a sufficiently tight link to reality. Certainly, having the “tight link” that Noah talks about is advantageous. Such a connection allows researchers to make direct make comparisons between theory and data in a way that is made much more difficult if the mapping between the model and the data is not explicit. On the other hand, valuable insights can certainly be obtained even if the theorist appeals to mathematical sloppiness / hand-waving / mathinesss, whatever you want to call it.  In fact I worry that the pressure on many researchers is often in the opposite direction. Instead of being given the freedom to leave some of their theories somewhat loose / reduced form / partial equilibrium, researchers are implored to solve things out in general equilibrium, to micro found everything, to be as precise and explicit as possible – often at the expense of realism. I would welcome a bit of tolerance for some hand-waviness in economics.  Outside of economics, the famous story of the proof of Fermat’s Last Theorem includes several important instances of at least what can be described as incompleteness if not outright hand-waving. The initial Taniyama–Shimura conjecture was a guess. The initial statement of Gerhard Frey’s epsilon conjecture was described by Ken Ribet (who ultimately proved the conjecture) as a “plausibility argument”. Even though they were incomplete, these conjectures were leading the researchers in important directions. Indeed, these guesses and sketches ultimately led to the modern proof of the theorem by Andrew Wiles. Wiles himself famously described his work experience like stumbling around in a dark room. If the room is very dark and very cluttered then you will certainly knock things over and stub your toes searching for a lightswitch. [Fn1]

In economics, some of my favorite papers have a bit of mathiness that serves the papers brilliantly. A good example occurs in Mankiw, Romer and Weil’s famous paper “A Contribution to the Empirics of Economic Growth” (QJE 1992).  As the title suggests, the paper is an analysis of the sources of differences in economic growth experiences. The paper includes a simple theory section and a simple data section. The theory essentially studies simple variations of the standard Solow growth model augmented to include human capital (skills, know-how). The model is essentially a one-good economy with exogenous savings rates. In some corners of the profession, using a model with an exogenous savings rate might be viewed as a stoning offense but it is perfectly fine in this context. The paper is about human capital, not about the determinants of the saving rate. But that’s not the end of the mathiness. Their analysis proceeds by constructing a linear approximation to the growth paths of the model and then using standard linear regression methods together with aggregate data. Naturally such regressions are typically not identified but Mankiw, Romer and Weil don’t let that interfere with the paper. They simply assume that the error terms in their regressions are uncorrelated with the savings rates and proceed with OLS. There is a ton of mathiness in this work. And the consequence? Mankiw, Romer and Weil’s 1992 paper is one of the most cited and influential papers in the field of economic growth. Think about how this paper would be changed if some idiot referee decided that it needed optimizing savings decisions (after all we can’t allow hand-waving about exogenous savings rates), multiple goods and a separate human capital production function (no hand-waving about an aggregate production function or one-good economies), micro data (no hand-waving about aggregate data), and instruments for savings rates, population growth rates and human capital savings rate (no hand-waving about identification).  The first three modifications suggested by the referee would simply be a form of hazing combined with obfuscation (the modifications make the authors jump through hoops for no good reason and the end product has an analysis that is less clear).  The last one – insistence on a valid instrument – would probably be the end of the paper since such instruments probably don’t exist. Thank God this paper didn’t run into a referee like this.

My own opinion is that mathematical sloppiness can be perfectly fine if it deals with a feature that is not a focus of the paper. Hand-waving of this sort likely comes at very little cost and may have benefits by eliminating a lengthy discussion of issues only tangentially related to the paper. On the other hand, if the hand-waving occurs when analyzing or discussing central features of the paper then I am much more inclined to ask the researcher to do the analysis right. This type of hand-waving happens sometimes but it is not clear that it happens more often in macroeconomics or in freshwater macro at that – ironically, the Freshwater guys that Romer criticizes are much more likely to demand a tight specification and precise analysis of the model (whether it is called for or not).

[Fn1] If you are interested in the modern proof of Fermat’s Last Theorem I highly recommend this documentary.

13 thoughts on “(Paul) Romer’s Rant

  1. Chris,

    Great post! This is somewhat tangential but I wanted to disagree with you on Mankiw, Romer and Weil. To my mind, admittedly as late-stage undergrad who isn’t very deep into the literature, Mankiw, Romer and Weil is empirically very problematic. It seems to me that Caselli, Esquivel and Lefort convincingly destroyed all of their important results by pointing out the very obvious biases and using Arellano-Bond instead. Though that paper is not without its problems also.

    I’m a bit mystified by why it’s been such an influential paper. It also seems to me like ‘mathiness’ at its very worst – i.e. a paper that looks good because it seems to validate Solow + human capital but really just does so through not having an identification strategy.

    Curious to hear your thoughts.

    • Indeed, the post seems to suggest that research with many citations is inherently good, which is question-begging at best.

      • MSL, I don’t want to claim that number of citations = quality of the paper *but* it certainly tells you something. At a minimum lot’s of citations indicates that people are reacting to the paper in one way or another which is probably a good sign. It’s of course possible that lots of citations could be generated by people pointing out flaws with a paper (“negative citations”). Then again, I would think that this would be a sign of a positive contribution to the field. Levitts abortion paper is probably a good example. The paper is one of the most influential papers of the past 20-30 years and it is one of the most cited. Many cites are arguing that the origianl paper is fundamentally flawed and I think ultimately the profession has concluded that the proposed effect highlighted in the paper doesn’t actually exist. Nevertheless, by generating a ton of work, asking a deeply provocative question and suggesting a meaningful way of answering the question makes the paper valueable even if the result doesn’t ultimately hold up.

    • Hi Joe,

      You are right that there are empirical problems with the MRW paper. They really aren’t identified. They assume that the endogeneity problems that cause so many problems in empirical work aren’t operative for the regressions they run but this isn’t satisfying. (Note: the authors are quite open about this problem in the paper so they aren’t hiding from it). Caselli et al is a nice paper but they can’t really resolve the problems with the growth regressions. They have to argue that they have found valid instrumental variables for their empirical procedure to work. I suspect that these instruments simply aren’t very persuasive.

      (Incidentally, I don’t usually describe research as “destroying” previous results — I guess I know what you are trying to say but it gives people the impression that research is a competition which really isn’t entirely true.)

      The paper was quite influential because it demonstrated in a very transparent way that the basic Solow growth model could be entirely consistent with the economic growth facts but at the same time be able to predict fairly large persistent differences in per capita income. This is backed up by the data in a very simple sense — there is sufficient observed variation in rates of population growth and savings rates to account for a lot of the measured differences in income (of course they are not econometrically identified so we can’t confidently say that these patterns are entirely causal).

      Chris

  2. Chris,

    Point noted on ‘destroying’, sloppy habit and will refrain in the future.

    My understanding of Caselli et al is that the combined differencing out of the time-variant endogeneity and using lags as instruments *only* to account for the time varying endogeneity is quite sensible. But this is a side issue.

    Anyway, thanks for explaining that for me! MRW’s prominence has always confused me a little and this clears it up.

    • Oops — sorry. This is what comes from being both a bad speller and having too much trust in spell-check software. A while back a commentator busted me for writing tow-the-line (which I was certainwas correct). I assume that by now you have realized that spelling and proper grammar are not strong suits. Live and learn …

  3. Interesting post and exchange. Chris, Joe, on the empirical front, you may be interested in my piece with Michael Clemens in AEJ: Macro a few years ago: http://tinyurl.com/po9he2r. We discuss a number of issues concerning identification in the empirical growth literature.

  4. Pingback: More on Mathiness | The Growth Economics Blog

  5. Can I simply say what a comfort to discover somebody that really knows what they’re discussing online. You actually understand how to bring a problem to light and make it important. More people have to look at this and understand this side of the story. I can’t believe you aren’t more popular since you certainly have the gift.

  6. Pingback: בעית האמינות במחקר הכלכלי | מבוא לכלכלה – ג

  7. Chris: Sorry for a belated comment. I been circulating back to reactions to Paul Romer’s mathiness paper in prep for my own belated post on the subject. Your reference to MRW seems right on the money to me, but you might look also at Mankiw’s Brookings conference follow-up, “The Growth of Nations” – not just for Mankiw’s lucid summary of MRW’s modeling decisions (very consistent with what you say), but also for Romer’s somewhat jejune reactions, which include complaints that MRW aren’t mathy enough! And Romer’s remarks complaints about the neoclassical growth model and other constructs hanging around the textbooks because they were all that “median” students could handle – well, they have always struck me as revealing.

    Your point about theories not “destroying” each other (however Romer might wish) is also well taken. It was almost at the same time of Romer’s comments on “Growth of Nations” that Chad Jones published the time series research that, in a Popperian sense, falsified all the R&D growth theories, Romer’s included. Romer might recall that Jones, instead of issuing an obituary for endogenous growth, went on to rehabilitate the theories he had just “refuted”, becoming one of their best advocates. Having been the beneficiary of such a refinement process, Romer seems to think that perfect competition theories won’t so benefit and should just disappear. His expectation around a formal “consensus” in his favor, so reminiscent of his response to Mankiw, seems to me to overshadow the claims around mathiness.

Leave a comment