Homo Economicus vs. Homo Economicus “Straw-icus”

I haven’t written on my blog for a really long time but I am starting to approach a point where I might actually have some time to post again somewhat regularly. My reason for writing this particular post is in response to a blog post by Noah Smith who took issue with Milton Friedman’s famous pool player analogy. Noah has concluded that “the pool player analogy is silly” and his reasons for arriving at this conclusion are

  1. If actual pool players never missed their shots, there would be no use for the physics equations as a prediction and analysis tool.
  2. People make mistakes so they don’t always optimize.
  3. People who make bad decisions don’t tend to go away over time.
  4. Unlike in pool, we rarely know what the objective is.

I must confess that the pool player analogy is one of my favorite analogies in economics and I use it often when I am talking to people about economics.  The pool player analogy is a common response to typical criticisms of standard microeconomic analysis.  Microeconomic conclusions like “consumers should make choices that equalize the marginal utility per dollar across goods” or “firms should make input choices so that the marginal product of capital equals the real (product) rental price of capital” often seem very abstract and technical and they invite natural objections like “real-life people don’t behave like that” or “in the real world, firms don’t make calculations like this.”  Such reactions are natural and this is exactly where the pool player analogy fits in.  In pool, making the best shots would seem to require a host of extremely involved calculations involving physics concepts that most real-life players won’t know. Making an optimal shot requires considerations of the friction of the felt on the table, considerations of angular momentum, transfers of energy, torque and so on.  Even if a pool player knew about all of this stuff one would guess that it would require a long time between shots as the player made a series of complex calculations prior to taking the shot. Nevertheless, the actual shot taken will look a lot like the one implied by such a calculation. That is, the pool player’s actual behavior will closely resemble the behavior implied by the optimal physics calculation.

The central guiding principle of economic analysis is that behavior is guided by self-interest.  Consumers and firms make choices that (in their assessment) make them as well-off as possible. Sometimes this principle is summarized by the term homo economicus or “economic man” which essentially views mankind as being comprised of many self-interested autonomous beings. The pool player is one manifestation of the behavior of a self-interested individual making choices under a constraint but so are ordinary people. If you go to lunch at a Chinese restaurant, you will in all likelihood be confronted with a staggering number of choices, options, prices, substitutions, and combinations.  One might be tempted to conclude that it will be impossible to make an optimal choice from such a menu because there are simply too many possibilities to consider. However, when I actually go to lunch at the local Chinese restaurant I typically see people ordering relatively quickly, sometimes individualizing their choices and I suspect that most people do not anticipate that they typically made mistakes ordering lunch. Homo economicus does a pretty good job ordering lunch, commuting to work, etc.  This is not to say that economic man doesn’t make mistakes or overlook things.  This is inevitable and surely happens often in the real world.  The question you have to ask yourself is – do you think people typically make something close to the best choices or do you think people typically make severe mistakes in their economic decisions.  If you agree with the former then you are thinking like a mainstream economist.

Noah seems to think that because people make mistakes or because we can’t know the true (mathematical) objective functions that this appeal to optimal behavior is misguided.  To me it seems like Noah is not criticizing “economic man” so much as he is criticizing “economic straw man.” In the article he arrives at this startling conclusion writing

[If] really good pool players made 100% of their shots, there wouldn’t be pool tournaments. It would be no fun, because whoever went first would always win. But in fact, there are pool tournaments. So expert pool players do, in fact, miss.

(… shocking, I know. How could we have not seen this?).

Economists do not assume that people don’t make mistakes (though in fairness it is not completely obvious how to analytically model mistakes). I can think of exactly zero economists who think that all behavior is optimal from some omniscient / omnipotent point of view.  I certainly don’t believe this – I regularly play online chess and I always overlook moves that are to my advantage or moves my opponent could play that would be really bad for me. This doesn’t in any way suggest that my behavior isn’t guided by my own self-interest or that my move isn’t what I think is my best option given the current position. Moreover, I would think that, in chess, starting from the point of view of making choosing one of the better moves available would often provide a good guide to actual game play – this is almost certainly true for professional and semi-professional players.

Imagine, if you will, actually trying to construct a model of pool playing. Specifically let’s consider 9 ball.  In 9 ball the balls are sunk in order (unless the 9 goes in on the break).  This greatly reduces the strategic nature of the game and makes it more like the players know which shot they want to hit and now there is simply the task of actually making the shot.  If I were to approach such a problem (which might actually be interesting from a behavioral economics point of view) I might adopt the following approach:  Suppose the best shot could be described by a vector S.  S would include spin, speed/power, angle, etc.  A perfect player would simply take a shot given by S. We could think about a real-world player as taking a shot that might include an error e. The spin will be a little off, the angle will be off a bit too, and so forth.  So any one actual shot could be given by S + e.  You might think that the best players typically make small errors while inexperienced players would make bigger errors. I would think this would be a pretty good description of actual pool but note I would absolutely begin with the idealized shot S (the shot made by homo economicus).

Noah also makes a big deal about the fact that we don’t know the objective. The fact that we typically don’t know the objective is not necessarily a problem. If an economist looks at a game being played where she isn’t familiar with the rules then she will still tend to treat the moves she observes as though they are guided by some latent objective. (And you might guess that after looking at such behavior for a while, the economist might be able to deduce the objective even if she doesn’t have advanced knowledge of it.)

In short, I have always felt – and I continue to feel – that Friedman’s pool player is an excellent way to convey how economists approach their subject. Only if we were to insist that people really were computerized robots or Vulcans always making perfectly optimal, logical choices with regard to some easily described mathematical objective would this analogy present a problem – in fact it’s only a problem for the economic straw man.  So, until Noah comes up with some more serious objections I’m going to keep this analogy on my go-to list of explanations for basic economics.

In the meantime, let me leave you with this opening quote from The Color of Money.

A player can make eight trick-shots in a row, blow the 9 and lose.

On the other hand, a player can get the 9 in on the break, if the balls are spread right, and win.

Which is to say that, luck plays a plays a part in 9-ball.


But for some players,

 Luck itself is an art

Is Big Data the answer to all our problems?

Noah Smith recently commented on a Malcom Gladwell talk in which he (Gladwell) expressed doubts about the promise of “Big Data.”  When people use the term “Big Data” they are often referring to datasets that are built from digital data streams like Google searches, the Twitter feed, and so forth but it can also refer simply to large datasets that were previously too cumbersome for researchers to access easily. In any case, Gladwell says that this newfound data availability is not our salvation. He also claims that this data might be a curse. Noah devotes most of his comment to countering Gladwell’s claims. I’m actually not a huge Malcom Gladwell fan and, like Noah, I basically disagree with the idea that having more data could be a problem for us.  More data must be better. (OK, there is the Snowden issue and government invasions of privacy more generally but let’s leave those problems aside – I don’t think that kind of intrusive survelance is what either Gladwell or Noah has in mind).

There are some aspects of Big Data that I’ve been thinking about for a little while that seem at least somewhat relevant to Gladwell’s argument and Noah’s post. Many researchers – and many economists in particular – see Big Data as a huge benefit to their field.  Indeed, some view the arrival of these new datasets as a transformative event in social science.  Speaking for myself only, I have some doubts that this new data will be as much of a benefit as many are predicting. In economics, many researchers are being drawn to these datasets without having a direct purpose or plan in mind. To me, this is most concerning with graduate students who are under lots of pressure and sometimes hold out hope that a huge dataset will be like the Holy Grail for an underdeveloped research portfolio. After waiting to obtain their data, the graduate students typically are let down when they realize that the data doesn’t really address the questions they were interested in, or that the data needs to be cleaned and arranged into a useable form which takes a tremendous amount of work, or that they were hoping that the dataset would present an obvious killer question or killer instrument and the data fails to deliver.

Another thing that pops up all too frequently is the idea that a bigger dataset is automatically superior simply because it has many observations. This is often clearly not the case and it’s painful to see this realization fall upon a researcher (sometimes during their presentations). To take a really obvious example, suppose you are interested in whether extensions of unemployment benefits reduce labor supply by causing people to search more while they are still getting the payments. A dataset with state-level unemployment rate data over the period 2005-2014 might actually be able to speak to this question. In contrast, a dataset with 100 million daily individual observations for a year isn’t going to help you at all if there is no variation in unemployment benefit policy in that year. Sure it’s impressive that you can get such a dataset but it isn’t useful for your research question. Sometimes in seminars, the presenter will intentionally advertise the scope of the dataset in a futile effort to impress the audience. It never works. It’s similar to a related unsuccessful tactic of trying to impress the audience by telling them how long it takes your computer to solve a complicated dynamic programming problem.

Stuff like this comes up all the time. I was in a seminar where a researcher was using individual household level consumption data to test the permanent income hypothesis (PIH). The dataset was quite nice but the consumption measure combined both durable and nondurable goods and unfortunately the PIH applies only to nondurable consumption spending. When the researcher was asked why he or she used the individual data rather than aggregate data (which does break out nondurable consumption) his/her response was simply that he/she felt that individual data was better than aggregate data (?).

Firm-level data is another pet peeve of mine. I can’t tell you the number of times I’ve heard people say that the reason that we should use firm level data is because this is just what people do these days. Firm-level data is particularly noteworthy because one of the classic issues in economics deals with the nature of a firm itself. The straight neoclassical perspective is that the notion of a firm is not particularly well defined. Two mechanic shops that operate independently would appear as two firms in a typical dataset but if the owner of one of the shops sells it to the other owner, these firms would suddenly become a single observation. This problem reminds me of a quote by Frank Zappa: “The most important thing in art is the frame. … without this humble appliance, you can’t know where the art stops and the real world begins.” A similar thing occurs with firm level data. We have a bunch of underlying behavior and then there are these arbitrary frames placed around groups of activity. We call these arbitrary groupings “firms.” Arbitrary combinations (or breakdowns) like this surely play a large role in dictating the nature of firm-level data. In the end it’s not clear how many real observations we actually have in these datasets.

In the past I’ve been fortunate enough to work with some students who used hand-collected data.[1] Data like this is almost always fairly small in comparison with real-time data or administrative data. Despite this apparent size disadvantage, self-collected data has some advantages that are worth emphasizing. First, the researcher will necessarily know much more about the way the data was collected. Second, the data can be collected with the explicit aim of addressing a specifically targetted research question. Third, building the data from the ground up invites the researcher to confront particular observations that might be noteworthy for one reason or another. In fact, I often encourage graduate students to look in depth at individual observations to build their understanding of the data. This will likely not happen with enormous datasets.

Again, this is not to say that more data is in anyway a disadvantage. However, like any input into the research process, the choice of data should be given some thought. A similar thing came up perhaps 15 years ago when more and more powerful computers allowed us to expand the set of models we could analyze. This was greeted as a moment of liberation by some economists but soon the moment of bliss gave way to reality. Adding a couple more state variables wasn’t going to change the field; just because the model is solved a bit more accurately and faster won’t expand our understanding by leaps and bounds. Better? No doubt. A panacea? Not at all.

The real constraints on economics research have always been, and continue to be, a shortage of ideas and creativity. Successfully pushing the boundaries of our understanding requires creative insights coupled with accurate quantitative modelling and good data and empirical work. The kinds of insights I’m talking about won’t be found just lying around in any dataset – not matter how big it is.

[1] One of my favorite examples of such data work is by Ed Knotek who collected a small dataset on prices of goods that were sold in convenience stores but also sold in large supermarkets. See “Convenient Prices and Price Rigidity: Cross-Sectional Evidence,” Review of Economics and Statistics, 2011.

Warren Buffet: Fighting Income Inequality with the EITC

Warren Buffet’s article in the Wall Street Journal reminds me of some posts I wrote a while back on fighting income inequality. His article contains a lot of wisdom. Some excerpts:

The poor are most definitely not poor because the rich are rich. Nor are the rich undeserving. Most of them have contributed brilliant innovations or managerial expertise to America’s well-being. We all live far better because of Henry Ford, Steve Jobs, Sam Walton and the like.

He writes that an expansion of the minimum wage to 15 dollars per hour

would almost certainly reduce employment in a major way, crushing many workers possessing only basic skills. Smaller increases, though obviously welcome, will still leave many hardworking Americans mired in poverty. […]  The better answer is a major and carefully crafted expansion of the Earned Income Tax Credit (EITC).

I agree entirely and so would Milton Friedman.

Unlike the minimum wage which draws money from an abstract group of individuals some of whom are not high income earners, the EITC draws funds from the broad U.S. tax base which in turn draws more heavily from upper income individuals. Unlike the minimum wage, the EITC can be directed at low income households rather than low wage individuals — many of whom are simply teenagers working summer jobs. Unlike the minimum wage which discouraged employment, the EITC encourages employment.

Buffet also proposes some common sense modifications to the EITC which I would welcome. In addition to reducing fraud, …

There should be widespread publicity that workers can receive free and convenient filing help. An annual payment is now the rule; monthly installments would make more sense, since they would discourage people from taking out loans while waiting for their refunds to come through.

The main problems with such an expansion of the EITC are political. Taking a serious swing at reducing income inequality would require a lot of money. Republican’s will likely oppose it because it is “socialist” or some nonsense. I’m sure that many Democrats would be very receptive to an aggressive expansion like Buffet alludes to but the cost in political capital might be too great for a politician to pay.


(Paul) Romer’s Rant

Paul Romer has decided that it is time to air some grievances. In a widely discussed recent article in the AER Papers and Proceedings volume, he calls out some prominent macroeconomists for the alleged crime of “Mathiness.”  Several blog commenters have offered their interpretations of the main thrust of Romer’s thesis. I admit that after reading the article I am not entirely sure what Romer means by “mathiness”. Noah Smith interprets mathiness as the result of “using math in a sloppy way to support […] preferred theories.”  In his follow-up article in Bloomberg, Noah says much of the blame comes from abusing “mathematical theory by failing to draw a tight link between mathematical elements and the real world.”

Hopefully Romer is talking about something more than just mathematical errors in papers by prominent researchers. If this is his main gripe, let me break the bad news to you: Every paper has errors. Sometimes the errors are innocuous; sometimes they are fatal. Errors are not confined to mathematical papers either. There are plenty of mistakes in purely empirical papers. The famous mistake in the Reinhart–Rogoff debt paper; the results from Levitt’s abortion paper and so on…  Mistakes and mess-ups are part of research. Criticizing someone for making lots of mistakes is almost like criticizing them for doing lots of research. If you aren’t making mistakes you aren’t working on stuff that is sufficiently hard. This isn’t to say that I encourage careless or intellectually dishonest work. All I am saying is that mistakes are an inevitable byproduct of research – particularly research on cutting-edge stuff. Moreover, mistakes will often live on. Mistakes are most likely to be exposed and corrected if the paper leads to follow-up work. Unfortunately, most research doesn’t lead to such subsequent work and thus any mistakes in the original contributions simply linger. This isn’t a big problem of course since no one is building on the work.  Focusing on mistakes is also not what we should be spending our time on. We should not be discussing or critiquing papers that we don’t value very much. We should be focused on papers that we do value. This is how we judge academics in general. I don’t care about whether people occasionally (or frequently) write bad papers. I care whether they occasionally write good ones. We don’t care about the average paper – we care about the best papers (the orderstatistics).

If Romer is indeed talking about something other than mistakes then I suspect that his point is closer to what Noah describes in his recent columns: mathiness is a kind of mathematical theory that lacks a sufficiently tight link to reality. Certainly, having the “tight link” that Noah talks about is advantageous. Such a connection allows researchers to make direct make comparisons between theory and data in a way that is made much more difficult if the mapping between the model and the data is not explicit. On the other hand, valuable insights can certainly be obtained even if the theorist appeals to mathematical sloppiness / hand-waving / mathinesss, whatever you want to call it.  In fact I worry that the pressure on many researchers is often in the opposite direction. Instead of being given the freedom to leave some of their theories somewhat loose / reduced form / partial equilibrium, researchers are implored to solve things out in general equilibrium, to micro found everything, to be as precise and explicit as possible – often at the expense of realism. I would welcome a bit of tolerance for some hand-waviness in economics.  Outside of economics, the famous story of the proof of Fermat’s Last Theorem includes several important instances of at least what can be described as incompleteness if not outright hand-waving. The initial Taniyama–Shimura conjecture was a guess. The initial statement of Gerhard Frey’s epsilon conjecture was described by Ken Ribet (who ultimately proved the conjecture) as a “plausibility argument”. Even though they were incomplete, these conjectures were leading the researchers in important directions. Indeed, these guesses and sketches ultimately led to the modern proof of the theorem by Andrew Wiles. Wiles himself famously described his work experience like stumbling around in a dark room. If the room is very dark and very cluttered then you will certainly knock things over and stub your toes searching for a lightswitch. [Fn1]

In economics, some of my favorite papers have a bit of mathiness that serves the papers brilliantly. A good example occurs in Mankiw, Romer and Weil’s famous paper “A Contribution to the Empirics of Economic Growth” (QJE 1992).  As the title suggests, the paper is an analysis of the sources of differences in economic growth experiences. The paper includes a simple theory section and a simple data section. The theory essentially studies simple variations of the standard Solow growth model augmented to include human capital (skills, know-how). The model is essentially a one-good economy with exogenous savings rates. In some corners of the profession, using a model with an exogenous savings rate might be viewed as a stoning offense but it is perfectly fine in this context. The paper is about human capital, not about the determinants of the saving rate. But that’s not the end of the mathiness. Their analysis proceeds by constructing a linear approximation to the growth paths of the model and then using standard linear regression methods together with aggregate data. Naturally such regressions are typically not identified but Mankiw, Romer and Weil don’t let that interfere with the paper. They simply assume that the error terms in their regressions are uncorrelated with the savings rates and proceed with OLS. There is a ton of mathiness in this work. And the consequence? Mankiw, Romer and Weil’s 1992 paper is one of the most cited and influential papers in the field of economic growth. Think about how this paper would be changed if some idiot referee decided that it needed optimizing savings decisions (after all we can’t allow hand-waving about exogenous savings rates), multiple goods and a separate human capital production function (no hand-waving about an aggregate production function or one-good economies), micro data (no hand-waving about aggregate data), and instruments for savings rates, population growth rates and human capital savings rate (no hand-waving about identification).  The first three modifications suggested by the referee would simply be a form of hazing combined with obfuscation (the modifications make the authors jump through hoops for no good reason and the end product has an analysis that is less clear).  The last one – insistence on a valid instrument – would probably be the end of the paper since such instruments probably don’t exist. Thank God this paper didn’t run into a referee like this.

My own opinion is that mathematical sloppiness can be perfectly fine if it deals with a feature that is not a focus of the paper. Hand-waving of this sort likely comes at very little cost and may have benefits by eliminating a lengthy discussion of issues only tangentially related to the paper. On the other hand, if the hand-waving occurs when analyzing or discussing central features of the paper then I am much more inclined to ask the researcher to do the analysis right. This type of hand-waving happens sometimes but it is not clear that it happens more often in macroeconomics or in freshwater macro at that – ironically, the Freshwater guys that Romer criticizes are much more likely to demand a tight specification and precise analysis of the model (whether it is called for or not).

[Fn1] If you are interested in the modern proof of Fermat’s Last Theorem I highly recommend this documentary.

In Praise of Linear Models …

Noah Smith and Francis Coppola have recent columns discussing the prevalence of linear methods in economics and in particular in macroeconomics.

Coppola acknowledges that adding financial components to DSGE models is a step in the right direction but that it “does not begin to address the essential non-linearity of a monetary economy. … Until macroeconomists understand this, their models will remain inadequate.” (This is ok but to me it sounds like she’s going into Deepak Chopra mode a bit.)

Noah gives a good description of the properties and benefits of linear models.

“[Linear models are] easy to work with. Lines can only intersect at one point, so […] there’s only one thing that can happen.” In non-linear models “the curves can bend back around and [could] meet in some faraway location. Then you have a second equilibrium – another possible future for the economy. […] if you go with the full, correct [non-linear] versions of your models, you stop being able to make predictions about what’s going to happen to the economy. […] Also, linearized models […] are a heck of a lot easier to work with, mathematically. […] As formal macroeconomic models have become more realistic, they’ve become nastier and less usable. Maybe their days are simply numbered.”

There are elements of truth in both columns but on the whole I don’t share their assessments. In fact, I am firmly of the opinion that linear solution methods are simply superior to virtually any other solution method in macroeconomics (or any other field of economics for that matter). Moreover, I suspect that many younger economists and many entering graduate students are gravitating towards non-linear methods for no good reason and I also suspect they will end up paying a price if they adopt the fancier modelling techniques.

Why am I so convinced that linear methods are a good way to proceed? Let me count the ways:

1. Most empirical work is done in a (log) linear context. In particular regressions are linear. This is important because when macroeconomists (or any economist) tries to compare predictions with the data, the empirical statements are usually stated in terms that are already linear. This comparison between theory and data is easier if the models are making predictions that are couched in a linear framework. In addition, there aren’t many empirical results that speak clearly on non-linearities in the data. There is a statement that I’ve been told by empirical economists several times that a well-known economist said that “either the world is linear, or it’s log-linear, or God is a son-of-a-bitch.”

2. Linear doesn’t mean simple. When I read Francis’ column it seems that her main complaint lies with the economic substance of the theories rather than the solution method or the approximate linearity of the solutions. She talks about lack of rationality, heterogeneity, financial market imperfections, and so forth. None of these things requires a fundamentally non-linear approach. On the contrary, the more mechanisms and features you shove into a theory, the more you will benefit from a linear approach. Linear systems can accommodate many features without much additional cost.

3. Linear DSGE models can be solved quickly and accurately. Noah mentioned this but it bears repeating. One of the main reasons to use linear methods is that they are extremely efficient and extremely powerful. They can calculate accurate (linear) solutions in milliseconds. In comparison, non-linear solutions often require hours or days (or weeks) to converge to a solution. [Fn 1]

4. The instances where non-linear results differ importantly from linear results are few and far between. The premise behind adopting a non-linear approach is that knowing about the slopes (or elasticities) of the demand and supply curves is not sufficient. We have to know about the curvatures of the demand and supply curves too. On top of the fact that we don’t really know much about these curvature terms, the presumption itself is highly suspect. If we are picking teams, I’ll pick the first order terms and let you pick all of the higher order terms you want any day of the week (and I’ll win). In cases in which we can calculate linear vs. non-linear solutions, the differences are typically embarrassingly small and even when there are noticeable differences, they often go away as we improve the non-linear solution. A common exercise is to calculate the growth convergence path for the neoclassical (Ramsey) growth model using discrete dynamic programming techniques and then compare the solution to the one you get from a linearized solution in the neighborhood of the balanced growth path. The dynamic programming approach is non-linear – it allows for an arbitrary reaction on a discrete grid space. When you plot out the two responses, it is clear that the two solutions aren’t the same. However, as we add more and more grid points, the two solutions start to look closer and closer. (The time required for computation of the dynamic programming solution grows of course.)

5. Unlike non-linear models, approximate analytical results can be recovered from the linear systems. This is an underappreciated side benefit of adopting a linear approach. The linear equations can be solved by hand to yield productive insights. There are many famous examples of log-linear relationships that have led to well-known empirical studies based on their predictions. Hall’s log-linear Euler equation, the New Keynesian Phillips Curve, log-linear labor supply curves, etc. Mankiw, Romer and Weil (1992) used a linear approximation to crack open the important relationship between human capital and economic growth.

Given the huge benefits to linear approaches in the field. The main question I have is not why researchers don’t adopt non-linear, global solution approaches but rather why linear methods aren’t used even more widely than they already are. One example in particular concerns the field of Industrial Organization (IO). IO researchers are famous for adopting complex non-linear modelling techniques that are intimidating and impressive. They often seem to brag about how it takes even the most powerful computers weeks to solve their models. I’ve never understood this aspect of IO. IO is also known to feature the longest publication lags and revision lags of any field. It’s possible that some of this is due to the techniques that they are using; I’m not sure. I have asked friends in IO why they don’t use linear solutions more often and the impression I am left with is that it is a combination of an assumption that a linear solution simply won’t work for the kinds of problems they analyze together with a lack of familiarity with linear methods.

There are of course cases in which there is important non-linear behavior that needs to be featured in the results. For these cases it does indeed seem like a linear approach is not appropriate. Brad DeLong argued that accommodating the zero lower bound on interest rates (i.e., the “flat part of the LM curve” ) is such a case. I agree. There are cases like the liquidity trap that clearly entail important aggregate non-linearities and in those instances you are forced to adopt a non-linear approach. For most other cases however, I’ve already chosen my team …


[Fn 1] This reminds me of a joke from the consulting business. I once asked an economic consultant why he thought he could give valuable advice to people who have been working in an industry their whole lives. He told me that I was overlooking one important fact – the consultants charge a lot of money.

Back to Blogging …

I haven’t written a post in a long, long time.  Research, referee reports, teaching and administrative work have all been preventing me from contributing to the blog.  This summer I’m going to try to get back into writing somewhat regularly (famous last words).

The Lure of Capital Taxation

The taxation of capital income – corporate profits, capital gains, interest, dividends, etc. – has always been attractive to both politicians and to economists.

Politicians are drawn to capital taxation because it presents an opportunity to raise a substantial amount of income but at the same time is politically popular in a way that labor income taxation or the elimination of the home mortgage interest deduction isn’t.

Economists are drawn to capital income taxation not because it is a good idea – many (most?) economists oppose taxing capital income – but because it presents a number of academically interesting features that are worthy of study. In particular, capital income taxation presents several tricky dynamic considerations that many other forms of taxation do not.

Suppose the United States government sharply increases taxes on business income – say by increasing the corporate profit tax rate. What are the short-run and long-run consequences of such a policy?

In the immediate wake of the higher tax, standard theory would predict that basically nothing would happen to production or employment (?). This probably strikes many of you as puzzling. If I tax business income shouldn’t I discourage the creation of this income? According to standard economic analysis, such a tax should not. The reason is that real business capital – the equipment, machines, factories, etc. – are inelastically supplied in the short run. The factories are all still here. The grills, ovens and sinks still work as do the espresso machines. The excavators still run … All of the capital that was functioning before the tax is functioning after the tax. Moreover, businesses cannot escape capital taxes by laying off workers – the firm’s wage bill is already tax deductible. As a result, the firm should behave the same in the face of the capital tax hike. The immediate effects of the capital tax increase are simply a reduction in the after-tax capital income for owners, investors and savers.  Put differently, it is the capitalists bear all of the burdens on the tax increase.

As time passes however, the burden of the tax hike shifts to workers rather than capital owners. While the tax increase doesn’t influence the supply of capital in the short run, according to the standard model, it discourages the capital accumulation over time. In particular, it should discourage capital accumulation – and thus reduce the capital stock – to the point at which the after tax real rate of return is in its long run equilibrium. In the standard model this is usually the exact same rate of return that the economy began with. At the same time, the reduction in capital also reduces the demand for labor and so wages fall.  So, in the long run, the consequences of an increase in capital taxation are the opposite of the short run.  In the long-run, the workers bear all of the tax burden and the capitalists bear none of it.

The fact that the economic incidence of a tax increase is different from the statutory incidence is one of the classic insights of public finance economics. The dynamic effects of a change in capital taxes is a particularly stark example of such a distinction. The difference between the short run and long run effects of the tax changes comes from the differences between the short-run and the long-run supply of capital.  In the short run, the capital stock “is what it is.” Over time however, capital can be accumulated or decumulated – that is, the long-run supply is relatively elastic. This simply insight alone strongly suggests that the optimal taxation of capital income should be higher in the short run that in the long run.

The tension between long run and short run optimal taxes also presents a problem for policy makers.  While they will constantly be tempted to tax capital, they will also want to commit to lower taxes in the future.

Among the most famous results in modern public finance is the Chamley-Judd theorem (see C. Chamley 1986 and K. Judd 1985). This result says that if tax rates are chosen optimally then, if the economy attains a steady state, the tax rate on capital income should be zero (!). Many economists (myself included) have interpreted this result as a formal articulation of the intuitions above.

Recently, there may have been an important development in the literature on optimal capital taxation. Ivan Werning and Ludwig Straub (MIT) have circulated a paper in which they argue that the Chamley-Judd result is of more limited importance that is commonly believed.  They argue that under certain circumstances optimal tax paths do not lead to a steady state the way we normally think of it and in these cases, the tax on capital income can be positive. In other cases, while the Chamley-Judd result technically holds, the steady state is attained only after an extended period of time (centuries in their examples) and is thus of limited relevance.

I am not going to go into the details of the Straub and Werning paper in a blog post. The short version of the explanation is that the suppositions required for the Chamley-Judd result are more stringent than one would imagine. For the textbook Chamley-Judd result to hold, the economy must approach an interior steady state with finite Lagrange multipliers for the various constraints in the model.  In many of the solutions in the new paper, the economy converges to a rest point with zero capital (and is thus not interior).  In other solutions, the Lagrange multipliers do not converge.

In fairness, the counter-examples Straub and Werning provide are all in the family of isoelastic preferences which sometimes have weird features in certain settings.  It is not clear to me whether the authors’ findings will necessarily extend to more malleable preference specifications but what is clear is that researchers will have to look much closer at the meaning of one of the most prominent results in modern public finance theory.

More to come I’m sure …