Homo Economicus vs. Homo Economicus “Straw-icus”

I haven’t written on my blog for a really long time but I am starting to approach a point where I might actually have some time to post again somewhat regularly. My reason for writing this particular post is in response to a blog post by Noah Smith who took issue with Milton Friedman’s famous pool player analogy. Noah has concluded that “the pool player analogy is silly” and his reasons for arriving at this conclusion are

  1. If actual pool players never missed their shots, there would be no use for the physics equations as a prediction and analysis tool.
  2. People make mistakes so they don’t always optimize.
  3. People who make bad decisions don’t tend to go away over time.
  4. Unlike in pool, we rarely know what the objective is.

I must confess that the pool player analogy is one of my favorite analogies in economics and I use it often when I am talking to people about economics.  The pool player analogy is a common response to typical criticisms of standard microeconomic analysis.  Microeconomic conclusions like “consumers should make choices that equalize the marginal utility per dollar across goods” or “firms should make input choices so that the marginal product of capital equals the real (product) rental price of capital” often seem very abstract and technical and they invite natural objections like “real-life people don’t behave like that” or “in the real world, firms don’t make calculations like this.”  Such reactions are natural and this is exactly where the pool player analogy fits in.  In pool, making the best shots would seem to require a host of extremely involved calculations involving physics concepts that most real-life players won’t know. Making an optimal shot requires considerations of the friction of the felt on the table, considerations of angular momentum, transfers of energy, torque and so on.  Even if a pool player knew about all of this stuff one would guess that it would require a long time between shots as the player made a series of complex calculations prior to taking the shot. Nevertheless, the actual shot taken will look a lot like the one implied by such a calculation. That is, the pool player’s actual behavior will closely resemble the behavior implied by the optimal physics calculation.

The central guiding principle of economic analysis is that behavior is guided by self-interest.  Consumers and firms make choices that (in their assessment) make them as well-off as possible. Sometimes this principle is summarized by the term homo economicus or “economic man” which essentially views mankind as being comprised of many self-interested autonomous beings. The pool player is one manifestation of the behavior of a self-interested individual making choices under a constraint but so are ordinary people. If you go to lunch at a Chinese restaurant, you will in all likelihood be confronted with a staggering number of choices, options, prices, substitutions, and combinations.  One might be tempted to conclude that it will be impossible to make an optimal choice from such a menu because there are simply too many possibilities to consider. However, when I actually go to lunch at the local Chinese restaurant I typically see people ordering relatively quickly, sometimes individualizing their choices and I suspect that most people do not anticipate that they typically made mistakes ordering lunch. Homo economicus does a pretty good job ordering lunch, commuting to work, etc.  This is not to say that economic man doesn’t make mistakes or overlook things.  This is inevitable and surely happens often in the real world.  The question you have to ask yourself is – do you think people typically make something close to the best choices or do you think people typically make severe mistakes in their economic decisions.  If you agree with the former then you are thinking like a mainstream economist.

Noah seems to think that because people make mistakes or because we can’t know the true (mathematical) objective functions that this appeal to optimal behavior is misguided.  To me it seems like Noah is not criticizing “economic man” so much as he is criticizing “economic straw man.” In the article he arrives at this startling conclusion writing

[If] really good pool players made 100% of their shots, there wouldn’t be pool tournaments. It would be no fun, because whoever went first would always win. But in fact, there are pool tournaments. So expert pool players do, in fact, miss.

(… shocking, I know. How could we have not seen this?).

Economists do not assume that people don’t make mistakes (though in fairness it is not completely obvious how to analytically model mistakes). I can think of exactly zero economists who think that all behavior is optimal from some omniscient / omnipotent point of view.  I certainly don’t believe this – I regularly play online chess and I always overlook moves that are to my advantage or moves my opponent could play that would be really bad for me. This doesn’t in any way suggest that my behavior isn’t guided by my own self-interest or that my move isn’t what I think is my best option given the current position. Moreover, I would think that, in chess, starting from the point of view of making choosing one of the better moves available would often provide a good guide to actual game play – this is almost certainly true for professional and semi-professional players.

Imagine, if you will, actually trying to construct a model of pool playing. Specifically let’s consider 9 ball.  In 9 ball the balls are sunk in order (unless the 9 goes in on the break).  This greatly reduces the strategic nature of the game and makes it more like the players know which shot they want to hit and now there is simply the task of actually making the shot.  If I were to approach such a problem (which might actually be interesting from a behavioral economics point of view) I might adopt the following approach:  Suppose the best shot could be described by a vector S.  S would include spin, speed/power, angle, etc.  A perfect player would simply take a shot given by S. We could think about a real-world player as taking a shot that might include an error e. The spin will be a little off, the angle will be off a bit too, and so forth.  So any one actual shot could be given by S + e.  You might think that the best players typically make small errors while inexperienced players would make bigger errors. I would think this would be a pretty good description of actual pool but note I would absolutely begin with the idealized shot S (the shot made by homo economicus).

Noah also makes a big deal about the fact that we don’t know the objective. The fact that we typically don’t know the objective is not necessarily a problem. If an economist looks at a game being played where she isn’t familiar with the rules then she will still tend to treat the moves she observes as though they are guided by some latent objective. (And you might guess that after looking at such behavior for a while, the economist might be able to deduce the objective even if she doesn’t have advanced knowledge of it.)

In short, I have always felt – and I continue to feel – that Friedman’s pool player is an excellent way to convey how economists approach their subject. Only if we were to insist that people really were computerized robots or Vulcans always making perfectly optimal, logical choices with regard to some easily described mathematical objective would this analogy present a problem – in fact it’s only a problem for the economic straw man.  So, until Noah comes up with some more serious objections I’m going to keep this analogy on my go-to list of explanations for basic economics.

In the meantime, let me leave you with this opening quote from The Color of Money.

A player can make eight trick-shots in a row, blow the 9 and lose.

On the other hand, a player can get the 9 in on the break, if the balls are spread right, and win.

Which is to say that, luck plays a plays a part in 9-ball.


But for some players,

 Luck itself is an art

Is Big Data the answer to all our problems?

Noah Smith recently commented on a Malcom Gladwell talk in which he (Gladwell) expressed doubts about the promise of “Big Data.”  When people use the term “Big Data” they are often referring to datasets that are built from digital data streams like Google searches, the Twitter feed, and so forth but it can also refer simply to large datasets that were previously too cumbersome for researchers to access easily. In any case, Gladwell says that this newfound data availability is not our salvation. He also claims that this data might be a curse. Noah devotes most of his comment to countering Gladwell’s claims. I’m actually not a huge Malcom Gladwell fan and, like Noah, I basically disagree with the idea that having more data could be a problem for us.  More data must be better. (OK, there is the Snowden issue and government invasions of privacy more generally but let’s leave those problems aside – I don’t think that kind of intrusive survelance is what either Gladwell or Noah has in mind).

There are some aspects of Big Data that I’ve been thinking about for a little while that seem at least somewhat relevant to Gladwell’s argument and Noah’s post. Many researchers – and many economists in particular – see Big Data as a huge benefit to their field.  Indeed, some view the arrival of these new datasets as a transformative event in social science.  Speaking for myself only, I have some doubts that this new data will be as much of a benefit as many are predicting. In economics, many researchers are being drawn to these datasets without having a direct purpose or plan in mind. To me, this is most concerning with graduate students who are under lots of pressure and sometimes hold out hope that a huge dataset will be like the Holy Grail for an underdeveloped research portfolio. After waiting to obtain their data, the graduate students typically are let down when they realize that the data doesn’t really address the questions they were interested in, or that the data needs to be cleaned and arranged into a useable form which takes a tremendous amount of work, or that they were hoping that the dataset would present an obvious killer question or killer instrument and the data fails to deliver.

Another thing that pops up all too frequently is the idea that a bigger dataset is automatically superior simply because it has many observations. This is often clearly not the case and it’s painful to see this realization fall upon a researcher (sometimes during their presentations). To take a really obvious example, suppose you are interested in whether extensions of unemployment benefits reduce labor supply by causing people to search more while they are still getting the payments. A dataset with state-level unemployment rate data over the period 2005-2014 might actually be able to speak to this question. In contrast, a dataset with 100 million daily individual observations for a year isn’t going to help you at all if there is no variation in unemployment benefit policy in that year. Sure it’s impressive that you can get such a dataset but it isn’t useful for your research question. Sometimes in seminars, the presenter will intentionally advertise the scope of the dataset in a futile effort to impress the audience. It never works. It’s similar to a related unsuccessful tactic of trying to impress the audience by telling them how long it takes your computer to solve a complicated dynamic programming problem.

Stuff like this comes up all the time. I was in a seminar where a researcher was using individual household level consumption data to test the permanent income hypothesis (PIH). The dataset was quite nice but the consumption measure combined both durable and nondurable goods and unfortunately the PIH applies only to nondurable consumption spending. When the researcher was asked why he or she used the individual data rather than aggregate data (which does break out nondurable consumption) his/her response was simply that he/she felt that individual data was better than aggregate data (?).

Firm-level data is another pet peeve of mine. I can’t tell you the number of times I’ve heard people say that the reason that we should use firm level data is because this is just what people do these days. Firm-level data is particularly noteworthy because one of the classic issues in economics deals with the nature of a firm itself. The straight neoclassical perspective is that the notion of a firm is not particularly well defined. Two mechanic shops that operate independently would appear as two firms in a typical dataset but if the owner of one of the shops sells it to the other owner, these firms would suddenly become a single observation. This problem reminds me of a quote by Frank Zappa: “The most important thing in art is the frame. … without this humble appliance, you can’t know where the art stops and the real world begins.” A similar thing occurs with firm level data. We have a bunch of underlying behavior and then there are these arbitrary frames placed around groups of activity. We call these arbitrary groupings “firms.” Arbitrary combinations (or breakdowns) like this surely play a large role in dictating the nature of firm-level data. In the end it’s not clear how many real observations we actually have in these datasets.

In the past I’ve been fortunate enough to work with some students who used hand-collected data.[1] Data like this is almost always fairly small in comparison with real-time data or administrative data. Despite this apparent size disadvantage, self-collected data has some advantages that are worth emphasizing. First, the researcher will necessarily know much more about the way the data was collected. Second, the data can be collected with the explicit aim of addressing a specifically targetted research question. Third, building the data from the ground up invites the researcher to confront particular observations that might be noteworthy for one reason or another. In fact, I often encourage graduate students to look in depth at individual observations to build their understanding of the data. This will likely not happen with enormous datasets.

Again, this is not to say that more data is in anyway a disadvantage. However, like any input into the research process, the choice of data should be given some thought. A similar thing came up perhaps 15 years ago when more and more powerful computers allowed us to expand the set of models we could analyze. This was greeted as a moment of liberation by some economists but soon the moment of bliss gave way to reality. Adding a couple more state variables wasn’t going to change the field; just because the model is solved a bit more accurately and faster won’t expand our understanding by leaps and bounds. Better? No doubt. A panacea? Not at all.

The real constraints on economics research have always been, and continue to be, a shortage of ideas and creativity. Successfully pushing the boundaries of our understanding requires creative insights coupled with accurate quantitative modelling and good data and empirical work. The kinds of insights I’m talking about won’t be found just lying around in any dataset – not matter how big it is.

[1] One of my favorite examples of such data work is by Ed Knotek who collected a small dataset on prices of goods that were sold in convenience stores but also sold in large supermarkets. See “Convenient Prices and Price Rigidity: Cross-Sectional Evidence,” Review of Economics and Statistics, 2011.

Warren Buffet: Fighting Income Inequality with the EITC

Warren Buffet’s article in the Wall Street Journal reminds me of some posts I wrote a while back on fighting income inequality. His article contains a lot of wisdom. Some excerpts:

The poor are most definitely not poor because the rich are rich. Nor are the rich undeserving. Most of them have contributed brilliant innovations or managerial expertise to America’s well-being. We all live far better because of Henry Ford, Steve Jobs, Sam Walton and the like.

He writes that an expansion of the minimum wage to 15 dollars per hour

would almost certainly reduce employment in a major way, crushing many workers possessing only basic skills. Smaller increases, though obviously welcome, will still leave many hardworking Americans mired in poverty. […]  The better answer is a major and carefully crafted expansion of the Earned Income Tax Credit (EITC).

I agree entirely and so would Milton Friedman.

Unlike the minimum wage which draws money from an abstract group of individuals some of whom are not high income earners, the EITC draws funds from the broad U.S. tax base which in turn draws more heavily from upper income individuals. Unlike the minimum wage, the EITC can be directed at low income households rather than low wage individuals — many of whom are simply teenagers working summer jobs. Unlike the minimum wage which discouraged employment, the EITC encourages employment.

Buffet also proposes some common sense modifications to the EITC which I would welcome. In addition to reducing fraud, …

There should be widespread publicity that workers can receive free and convenient filing help. An annual payment is now the rule; monthly installments would make more sense, since they would discourage people from taking out loans while waiting for their refunds to come through.

The main problems with such an expansion of the EITC are political. Taking a serious swing at reducing income inequality would require a lot of money. Republican’s will likely oppose it because it is “socialist” or some nonsense. I’m sure that many Democrats would be very receptive to an aggressive expansion like Buffet alludes to but the cost in political capital might be too great for a politician to pay.


(Paul) Romer’s Rant

Paul Romer has decided that it is time to air some grievances. In a widely discussed recent article in the AER Papers and Proceedings volume, he calls out some prominent macroeconomists for the alleged crime of “Mathiness.”  Several blog commenters have offered their interpretations of the main thrust of Romer’s thesis. I admit that after reading the article I am not entirely sure what Romer means by “mathiness”. Noah Smith interprets mathiness as the result of “using math in a sloppy way to support […] preferred theories.”  In his follow-up article in Bloomberg, Noah says much of the blame comes from abusing “mathematical theory by failing to draw a tight link between mathematical elements and the real world.”

Hopefully Romer is talking about something more than just mathematical errors in papers by prominent researchers. If this is his main gripe, let me break the bad news to you: Every paper has errors. Sometimes the errors are innocuous; sometimes they are fatal. Errors are not confined to mathematical papers either. There are plenty of mistakes in purely empirical papers. The famous mistake in the Reinhart–Rogoff debt paper; the results from Levitt’s abortion paper and so on…  Mistakes and mess-ups are part of research. Criticizing someone for making lots of mistakes is almost like criticizing them for doing lots of research. If you aren’t making mistakes you aren’t working on stuff that is sufficiently hard. This isn’t to say that I encourage careless or intellectually dishonest work. All I am saying is that mistakes are an inevitable byproduct of research – particularly research on cutting-edge stuff. Moreover, mistakes will often live on. Mistakes are most likely to be exposed and corrected if the paper leads to follow-up work. Unfortunately, most research doesn’t lead to such subsequent work and thus any mistakes in the original contributions simply linger. This isn’t a big problem of course since no one is building on the work.  Focusing on mistakes is also not what we should be spending our time on. We should not be discussing or critiquing papers that we don’t value very much. We should be focused on papers that we do value. This is how we judge academics in general. I don’t care about whether people occasionally (or frequently) write bad papers. I care whether they occasionally write good ones. We don’t care about the average paper – we care about the best papers (the orderstatistics).

If Romer is indeed talking about something other than mistakes then I suspect that his point is closer to what Noah describes in his recent columns: mathiness is a kind of mathematical theory that lacks a sufficiently tight link to reality. Certainly, having the “tight link” that Noah talks about is advantageous. Such a connection allows researchers to make direct make comparisons between theory and data in a way that is made much more difficult if the mapping between the model and the data is not explicit. On the other hand, valuable insights can certainly be obtained even if the theorist appeals to mathematical sloppiness / hand-waving / mathinesss, whatever you want to call it.  In fact I worry that the pressure on many researchers is often in the opposite direction. Instead of being given the freedom to leave some of their theories somewhat loose / reduced form / partial equilibrium, researchers are implored to solve things out in general equilibrium, to micro found everything, to be as precise and explicit as possible – often at the expense of realism. I would welcome a bit of tolerance for some hand-waviness in economics.  Outside of economics, the famous story of the proof of Fermat’s Last Theorem includes several important instances of at least what can be described as incompleteness if not outright hand-waving. The initial Taniyama–Shimura conjecture was a guess. The initial statement of Gerhard Frey’s epsilon conjecture was described by Ken Ribet (who ultimately proved the conjecture) as a “plausibility argument”. Even though they were incomplete, these conjectures were leading the researchers in important directions. Indeed, these guesses and sketches ultimately led to the modern proof of the theorem by Andrew Wiles. Wiles himself famously described his work experience like stumbling around in a dark room. If the room is very dark and very cluttered then you will certainly knock things over and stub your toes searching for a lightswitch. [Fn1]

In economics, some of my favorite papers have a bit of mathiness that serves the papers brilliantly. A good example occurs in Mankiw, Romer and Weil’s famous paper “A Contribution to the Empirics of Economic Growth” (QJE 1992).  As the title suggests, the paper is an analysis of the sources of differences in economic growth experiences. The paper includes a simple theory section and a simple data section. The theory essentially studies simple variations of the standard Solow growth model augmented to include human capital (skills, know-how). The model is essentially a one-good economy with exogenous savings rates. In some corners of the profession, using a model with an exogenous savings rate might be viewed as a stoning offense but it is perfectly fine in this context. The paper is about human capital, not about the determinants of the saving rate. But that’s not the end of the mathiness. Their analysis proceeds by constructing a linear approximation to the growth paths of the model and then using standard linear regression methods together with aggregate data. Naturally such regressions are typically not identified but Mankiw, Romer and Weil don’t let that interfere with the paper. They simply assume that the error terms in their regressions are uncorrelated with the savings rates and proceed with OLS. There is a ton of mathiness in this work. And the consequence? Mankiw, Romer and Weil’s 1992 paper is one of the most cited and influential papers in the field of economic growth. Think about how this paper would be changed if some idiot referee decided that it needed optimizing savings decisions (after all we can’t allow hand-waving about exogenous savings rates), multiple goods and a separate human capital production function (no hand-waving about an aggregate production function or one-good economies), micro data (no hand-waving about aggregate data), and instruments for savings rates, population growth rates and human capital savings rate (no hand-waving about identification).  The first three modifications suggested by the referee would simply be a form of hazing combined with obfuscation (the modifications make the authors jump through hoops for no good reason and the end product has an analysis that is less clear).  The last one – insistence on a valid instrument – would probably be the end of the paper since such instruments probably don’t exist. Thank God this paper didn’t run into a referee like this.

My own opinion is that mathematical sloppiness can be perfectly fine if it deals with a feature that is not a focus of the paper. Hand-waving of this sort likely comes at very little cost and may have benefits by eliminating a lengthy discussion of issues only tangentially related to the paper. On the other hand, if the hand-waving occurs when analyzing or discussing central features of the paper then I am much more inclined to ask the researcher to do the analysis right. This type of hand-waving happens sometimes but it is not clear that it happens more often in macroeconomics or in freshwater macro at that – ironically, the Freshwater guys that Romer criticizes are much more likely to demand a tight specification and precise analysis of the model (whether it is called for or not).

[Fn1] If you are interested in the modern proof of Fermat’s Last Theorem I highly recommend this documentary.

In Praise of Linear Models …

Noah Smith and Francis Coppola have recent columns discussing the prevalence of linear methods in economics and in particular in macroeconomics.

Coppola acknowledges that adding financial components to DSGE models is a step in the right direction but that it “does not begin to address the essential non-linearity of a monetary economy. … Until macroeconomists understand this, their models will remain inadequate.” (This is ok but to me it sounds like she’s going into Deepak Chopra mode a bit.)

Noah gives a good description of the properties and benefits of linear models.

“[Linear models are] easy to work with. Lines can only intersect at one point, so […] there’s only one thing that can happen.” In non-linear models “the curves can bend back around and [could] meet in some faraway location. Then you have a second equilibrium – another possible future for the economy. […] if you go with the full, correct [non-linear] versions of your models, you stop being able to make predictions about what’s going to happen to the economy. […] Also, linearized models […] are a heck of a lot easier to work with, mathematically. […] As formal macroeconomic models have become more realistic, they’ve become nastier and less usable. Maybe their days are simply numbered.”

There are elements of truth in both columns but on the whole I don’t share their assessments. In fact, I am firmly of the opinion that linear solution methods are simply superior to virtually any other solution method in macroeconomics (or any other field of economics for that matter). Moreover, I suspect that many younger economists and many entering graduate students are gravitating towards non-linear methods for no good reason and I also suspect they will end up paying a price if they adopt the fancier modelling techniques.

Why am I so convinced that linear methods are a good way to proceed? Let me count the ways:

1. Most empirical work is done in a (log) linear context. In particular regressions are linear. This is important because when macroeconomists (or any economist) tries to compare predictions with the data, the empirical statements are usually stated in terms that are already linear. This comparison between theory and data is easier if the models are making predictions that are couched in a linear framework. In addition, there aren’t many empirical results that speak clearly on non-linearities in the data. There is a statement that I’ve been told by empirical economists several times that a well-known economist said that “either the world is linear, or it’s log-linear, or God is a son-of-a-bitch.”

2. Linear doesn’t mean simple. When I read Francis’ column it seems that her main complaint lies with the economic substance of the theories rather than the solution method or the approximate linearity of the solutions. She talks about lack of rationality, heterogeneity, financial market imperfections, and so forth. None of these things requires a fundamentally non-linear approach. On the contrary, the more mechanisms and features you shove into a theory, the more you will benefit from a linear approach. Linear systems can accommodate many features without much additional cost.

3. Linear DSGE models can be solved quickly and accurately. Noah mentioned this but it bears repeating. One of the main reasons to use linear methods is that they are extremely efficient and extremely powerful. They can calculate accurate (linear) solutions in milliseconds. In comparison, non-linear solutions often require hours or days (or weeks) to converge to a solution. [Fn 1]

4. The instances where non-linear results differ importantly from linear results are few and far between. The premise behind adopting a non-linear approach is that knowing about the slopes (or elasticities) of the demand and supply curves is not sufficient. We have to know about the curvatures of the demand and supply curves too. On top of the fact that we don’t really know much about these curvature terms, the presumption itself is highly suspect. If we are picking teams, I’ll pick the first order terms and let you pick all of the higher order terms you want any day of the week (and I’ll win). In cases in which we can calculate linear vs. non-linear solutions, the differences are typically embarrassingly small and even when there are noticeable differences, they often go away as we improve the non-linear solution. A common exercise is to calculate the growth convergence path for the neoclassical (Ramsey) growth model using discrete dynamic programming techniques and then compare the solution to the one you get from a linearized solution in the neighborhood of the balanced growth path. The dynamic programming approach is non-linear – it allows for an arbitrary reaction on a discrete grid space. When you plot out the two responses, it is clear that the two solutions aren’t the same. However, as we add more and more grid points, the two solutions start to look closer and closer. (The time required for computation of the dynamic programming solution grows of course.)

5. Unlike non-linear models, approximate analytical results can be recovered from the linear systems. This is an underappreciated side benefit of adopting a linear approach. The linear equations can be solved by hand to yield productive insights. There are many famous examples of log-linear relationships that have led to well-known empirical studies based on their predictions. Hall’s log-linear Euler equation, the New Keynesian Phillips Curve, log-linear labor supply curves, etc. Mankiw, Romer and Weil (1992) used a linear approximation to crack open the important relationship between human capital and economic growth.

Given the huge benefits to linear approaches in the field. The main question I have is not why researchers don’t adopt non-linear, global solution approaches but rather why linear methods aren’t used even more widely than they already are. One example in particular concerns the field of Industrial Organization (IO). IO researchers are famous for adopting complex non-linear modelling techniques that are intimidating and impressive. They often seem to brag about how it takes even the most powerful computers weeks to solve their models. I’ve never understood this aspect of IO. IO is also known to feature the longest publication lags and revision lags of any field. It’s possible that some of this is due to the techniques that they are using; I’m not sure. I have asked friends in IO why they don’t use linear solutions more often and the impression I am left with is that it is a combination of an assumption that a linear solution simply won’t work for the kinds of problems they analyze together with a lack of familiarity with linear methods.

There are of course cases in which there is important non-linear behavior that needs to be featured in the results. For these cases it does indeed seem like a linear approach is not appropriate. Brad DeLong argued that accommodating the zero lower bound on interest rates (i.e., the “flat part of the LM curve” ) is such a case. I agree. There are cases like the liquidity trap that clearly entail important aggregate non-linearities and in those instances you are forced to adopt a non-linear approach. For most other cases however, I’ve already chosen my team …


[Fn 1] This reminds me of a joke from the consulting business. I once asked an economic consultant why he thought he could give valuable advice to people who have been working in an industry their whole lives. He told me that I was overlooking one important fact – the consultants charge a lot of money.

Back to Blogging …

I haven’t written a post in a long, long time.  Research, referee reports, teaching and administrative work have all been preventing me from contributing to the blog.  This summer I’m going to try to get back into writing somewhat regularly (famous last words).

The Lure of Capital Taxation

The taxation of capital income – corporate profits, capital gains, interest, dividends, etc. – has always been attractive to both politicians and to economists.

Politicians are drawn to capital taxation because it presents an opportunity to raise a substantial amount of income but at the same time is politically popular in a way that labor income taxation or the elimination of the home mortgage interest deduction isn’t.

Economists are drawn to capital income taxation not because it is a good idea – many (most?) economists oppose taxing capital income – but because it presents a number of academically interesting features that are worthy of study. In particular, capital income taxation presents several tricky dynamic considerations that many other forms of taxation do not.

Suppose the United States government sharply increases taxes on business income – say by increasing the corporate profit tax rate. What are the short-run and long-run consequences of such a policy?

In the immediate wake of the higher tax, standard theory would predict that basically nothing would happen to production or employment (?). This probably strikes many of you as puzzling. If I tax business income shouldn’t I discourage the creation of this income? According to standard economic analysis, such a tax should not. The reason is that real business capital – the equipment, machines, factories, etc. – are inelastically supplied in the short run. The factories are all still here. The grills, ovens and sinks still work as do the espresso machines. The excavators still run … All of the capital that was functioning before the tax is functioning after the tax. Moreover, businesses cannot escape capital taxes by laying off workers – the firm’s wage bill is already tax deductible. As a result, the firm should behave the same in the face of the capital tax hike. The immediate effects of the capital tax increase are simply a reduction in the after-tax capital income for owners, investors and savers.  Put differently, it is the capitalists bear all of the burdens on the tax increase.

As time passes however, the burden of the tax hike shifts to workers rather than capital owners. While the tax increase doesn’t influence the supply of capital in the short run, according to the standard model, it discourages the capital accumulation over time. In particular, it should discourage capital accumulation – and thus reduce the capital stock – to the point at which the after tax real rate of return is in its long run equilibrium. In the standard model this is usually the exact same rate of return that the economy began with. At the same time, the reduction in capital also reduces the demand for labor and so wages fall.  So, in the long run, the consequences of an increase in capital taxation are the opposite of the short run.  In the long-run, the workers bear all of the tax burden and the capitalists bear none of it.

The fact that the economic incidence of a tax increase is different from the statutory incidence is one of the classic insights of public finance economics. The dynamic effects of a change in capital taxes is a particularly stark example of such a distinction. The difference between the short run and long run effects of the tax changes comes from the differences between the short-run and the long-run supply of capital.  In the short run, the capital stock “is what it is.” Over time however, capital can be accumulated or decumulated – that is, the long-run supply is relatively elastic. This simply insight alone strongly suggests that the optimal taxation of capital income should be higher in the short run that in the long run.

The tension between long run and short run optimal taxes also presents a problem for policy makers.  While they will constantly be tempted to tax capital, they will also want to commit to lower taxes in the future.

Among the most famous results in modern public finance is the Chamley-Judd theorem (see C. Chamley 1986 and K. Judd 1985). This result says that if tax rates are chosen optimally then, if the economy attains a steady state, the tax rate on capital income should be zero (!). Many economists (myself included) have interpreted this result as a formal articulation of the intuitions above.

Recently, there may have been an important development in the literature on optimal capital taxation. Ivan Werning and Ludwig Straub (MIT) have circulated a paper in which they argue that the Chamley-Judd result is of more limited importance that is commonly believed.  They argue that under certain circumstances optimal tax paths do not lead to a steady state the way we normally think of it and in these cases, the tax on capital income can be positive. In other cases, while the Chamley-Judd result technically holds, the steady state is attained only after an extended period of time (centuries in their examples) and is thus of limited relevance.

I am not going to go into the details of the Straub and Werning paper in a blog post. The short version of the explanation is that the suppositions required for the Chamley-Judd result are more stringent than one would imagine. For the textbook Chamley-Judd result to hold, the economy must approach an interior steady state with finite Lagrange multipliers for the various constraints in the model.  In many of the solutions in the new paper, the economy converges to a rest point with zero capital (and is thus not interior).  In other solutions, the Lagrange multipliers do not converge.

In fairness, the counter-examples Straub and Werning provide are all in the family of isoelastic preferences which sometimes have weird features in certain settings.  It is not clear to me whether the authors’ findings will necessarily extend to more malleable preference specifications but what is clear is that researchers will have to look much closer at the meaning of one of the most prominent results in modern public finance theory.

More to come I’m sure …

End of an Era


On August 13 Judit Polgar announced that she was retiring from competitive chess.  This came as a shock to most people who follow chess – it certainly came as a shock to me though I have heard on several occasions that Judit wanted to put more emphasis on raising her family.  On her website she said that she was going to spend more time with her children and developing her foundation (The Judit Polgar Chess Foundation promotes pioneering cognitive skills development for school children). 

Judit is often described as the strongest female chess player ever and I have no reason to doubt that assessment.  At the peak of her career, Judit was one of the strongest players of either gender.  Her highest ELO ranking was a staggering 2735 and she was ranked #8 overall in the world in 2005.  According to Wikipedia, she has been the #1 rated female chess player in the world since 1989 (!). Judit typically played in the general tournaments – she actually never competed for the women’s world championship. Over her career, she has defeated a slew of famous players including Magnus Carlsen (the current world #1 and current chess champion), Viswanathan Anand (the previous world #1), Anatoly Karpov, Boris Spassky, … the list goes on and on. [1]

There is a famous anecdote that Gary Kasparov once described Judit as a “circus puppet” and asserted that women chess players should stick to having children. I’m not actually sure where this story comes from – it was relayed by The Guardian in 2002 without further elaboration. The statement is so over the top that I wonder whether it’s actually true.  It’s might be true – chess players are known for making zany statements like this from time to time (Bobby Fischer comes to mind). Perhaps Kasparov was trying to stir up some controversy … who knows. In any case, Kasparov asked for it and he got it.  In 2002 Polgar beat Kasparov and added to her trophy room.  Kasparov was the worlds #1 ranked player at the time. 

Another fascinating aspect of Judit’s chess life is that her father, Laszlo Polgar, apparently decided to use his children to “prove” that geniuses are “made, not born.” He made a conscious effort to train his children in chess starting when they were each very young. Perhaps Laszlo was right, in addition to Judit, her sisters Susan and Sofia are also established chess grandmasters. 

Speaking for myself, Judit has been my favorite active chess player for a while now.  Her style is somewhat of an anachronism – she is known for a hyper aggressive, dramatic playing style.  She often sacrifices pieces trying to gain initiative and attacking positions. For the most part, men’s chess is actually much tamer – many of the best male players are “positional” players who grind out games looking for small advantages which they eventually convert for a win (Alexey Shirov is a counter-example – see below). Stylistically, Judit reminds me a lot of Mikhail Tal – perhaps the greatest chess tactician of all.

Below is a video of one of Judit’s most famous games. In the game Judit plays against Alexey Shirov. The game commentary is by Mato Jelic. If you are interested in learning more about chess or about Judit’s games, Mato’s youtube channel is a great place to start. Among other things, Mato has a great collection of Judit Polgar’s games with commentary. 


[1] Here’s a funny quote from Judit about her sister Susan. “My sister Susan — she was 16 or 17 — said that she never won against a healthy man. After the game, there was always an excuse: ‘I had a headache. I had a stomach ache.’ There is always something.”  

More Thoughts on Agent Based Models

My recent post on Agent Based Models (ABMs) generated a few interesting responses and I thought I would briefly reply to a couple of them in this post.  In particular, two responses came from people who actually have direct experience with ABMs.

Rajiv Sethi posts a response on his own blog.  Some excerpts:

Chris House has managed to misrepresent the methodology so completely that his post is likely to do more harm than good.

[Well that doesn’t sound too good …]

Agents can be as sophisticated and forward-looking in their pursuit of self-interest in an ABM as you care to make them; they can even be set up to make choices based on solutions to dynamic programming problems, provided that these are based on private beliefs about the future that change endogenously over time.

What you cannot have in an ABM is the assumption that, from the outset, individual plans are mutually consistent. That is, you cannot simply assume that the economy is tracing out an equilibrium path. The agent-based approach is at heart a model of disequilibrium dynamics, in which the mutual consistency of plans, if it arises at all, has to do so endogenously through a clearly specified adjustment process. This is the key difference between the ABM and DSGE approaches [.]

In a similar vein, in the comments section to the earlier post, Leigh Tesfatsion offered several thoughts many of which fit squarely with Rajiv’s opinion.  Professor Tesfatsion uses ABMs in a multiple settings including economics and climate change – I’m quite sure that she has much more experience with such models that I do (I basically don’t know anything beyond a couple of papers I’ve encountered as a referee here and there).  Here are some excerpts from Leigh’s comments:

Agents in ABMs can be as rational (or irrational) as their real-world counterparts…

The core difference between agent modeling in ABMs and agents in DSGE models is that agents in ABMs are required to be “locally constructive,” meaning they must specify and implement their goals, choice environments, and decision making procedures based on their own local information, beliefs, and attributes. Agent-based modeling rules out “top down” (modeler imposed) global coordination devices (e.g., global market clearing conditions) that do not represent the behavior or activities of any agent actually residing within the model. They do this because they are interested in understanding how real-world economies work.

Second, ABM researchers seek to understand how economic systems might (or might not) attain equilibrium states, with equilibrium thus studied as a testable hypothesis (in conjunction with basins of attraction) rather than as an a priori maintained hypothesis.

I was struck by the similarity between Professor Sethi and Professor Tesfatsion’s comments. The parts of their comments that really strike me are (1) the agents in an ABM can have rational rules; (2) in an ABM, there is no global coordination imposed by the modeler. That is, agents behaviors don’t have to be mutually consistent; and (3) ABMs are focused on explaining disequilibrium in contrast to DSGE models which operate under the assumption of equilibrium at all points.

On the first point (1) I agree with Rajiv and Leigh on the basic principle. Agents in an ABM could be endowed with rational behavioral rules – that is, they could have rules which are derived from an individual optimization problem of some sort. The end result of an economic optimization problem is a rule – a contingency plan that specifies what you intend to do and when you intend to do it. This rule is typically a function of some individual state variable (what position are you in?). In an ABM, the modeler specifies the rule as he or she sees fit and then goes from there. If this rule were identical to the contingency plan from a rational economic actor then the two modelling frameworks would be identical along those dimensions. However, in an ABM there is nothing which requires that these rules adhere to rationality. The models could accommodate rational behavior but they don’t have to. To me this still seems like a significant departure from standard economic models that typically place great emphasis on self-interest as a guiding principle. In fact, the first time I read Rajiv’s post, my initial thought was that an ABM with a rational decision rule would be essentially a DSGE model. All actions in DSGE models are based on private beliefs about the system. Both the system and the beliefs can change over time.  I for one would be very interested if there were any ABMs that fit Rajiv’s description that are in use today.

The second point (2) on mutual consistency is interesting. It is true that in most DSGE models, plans are indirectly coordinated through markets.  Each person in a typical economic model is assumed to be in (constant?) contact with a market and each confronts a common price for each good.  As a result of this common connection, the plans of individuals in economic models are assumed to be consistent in a way that they are not in ABMs.  On the other hand, there are economic models that do not have this type mutual consistency.  Search based models are the most obvious example.  In many search models, individuals meet one-on-one and make isolated bargains about trades.  There are thus many trades and exchanges occurring in such model environments and the equilibria can feature many different prices at any point in time.  This might mean that search / matching models are a half-way point between pure Walrasian theories on the one hand and ABMs on the other.

The last issue (3) that Rajiv and Leigh brought up was the idea that ABMs seek to model “disequilibrium” of some sort. I suspect that this is somewhat more an issue of terminology rather than substance but there may be something more to it.  Leigh’s comment in particular suggests that she is reserving the term “equilibrium” for a classical rest point at which the system is unchanging. I mentioned to her that this doesn’t match up with the term “equilibrium” in economics. In economic models (e.g., DSGE models) equilibria can feature erratic dynamic adjustment over time as prices and markets gradually adjust (e.g., the New Keynesian model) or as unemployment and vacancies are gradually brought into alignment (e.g., the Mortensen Pissarides model) or as capital gradually accumulates over time (e.g., the Ramsey model).  Indeed, the equilibria can be “stochastic” so that they directly incorporate random elements over time. There is no supposition that an equilibrium is a rest point in the sense that (I think) she intends.  When I mentioned this she replied:

As for your definition of equilibrium, equating it with any kind of “solution,” I believe this is so broad as to become meaningless. In my work, “equilibrium” is always used to mean some type of unchanging condition that might (or might not) be attained by a system over time. This unchanging condition could be excess supply = 0, or plans are realized (e.g., no unintended inventories), or expectations are consistent with observations (so updating ceases), or some such condition. Solution already means “solution” — why debase the usual scientific meaning of equilibrium (a system at “rest” in some sense) by equating it with solution?

I suspect that in addition to her background in economics, Professor Tesfatsion also has a strong background in the natural sciences and is somewhat unaccustomed to terminology used in economics and prefers to use the term “equilibrium” as it would be used in say physics.[1] In economics, an outcome which is constant and unchanging would be called a “steady state equilibrium” or a “stationary equilibrium.”  As I mentioned above, there are non-stationary equilibria in economic models as well.  Even though quantities and prices are changing over time, the system is still described as being “in equilibrium.”  The reason most economists use this terminology is subtle.  Even though the observable variables are changing, agents’ decision rules are not – the decision rules or contingency plans are at a rest point even though the observables move over time.

Consider this example. Suppose two people are playing chess. The player with the white pieces is accustomed to playing e4. She correctly anticipates that her opponent will respond with c5 – the Sicilian Defense. White will then respond with the Smith-Morra Gambit to which black with further respond with the Sicilian-Scheveningen variation. Both players have played several times and they are used to the positions they get out of this opening. To an economist, this is an equilibrium.  White is playing the Smith-Morra Gambit and black plays the Sicilian-Scheveningen variation. Both correctly anticipate the opening responses of the other and neither wants to deviate in the early stages of the game. Neither strategy changes over time even though the position of the board changes as they play through the first several moves. (In fact this is common to see in competitive chess – two players who play each other a lot often rapidly fire off 8-10 moves and get to a well-known position.)

In any case, I’m not sure that this means economists are “debasing” the usual scientific meaning of equilibrium or not but that’s how the term is used in the field.

One last point that came up in Rajiv’s post which deserves mention is the following:

A typical (though not universal) feature of agent-based models is an evolutionary process, that allows successful strategies to proliferate over time at the expense of less successful ones.

This is absolutely correct.  I didn’t think to mention this in the earlier post but I clearly should have done so.  Features like this are used often in evolutionary game theory.  In those settings, we gather together many individuals and endow them with different rules of behavior.  Whether a rule survives, dies, proliferates, etc. is governed by how well it succeeds at maximizing an objective.  Rajiv is quite correct that such behavior is common in many ABMs and he is right to point out its similarity with learning in economic models (though it is not exactly the same as learning).

[1] A reader pointed out that Leigh Tesfatsion’s Ph.D. is in economics and so she is well aware of non-stationary equilibria or stochastic equilibria. My original post incorectly suggested that she might unaware of economic terminology (Sorry Leigh). Leigh prefers to reserve the term “equilibrium” for a constant state as it is in many other fields. Her choice for terminology is fine as long as she and I are clear as to what were are each talking about.