Paul Krugman: Reporting Limits and Inequality
When Census data are collected, incomes in excess of an upper limit (approximately 1 million dollars) are top-coded and reported as $999,999 dollars rather than as the full amount. There's a bit more to it than this, but what's important is that there are reporting limits.
Recently, the effect that top-coding of census data has on income inequality has been under discussion. The issue arose when Alan Reynolds claimed that various statistical issues have created a false impression of rising inequality, a claim that has been thoroughly rebutted here and elsewhere (see below). In response, Paul Krugman emails "a finger exercise on earnings inequality and reporting limits":
Do Reporting Limits Really Affect Measured Income Inequality, by Paul Krugman 1/8/07: For my own edification, I thought I’d make a rough estimate of how much the Census reporting limits affect one dimension of inequality, inequality in earnings. What we learned amid all the nonsense from Alan Reynolds is that the Census data don’t count earned income in excess of approximately $1 million, and other forms of income are subject to even tighter reporting limits. But let’s focus just on earnings.
Now, only a tiny minority of Americans make enough for the reporting limits to matter. According to the Social Security Administration data, http://www.ssa.gov/OACT/COLA/awidevelop.html, in 2005 less than 0.06% of workers had wages and salaries exceeding $1 million – and only the part of their income over $1 million is censored. Can that piece really be big enough to significantly affect overall measures of the level and trend in inequality?
According to an estimate I’ve just done using the SSA data, the answer is yes. This kind of calculation is new to me – I was just having what passes for fun in the sick mind of an economist – but I’d like to hear any comments.
The SSA data give the number of people with wage and salary income in ranges – e.g., $1 million to $1.5 million, $1.5 million to $2 million, and so on. But it’s pretty easy to use those data to estimate the total income excluded by a $1 million reporting limit.
The key is knowing that top incomes tend to follow a Pareto distribution. That is, the number of people with any given (high) income declines exponentially with that income:
n = Ky^(-alpha)
We can integrate this to get the number of people with incomes exceeding some level Y:
N = KY^(1-alpha)/(alpha-1)
Or, in logs,
Ln(N) = ln(K/(alpha-1)) + (1-alpha) ln(Y)
Does this work? You bet! Figure 1 shows the SSA data for 2005, with income measured in millions and actual numbers of people. The Pareto distribution works very well indeed. And the fitted line lets us estimate both K and alpha: K = 154318; alpha = 2.6867.
![]()
A bit more math lets us derive the total income of people with earnings above a given Y: it’s
Y total = KY^(2-alpha)/(alpha-2)
Since we’re measuring income in millions, and looking for the income of people in the million-plus category, this becomes simply K/(alpha-2) = about $225 billion.
The SSA also gives us total wage and salary income: $5.374 trillion in 2005. So the 82,000 workers in the million-plus club, less than 0.06 percent of the work force, accounted for 4.2% of wages and salaries.
Some of that total – the part under $1 million – was counted. How much? $82 billion - $1 million per worker. What’s left, the part that was above the reporting limit, I estimate at 2.7% of total wage and salary income. If I understand correctly, that explains about a third of the difference between the Census and Piketty-Saez estimates of the top 5% share. Bear in mind that the reporting limits on other forms of income also matter, and that there are other sources of bias in the Census numbers, such as a tendency of high-income respondents to understate their incomes.
More important is how the reporting limits affect trends. Here’s what I did: I went back to 1994, just after the Census changed the reporting limits, and did exactly the same exercise. Figure 2 shows the Pareto plot for 1994: notice that the line is steeper, which says that income among the million-plus club wasn’t quite as unequal in 1994 as it was in 2005.
![]()
When you run through the whole exercise, what you find is that the earnings that would be missed because of reporting limits are much smaller, only 0.7% of total wage and salary income. This isn’t surprising: the reporting limit hasn’t changed, while the structure of wages has shifted right both because of inflation and because of rising average real earnings. Moreover, top incomes have become more unequal, with more income in the far right tail. So there’s a lot more income above the reporting limit, all of which will be captured by income-tax-based estimates of income inequality, but won’t be captured by Census data.
Now, the Census data say that the income share of the top 5% rose only slightly, from 21.2% to 22.2%, between 1994 and 2005. The Piketty-Saez data, which only go up to 2004, show a 3.7% rise. Our little exercise with earnings data suggests that the missed income due to reporting limits rose by about 2 percentage points over the same period, so that more than half the difference between the Census and Piketty-Saez trends could be the result of reporting limits that caused the Census data to miss a large and growing amount of income at the very top.
The bottom line: top-coding really, truly does matter – and yes, Virginia, income inequality is still rising.
See also the posts listed here for more background on the discussion leading up to this post, and the posts by Brad DeLong here, here, and here.
Posted by Mark Thoma on Monday, January 8, 2007 at 12:06 PM in Economics, Income Distribution | Permalink | TrackBack (0) | Comments (17)



Real vs. abstract. The reality is what exist, not abstract. The fact that most are no better off and many are worse off is the reality. All the data in the world is abstract. sometimes some people confuse the two thinking reality for the people is abstract and the data real.
Posted by: ken melvin | Link to comment | Jan 08, 2007 at 12:54 PM
Mr. Reynolds is not confused. He knows perfectly well what he knows.
Mr. Krugman is also not confused. He knows what he knows and can prove it, abstractly and in reality.
Mr. Reynolds can't prove what he knows, either abstractly or in reality.
Some people might call Mr. Reynolds attitude "faith-based", but actually faith is a little bit more empirical than Mr. Reynolds' attitude.
I await Mr. Reynolds' refutation of Mr. Krugman with unbated breath.
Posted by: evagrius | Link to comment | Jan 08, 2007 at 01:52 PM
Based upon some of the arguments Reynolds has been putting up at some of the blogs reporting on this issue, I would say that he is firmly in the "obfuscation based" community.
Posted by: Marcus Aurelius | Link to comment | Jan 08, 2007 at 01:54 PM
With all due respect, how can a supposed serious discussion about the full scope of Census Bureau top-coding (not just the small portion mentioned by Paul Krugman and Mark Thoma) not include the following top-coding references in any blog main posts? Or not include any Paul Krugman or EV main blog post reference to such documents and facts?
Special Studies in Federal Tax Statistics
2003
Statistics of Income Division
Internal Revenue Service
Survey of Income and Program Participation (SIPP)
Census Bureau
SIPP Users' Guide
Census Bureau
2001, Third Edition
Appendix B in the 2001 issue of the Census Bureau SIPP Users' Guide outlines the specific top-codes used in the 1996 survey. To suggest that over 60 Census Bureau top-coding caps apply only to the top 1% of income earners or a tiny portion of such individuals in the top 1% is simply a factual distortion if Appendix B is read and understood.
Posted by: Movie Guy | Link to comment | Jan 08, 2007 at 05:09 PM
I don't think that Krugman was arguing that top-coding only applied to the top 1%, only that top-coding is a standard practice for the Census Bureau and therefore results in incomplete information.
It's the same when looking at web sites that map cities and counties for income using Census data. There's no further distinction in areas with incomes more than $200K a year.
Posted by: evagrius | Link to comment | Jan 08, 2007 at 05:42 PM
It should be axiomatic that if you understate the incomes of units receiving a million or more by capping them at a million you are going to get a lower Gini than the real one. In short you are going to understate the degree of inequality in incomes.
Posted by: maria | Link to comment | Jan 08, 2007 at 07:33 PM
" How much? $82 billion - $1 million per worker. What’s left, the part that was above the reporting limit, I estimate at 2.7% of total wage and salary income."
The CPS's problem is overstating earnings; not undereporting earnings. Since the '96 CPS revision that replaced hard topcodes with the mean topcoding amount currently used, the CPS has tended to overshoot on earnings and exceeded NIPA earnings amts.
For the last several years, I believe that the CPS has earnings well in excess of SSA data and NIPA, according to Schwabish in '06.
I have trouble reconciling the fact that a) The CPS overstates earnings income and b) The CPS massively underreports earnings for the top 5%. This would mean that the CPS is attributing several hundred billion dollars of wages to the middle and lower income invidividuals. I don't think the CPS is that worthless.
What am I misunderstanding?
Posted by: hederman | Link to comment | Jan 09, 2007 at 06:46 AM
Note that Krugman has brought something important to bear here: the shape of the distribution of income. You get a lot of power (statistically) from bringing distributions into the picture, and even simple distributions that only crudely fit give a lot of leverage.
Statisticians often seem cavalier in choosing underlying distributions. But much of that is due to experience, and knowing that choosing an underlying model that represents most of the distribution issues is half the battle.
AFAIK, this is where Reynolds loses the battle. Pareto distribution of incomes is well-established and goes back more than a century; using the current data plus the underlying distribution gives more than enough power to show inequality increasing over time.
Posted by: Richard | Link to comment | Jan 09, 2007 at 07:31 PM
Don't ya know that inequality is a myth? Investor's Daily says so!
Posted by: jim | Link to comment | Jan 11, 2007 at 01:42 AM
I've made several attempts at various blogs of economists that make the assertion that there is growing inequality of income asking to see if data that shows whatever "inequality" that they are claiming is not based on differences in skill and talent differrences, i.e. that their "inequality", however they might define it, is not merit based on having rare and higly valued talents (i.e. vs what the overall market thinks says on income inequality). To date, I have not gotten any statitical studies from these folks to support their claims. Till they produce such analysis I'm not just going to leap to their conclusion there exists an income inequality problem.
Moreover, the Investor's Daily article someone linked to in one of the comments above indicated that that there were statitical studies that show large differences in income based on having a college degree vs not having a college degree. If indeed the variations in Income can be explained by variations in valued skills and talents (in all their varied forms: Problem Solving Skills ~ IQ, Pro Athletic, Musical Talent, Hollywood Movie star sex appeal, etc...) then Income inequality is a reflection of the notion that those with rare and valued skills and talents are naturally expected to have more Income than those who don't have rare and valued skills and talents.
Increasing income concetration my be only a reflection that the economy is become more and more efficient at finding the individuals with the rare and currently highly valued skills and talents and placing them into their appropriate economic order in the distribution of income.
Posted by: RX314 | Link to comment | Jan 11, 2007 at 08:50 AM
RX314:
"By nature a philosopher is not in genius and disposition half so different from a street porter as a mastiff is from a greyhound."- Adam Smith
Posted by: john c. halasz | Link to comment | Jan 11, 2007 at 09:36 AM
At the very least RX314 displays the sense and humility (in comparison to Reynolds) to accept the determinism of the math in respect of inequality, even if he cannot fathom why inequality IS an important issue.
In the simplest sense, in free-market libertarian terms that you will understand best, rising inequality is problemmatical precisely because it will ultimately stultify the competitive benefits of the market and capitalism since the property rights of one's accrued spoils are protected, more or less indefinitely. Now in a true Darwinian competitive environment, property rights would NOT be protected. This comes at a price, for with property rights, scale and success come uncompetitive behaviour, the antithesis of capitalisms vigor and fairness. Imagine for a moment the Bull Sea Lion on the beach. He IS at the moent of his victory, the fittest and to him goes the spoils - harem & all. Except that he must defend his territory which is so exhausting, he typically is king but for a month. Back in our world, with permanent property rights protecting the the spoils and deflecting competition, the Libertarian is faced with a terrible paradox: embrace the market at the expense of proprty rights or entrench property rights at the expense of the competition and the market. In reality the optimal bargain for a large diversified and people and economy lies somewhere in between with sufficient property rights to make one get up and out the door in the morning, but not so much that all the spoils pool in some 1% eddy, inalienably protected from predation by the power of the state. Whether "deserving" or not, whether properly placed in the economic pecking order of talent or luck, the dramatically rising tide of inequality reflects a society and its economy that is moving away from the point optimal of tradeoff between the benefits of the market and competition and conferring of protection of property rights.
I am sure there are many other moral, health, reasons for desiring diminished economic inequality as well as other beneficial economic reasons that spring from more recent experimental markets research.
Posted by: Cassandra | Link to comment | Jan 11, 2007 at 09:44 AM
Intriguing, isn't it?
In France at the moment, after the government (in an empty promise I admit: they will leave in 4 months) announced an opposable right to be housed, there has been a lot of outcries on the tune of "this is a violation of the fundamental property right".
The idea was that with an opposable right, the government could be tempted to temporarily seize some unoccupied houses. And there are a lot of them, kept like that to get the prices to rise: I've been told that it's around 20% of flats in Paris that are unoccupied (remember the California energy crisis anyone?).
OK, so not all 20% are kept unoccupied by design. And, yes, one could argue that it is your right to do so as an owner.
I just found it very odd that property rights would become the most fundamental of rights. That property rights would be brandished in the face of the homeless dying of cold, when for once the government could try to tie its own hands to prevent itself ignoring them. Especially when the shortage is, at least in part, organised.
Posted by: Cyrille | Link to comment | Jan 11, 2007 at 10:16 AM
RX314 wrote: I've made several attempts at various blogs of economists that make the assertion that there is growing inequality of income asking to see if data that shows whatever "inequality" that they are claiming is not based on differences in skill and talent differrences.... Till they produce such analysis I'm not just going to leap to their conclusion there exists an income inequality problem.
That's your right, of course, but I think you misunderstand the argument. Most professional and academic economists who are concerned with income inequality do not make the claim you ask them to defend. In general, I think you'll find that the consensus is that the income distribution trends we're seeing are largely the outcomes of more or less ordinary market mechanisms. There is considerable uncertainty about the relative importance of specific mechanisms (e.g. trade, immigration, skill-biased technical change, and/or the winner-take-all phenomenon), but none of these constitute market failures in the ususal sense of the term.
There are also two major "non-market" sources of income inequality that enter into the debate: the reduction in power of labor unions and erosion of norms which used to limit executive compensation. But even these don't (or don't necessarily) represent market failures. The diminished power of unions arguably represents the reduction of a pre-existing market distortion, so, from a pure efficiency perspective, is a "good thing." And it's not clear a priori that the "old" norms for executive compensation were the right ones. Maybe they unduly restricted rewards to top performers.
The case for an inequality problem does not rest on the proposition that growing inequality stems from some sort of market failure. It stems from the proposition -- which is, in part, a value proposition -- that inequality is per se a bad thing, regardless of how it comes about.
Krugman puts it this way:Let me make a shocking declaration given my profession: The essential reason for caring about the disturbing trends in Europe and America is social, rather than strictly economic.
Consider the position of someone in, say, the top fifth of the income distribution in either the United States or Europe--a description that surely applies to most readers of this article. Does the growth in poverty in America or of mass unemployment constitute any direct threat to the living standards of that individual? The answer in the United States is a clear no: There is no strictly economic reason why we cannot continue to have a growing economy even while a substantial fraction of the population is experiencing declining standards of living. Economic theory suggests no particular connection between equity or justice and growth, and no evidence exists that income inequality has any large effects on the rate of economic growth, positive or negative.
...
So where is the crisis? The answer is that it is in society and ultimately in politics. On both sides of the Atlantic, economic forces are more and more tending to split society in two: into those who have good jobs and whose standards of living continue to rise and those who are faced either with falling incomes or the prospect of a more or less permanent life on the dole. Even an economist can see that such a split demoralizes those on the bottom and coarsens those on the top. The ultimate effect of growing economic disparities on our social and political health may be hard to predict, but it is unlikely to be pleasant.
Now, as citizens, we need to decide whether we find this line of argument persuasive or not. But its persuasiveness doesn't depend on evidence that the trends we observe aren't ordinary market outcomes, but on our judgement of whether those outcomes are acceptable.
Posted by: johnchx | Link to comment | Jan 11, 2007 at 10:22 AM
I thank those that have responded to me, finally, on this subject.
Those that make clear each time they claim there is a problem with the distribution of Income that it is not related to a lack of merit in economic outcomes (i.e. that Income does generally follow rate and valued skills and talents) do great service to their cause. In comparision, those who suggest without evidence, perhaps only implicity or by remaining silent on the subject, that their problems with the Income Distribution are due to some kind a systematic crooked or illegal behavior (i.e. implications along the lines that all CEOs are crooks like those of Enron or Worldcom) do great harm to their case.
The honest admission "that it a personal judement of whether those [economic] outcomes are acceptable" goes a long way toward putting the debate in its true context and thus make progress possible.
Thanks.
Posted by: RX314 | Link to comment | Jan 11, 2007 at 11:03 AM
johnchx - (responding to RX314) "In general, I think you'll find that the consensus is that the income distribution trends we're seeing are largely the outcomes of more or less ordinary market mechanisms. There is considerable uncertainty about the relative importance of specific mechanisms (e.g. trade, immigration, skill-biased technical change, and/or the winner-take-all phenomenon), but none of these constitute market failures in the ususal sense of the term."
I don't know that the problem should be phrased with regard to market failures. If a nation's economic policies and taxation policies drive a significant portion of growing income equality, then the issue isn't about market failures but rather about national policy decisions. We could add to that list of policy decisions such matters as environmental policy and a host of other considerations beyond domestic and international trade policies as well as domestic corporate and individual taxation policies.
In the sense that U.S. economic policies drive some portion of market outcomes of the U.S. economy including wage scales, I would say yes - U.S. market failures (as such) are contributing to income equality.
Regardless of my views, it is encouraging to note that some posters are interesting in moving the income inequality discussions forward. It's time to broaden the discussion in my opinion.
Posted by: Movie Guy | Link to comment | Jan 11, 2007 at 11:25 AM
After thinking more about the issue of economic markets, it's clear to me that market or economic policies, laws, regulations, rules and fees form the framework of market areana activity. In light of such considerations, the policy/rule drivers along with the level of participation determine the success or failure of "designed" economic markets.
Naturally, there are plenty of ways to measure market performances and economic outcomes. To suggest that we are not experiencing any primary national market or economic failures is to overlook the continued decline in median incomes as a principal measurement of such economic activity.
Any nation or tribe that designs its domestic and international market participation and access in such a fashion that continues to result in declining median incomes vs the general level of product services and goods certainly isn't a successful approach for the benefit of all citizens or tribal members in my judgment.
I could go further, but I'll leave there for now.
Posted by: Movie Guy | Link to comment | Jan 11, 2007 at 05:46 PM