Via email, I was asked if this is the "stupidest article ever published?":
If not, it's certainly in the running.
Via email, I was asked if this is the "stupidest article ever published?":
If not, it's certainly in the running.
Matthew O. Jackson, Stanford University Social and Economic Networks: Backgound
Daron Acemoglu, MIT Networks: Games over Networks and Peer Effects
Matthew O. Jackson, Stanford University Diffusion, Identification, Network Formation
Daron Acemoglu, MIT Networks: Propagation of Shocks over Economic Networks
Jennifer Castle and David Hendry on data mining
‘Data mining’ with more variables than observations: While ‘fool’s gold’ (iron pyrites) can be found by mining, most mining is a productive activity. Similarly, when properly conducted, so-called ‘data mining’ is no exception –despite many claims to the contrary. Early criticisms, such as the review of Tinbergen (1940) by Friedman (1940) for selecting his equations “because they yield high coefficients of correlation”, and by Lovell (1983) and Denton (1985) of data mining based on choosing ‘best fitting’ regressions, were clearly correct. It is also possible to undertake what Gilbert (1986) called ‘strong data mining’, whereby an investigator tries hundreds of empirical estimations, and reports the one she or he ‘prefers’ – even when such results are contradicted by others that were found. As Leamer (1983) expressed the matter: “The econometric art as it is practiced at the computer terminal involves fitting many, perhaps thousands, of statistical models. One or several that the researcher finds pleasing are selected for reporting purposes”. That an activity can be done badly does not entail that all approaches are bad, as stressed by Hoover and Perez (1999), Campos and Ericsson (1999), and Spanos (2000) – driving with your eyes closed is a bad idea, but most car journeys are safe.
Why is ‘data mining’ needed?
Econometric models need to handle many complexities if they are to have any hope of approximating the real world. There are many potentially relevant variables, dynamics, outliers, shifts, and non-linearities that characterise the data generating process. All of these must be modelled jointly to build a coherent empirical economic model, necessitating some form of data mining – see the approach described in Castle et al. (2011) and extensively analysed in Hendry and Doornik (2014).
Any omitted substantive feature will result in erroneous conclusions, as other aspects of the model attempt to proxy the missing information. At first sight, allowing for all these aspects jointly seems intractable, especially with more candidate variables (denoted N) than observations (T denotes the sample size). But help is at hand with the power of a computer. ...[gives technical details]...
Appropriately conducted, data mining can be a productive activity even with more candidate variables than observations. Omitting substantively relevant effects leads to mis-specified models, distorting inference, which large initial specifications should mitigate. Automatic model selection algorithms like Autometrics offer a viable approach to tackling more candidate variables than observations, controlling spurious significance.
...The rather boring truth is that it is entirely predictable that forecasters will miss major recessions, just as it is equally predictable that each time this happens we get hundreds of articles written asking what has gone wrong with macro forecasting. The answer is always the same - nothing. Macroeconomic model based forecasts are always bad, but probably no worse than intelligent guesses.
Alan Blinder and Mark Watson:
Presidents and the U.S. Economy: An Econometric Exploration, by Alan S. Blinder and Mark W. Watson, NBER Working Paper No. 20324 [open link]: The U.S. economy has grown faster—and scored higher on many other macroeconomic metrics—when the President of the United States is a Democrat rather than a Republican. For many measures, including real GDP growth (on which we concentrate), the performance gap is both large and statistically significant, despite the fact that postwar history includes only 16 complete presidential terms. This paper asks why. The answer is not found in technical time series matters (such as differential trends or mean reversion), nor in systematically more expansionary monetary or fiscal policy under Democrats. Rather, it appears that the Democratic edge stems mainly from more benign oil shocks, superior TFP performance, a more favorable international environment, and perhaps more optimistic consumer expectations about the near-term future. Many other potential explanations are examined but fail to explain the partisan growth gap.
Further thoughts on Phillips curves: In a post from a few days ago I looked at some recent evidence on Phillips curves, treating the Great Recession as a test case. I cast the discussion as a debate between rational and adaptive expectations. Neither is likely to be 100% right of course, but I suggested the evidence implied rational expectations were more right than adaptive. In this post I want to relate this to some other people’s work and discussion. (See also this post from Mark Thoma.) ...
The first issue is why look at just half a dozen years, in only a few countries. As I noted in the original post, when looking at CPI inflation there are many short term factors that may mislead. Another reason for excluding European countries which I did not mention is the impact of austerity driven higher VAT rates (and other similar taxes or administered prices), nicely documented by Klitgaard and Peck. Surely all this ‘noise’ is an excellent reason to look over a much longer time horizon?
One answer is given in this recent JEL paper by Mavroeidis, Plagborg-Møller and Stock. As Plagborg-Moller notes in an email to Mark Thoma: “Our meta-analysis finds that essentially any desired parameter estimates can be generated by some reasonable-sounding specification. That is, estimation of the NKPC is subject to enormous specification uncertainty. This is consistent with the range of estimates reported in the literature….traditional aggregate time series analysis is just not very informative about the nature of inflation dynamics.” This had been my reading based on work I’d seen.
This is often going to be the case with time series econometrics, particularly when key variables appear in the form of expectations. Faced with this, what economists often look for is some decisive and hopefully large event, where all the issues involving specification uncertainty can be sidelined or become second order. The Great Recession, for countries that did not suffer a second recession, might be just such an event. In earlier, milder recessions it was also much less clear what the monetary authority’s inflation target was (if it had one at all), and how credible it was. ...
I certainly agree with the claim that a "decisive and hopefully large event" is needed to empirically test econometric models since I've made the same point many times in the past. For example, "...the ability to choose one model over the other is not quite as hopeless as I’ve implied. New data and recent events like the Great Recession push these models into unchartered territory and provide a way to assess which model provides better predictions. However, because of our reliance on historical data this is a slow process – we have to wait for data to accumulate – and there’s no guarantee that once we are finally able to pit one model against the other we will be able to crown a winner. Both models could fail..."
Anyway...he goes on to discuss "How does what I did relate to recent discussions by Paul Krugman?," and concludes with:
My interpretation suggests that the New Keynesian Phillips curve is a more sensible place to start from than the adaptive expectations Friedman/Phelps version. As this is the view implicitly taken by most mainstream academic macroeconomics, but using a methodology that does not ensure congruence with the data, I think it is useful to point out when the mainstream does have empirical support. ...
Via email, a comment on my comments about the difficulty of settling questions about the Phillips curve empirically:
Dear Professor Thoma,
I saw your recent post on the difficulty of empirically testing the Phillips Curve, and I just wanted to alert you to a survey paper on this topic that I wrote with Sophocles Mavroeidis and Jim Stock: "Empirical Evidence on Inflation Expectations in the New Keynesian Phillips Curve". It was published in the Journal of Economic Literature earlier this year (ungated working paper).
In the paper we estimate a vast number of specifications of the New Keynesian Phillips Curve (NKPC) on a common U.S. data set. The specification choices include the data series, inflation lag length, sample period, estimator, and so on. A subset of the specifications amount to traditional backward-looking (adaptive expectation) Phillips Curves. We are particularly interested in two key parameters: the extent to which price expectations are forward-looking, and the slope of the curve (how responsive inflation is to real economic activity).
Our meta-analysis finds that essentially any desired parameter estimates can be generated by some reasonable-sounding specification. That is, estimation of the NKPC is subject to enormous specification uncertainty. This is consistent with the range of estimates reported in the literature. Even if one were to somehow decide on a given specification, the uncertainty surrounding the parameter estimates is typically large. We give theoretical explanations for these empirical findings in the paper. To be clear: Our results do not reject the validity of the NKPC (or more generally, the presence of a short-run inflation/output trade-off), but traditional aggregate time series analysis is just not very informative about the nature of inflation dynamics.
PhD candidate in economics, Harvard University
Why Economists Can’t Always Trust Data, by Mark Thoma, The Fiscal Times: To make progress in economics, it is essential that theoretical models be subjected to empirical tests that determine how well they can explain actual data. The tests that are used must be able to draw a sharp distinction between competing theoretical models, and one of the most important factors is the quality of the data used in the tests. Unfortunately, the quality of the data that economists employ is less than ideal, and this gets in the way of the ability of economists to improve the models they use. There are several reasons for the poor quality of economic data...
At MoneyWatch, what is econometrics?:
Using Econometrics to Figure Out How the World Really Works, by Mark Thoma: Many people believe there has been no progress in economics, but that isn't true. ...
From the Journal of Economic Perspectives' Symposium on Big Data:
"Big Data: New Tricks for Econometrics," by Hal R. Varian: Computers are now involved in many economic transactions and can capture data associated with these transactions, which can then be manipulated and analyzed. Conventional statistical and econometric techniques such as regression often work well, but there are issues unique to big datasets that may require different tools. First, the sheer size of the data involved may require more powerful data manipulation tools. Second, we may have more potential predictors than appropriate for estimation, so we need to do some kind of variable selection. Third, large datasets may allow for more flexible relationships than simple linear models. Machine learning techniques such as decision trees, support vector machines, neural nets, deep learning, and so on may allow for more effective ways to model complex relationships. In this essay, I will describe a few of these tools for manipulating and analyzing big data. I believe that these methods have a lot to offer and should be more widely known and used by economists. Full-Text Access | Supplementary Materials
This may be of interest:
“There will be growth in the spring”: How well do economists predict turning points?, by Hites Ahir and Prakash Loungani: Forecasters have a poor reputation for predicting recessions. This column quantifies their ability to do so, and explores several reasons why both official and private forecasters may fail to call a recession before it happens.
"Past performance is not an indicator of future results":
Pseudo-mathematics and financial charlatanism, EurekAlert: Your financial advisor calls you up to suggest a new investment scheme. Drawing on 20 years of data, he has set his computer to work on this question: If you had invested according to this scheme in the past, which portfolio would have been the best? His computer assembled thousands of such simulated portfolios and calculated for each one an industry-standard measure of return on risk. Out of this gargantuan calculation, your advisor has chosen the optimal portfolio. After briefly reminding you of the oft-repeated slogan that "past performance is not an indicator of future results", the advisor enthusiastically recommends the portfolio, noting that it is based on sound mathematical methods. Should you invest?
The somewhat surprising answer is, probably not. Examining a huge number of sample past portfolios---known as "backtesting"---might seem like a good way to zero in on the best future portfolio. But if the number of portfolios in the backtest is so large as to be out of balance with the number of years of data in the backtest, the portfolios that look best are actually just those that target extremes in the dataset. When an investment strategy "overfits" a backtest in this way, the strategy is not capitalizing on any general financial structure but is simply highlighting vagaries in the data. ...
Unfortunately, the overfitting of backtests is commonplace not only in the offerings of financial advisors but also in research papers in mathematical finance. One way to lessen the problems of backtest overfitting is to test how well the investment strategy performs on data outside of the original dataset on which the strategy is based; this is called "out-of-sample" testing. However, few investment companies and researchers do out-of-sample testing. ...
Inequality in Capitalist Systems Is Not Inevitable, by Mark Thoma: Capitalism is the best economic system yet discovered for giving people the goods and services they desire at the lowest possible price, and for producing innovative economic growth. But there is a cost associated with these benefits, the boom and bust cycles inherent in capitalist systems, and those costs hit working class households – who have done nothing to deserve such a fate – very hard. Protecting innocent households from the costs of recessions is an important basis for our social insurance programs.
It is becoming more and more evident that there is another cost of capitalist systems, the inevitable rising inequality documented by Thomas Piketty in “Capital in the Twenty-First Century, that our social insurance system will need to confront. ...
On R-squared and economic prediction: Recently I've heard a number of otherwise intelligent people assess an economic hypothesis based on the R2 of an estimated regression. I'd like to point out why that can often be very misleading. ...
Here's what you'd find if you calculated a regression of this month's stock price (pt) on last month's stock price (pt-1). Standard errors of the regression coefficients are in parentheses.
The adjusted R-squared for this relation is 0.997. ... On the other hand, another way you could summarize the same relation is by using the change in the stock price (Δpt = pt - pt-1) as the left-hand variable in the regression:
This is in fact the identical model of stock prices as the first regression. The standard errors of the regression coefficients are identical for the two regressions, and the standard error of the estimate ... is identical for the two regressions because indeed the residuals are identical for every observation. ...
Whatever you do, don't say that the first model is good given its high R-squared and the second model is bad given its low R-squared, because equations (1) and (2) represent the identical model. ...
That's not a bad empirical description of stock prices-- nobody can really predict them. ... This is actually a feature of a broad class of dynamic economic models, which posit that ... the deviation between what actually happens and what the decision-maker intended ... should be impossible to predict if the decision-maker is behaving rationally. For example, if everybody knew that a recession is coming 6 months down the road, the Fed should be more expansionary today... The implication is that when recessions do occur, they should catch the Fed and everyone else by surprise.
It's very helpful to look critically at which magnitudes we can predict and which we can't, and at whether that predictability or lack of predictability is consistent with our economic understanding of what is going on. But if what you think you learned in your statistics class was that you should always judge how good a model is by looking at the R-squared of a regression, then I hope that today you learned something new.
[There's an additional example and more explanation in the original post.]
... Metrika began as a small village - little more than a coach-stop and a mandatory tavern at a junction in the highway running from the ancient data mines in the South, to the great city of Enlightenment, far to the North. In Metrika, the transporters of data of all types would pause overnight on their long journey; seek refreshment at the tavern; and swap tales of their experiences on the road.
To be fair, the data transporters were more than just humble freight carriers. The raw material that they took from the data mines was largely unprocessed. The vast mountains of raw numbers usually contained valuable gems and nuggets of truth, but typically these were buried from sight. The data transporters used the insights that they gained from their raucous, beer-fired discussions and arguments (known locally as "seminars") with the Metrika yokels locals at the tavern to help them to sift through the data and extract the valuable jewels. With their loads considerably lightened, these "data-miners" then continued on their journey to the City of Enlightenment in a much improved frame of mind, hangovers nothwithstanding!
Over time, the town of Metrika prospered and grew as the talents of its citizens were increasingly recognized and valued by those in the surrounding districts, and by the data miners transporters.
Young Joe grew up happily, supported by his family of econometricians, and he soon developed the skills that were expected of his societal class. He honed his computing skills; developed a good nose for "dodgy" data; and studiously broadened and deepened his understanding of the various tools wielded by the artisans in the neighbouring town of Statsbourg.
In short, he was a model child!
But - he was torn! By the time that he reached the tender age of thirteen, he felt the need to make an important, life-determining, decision.
Should he align his talents with the burly crew who frequented the gym near his home - the macroeconometricians - or should he throw in his lot with the physically challenged bunch of empirical economists known locally as the microeconometricians? ...
Full story here.
I tweeted this link, and it's getting far, far more retweets than I would have expected, so I thought I'd note it here:
Econometrics and "Big Data", by Dave Giles: In this age of "big data" there's a whole new language that econometricians need to learn. ... What do you know about such things as:
- Decision trees
- Support vector machines
- Neural nets
- Deep learning
- Classification and regression trees
- Random forests
- Penalized regression (e.g., the lasso, lars, and elastic nets)
- Spike and slab regression?
Probably not enough!
If you want some motivation to rectify things, a recent paper by Hal Varian ... titled, "Big Data: New Tricks for Econometrics" ... provides an extremely readable introduction to several of these topics.
He also offers a valuable piece of advice:
"I believe that these methods have a lot to offer and should be more widely known and used by economists. In fact, my standard advice to graduate students these days is 'go to the computer science department and take a class in machine learning'."
Ten Things for Applied Econometricians to Keep in Mind, by Dave Giles: No "must do" list is ever going to be complete, let alone perfect. This is certainly true when it comes to itemizing essential ground-rules for all of us when we embark on applying our knowledge of econometrics.
That said, here's a list of ten things that I like my students to keep in mind:
- Always, but always, plot your data.
- Remember that data quality is at least as important as data quantity.
- Always ask yourself, "Do these results make economic/common sense"?
- Check whether your "statistically significant" results are also "numerically/economically significant".
- Be sure that you know exactly what assumptions are used/needed to obtain the results relating to the properties of any estimator or test that you use.
- Just because someone else has used a particular approach to analyse a problem that looks like, that doesn't mean they were right!
- "Test, test, test"! (David Hendry). But don't forget that "pre-testing" raises some important issues of its own.
- Don't assume that the computer code that someone gives to you is relevant for your application, or that it even produces correct results.
- Keep in mind that published results will represent only a fraction of the results that the author obtained, but is not publishing.
- Don't forget that "peer-reviewed" does NOT mean "correct results", or even "best practices were followed".
I'm sure you can suggest how this list can be extended!
I'll add two that I heard often in grad school:
Don't take econometric techniques in search of questions. Instead, start with the important questions and then develop the econometrics needed to answer them.
Model the process that generates the data.
Any further suggestions?
There will be a big revision of macroeconomic data in July:
Data shift to lift US economy by 3%, by Robin Harding, FT: The US economy will officially become 3 per cent bigger in July as part of a shake-up that will for the first time see government statistics take into account 21st century components such as film royalties and spending on research and development. ...
In an interview with the Financial Times, Brent Moulton, who manages the national accounts at the Bureau of Economic Analysis, said the update is the biggest since computer software was added to the accounts in 1999.
“We are carrying these major changes all the way back in time – which for us means to 1929 – so we are essentially rewriting economic history,” said Mr Moulton.
The changes will affect everything from the measured GDP of different US states to the stability of the inflation measure targeted by the US Federal Reserve. They will force economists to revisit policy debates about everything from corporate profits to the causes of economic growth. ...
The changes are in addition to a comprehensive revision of the national accounts that takes place every five years... Steve Landefeld, the BEA director, said it was hard to predict the overall outcome given the mixture of new methodology and data updates. ... But while the level of GDP may change,... “I wouldn’t be looking for large changes in trends or cycles,” said Mr Landefeld. ...
When working with macroeconomic data, we don't generally assume that there are large measurement errors in the data when assessing the significance of the results. Maybe we should.
The blow-up over the Reinhart-Rogoff results reminds me of a point I’ve been meaning to make about our ability to use empirical methods to make progress in macroeconomics. This isn't about the computational mistakes that Reinhart and Rogoff made, though those are certainly important, especially in small samples, it's about the quantity and quality of the data we use to draw important conclusions in macroeconomics.
Everybody has been highly critical of theoretical macroeconomic models, DSGE models in particular, and for good reason. But the imaginative construction of theoretical models is not the biggest problem in macro – we can build reasonable models to explain just about anything. The biggest problem in macroeconomics is the inability of econometricians of all flavors (classical, Bayesian) to definitively choose one model over another, i.e. to sort between these imaginative constructions. We like to think or ourselves as scientists, but if data can’t settle our theoretical disputes – and it doesn’t appear that it can – then our claim for scientific validity has little or no merit.
There are many reasons for this. For example, the use of historical rather than “all else equal” laboratory/experimental data makes it difficult to figure out if a particular relationship we find in the data reveals an important truth rather than a chance run that mimics a causal relationship. If we could do repeated experiments or compare data across countries (or other jurisdictions) without worrying about the “all else equal assumption” we’d could perhaps sort this out. It would be like repeated experiments. But, unfortunately, there are too many institutional differences and common shocks across countries to reliably treat each country as an independent, all else equal experiment. Without repeated experiments – with just one set of historical data for the US to rely upon – it is extraordinarily difficult to tell the difference between a spurious correlation and a true, noteworthy relationship in the data.
Even so, if we had a very, very long time-series for a single country, and if certain regularity conditions persisted over time (e.g. no structural change), we might be able to answer important theoretical and policy questions (if the same policy is tried again and again over time within a country, we can sort out the random and the systematic effects). Unfortunately, the time period covered by a typical data set in macroeconomics is relatively short (so that very few useful policy experiments are contained in the available data, e.g. there are very few data points telling us how the economy reacts to fiscal policy in deep recessions).
There is another problem with using historical as opposed to experimental data, testing theoretical models against data the researcher knows about when the model is built. In this regard, when I was a new assistant professor Milton Friedman presented some work at a conference that impressed me quite a bit. He resurrected a theoretical paper he had written 25 years earlier (it was his plucking model of aggregate fluctuations), and tested it against the data that had accumulated in the time since he had published his work. It’s not really fair to test a theory against historical macroeconomic data, we all know what the data say and it would be foolish to build a model that is inconsistent with the historical data it was built to explain – of course the model will fit the data, who would be impressed by that? But a test against data that the investigator could not have known about when the theory was formulated is a different story – those tests are meaningful (Friedman’s model passed the test using only the newer data).
As a young time-series econometrician struggling with data/degrees of freedom issues I found this encouraging. So what if in 1986 – when I finished graduate school – there were only 28 quarterly observations for macro variables (112 total observations, reliable data on money, which I almost always needed, doesn’t begin until 1959). By, say, the end of 2012 there would be almost double that amount (216 versus 112!!!). Asymptotic (plim-type) results here we come! (Switching to monthly data doesn’t help much since it’s the span of the data – the distance between the beginning and the end of the sample – rather than the frequency the data are sampled that determines many of the “large-sample results”).
By today, I thought, I would have almost double the data I had back then and that would improve the precision of tests quite a bit. I could also do what Friedman did, take really important older papers that give us results “everyone knows” and see if they hold up when tested against newer data.
It didn’t work out that way. There was a big change in the Fed’s operating procedure in the early 1980s, and because of this structural break today 1984 is a common starting point for empirical investigations (start dates can be anywhere in the 79-84 range though later dates are more common). Data before this time-period are discarded.
So, here we are 25 years or so later and macroeconomists don’t have any more data at our disposal than we did when I was in graduate school. And if the structure of the economy keeps changing – as it will – the same will probably be true 25 years from now. We will either have to model the structural change explicitly (which isn’t easy, and attempts to model structural beaks often induce as much uncertainty as clarity), or continually discard historical data as time goes on (maybe big data, digital technology, theoretical advances, etc. will help?).
The point is that for a variety of reasons – the lack of experimental data, small data sets, and important structural change foremost among them – empirical macroeconomics is not able to definitively say which competing model of the economy best explains the data. There are some questions we’ve been able to address successfully with empirical methods, e.g., there has been a big change in views about the effectiveness of monetary policy over the last few decades driven by empirical work. But for the most part empirical macro has not been able to settle important policy questions. The debate over government spending multipliers is a good example. Theoretically the multiplier can take a range of values from small to large, and even though most theoretical models in use today say that the multiplier is large in deep recessions, ultimately this is an empirical issue. I think the preponderance of the empirical evidence shows that multipliers are, in fact, relatively large in deep recessions – but you can find whatever result you like and none of the results are sufficiently definitive to make this a fully settled issue.
I used to think that the accumulation of data along with ever improving empirical techniques would eventually allow us to answer important theoretical and policy questions. I haven’t completely lost faith, but it’s hard to be satisfied with our progress to date. It’s even more disappointing to see researchers overlooking these well-known, obvious problems – for example the lack pf precision and sensitivity to data errors that come with the reliance on just a few observations – to oversell their results.
I should have noted this when I posted the conference schedule. If you want to watch a live feed of the sessions, it's at:
Remember that all times listed are Hong Kong Time (15 hours ahead of PST, 12 hours ahead of EST). Videos of each session will also be posted (same address as the link above).
Is "Intellectual Property" a Misnomer?, by Tim Taylor: The terminology of "intellectual property" goes back to the eighteenth century. But some modern critics of how the patent and copyright law have evolved have come to view the term as a tendentious choice. One you have used the "property" label, after all, you are implicitly making a claim about rights that should be enforced by the broader society. But "intellectual property" is a much squishier subject than more basic applications of property, like whether someone can move into your house or drive away in your car or empty your bank account. ...
Is it really true that using someone else's invention is the actually the same thing as stealing their sheep? If I steal your sheep, you don't have them any more. If I use your idea, you still have the idea, but are less able to profit from using it. The two concepts may be cousins, but they not identical.
Those who believe that patent protection has in some cases gone overboard, and is now in many industries acting more to protect established firms than to encourage new innovators, thus refer to "intellectual property as a "propaganda term." For a vivid example of these arguments, see "The Case Against Patents," by Michele Boldrin and David K. Levine, in the Winter 2013 issue of my own Journal of Economic Perspectives. (Like all articles in JEP back to the first issue in 1987, it is freely available on-line courtesy of the American Economic Association.)
Mark Lemley offers a more detailed unpacking of the concept of "intellectual property" in a 2005 article he wrote for the Texas Law Review called "Property, Intellectual Property, and Free Riding" Lemley writes: ""My worry is that the rhetoric of property has a clear meaning in the minds of courts, lawyers and commentators as “things that are owned by persons,” and that fixed meaning will make all too tempting to fall into the trap of treating intellectual property just like “other” forms of property. Further, it is all too common to assume that because something is property, only private and not public rights are implicated. Given the fundamental differences in the economics of real property and intellectual property, the use of the property label is simply too likely to mislead."
As Lemley emphasizes, intellectual property is better thought of as a kind of subsidy to encourage innovation--although the subsidy is paid in the form of higher prices by consumers rather than as tax collected from consumers and then spent by the government. A firm with a patent is able to charge more to consumers, because of the lack of competition, and thus earn higher profits. There is reasonably broad agreement among economists that it makes sense for society to subsidize innovation in certain ways, because innovators have a hard time capturing the social benefits they provide in terms of greater economic growth and a higher standard of living, so without some subsidy to innovation, it may well be underprovided.
But even if you buy that argument, there is room for considerable discussion of the most appropriate ways to subsidize innovation. How long should a patent be? Should the length or type of patent protection differ by industry? How fiercely or broadly should it be enforced by courts? In what ways might U.S. patent law be adapted based on experiences and practices in other major innovating nations like Japan or Germany? What is the role of direct government subsidies for innovation in the form of government-sponsored research and development? What about the role of indirect government subsidies for innovation in the form of tax breaks for firms that do research and development, or in the form of support for science, technology, and engineering education? Should trade secret protection be stronger, and patent protection be weaker, or vice versa?
These are all legitimate questions about the specific form and size of the subsidy that we provide to innovation. None of the questions about "intellectual property" can be answered yelling "it's my property."
The phrase "intellectual property" has been around a few hundred years, so it clearly has real staying power and widespread usage I don't expect the term to disappear. But perhaps we can can start referring to intellectual "property" in quotation marks, as a gentle reminder that an overly literal interpretation of the term would be imprudent as a basis for reasoning about economics and public policy.
Dean Baker's blog is called "Beat the Press," but he praised this effort (the original is quite a bit longer, and makes additional points):
The War On Entitlements, by Thomas Edsall, Commentary, NY Times: ...Currently, earned income in excess of $113,700 is entirely exempt from the 6.2 percent payroll tax that funds Social Security benefits... Simply by eliminating the payroll tax earnings cap — and thus ending this regressive exemption for the top 5.2 percent of earners — would, according to the Congressional Budget Office, solve the financial crisis facing the Social Security system.
So why don’t we talk about raising or eliminating the cap – a measure that has strong popular, though not elite, support? ... The Washington cognoscenti are more inclined to discuss two main approaches...: means-testing of benefits and raising the age of eligibility for Social Security and Medicare. ... Means-testing and raising the age of eligibility as methods of cutting spending appeal to ideological conservatives for a number of reasons.
First, insofar as benefits for the affluent are reduced or eliminated under means-testing, social insurance programs are no longer universal and are seen, instead, as a form of welfare. Public support would almost certainly decline, encouraging further cuts in the future. Second, the focus on means-testing and raising the age of eligibility diverts attention from a much simpler and more equitable approach: raising the payroll tax to apply to the earnings of the well-to-do, a step strongly opposed by the ideological right. ... Third, and most important in terms of the policy debate, while both means-testing and eliminating the $113,700 cap on earnings subject to the payroll tax hurt the affluent, the latter would inflict twice as much pain. ...
Theda Skocpol ... of ... Harvard and an authority on the history of the American welfare state contended ... that policy elites avoid addressing the sharply regressive nature of social welfare taxes because, “at one level, it’s very, very privileged people wanting to make sure they cut spending on everybody else” while “holding down their own taxes.” ...
Stephen Ziliak, via email:
Does graphing improve prediction and increase understanding of uncertainty? When making economic forecasts, are scatter plots better than t-statistics, p-values, and other commonly required regression output?
A recent paper by Emre Soyer and Robin Hogarth suggests the answers are yes, that in fact we are far better forecasters when staring at plots of data than we are when dishing out – as academic journals normally do – tables of statistical significance. [Here is a downloadable version of the Soyer-Hogarth article.]
“The Illusion of Predictability: How Regression Statistics Mislead Experts” was published by Soyer and Hogarth in a symposium of the International Journal of Forecasting (vol. 28, no. 3, July 2012). The symposium includes published comments by J. Scott Armstrong, Daniel Goldstein, Keith Ord, N. Nicholas Taleb, and me, together with a reply from Soyer and Hogarth.
Soyer and Hogarth performed an experiment on the forecasting ability of more than 200 well-published econometricians worldwide to test their ability to predict economic outcomes using conventional outputs of linear regression analysis: standard errors, t-statistics, and R-squared.
The chief finding of the Soyer-Hogarth experiment is that the expert econometricians themselves—our best number crunchers—make better predictions when only graphical information—such as a scatter plot and theoretical linear regression line—is provided to them. Give them t-statistics and fits of R-squared for the same data and regression model and their forecasting ability declines. Give them only t-statistics and fits of R-squared and predictions fall from bad to worse.
It’s a finding that hits you between the eyes, or should. R-squared, the primary indicator of model fit, and t-statistic, the primary indicator of coefficient fit, are in the leading journals of economics - such as the AER, QJE, JPE, and RES - evidently doing more harm than good.
Soyer and Hogarth find that conventional presentation mode actually damages inferences from models. This harms decision-making by reducing the econometrician’s (and profit seeker’s) understanding of the total error of the experiment—or of what might be called the real standard error of the regression, where “real” is defined as the sum (in percentage terms, say) of both systematic and random sources of uncertainty in the whole model. If Soyer and Hogarth are correct, academic journals should allocate more space to visual plots of data and less to tables of statistical significance.
In the blogosphere the statistician Andrew Gelman, INET’s Robert Johnson, and journalists Justin Fox (Harvard Business Review) and Felix Salmon (Reuters) have commented favorably on Soyer's and Hogarth's striking results.
But historians of economics and statistics, joined by scientists in other fields – engineering and physics, for example – will not be surprised by the power of visualizing uncertainty. As I explain in my published comment, Karl Pearson himself—a founding father of English-language statistics—tried beginning in the 1890s to make “graphing” the foundation of statistical method. Leading economists of the day such as Francis Edgeworth and Alfred Marshall sympathized strongly with the visual approach.
And as Keynes (1937, QJE) observed, in economics “there is often no scientific basis on which to form any calculable probability whatever. We simply do not know.” Examples of variables we do not know well enough to forecast include, he said, “the obsolescence of a new invention”, “the price of copper” and “the rate of interest twenty years hence” (Keynes, p. 214).
That sounds about right - despite currently fashionable claims about the role of statistical significance in finding a Higgs boson. Unfortunately, Soyer and Hogarth did not include time series forecasting in their novel experiment though in future work I suspect they and others will.
But with extremely powerful, dynamic, and high-dimensional visualization software such as “GGobi” – which works with R and is currently available for free on-line - economists can join engineers and rocket scientists and do a lot more gazing at data than we currently do (http://www.ggobi.org).
At least, that is, if our goal is to improve decisions and to identify relationships that hit us between the eyes.
Stephen T. Ziliak
Professor of Economics
Marcus Nunes, I think properly, concludes that Williamson’s graph is wrong, because Williamson ignores the fact that there was a rising trend of NGDP during the 1970s, while during the Great Moderation, NGDP was stationary... Furthermore, Scott Sumner questions whether the application of the Hodrick-Prescott filter to the entire 1947-2011 period was appropriate, given the collapse of NGDP after 2008, thereby distorting estimates of the trend…
First off, I am very cautious about mixing pre- and post-1985 data because of the impact of the Great Moderation on business cylce dynamics. This applies to Jim Hamilton's reply to my thoughts about the positive impact from housing. Hamilton points out that prior to the Great Moderation, housing would make significant contributions to GDP growth as the economy jumped back to trend. True enough; Hamilton might prove correct. But I would add that large contributions prior to 1985 would typically come in the early stages of the business cycle. I don't think the same kinds of cycles are currently at play, and that it is a little late to be expecting a V-shaped boost from housing.
As to the issue of the HP filter, this was on my radar because St. Louis Federal Reserve President James Bullard likes to rely on this technique to support his claim that the US economy is operating near potential. As he said today:
The housing bubble and the ensuing financial crisis probably did some lasting damage to the economy, suggesting that the output gap in the U.S. is not as large as commonly believed and that the growth rate of potential output is modest. This helps explain why U.S. growth continues to be sluggish, why U.S. inflation has remained close to target instead of dropping precipitously and why U.S. unemployment has fallen over the last year—from a level of 9.1 percent in June 2011 to 8.2 percent in June 2012.
I think there is more wrong than right in these two sentences. I don't see how a slower rate of potential growth necessarily implies lower actual growth in the short run. Clearly we have many instances of both above and below trend growth over the years. The failure of inflation to fall further can easily be explained by nominal wage rigidities. And the drop in the unemployment rate, in itself not impressive, should be taken in context with the stagnation of the labor force participation rate.
Bullard likes to rely on this chart as support:
For some reason, Bullard rejects entirely CBO estimates of potential output, which would reveal a smaller output gap then his linear trend decomposition. My version of this chart:
To deal with the endpoint problem, I used a GDP forecast from an ARIMA(1,1,1) model to extend the data beyond 2012:1. If you don't deal with the endpoint problem, you get this:
I believe most people would believe this result (that output is solidly above potential) to be a nonsensical. By itself, the issue of dealing with the endpoint problem should raise red flags about using the HP filter to draw policy conclusions about recent economic dynamics.
Relatedly, notice that the HP filter reveals a period of substantial above trend growth through the middle of 2008. This should be a red flag for Bullard. If he wants to argue that steady inflation now implies that growth is close to potential, he needs to explain why inflation wasn't skyrocketing in 2005. Or 2006. Or 2007. Most importantly, we should have seen the rise in headline inflation confirmed by core-inflation. The record:
Core-inflation remained remarkably well-behave for an economy operating so far above potential, don't you think?
At issue is the tendency of the HP filter to generate revisionist history. Consider the view of the world using data through 2007:4:
Suddenly, the output gap disappears almost entirely in 2005. And 2006. And 2007. Which is much more consistent with the inflation story during that period.
Bottom Line: Use the HP filter with great caution, especially around large shocks. Such shocks will distort your estimates of the underlying trends, both before and after the shock.
Some of you might be interested in this:
An Overview of VAR Modelling, by Dave Giles: ...my various posts on different aspects of VAR modelling have been quite popular. Many followers of this blog will therefore be interested in a recent working paper by Helmut Luetkephol. The paper is simply titled, "Vector Autoregressive Models", and it provides an excellent overview by one of the leading figures in the field.
You can download the paper from here.
There was recently some discussion of "genoeconomics":
...the idea that genes had an important role to play in decision-making was largely abandoned in the world of economics. But with the completion of the Human Genome Project in 2000, the first full sequence of a human being’s genetic code, people started wondering if perhaps it would be possible to push past broad heritability estimates ... and figure out what part of a person’s genome influenced what aspect of his behavior.
However, Cornell economist Daniel Benjamin argues that the ability of genetic factors to explain individual variation in economic and polilitical behavior is "likely to be very small" (genetic data "taken as a whole" may have some predictive power, but "molecular genetic data has essentially no predictive power"):
New evidence that many genes of small effect influence economic decisions and political attitudes, EurekAlert: Genetic factors explain some of the variation in a wide range of people's political attitudes and economic decisions – such as preferences toward environmental policy and financial risk taking – but most associations with specific genetic variants are likely to be very small, according to a new study led by Cornell University economics professor Daniel Benjamin.
The research team arrived at the conclusion after studying a sample of about 3,000 subjects with comprehensive genetic data and information on economic and political preferences. The researchers report their findings in "The Genetic Architecture of Economic and Political Preferences," published by the Proceedings of the National Academy of Sciences Online Early Edition, May 7, 2012.
The study showed that unrelated people who happen to be more similar genetically also have more similar attitudes and preferences. This finding suggests that genetic data - taken as a whole – could eventually be moderately predictive of economic and political preferences. The study also found evidence that the effects of individual genetic variants are tiny, and these variants are scattered across the genome. Given what is currently known, the molecular genetic data has essentially no predictive power for the 10 traits studied, which included preferences toward environmental policy, foreign affairs, financial risk and economic fairness.
This conclusion is at odds with dozens of previous papers that have reported large genetic associations with such traits, but the present study included ten times more participants than the previous studies.
"An implication of our findings is that most published associations with political and economic outcomes are probably false positives. These studies are implicitly based on the incorrect assumption that there are common genetic variants with large effects," said Benjamin. "If you want to find genetic variants that account for some of the differences between people in their economic and political behavior, you need samples an order of magnitude larger than those presently used," he added.
The research team concluded that it may be more productive in future research to focus on behaviors that are more closely linked to specific biological systems, such as nicotine addiction, obesity, and emotional reactivity, and are therefore likely to have stronger associations with specific genetic variants.