Saturday, May 3, 2014

Why Participation Is Down

There have been many attempts to answer this question: Is the decline in the U.S. labor force participation rate structural or cyclical? Or, more precisely, to what extent is it either one?

And there have been so many attempts because it really is an important question. Think about the economy as a big machine that takes three inputs -- technology, labor, and capital -- and produces output. The drop in the labor force means that the U.S. has forfeited, perhaps permanently, that labor input and whatever marginal output it would have yielded. A simple calculation1 suggests that the share of output lost is about three percent; more in-depth calculations from Reifschneider, Wascher, and Wilcox (2013) place it at the center of their estimate of a seven-percent drop in potential output. That's a lot. You don't blow three percent of GDP, let alone seven, every day.

Another reason that economists keep coming back to the labor force participation rate is that, ominously, it keeps falling. Not only does that render much of the research overtaken by events, but also the data presents a challenge to reports that see the decline as cyclical and transitory.

I'd also say that the reason that the research continues2 is because it hasn't settled on a single analytical framework. That's not necessarily a bad thing at all, as disagreement over methods forces researchers to reconcile differences in results rather than herd around a single conclusion. Yet to a certain degree it reflects dissatisfaction with the methods offered so far.

In this post I take an approach that is mostly3 new to the question of the decline in the labor force participation rate but will be familiar to most labor economists, the Blinder-Oaxaca decomposition of a probit model for the labor force participation decision. I use microdata from the March 2007 and 2013 supplements to the Current Population Survey, downloaded from IPUMS. I conclude that, of the 2.8-percentage-point decline in the labor force participation rate over that six-year period, more than half (1.7 percentage points) can be explained by underlying changes in demography, though a substantial fraction (1.1 percentage points) cannot.

The method

For the majority of my audience that has no idea what a Blinder-Oaxaca decomposition is, here's a quick 101. It's a statistical technique invented by Alan Blinder and Ronald Oaxaca in 1973 that takes the change in a variable and determines how much of it can be explained by a set of other variables in a model and how much can't. (Note: What comes next gets rather mathy, but you can skip down to "My idea..." if math isn't your thing.)

For example, Blinder and Oaxaca both wanted to understand why people differ in their earnings. Let's say that you think pay is determined by a bunch of factors, like your education, work experience, occupation, and so on. Let's put all of those factors into a matrix X, which contains data on lots of people. Let's put all of their earnings into another matrix Y. Then we can estimate the impact of all of those factors by an ordinary least squares regression:

Y = Ε

where β is the matrix of coefficients, which reflects the impacts of the factors, and Ε is a matrix of residuals

Now here's the innovation from Blinder and Oaxaca: If we want to understand a change in Y between two periods, then in the context of our model, there can only be two things going on. Either X or β could have changed -- that is, there could have been an underlying change in the determinants X of pay Y, or there could have been a change in the impacts of factors, reflected in β. We can express that idea as:

ΔY = Ya - Yb = (X- Xb)βb + Xa(β- βb)

where the "a" subscripts are for the "after" period and the "b" subscripts for the "before" period. You can think of the first term as the explained share, changes in the composition of the independent variables. And you can think of the second term as the unexplained share, changes in effects.

Now, my application of this model is a little bit more complex, because we're trying to explain a binary variable. "Are you in the labor force?" can get an answer of yes or no. So I've used something called a probit model, which allows us to estimate the probability that you answer yes or no to that question, given your characteristics. Our changes in the probabilities can also be divided in just the same way into changes in characteristics and changes in the effects of characteristics.

These techniques might seem exotic or advanced to newcomers. To economists, though, they're standard practice. So much so that it's surprising that I was not able to find a single piece of research that did what I think should be the first cut at answering this oh-so-important question about the decline in labor force participation. 

My idea, to be sure, was pretty simple. Here, I'll explain it without the math. Create a model that includes everything you think might be relevant to the decision of whether to participate in the labor force or not. Find data on an "after" period (March 2013) and a "before" period (March 2007). Then see what change in the labor force participation rate the model predicts. But, whatever you do, don't tell the model that a recession happened between 2007 and 2013. Include everything you think might explain the labor force participation decision in a structural capacity -- but nothing else.

My dataset is the March 2007 and 2013 supplements to the Current Population Survey. That gives me a sample size of roughly 150,000 people for both years. To predict whether or not each of these people are in the labor force, I had data on lots of different things: their age, sex, race, marital status, health status, disability status, education, whether they are currently enrolled in school, whether they're a war veteran, whether they have young children at home, and whether they're on welfare. 

It turns out that all this information is enough to make a good guess at whether you actually are in the labor force or not. On average, the model gets it right 81 percent of the time, assuming that you think of predictions of 50 percent and above as a "yes" and below 50 percent as a "no."

And I've deliberately gone out of my way to include common narratives about why the labor force participation rate has fallen. The aging and retirement of the Baby Boomers. The rise in worker disability. The rise in college enrollment. Furthermore, the unexplained share of this method will identify the specific areas of unexplained changes -- for instance, if women en masse suddenly have decided to stop working (and it turns out they haven't), this method will point at that issue. So one of the huge advantages to this approach is that it allows us to do a bunch of tests of specific theories one-by-one and say whether they hold water or not. 

The results

The headline result is that 1.7 percentage points of the decline in the labor force participation rate are explained by changes in the demographic composition of the population, and that 1.1 percentage points are left unexplained. The 95-percent confidence intervals on those figures are that between 1.4 and 1.9 percentage points are explained and between 0.8 and 1.4 percentage points are unexplained. 

This is a good place to note that I've made my .do files available here, so that you can go home and replicate this work, as I know you're all dying to do.

What matters to explaining the decline in the labor force participation rate? One thing above all else: aging, which explains 1.3 percentage points of the drop. The next most important: enrollment in school, which explains 0.8 percentage points of the drop. Remember that individual explanations can sum to more than the total, because there are other changes that partially offset. For example, the rise in educational attainment, which comes from this enrollment, explains a 0.6 percent rise in the labor force participation rate, because the well-educated work like crazy.

What matters less? The rise of disability, which explains 0.2 percentage points of the drop. The decline in the birthrate during the recession, which would suggest a 0.1-percentage-point increase, since fewer people are tied down at home with four-year-olds. 

And what just straight up doesn't matter? Changes in the share of people on welfare, disability aside. Changes in health, after accounting for disability and age. Changes in the sex and race composition of the labor force.

It also turns out that there's no single category that absorbs most of the unexplained share. In fact, the model puts almost all of the unexplained share into a constant. Which basically means that the model is saying, "Whatever your background, take what your probability of being in the labor force was in 2007 and mark it down by some amount for your 2013 probability." I found this compelling evidence that what our model says is unexplained really is the business cycle, and not some omitted structural explanation.

Here, also, is maybe a conclusion you wanted: What does the model predict the labor force participation rate is in March 2013 based on these changes in composition? 64.7 percent, as compared to an actual rate of 63.5 percent. Perhaps this makes you view my conclusion differently, if "less than half cyclical" sounded dour. This wouldn't be a trivial amount of recovery, as you can see in this graph. The black dot not on the line indicates the March 2013 counterfactual.


I've been meaning to write a post on this for a long time. It is the analytical challenge of our era for economists. It's taken me so long to put together an estimate because I wanted an approach I could defend.

One valuable side note is that the change in the working-age labor force participation rate is probably a good rule of thumb for the change in the overall structural labor force participation rate. The drop is about the same as predicted. Which makes sense: These are people whose labor force decision should not be sensitive to the business cycle. They're in the working period of their lives.

I should also mention some shortcomings of this analysis. One of them is that I've only used data from two months, the March 2007 and 2013 CPS supplements. This was mainly out of convenience, as that was the data available on IPUMS, the database I linked to earlier.

Another concern is the obvious endogeneity problem with education. That is, if the economy's terrible, that affects your decision of whether to work now or to go back to school. But note that this problem is insoluble without a model of how the economy affects education decisions, something well beyond the scope of my work here. What my work suggests, though, is that this exercise is worthwhile. Since you get a year older every year, there's not a lot of mystery to the aging-working link. But, since we know now that education decisions were actually important to driving down overall labor force participation, maybe we should go back and think about it carefully.

A final concern is that a lot of the prior research I looked at includes what are called "cohort effects," that is, you think about labor force participation evolving differently for different generations of people, based on their pre-recession starting age. I don't do that in this model. If cohorts matter, this approach will miss it.

Part of my hope of writing this post, whether or not you agree with the overall conclusion, is to enlighten people about the explanatory power of all the theories on the table. If you're on the right, and walk away from this post saying, "Gosh, I wasn't convinced that the decline in the labor force participation rate is partly cyclical, but wow, maybe it really isn't all about more people on welfare," I'll take that as a victory. Or, if you're on the left, and think, "Gosh, I wasn't convinced that the decline in the labor force participation rate is more than half structural, but wow, maybe aging is a bigger part of the story than I thought," I'll also take that as a victory. And, for sure, this won't be the last word. There are many other compelling approaches, each with their advantages and disadvantages. But I think this is an important one that needs to be added to the conversation.

If you have questions, I'm happy to answer them in the comments.


1. Assume that GDP is described by a Cobb-Douglas aggregate production function with a labor share of 0.6, consistent with U.S. levels. Then, holding capital and technology constant, you would predict that a 5-percent drop in the labor force participation rate would cause a 3-percent drop in output.

2: You can find a good literature review in Erceg and Levin (2003).

3: There is an exception, Hotchkiss and Rios-Avila (2013). But it does something I think is not good, which is that it includes a measure of labor-market conditions. My approach differs importantly in that I don't include one because I want to see the conclusions of the model without telling it about the recession. I have some other concerns about the particular measure they've chosen and whether we really can include it in the model if it is codetermined with labor force participation.

Update: I've made my fully cleaned up .dta file available for direct download here.


Further results:

Alan Reynolds of the Cato Institute asked me to try repeating the decomposition with broader measures of welfare programs -- the one I used originally was narrow, i.e. TANF, and Reynolds wanted SNAP (food stamps), Medicare, and Medicaid.

Following other ideas in the comments, I also included cubic and quartic terms in the age, so as to better approximate the curve of the LFPR in the cage. I found that inclusion of the extra age terms didn't do much.

I found that the increase in the fraction receiving public health insurance was an important explanatory variable for the decline of the labor force participation rate: It explains about 0.6 percentage points. I found the increase from SNAP was rather small: It explains 0.2 percentage points. In the new specification, fully 2.5 percentage points of the 2.8 percentage point from in the LFPR is explained by changes in the composition of the workforce.

I would strongly caution Alan, or anyone really, from interpreting this as a causal result. Don't conclude that because Obama expanded Medicaid and food stamps, those new recipients aren't working any more. I imagine that most of this growth was the result of the business cycle. The causal pathway probably goes from unemployment to those programs. I am aware Medicaid expanded permanently, but there is no way to disentangle this.

Tuesday, April 29, 2014

The Fishermen

Stephen Williamson and John Cochrane have raised a radical question: What if we have the sign wrong for monetary policy? What if low interest rates reduce inflation and high interest rates raise it, that is, rather than the other way around?

Their basic argument is this: If the real interest rate is fixed, then when the central bank raises the nominal interest rate, it also raises the inflation rate; when it lowers the nominal interest rate, it also lowers the inflation rate.

You can see that in the Fisher relation, which is:

i = (1+r)(1+π) ≈ r + π

where i is the nominal interest rate, r is the real interest rate, and π is the inflation rate. The time preferences of agents in the economy fix the real interest rate, and the central bank specifies policy in terms of the nominal rate. 

Noah Smith has a useful summary of the debate; he counts himself among the supporters of the "Neo-Fisherites." I think the term "Fishermen" is catchier. David Beckworth, who prodded me to write up some thoughts on this, argues that economists have conclusive historical evidence against the Fishermen. After World Wars I and II, governments pegged the interest rates on their debt very low; what they got was explosive inflation. Ryan Avent also points out that expectations are key here -- and should be in the Fisher relation I wrote above.

I think there are basically two problems with the debate. The first is that the Fishermen wrap a dubious claim in an identity and a modeling assumption. The second, which follows from the first, is that if this debate is about anything, it's not really about the Fisher relation at all. It's about the wrapped-up claim.

Look back to the Fisher relation, and you'll see that it's true by the construction of the model that when the central bank raises the nominal interest rate, the inflation rate must rise. That's an identity, and we've said that the real interest rate is fixed.

But the problem with this argument, which looks airtight, is that it misconstrues what the Fisher relation is. What it really says is when the central bank raises the nominal interest rate, the inflation rate consistent with a steady-state equilibrium also rises.

Note that what I've done here is add in the words "consistent with a steady state equilibrium." This isn't mere semantics. It matters because the central bank's power in one sense is materially weaker: It no longer picks the current rate of inflation off a menu, but rather only the rate of inflation that can be sustained. Yet it also means that the central bank's power is, in another sense, materially stronger: It can distort the real interest rate in the short run.

Why does this matter? Because what I've shown is that wrapped inside Williamson's and Cochrane's point about the Fisher relation, which is just plain true, is an actual claim, and a dubious one: After the central bank raises the nominal interest rate, and thereby the steady-state rate of inflation, inflation will actually rise to that steady state. Cochrane embeds this assumption to the model if you look carefully; Williamson doesn't seem to discuss it.

So the first point is that the Fishermen aren't wrong. In fact, they can't be wrong. They're wrong about what their conclusion is. The second point is that the Fisher relation is in fact tangential to the whole debate. If Williamson and Cochrane are arguing anything at all, what they're arguing is the point about dynamics -- i.e. that inflation is well-behaved and goes to the new, higher steady state the central bank chooses when it picks the higher nominal interest rate.

Does the inflation rate explode when it starts out above the central bank's choice of equilibrium? Or does it converge nicely to the equilibrium? Do we fall into catastrophic deflation when inflation starts out above the central bank's choice of equilibrium? Or can we actually raise inflation by hiking rates? And, even if the dynamics aren't explosive, the interest-rate peg still means that the price level can float anywhere, depending on the sequence of shocks.

In case it's not totally obvious, I think we're in the second world. That's a view informed by a longstanding theoretical tradition in macroeconomics that the price level is indeterminate when the central bank pegs an interest rate. That's exactly what the second picture shows: If you don't start out at 2, then the dynamics get completely out of control.

It's also a view informed by data. Beckworth beat me to the punch when he looked at historical episodes of interest rate pegs and saw exactly what Sargent and Wallace predicted. I would also point out that we have really strong evidence, also from Sargent, that the way to stop hyperinflation is to hike interest rates hard. That would be literally the worst thing you could do if you cast your line with the Fishermen.

More: Via Tony Yates, I am reading Cochrane (2011), which gets into many of these issues. Also, I just noticed this new working paper from Williamson, which, interestingly, also digs into the indeterminacy problem. If you look on page 12, his argument about the Fisher relation pops up. Update: I got rid of the illustration because Noah found it confusing.

Sunday, April 20, 2014

Yes, the Pay Gap Persists

Mark Perry and Andrew Biggs, economists at the American Enterprise Institute, argued recently that no pay gap exists between men and women after you control for the different choices they make. The story of a woman who is paid less than a man she is exactly alike, they claim, is false.

I took issue with this argument in two posts. When I actually ran the numbers, I found a persistent pay gap on the order of 4 percent to 10 percent, accounting for a battery of things -- frankly, everything that I could think of, and everything labor economists usually consider -- occupation, work experience, education, race, marital status, children, union membership, geographic location, and weekly hours. And I also wrote that it's probably wrong to take all these things as unaffected by pressure or discrimination.

Perry responded last week in a post I somehow missed. It's worth a follow-up. "What if gender discrimination could be completely eliminated, would there still be a gender wage gap for reasons not related to discrimination against women by employers?" he writes. He argues that the pay gap might persist because of gender differences in risk tolerance. Men might take on higher-risk jobs, such as in oil drilling, and receive what's called a compensating wage differential. Perry also suggested that the gender gap might persist because professional athletes and musicians are paid well and tend to be men.

Sadly, his argument makes no sense. Let me explain why.

1. My regression has "fixed effects" for occupation. This means that it fully accounts for any occupation-level compensating differentials for risk. So everything Perry and Biggs write about men dying in forestry, or what have you -- yeah, my analysis accounts for that. That's what a fixed effect is.

2. My analysis is of workers paid hourly wages. Professional athletes and musicians are not hourly workers. So this can't possibly explain the pay gap.

Look, I understand why Perry and Biggs have to respond to me and Matthew Yglesias, who riffed on my data analysis for Vox, where I am also a contributor. They misrepresented the research consensus on the gender pay gap in a major newspaper, and I called them out on it. 

I would agree with them that the 23-percent number reflects more than discrimination. But if they are going to try to explain away the pay gap, they're going to need to try a bit harder than this. I would know: That was the point of my original post, to see if I could entirely eliminate the pay gap with controls. I couldn't. I easily could make it small, as it surely is, but in the process of spending hours on the data analysis, I grew convinced it was real.

Tuesday, April 15, 2014

Interest, to Deficit, to Debt

What drives the federal government's near-term fiscal outlook is, to a surprising extent, interest payments rather than Medicare and Medicaid. They are set to grow faster than mandatory or discretionary spending over the next decade, from 1.3 percent of GDP to 2.6 percent of GDP.

The difference between rising debt over the next decade and keeping debt stable as a percentage of GDP is just one percentage point in the average interest rate on the U.S. public debt.

Paul Krugman has argued recently that the Congressional Budget Office's forecast for interest rates is too high -- and inconsistent with the odds that the economy remains somewhat depressed for some time to come. I figured it would be useful to do some math and test some of the assumptions in the latest CBO forecast.

First, I backed out its forecast for the average interest rate on U.S. public debt -- you can do this by dividing interest payments as a percentage of GDP by debt as a percentage of GDP. What you find is that the CBO expects this rate to reach 4.2 percent by 2024.

Is that reasonable? Well, it's hard to know without having a good sense of how the average interest rate on U.S. public debt behaves. It turns out that you can proxy for it very closely with the 10-year Treasury. I added the forecast line in green. The public debt interest rate in blue is estimated by taking the annual federal expenditure on interest payments and dividing it by the stock of total public debt.

That doesn't seem to be an impossible path for the average interest rate on public debt. In fact, we can use the 20-year Treasury and the 10-year Treasury to back out the 10-year, 10-year-forward rate -- the expected interest rate on the 10-year Treasury note in 2024. Given a current 10-year rate of 2.63 percent and a current 20-year rate of 3.35 percent, the 10, 10 forward rate is 4.08 percent.

But what if you assume that the interest rate will be 3 percent rather than 4.2 percent? How does the debt outlook change? Well, at that rate, debt stays constant for the next decade as a share of GDP, given the CBO's forecast for primary balance (the deficit ex-interest). Here's the graph comparing the two assumptions for the interest rate:

And here's the graph showing how this change in interest rates has a big impact on the near-term outlook for the federal debt:

I'm not making any claim that three percent is the correct assumption for the interest rate. In fact, I've presented some evidence as to why four percent is closer to market expectations. But it's always interesting to see how sensitive debt forecasts are to small changes in parameters like the interest rate. I guess this does illustrate Carmen Reinhart's view that highly-indebted governments almost always turn to financial repression.

Thursday, April 10, 2014

How Big Is the Gender Pay Gap?

“It’s not a myth; its math,” President Barack Obama said of the pay gap between men and women in the workplace. “You can look at the numbers. You can look at the pay stubs.”

It's true that the average woman working full-time earns 23 percent less than the average man who works full-time does. Yet this tells us a lot less than it might seem at first glance.

Men and women differ in occupations, work experience, education, hours and so many other qualities that “average” doesn’t get you close to the real concern about pay equity: That women earn less simply because they are women, and that a woman would be paid less than a man who is otherwise exactly alike her.

In this post, I'm going to do a cursory analysis of data from the March Current Population Survey from 1990 to 2013. You should check out the classic study on gender pay gaps here, from Francine Blau and Lawrence Kahn; these two essays by Claudia Goldin are also a terrific grounder on what and how economists think about the issue.

There's no doubt that, in raw average terms, women earn less than men on an hourly basis. Here's a kernel-density plot of the distribution of the natural logarithm of the hourly wage in 2013, divided according to gender. The black vertical line denotes the federal minimum wage. You can see the wage gap for yourself: the men's wage distribution, in red, is further to the right than the women's wage distribution, in blue.

But perhaps one might think the right question to ask with respect to gender equity isn't this raw average, but rather the counterfactual story about two people who are identical but for their gender. Econometric analysis can get us a good deal of the way there, and that's what this post is about.

Let's think about the factors that tend to be associated with higher or lower hourly wages. People with more work experience and more education seem likely to earn more, as are people who enter into occupations that are generally well-paid, like law or medicine, or unionized, like manufacturing. And it's well known that there are substantial racial differences in pay. We might also expect that pay varies according to whether you work in an urban area or a rural one and your geographic region of the country. We also know that hourly wages have generally risen over time as a result of inflation and productivity growth. And we might imagine that marriage and the number of children, especially young ones, has some effect on pay.

When we control for all of these influences -- which, let's suppose for the moment, are all independent of the gender pay gap -- how much of the gap persists? Whatever goes away is explained by these factors rather than purely gender.

The technical specification of the regression is that we're constructing a Mincer earnings function with controls for NAICS occupation, age, race, SMSA urban status, US Census region, marital status, union membership, number of children, number of children under age five, the logarithm of reported usual weekly hours, and time fixed effects by gender.

Using Stata to run the regression on my CPS survey data, which includes a sample of 209,000 people, I get the following table. From left to right, the columns are year, my point estimate for the gender pay gap in that year, the standard error, the t-statistic, the p-value, and then the 95-percent confidence interval around my point estimate.

My estimate of the gender pay gap is that women were paid about 7.7 percent per hour less than men on average in 2013, holding everything else equal. The gap was 14.3 percent in 1990. Depending on my regression specification, I was able to push the gender pay gap coefficients around only somewhat -- I think the range of reasonable estimates of the 2013 adjusted gender pay gap is probably 4 percent to 10 percent.

We're talking about a hypothetical woman working in the same occupation, in the same region of the country, of the same work experience, education, and race, and with the same family and working hours -- that woman is paid significantly less than a man to whom she is alike in all these respects, though the pay gap is smaller than the raw version.

It's not clear, though, that we really should be controlling for all these things. It's fair enough that people with more experience earn more. But it's reasonable to think that things like occupational choice and working hours are all influenced by the same gender discrimination we're seeking to detect. The pay gaps that result from women ending up in lower-paying fields are part of the pay gap insofar as that's true -- they're not something to explain away. More on this soon.

Note: This post is a re-do of an earlier one, which had a technical issue in the regression specification. Hat tip to Justin Wolfers, who spotted the problem, and to John Schmitt for some helpful further comments. You can download the .do file here, and the data from IPUMS here

Thursday, March 20, 2014

Slacking Off

Update (3/20): Alan Krueger has a highly relevant paper for the Brookings Papers on Economic Activity that concludes the long-term unemployed likely exert minimal influence on labor markets, including wage determination.

We're at the stage of the conversation about labor-market slack, it seems, where everybody is looking for the takeaways and points of meaningful agreement.

Cardiff Garcia has an excellent overview of how the debate on the labor-market slack issue has evolved. For people looking to dive in, Jérémie Cohen-Setton has a shorter look at the evidence and the arguments. Tim Duy says it was crazy that people ever thought that "overshooting" was ever in the cards. And Ryan Avent says no, it wasn't crazy -- and the case for a modest overshooting remains airtight.

Just a few odds and ends to share from me. I looked at the extent to which quits lead wages -- and found the lead time to be something on the order of a year, as you can see in this graph:

Regressing wages on 12-month lagged quits yielded a forecast for year-over-year increases in wages of about 3 percent. That was broadly in line with monthly and quarterly VAR estimates, including and excluding unemployment and consumption in the model. So something on the order of three-percent wage growth next year, and three-to-four-percent the year after that, both seems about right qualitatively and is what some simple modeling would predict.

That's not much to fear. What's been missed, though, is that the U.S. is not at the beginning of a normal tightening cycle. The federal funds rate is not one or two percent as it was in the 1990s and 2000s bottoms. It is zero. And there is billions in further easing beyond that.

When I look at the data and historical Fed behavior, it's hard for me to see a monetary policy that does not involve the beginning of an exit now and rate hikes in 2015. Even that incorporates some degree of overshoot -- if not in terms of inflation, as Avent wants, than certainly in terms of the unemployment rate.

It seems that, too, is the conclusion of the FOMC. The downward revision of the unemployment rate forecast now pegs full employment at the end of 2016. And the assessments of appropriate monetary policy suggest the FOMC is attached to the idea of rate hikes in 2015 and proceeding slowly from there.