Demonstrating the Frisch–Waugh–Lovell theorem with Stata

Reading Time:

We are interested in plotting a bivariate relationship while recognizing (partialling out) various less interesting effects.

A tractable simulation will show that after "taking out the effect of x1 on x2" (i.e. residualizing) we can obtain the desired coefficient. The goal will be to properly estimate the second beta in

$$ Y = \alpha + \beta_1 X1 + \beta_2 X2 + \epsilon $$

The true data-generating process is linear, and the hope is to get a coefficient estimate not too far from 5 (the true beta_2).

clear
set seed 10009
set obs 100
gen x1 = rnormal()
* Induce positive correlation between x1 and x2
gen x2 = rnormal() + .2*x1

* TRUE data-generating process
gen y = 1 + x1 + 5*x2 + rnormal()

Predict the X2 values based on X1 and store residuals:

* Step 1: Residualize x2
reg x2 x1 
predict resid_x2, res

The new variable resid_x2 contains the residuals which will be used as an input in a new univariate regression. The variation due to X1 has already been taken into account:

* Step 2: Residualize y
reg y x1
predict resid_y, res

FWL means that regressing residuals on residuals will yield \beta_2:

* THIS IS IT: FWL
* Run the univariate regression
reg resid_y resid_x2, noci cformat(%9.2f) pformat(%5.2f) sformat(%8.2f)

-----------------------------------------------------
     resid_y |      Coef.   Std. Err.      t    P>|t|
-------------+---------------------------------------
    resid_x2 |       5.24       0.11    45.96    0.00
       _cons |       0.00       0.10     0.00    1.00
-----------------------------------------------------

A statement of the theorem with orthogonal projection matrices can be found on Wikipedia and elsewhere.

OK, so the coefficient is 5.2.

This is exactly the same coefficient we would have obtained from regressing Y on X1 and X2:

. reg y x1 x2, noci cformat(%9.2f) pformat(%5.2f) sformat(%8.2f)

-----------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|
-------------+---------------------------------------
          x1 |       0.74       0.10     7.29    0.00
          x2 |       5.24       0.11    45.72    0.00
       _cons |       1.09       0.10    10.56    0.00
-----------------------------------------------------

If we had not controlled for the influence of X1, then the effect of X2 would have been overestimated:

* The coef. on x2 is overstated, due to OVB:
reg y x2, noci cformat(%9.2f) pformat(%5.2f) sformat(%8.2f)

-----------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|
-------------+---------------------------------------
          x2 |       5.50       0.13    40.86    0.00
       _cons |       1.22       0.13     9.79    0.00
-----------------------------------------------------

The coefficient above is biased.

Plotting the slopes

The FWL / residualized regression approach lets us visualize the relationship between Y and X2 where the influence of the X1 variable has already been filtered out. That relationship is displayed in the right panel in this chart, and can be compared with the unadjusted relationship, shown on the left:

Frisch-Waugh-Lovell-FWL-theorem

The slope is -- appropriately -- less steep when X1 has been controlled for.

In other words, the slope better approximates the true value of the beta of interest.

The code to produce the chart above:

. scatter y x2, ///
  yscale(r(-10 15)) ///
  name(n1, replace) title("Correlation between Y and X2" "(uncontrolled)")

. scatter resid_y resid_x2, ///
  yscale(r(-10 15)) ///
  name(n2, replace) title("FWL" "(controls for X1)") ///
 xtitle("x2_residuals") ytitle("y_residuals")

. gr combine n1 n2

Final note

If we had not used y-residuals (from regressing Y on X1) on the left-hand side of the regression equation (as the FWL theorem asks us to do) then the standard errors would have been larger, but the coefficient would still be correct:

. reg y resid_x2, noci cformat(%9.2f) pformat(%5.2f) sformat(%8.2f)

-----------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|
-------------+---------------------------------------
    resid_x2 |       5.24       0.29    18.20    0.00
       _cons |       0.87       0.25     3.45    0.00
-----------------------------------------------------

The code to reproduce what's in this post is on Github.

How Democrats, Republicans, and Independents view 16 groups

Here are thermometer ratings from the 2017 VOTER Survey, broken down by party ID. A static version of this chart was posted here. if("undefined"==typeof...