How to talk about your (descriptive) regression

 •  Filed under Data science and visualization

After running a regression, even you just want to look at empirical correlations (i.e. you do not claim observed associations are causal) you will often need to "verbalize" the output. How can you talk about such tables honestly?

The simplest thing to do seems to be to transparently say: "don't read too much into this, I am just reporting some conditional expectations here".

But if you want to be more polished, Andrew Gelman just posted this cool suggestion:

I train my students to summarize regression fits using descriptive terminology. So, don’t say “if you increase x_1 by 1 with all the other x’s held constant, then E(y) will change by 0.3.” Instead say, “Comparing two people that differ by 1 in x_1 and who are identical in all the other x’s, you’d predict y to differ by 0.3, on average.” [emphasis added]

This is really good!

A more modest goal, of course, would be to just remember never to use language like "x raises y when other covariates are held constant" when the jump to causality is not actually justified...