Using STATA: Bar charts with multiple groups using by() and over()

Let's compare Q1 GDP growth vs. the rest of each year, starting in 2009:

Here is the code to make the above chart:

graph bar ann_growthQ1 ann_growthRest, ///
bargap(5) ///
graphregion(color(white)) ///
over(year, gap(50) label(angle(45))) /// 
ytitle("Real GDP growth (percent)") ///
ylabel(, angle(horizontal)) ///
bar(1, color(red*0.3)) bar(2, color(blue*0.7)) ///
legend(label(1 "First quarter") label(2 "Other quarters") rows(2) ring(0) pos(4) region(lcolor(white))) ///
title("GDP growth in Q1 vs. average of Q2-Q4") ///
nofill
graph export gdp-bar.png, width(1200) replace

That's an improved version of what we'd get with graph bar ann_growthQ1 ann_growthRest, over(year):

How to get the data

freduse GDPC1
gen dd=yq(year(daten), quarter(daten))
tsset dd, quarterly

* Compute annualized quarterly growth:
gen ann_growth = ((GDPC1/l.GDPC1)^4-1)*100

* Generate variable for observation tagging:
gen quarter = substr(date,6,2)
gen q1 = (quarter == "01")
gen year = substr(date,1,4)
destring year, replace

* Compute average growth rates in Q1 and the remainder of each year
egen avg = mean(ann_growth), by(year q1)

keep if year>=2009 /// restrict the sample to crisis and post-crisis years

collapse (mean) ann_growth, by(year q1)

reshape wide ann, i(year) j(q1)

rename ann_growth1 ann_growthQ1 
rename ann_growth0 ann_growthRest

Other ways to show the data

If you do not reshape the data, you can use over() over() twice, like this:

graph bar ann_growth if year >=2008, ///
graphregion(color(white)) ///
over(year,label(angle(45) labsize(small))) ///
over(q_other, relabel(1 "Q1" 2 "Average of Q2-Q4")) ///
ytitle("Real GDP growth (percent)") ///
ylabel(, angle(horizontal)) ///
title("Seasonally-adjusted GDP growth" "early vs. late in the year") ///
nofill ///
intensity(*.7) 
graph export gdp-over.png, width(1200) replace

This gets us:

Combining over() and by() is a bit more complicated because I haven't seen a way to declare labels inside by(), so I labeled the groups before creating the chart:

label define qo 0 "First quarter" 1 "Other quarters"
label values q_other qo

graph bar ann_growth if year >=2008, ///
graphregion(color(white)) ///
over(year,label(angle(45) labsize(small))) /// make the x-axis readable by changing the angle and decreasing font size
by(q_other, cols(2) note("")) /// change to col(1) stacked exhibits are preferred; remove the default note on groups
ytitle("Real GDP growth (percent)") ///
ylabel(, angle(horizontal)) ///
nofill ///
intensity(*.7) /// stylistic
graph export gdp.png, width(1200) replace

This gives us:

Finally, we could show the above with a dot plot, which would need a bit more work:

graph dot ann_growth if year >=2008, ///
over(q_other, relabel(1 "Q1" 2 "Q2-Q4")) ///
over(year, label(angle(45) labsize(small))) ///
ytitle("Real GDP growth (percent)") ///
graphregion(color(white)) nofill
*graph save Graph dot,gph
graph export dot2.png, width(1200) replace
*marker(1,mcolor(purple))

It seems like a better idea to reshape the dataset, rather than using over() twice:

reshape wide ann, i(year) j(q1)
rename ann_growth1 ann_growthQ1 
rename ann_growth0 ann_growthRest
graph dot ann_growthQ1 ann_growthRest if year >=2009, ///
over(year) ///
ytitle("Real GDP growth (percent)") ///
graphregion(color(white)) nofill ///
marker(1,mcolor(purple*.5) msize(*1.5)) ///
marker(2,mcolor(midgreen) msize(*1.5)) ///
legend(label(1 "Q1 growth") label(2 "Q2-Q4 growth"))
graph export dot-plot.png, width(1200) replace
Show Comments

Stay in touch with me - I send out occasional updates.