Josh Quan - joshua.quan@tufts.edu
Tisch Library
tinyurl.com/tischstata
Open Stata: Windows -> Data & Statistical Applications -> Stata 14
In Stata: File -> Change Working Directory...
Highly recommended to keep a do/log file of your sessions
How Many Variables?
How Many Variables?
Six Variables: region, country, popgrowth, lexp, gnppc, safewater
/* this is a comment, does not run */
use lifeexp.dta
/* basic information about data types */
describe
/* basic descriptive statistics about our dataset */
summarize
browse
Why are values different colors?
Matrix plots
graph matrix lexp safewater gnppc popgrowth
regress y x
regress lexp safewater
regress y x
regress lexp safewater
regress y x
regress lexp safewater
scatter plot with fit line
twoway scatter lexp safewater || lfit lexp safewater
clear
use heights.dta
summarize
six variables: earn, height, gender, ed, age, race
Is there a significant relationship between height and salary?
regress earn height
Interpret the output on your own or with a neighbor
Is there a significant relationship between height and salary?
regress earn height
Multiple Regression
regress earn height ed
what happens to our coefficients when we add variables to the model?
Multiple Regression with a Categorical Variable
regress earn height ed gender
Multiple Regression with a Categorical Variable
/*one way to create a dummy variable*/
gen female = (gender == "female")
/*and another*/
gen female = 0
replace female = 1 if gender == "female"
/*yet another*/
xi: regress earn i.gender
Multiple Regression with a Categorical Variable
regress earn height ed female
outreg2
outreg2 using model.doc
regress earn height
outreg2
regress earn height ed age
outreg2 using model.doc, append
help regress