Stata常用指令 解释
t more off
t virtual on 把虚拟内存打开
di exp(3.567) = display
Brow the data
anytime
tabmiss x1 x2 (findit tabmiss) 显示MV的freq与比例
brow var1 var2 (if ….) Look like editor window, but cannot edit listblck in 1/10, repeat(1) (findit listblck) list, 但将版面精缩
repeat(1/n) => 前1(n)个重复出现after row 2
(findit univar) univar chine math science, boxplot , by(gender) onehdr univar math, by(gender) onehdr boxplot onescal univar (=sum) 但增加q25, midian, q75的呈现get a table with one header
onescale才能相比较
Summary Statistics & Tables
sum
we can u if : eg. (if crime==1) Summarize all variables (mean, SD, freqency)
tab x1, sort miss
(sort=按照distribution排列; miss=列出MV distribution as
well)
tab=tabulate
ta x1 x2, chi2 miss
, nof column (no frequency / column percentage)
, row (row percentage)
, all (all available statistics)
,
exact (Fisher’s exact test)
Chi2=Pearson chi-square test of independence
ta maage_group, plot
tab1 x1 x2 x3 x4 = tab x1 / tab x2…….
tab2 x1 x2 x3 x4 tab all possible two-way..
ta paedu, sum(crime) By levels of paedu, summarize crime tabstat score, stats(mean sd n max min…) by (subject) median, p10, p25, iqr, q…背单词方法
iqr=interquaritile range=p75-p25
q=quartiles= if we specify p25 p50 p75
table x1 x2, contents(mean y1 median y2) Also min, max….etc…..
Data Management
gen id=_n (then do something el)
sort id
If want to come back to the earlier order….. brow var1 var2 (if ….) Look like editor window, but cannot edit edit var1 var2 var3 (if…)
label variable bw “birth weight”
drop if id==id[_n-1] & birthday==birthday[_n-1] Or just replace delete=1, 就不用真的delete format id %9.0f 字符太多不够显现时….
encode region, gen(region2)
tab region2 (looks the same but…)
tab region2, nolabel (now we e the numeric value) It generate labeled- numeric var from a string variable.
mvdecode mvencode numeric value => mv mv => numeric value
egen zscore=std(x) 标准分数 (mean=0, V=1)
egen avg=rmean(Chine, English, math) Row mean, ignore MV
egen sum=rsum(x,y,z) Row sum, MV=0
list population region, nolabel
(only for lebeled numeric variables, not string var)
Display numeric var instead of the labels [分组]
sort var
会计凭证怎么做gen varnew=group(5)
分成相同cas五组
egen iicat=cut(ii), at(10, 40, 70, 90)
table iicat, contents(min ii max ii) => 检查 分成10, 40, 70三组 不包括上限 (eg.90) 不被包括者 => MVupstairs
egen iicat=cut(ii), at(10, 40, 70, 90) icodes egen iicat=cut(ii), at(10, 40, 70, 90) label => 变成 0, 1, 2 三组
=> 跟icodes一样,但加了label (10- 40- 70- )
期间费用包括哪些local x "st2 st3 " [for later u: type `x']定义长字符串
Importing data from other programs
infile str30 place population x score using test.raw String var之前要加str#, as many as
#characters
(clean Excel data following stata data format) Excel => stata data
(save Excel as .csv file )
insheet using “c:/data/test.csv”
infix
reshape?
collap?
Compare groups
ttest college, by(male)
Regression
by region3, sort:reg score paedu sort region3
by region3:reg score paedu reg y x1 x2 x3, beta standardized regression
sw reg Y x1 x2 x3 x4 x5….., pr(.05)
pr=p to retain (backward elimination) Stepwi reg:
它自己remove不显著Xs
sw reg Y x1 x2 x3 x4 x5….., pe(.05) pe=p to enter
After regression…
predict yhat
predict e, risid
sort e
list v1 v2 v3… in 1/10 (or in -10/l) (l=last, not one)Residual
We can examine where the model fits poorly…
lstat ? correct classification rate Listcoef, help (要arch & install: Long’s
spostdo)
列出 X(&Y)的标准化系数
After logistic regression
est store full
quietly logistic y x (nested model)
lrtest full
Likelihood-ratio test :
logit y x
predict phat
graph twoway connected phat x, sort predict q, xb => Phat=predicted p
show off=exp(a+bx)/[1+exp(a+bx)] => xb = lg odd = ln(p/(1-p))
predict phat
green parkgraph twoway mspline phat x2
冰川时代1电影
adjust, by(var1) exp 后者=前者
*exp(b)
adjust, by(var1) pr p/(1-p)=odds (when var1=n) => odds when var1=1,2,3.. => p(y) when var1=1,2,3..
Interaction term的诠释: B1(Main)+B2(dummy)
For the group (dummy=1): the odds ratio of Main is
exp(B1) * exp(B2)
logistic y var1 var2 inter
lincom var1+inter
lincom [2]lbw+[2]inter10, or (for mlogit) ([2]=model) Get point estimation & CI of coefficient combination
用方便的方式得到 predicted probability
prchange (findit
spost)
prchange, fromto help (help: add 说
明)
Changes in predicted probability
prtab
prtab, x(paedu=1 maedu=1) rest(min)
Predicted probability in n*n table
prgen ii, f(30) t(60) gen(ff) x(male=0)
prgen ii, f(30) t(60) gen(mm) x(male=1) twoway (connected ffp1 ffx) (connected mmp1 mmx) 连续变项对y=1的影响 (于范围内自动取n[default=11]点来计算p)
xi3: logit y i.x1*male
postgr3 male, by(x1) table (very uful for obtain p)dohomework
postgr3 ii, by(area) (连续变项也可以) 有interaction term时……
=> male effect 因x1类别而不同
mlogit
mlogit y x1 x2, rrr nolog ba(2)
(ref group=> y=2)
rrr=relative risk ratio (=OR) Output
outreg using test.doc, nolabel replace (findit outreg) & install Then convert text into table 储存时要click no另存新檔
outreg using test.doc, nolabel append append = model 2 add on M1
outreg var1 var2 using test.xls, replace 10pct coefastr (=st. error instead of t statistics) 可指定列出哪些系数
(+ p<.1) (* add on coef)
log using myfile.smcl, replace (don’t u t) 最后:log2html myfile.smcl, replace (先 findit log2html) => 可以把结果存成html
Graph
不死之药
graph dir List all the graph files graph u gender_gap
graph save filename i.e., filename.gph is saved era filename.gph
其它
sgmediation var_y, mv(varx1) iv(varx2)
[Sobel-Goodman tests: u findit first] test whether a mediator carries the influence of an IV to a DV.
省时
program define shortcut
command 1 … command 2
end
shortcut (自己跑一遍command 1, 2..) Shortcut=program name we t =>shortcut 本身变成command
超级常用
list, gen, recode, replace, rename, sort, drop, keep,
order……
merge, append _merge=1 (from master data), 2=from using
data…