极大似然估计案例

更新时间:2023-06-16 23:04:13 阅读: 评论:0

ML ESTIMATION IN STATA
Many estimation procedures in Stata are bad on the principle of Maximum Likelihood. For a lot of common estimators knowledge of ML is not strictly necessary (linear regression with OLS, 2SLS, SUR) since their matrix derivation gives the same result as the ML outcome. However, there are also techniques that explicitly make u of ML. Examples are non-linear binary choice models (e.g. Logit, Probit, Multinomial Logit), models for time-ries data (e.g. ARIMA) and binary choice models for panel data (fixed effects logit model).
The flexibility of Stata is mainly due to the fact that all procedures (estimators, tests, basic statistics but also graphs) are programmed in so-called ado -files. You can also program your own ado-files if Stata doesn’t support an estimator or test that you need for your rearch. Of cour, this requires knowledge on programming in Stata.
Using ML in Stata also requires that you write your own (small) programs for your particular problem. In short there are a number of steps to go through in order to do ML estimation in Stata:
1. Derive the log-likelihood function from your probability model. This is done on paper and is of cour dependent on the assumptions you make on the distribution underlying the data (e.g. normal or logistic)
.
2. Write a program that calculates the log-likelihood values and, optionally for difficult models, its derivatives in an ado -file. This program is known as a likelihood evaluator
3. Identify a particular model to fit using your data variables and the ml model  statement
4. Fit the model using ml maximize .
Let’s go through the steps for a well-known non-linear model, the logit  model. In a logit model the dependent variable typically has only two values, 0 or 1. Applications are consumer choice behavior, investment decisions etc. In this example we try to explain why some farmers have chon organic farming and other not. A datat with 473 obrvations on dairy farmers in the period 1994-1999 is available. The following variables are available.
Variable Description and units
2010年世界杯主题曲biodum    1 if organic farmer, 0 if conventional farmer
age  years
succ    1 if there is a successor, 0 if not
tenure    1 if more than half of the land is rented, 0 if not
clay    1 if major soil type is clay, 0 if not
educ    1 if farmer has higher education, 0 otherwi
sizequo  dairy production quota in 100,000 kg
sizeha acreage in hectares
animalha number of animals per ha
prof  profits in 100,000 Euros
1. Derive the log-likelihood function from your probability model.
()1Pr =y  is defined by the expression for the density function of the logistic distribution that underlies the logit model. So, ()()
beat it是什么意思Xb e y −+==111Pr . Since we only obrve values 1 or 0 for the variable biodum  we can also define an expression for ()0Pr =y : ()()()()()()Xb
Xb Xb Xb Xb
Xb e e e e e e y −−−−−−+=+−++=+−==111111110Pr  This leads to the following definition of the loglikelihood for the j th obrvation in the logit model:
数学教学经验总结()()()()()()()()()
=+−−=+−=+=+−=+−=+=−−−−−−−−01ln 1ln ln 1ln 11ln 1ln )1ln(11ln ln i Xb Xb Xb Xb Xb i Xb Xb Xb j y if e Xb e e e e
y if e e e l
2. Write a program (Stata ado file) that calculates the log-likelihood values
program define mylogit
args lnf Xb
quietly replace `lnf' = -ln(1+exp(-`Xb'))            if $ML_y1==1    quietly replace `lnf' = -`Xb' -ln(1+exp(-`Xb'))      if $ML_y1==0 end
flickr
3. Identify a particular model to fit using your data variables and the ml model statement
ml model lf mylogit  (biodum = age succ tenure clay educ sizeha sizequo animalha prof)
4. Fit the model using ml maximize .
. ml maximize
initial:      log likelihood = -327.85862sheaffer
alternative:  log likelihood = -284.23841skillet
rescale:      log likelihood = -268.17278
Iteration 0:  log likelihood = -268.17278
Iteration 1:  log likelihood = -185.84208
Iteration 2:  log likelihood = -161.98223
Iteration 3:  log likelihood =  -159.3928
Iteration 4:  log likelihood = -159.37492
Iteration 5:  log likelihood = -159.37492
Number of obs  =        473                                                  Wald chi2(9)    =      95.39 Log likelihood = -159.37492                      Prob > chi2    =    0.0000
------------------------------------------------------------------------------      biodum |      Coef.  Std. Err.      z    P>|z|    [95% Conf. Interval] -------------+----------------------------------------------------------------          age |  -.0271121  .0161332    -1.68  0.093    -.0587326    .0045085        succ |  -.3254785    .438487    -0.74  0.458    -1.184897    .5339403      tenure |  -.3761036  .5109148    -0.74  0.462    -1.377478    .6252709        clay |  .2660545  .3076138    0.86  0.387    -.3368576    .8689665        educ |    2.66085  .4142379    6.42  0.000    1.848958    3.472741      sizeha |  -.0195829  .0141403    -1.38  0.166    -.0472974    .0081316      sizequo |  -.3481513  .1456595    -2.39  0.017    -.6336387    -.062664    animalha |  -3.351957  .4730748    -7.09  0.000    -4.279167  -2.424748        prof |  .6485803  .2552814    2.54  0.011      .148238    1.148923        _cons |  6.221396  1.235341    5.04  0.000    3.800171    8.64262
The results are exactly the same as the one we would get using the logit  command available in Stata.
The results already give the outcome for a Wald test of H 0 : 0====prof succ age βββK . We could test other hypothes using the straightforward LR test. E.g to test whether personal characteristics matter  we can test the null hypothesis 0===educ succ age βββ. Estimate the model again with the three variables omitted:
indeedIdentify the new model to fit using your data variables and the ml model statement
ml model lf mylogit  (biodum = tenure clay sizeha sizequo animalha prof)
ml maximize
Number of obs  =        473                                                  Wald chi2(6)    =      79.64 Log likelihood = -195.21467                      Prob > chi2    =    0.0000  ------------------------------------------------------------------------------      biodum |      Coef.  Std. Err.      z    P>|z|    [95% Conf. Interval] -------------+----------------------------------------------------------------      tenure |  .7022649  .4007155    1.75  0.080    -.0831231    1.487653        clay |  -.0454245  .2667705    -0.17  0.865    -.5682852    .4774362      sizeha |  .0052
992  .0123222    0.43  0.667    -.0188519    .0294504      sizequo |  -.3474167    .14042    -2.47  0.013    -.6226347  -.0721986    animalha |  -2.390138    .39364    -6.07  0.000    -3.161658  -1.618618        prof |  .3865246  .2174228    1.78  0.075    -.0396164    .8126655        _cons |  3.871129  .8605505    4.50  0.000    2.184481    5.557777
The LR test statistic is []()81.7368.7121.19537.15922=>=−−−=χLR  so we firmly reject this null hypothesis.
We could have tested this with an LM test. Then we need to create cross-products of the residuals ε with x and z. First, residuals ε  were created bad on the estimated model. Using the estimated ML model, we made predictions and then took the difference with the obrved values
gen e=biodum-biodumf
Next, the cross products 'ˆi i x ε
and 'ˆi i z ε were created where the vector z i  contains the omitted variable age , educ, and  succ:
gen e_age=e*age
gen e_succ=e*succ
mcdonald s
gen e_tenure=e*tenure
gen e_clay=e*clay
gen e_educ=e*educ
gen e_sizeha=e*sizeha
gen e_sizequo=e*sizequo
gen e_animalh=e*animalh奥数培训班
gen e_prof=e*prof
We can do the LM test in a straightforward fashion by estimating the following equation without intercept!
考研可以提前交卷吗
u prof succ age +⋅++⋅+⋅=εεε (1)
So, create a vector of ones:
gen ones=1
reg ones  e_age e_succ e_tenure e_clay e_educ e_sizeha e_sizequo e_animalh e_prof, noconstant
Source |      SS      df      MS              Number of obs =    473 -------------+------------------------------          F(  9,  464) =    8.96        Model |  70.0368282    9  7.7818698          Prob > F      =  0.0000    Residual |  402.963172  464  .868455112          R-squared    =  0.1481 -------------+------------------------------          Adj R-squared =  0.1315        Total |        473  473          1          Root MSE      =  .93191
------------------------------------------------------------------------------        ones |      Coef.  Std. Err.      t    P>|t|    [95% Conf. Interval] -------------+----------------------------------------------------------------        e_age |  -.0065348    .011938    -0.55  0.584    -.0299941    .0169245      e_succ |  -.1558425  .3272642    -0.48  0.634    -.798946    .487261    e_tenure |  -.3956457  .4480105    -0.88  0.378    -1.276027    .4847352      e_clay |  .4048804  .2566723    1.58  0.115    -.0995037    .9092644      e_educ |  2.495919  .3029164    8.24  0.000    1.900661    3.091177    e_sizeha |  -.0006441  .0108471    -0.06  0.953    -.0219597    .0206714    e_sizequo |  -.3132451  .1293209    -2.42  0.016    -.5673724  -.0591179    e_animalh |  -.1441578  .3363157    -0.43  0.668    -.8050484    .5167328      e_prof |  .1131073  .2338385    0.48  0.629    -.3464064    .5726209 ------------------------------------------------------------------------------
Finally we can obtain the LM test statistic NR 2  = 473*0.148 = 70.04 ()81.732=>χ
The LM test statistic is clo to the test value of the LR value and it holds that LM LR ξξ≥.

本文发布于:2023-06-16 23:04:13,感谢您对本站的认可!

本文链接:https://www.wtabcd.cn/fanwen/fan/90/147494.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:经验总结   考研   数学
相关文章
留言与评论(共有 0 条评论)
   
验证码:
Copyright ©2019-2022 Comsenz Inc.Powered by © 专利检索| 网站地图