极大似然估计案例

更新时间:2023-06-16 23:04:13 阅读：评论：0

ML ESTIMATION IN STATA

Many estimation procedures in Stata are bad on the principle of Maximum Likelihood. For a lot of common estimators knowledge of ML is not strictly necessary (linear regression with OLS, 2SLS, SUR) since their matrix derivation gives the same result as the ML outcome. However, there are also techniques that explicitly make u of ML. Examples are non-linear binary choice models (e.g. Logit, Probit, Multinomial Logit), models for time-ries data (e.g. ARIMA) and binary choice models for panel data (fixed effects logit model).

The flexibility of Stata is mainly due to the fact that all procedures (estimators, tests, basic statistics but also graphs) are programmed in so-called ado -files. You can also program your own ado-files if Stata doesn’t support an estimator or test that you need for your rearch. Of cour, this requires knowledge on programming in Stata.

Using ML in Stata also requires that you write your own (small) programs for your particular problem. In short there are a number of steps to go through in order to do ML estimation in Stata:

1. Derive the log-likelihood function from your probability model. This is done on paper and is of cour dependent on the assumptions you make on the distribution underlying the data (e.g. normal or logistic)

2. Write a program that calculates the log-likelihood values and, optionally for difficult models, its derivatives in an ado -file. This program is known as a likelihood evaluator

3. Identify a particular model to fit using your data variables and the ml model statement

4. Fit the model using ml maximize .

Let’s go through the steps for a well-known non-linear model, the logit model. In a logit model the dependent variable typically has only two values, 0 or 1. Applications are consumer choice behavior, investment decisions etc. In this example we try to explain why some farmers have chon organic farming and other not. A datat with 473 obrvations on dairy farmers in the period 1994-1999 is available. The following variables are available.

Variable Description and units

2010年世界杯主题曲biodum 1 if organic farmer, 0 if conventional farmer

age years

succ 1 if there is a successor, 0 if not

tenure 1 if more than half of the land is rented, 0 if not

clay 1 if major soil type is clay, 0 if not

educ 1 if farmer has higher education, 0 otherwi

sizequo dairy production quota in 100,000 kg

sizeha acreage in hectares

animalha number of animals per ha

prof profits in 100,000 Euros

1. Derive the log-likelihood function from your probability model.

()1Pr =y is defined by the expression for the density function of the logistic distribution that underlies the logit model. So, ()()

beat it是什么意思Xb e y −+==111Pr . Since we only obrve values 1 or 0 for the variable biodum we can also define an expression for ()0Pr =y : ()()()()()()Xb

Xb Xb Xb Xb

Xb e e e e e e y −−−−−−+=+−++=+−==111111110Pr This leads to the following definition of the loglikelihood for the j th obrvation in the logit model:

数学教学经验总结()()()()()()()()()

=+−−=+−=+=+−=+−=+=−−−−−−−−01ln 1ln ln 1ln 11ln 1ln )1ln(11ln ln i Xb Xb Xb Xb Xb i Xb Xb Xb j y if e Xb e e e e

y if e e e l

2. Write a program (Stata ado file) that calculates the log-likelihood values

program define mylogit

args lnf Xb

quietly replace `lnf' = -ln(1+exp(-`Xb')) if $ML_y1==1 quietly replace `lnf' = -`Xb' -ln(1+exp(-`Xb')) if $ML_y1==0 end

flickr

3. Identify a particular model to fit using your data variables and the ml model statement

ml model lf mylogit (biodum = age succ tenure clay educ sizeha sizequo animalha prof)

4. Fit the model using ml maximize .

. ml maximize

initial: log likelihood = -327.85862sheaffer

alternative: log likelihood = -284.23841skillet

rescale: log likelihood = -268.17278

Iteration 0: log likelihood = -268.17278

Iteration 1: log likelihood = -185.84208

Iteration 2: log likelihood = -161.98223

Iteration 3: log likelihood = -159.3928

Iteration 4: log likelihood = -159.37492

Iteration 5: log likelihood = -159.37492

Number of obs = 473 Wald chi2(9) = 95.39 Log likelihood = -159.37492 Prob > chi2 = 0.0000

------------------------------------------------------------------------------ biodum | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -.0271121 .0161332 -1.68 0.093 -.0587326 .0045085 succ | -.3254785 .438487 -0.74 0.458 -1.184897 .5339403 tenure | -.3761036 .5109148 -0.74 0.462 -1.377478 .6252709 clay | .2660545 .3076138 0.86 0.387 -.3368576 .8689665 educ | 2.66085 .4142379 6.42 0.000 1.848958 3.472741 sizeha | -.0195829 .0141403 -1.38 0.166 -.0472974 .0081316 sizequo | -.3481513 .1456595 -2.39 0.017 -.6336387 -.062664 animalha | -3.351957 .4730748 -7.09 0.000 -4.279167 -2.424748 prof | .6485803 .2552814 2.54 0.011 .148238 1.148923 _cons | 6.221396 1.235341 5.04 0.000 3.800171 8.64262

The results are exactly the same as the one we would get using the logit command available in Stata.

The results already give the outcome for a Wald test of H 0 : 0====prof succ age βββK . We could test other hypothes using the straightforward LR test. E.g to test whether personal characteristics matter we can test the null hypothesis 0===educ succ age βββ. Estimate the model again with the three variables omitted:

indeedIdentify the new model to fit using your data variables and the ml model statement

ml model lf mylogit (biodum = tenure clay sizeha sizequo animalha prof)

ml maximize

Number of obs = 473 Wald chi2(6) = 79.64 Log likelihood = -195.21467 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ biodum | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- tenure | .7022649 .4007155 1.75 0.080 -.0831231 1.487653 clay | -.0454245 .2667705 -0.17 0.865 -.5682852 .4774362 sizeha | .0052

992 .0123222 0.43 0.667 -.0188519 .0294504 sizequo | -.3474167 .14042 -2.47 0.013 -.6226347 -.0721986 animalha | -2.390138 .39364 -6.07 0.000 -3.161658 -1.618618 prof | .3865246 .2174228 1.78 0.075 -.0396164 .8126655 _cons | 3.871129 .8605505 4.50 0.000 2.184481 5.557777

The LR test statistic is []()81.7368.7121.19537.15922=>=−−−=χLR so we firmly reject this null hypothesis.

We could have tested this with an LM test. Then we need to create cross-products of the residuals ε with x and z. First, residuals ε were created bad on the estimated model. Using the estimated ML model, we made predictions and then took the difference with the obrved values

gen e=biodum-biodumf

Next, the cross products 'ˆi i x ε

and 'ˆi i z ε were created where the vector z i contains the omitted variable age , educ, and succ:

gen e_age=e*age

gen e_succ=e*succ

mcdonald s

gen e_tenure=e*tenure

gen e_clay=e*clay

gen e_educ=e*educ

gen e_sizeha=e*sizeha

gen e_sizequo=e*sizequo

gen e_animalh=e*animalh奥数培训班

gen e_prof=e*prof

We can do the LM test in a straightforward fashion by estimating the following equation without intercept!

考研可以提前交卷吗

u prof succ age +⋅++⋅+⋅=εεε (1)

So, create a vector of ones:

gen ones=1

reg ones e_age e_succ e_tenure e_clay e_educ e_sizeha e_sizequo e_animalh e_prof, noconstant

Source | SS df MS Number of obs = 473 -------------+------------------------------ F( 9, 464) = 8.96 Model | 70.0368282 9 7.7818698 Prob > F = 0.0000 Residual | 402.963172 464 .868455112 R-squared = 0.1481 -------------+------------------------------ Adj R-squared = 0.1315 Total | 473 473 1 Root MSE = .93191

------------------------------------------------------------------------------ ones | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- e_age | -.0065348 .011938 -0.55 0.584 -.0299941 .0169245 e_succ | -.1558425 .3272642 -0.48 0.634 -.798946 .487261 e_tenure | -.3956457 .4480105 -0.88 0.378 -1.276027 .4847352 e_clay | .4048804 .2566723 1.58 0.115 -.0995037 .9092644 e_educ | 2.495919 .3029164 8.24 0.000 1.900661 3.091177 e_sizeha | -.0006441 .0108471 -0.06 0.953 -.0219597 .0206714 e_sizequo | -.3132451 .1293209 -2.42 0.016 -.5673724 -.0591179 e_animalh | -.1441578 .3363157 -0.43 0.668 -.8050484 .5167328 e_prof | .1131073 .2338385 0.48 0.629 -.3464064 .5726209 ------------------------------------------------------------------------------

Finally we can obtain the LM test statistic NR 2 = 473*0.148 = 70.04 ()81.732=>χ

The LM test statistic is clo to the test value of the LR value and it holds that LM LR ξξ≥.

本文发布于:2023-06-16 23:04:13，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/90/147494.html

上一篇：常用科目英语单词

下一篇：老友记第10季

标签：经验总结考研数学

留言与评论（共有 0 条评论）