lclogit潜类别logit模型(latent class logit model)教程

更新时间:2023-07-24 11:03:07 阅读: 评论:0

The Stata Journal (2013)
13,Number 3,pp.625–639
lclogit:A Stata command for fitting
latent-class conditional logit models via the
expectation-maximization algorithm
Daniele Pacifico
Italian
Department of the Treasury
Rome,Italy
exact timedaniele.pacifico@tesoro.it Hong il Yoo Durham University Durham,UK
@durham.ac.uk
Abstract.In this article,we describe lclogit ,a Stata command for fitting
a discrete-mixture or latent-class logit model via the expectation-maximization
algorithm.
Keywords:st0312,lclogit,lclogitpr,lclogitcov,lclogitml,latent-class model,ex-
pectation-maximization algorithm,mixed logit
lamp是什么意思1Introduction
Mixed logit or random parameter logit is ud in many empirical applications to cap-ture more realistic substitution patterns than traditional conditional logit.The ran-dom parameters are usually assumed to follow a normal distribution,and the resulting model is fit through simulated maximum likelihood,as in Hole ’s (2007)Stata command mixlogit .Several recent studies,however,note potential gains from specifying a dis-crete instead of normal mixing distribution,including the ability to approximate the true parameter distribution more flexibly at lower computational costs.1
Pacifico (2012)implements the expectation-maximization (EM )algorithm for fitting a discrete-mixture logit model,also known as a latent-class logit (LCL )model,in Stata.As Bhat (1997)and Train (2008)emphasize,the EM algorithm is an attractive alterna-tive to the usual (quasi-)Newton methods in the prent context becau it guarantees numerical stability and convergence to a local maximum even when the number of latent class is large.In contrast,the usual optimization procedures often fail to achieve con-vergence becau inversion of the (approximate)Hessian becomes numerically difficult.
rumor
With this contribution,we aim at generalizing Pacifico ’s (2012)code with a Stata command that introduces a ries of important functionalities and provides an improved performance in terms of run time and stability.
1.For example,e Hess et al.(2011),Shen (2009),and Greene and Hensher (2003).
c  2013StataCorp LP st0312
626Latent-class logit model 2EM algorithm for LCL
This ction recapitulates the EM algorithm forfitting an LCL model.2Suppo that each of N agents f
aces,for notational simplicity,J alternatives in each of T choice scenarios.3Let y njt denote a binary variable that equals1if agent n choos alternative j in scenario t and equals0otherwi.Each alternative is described by alternative-specific characteristics x njt and each agent by agent-specific characteristics,including a constant,z n.
LCL assumes that there are C distinct ts(or class)of taste parameters,β= (β1,β2,...,βC).If agent n is in class c,the probability of obrving his or her quence of choices is a product of conditional logit formulas:
P n(βc)=
T
t=1
J
j=1
exp(βc x njt)
J
k=1
exp(βc x nkt)
y
njt
(1)
Becau the class membership status is unknown,the rearcher needs to specify the unconditional likelihood of agent n’s choices,which equals the weighted average of(1) over class.The weight for class c,πcn(θ),is the population share of that class and is usually modeled as fractional multinomial logit,
πcn(θ)=
exp(θc z n)
高一英语辅导书1+
C−1
l=1
exp(θl z n)
(2)
whereθ=(θ1,θ2,...,θC−1)are class membership model parameters;note thatθC has been normalized to0for identification.
The sample log likelihood is then obtained by summing each agent’s log uncondi-tional likelihood:
ln L(β,θ)=
N
n=1
ln
C
c=1
πcn(θ)P n(βc)(3)
Bhat(1997)and Train(2008)note numerical difficulties associated with maximizing(3)
directly.They show thatβandθcan be more conveniently estimated via a well-known EM algorithm for likelihood maximization in the prence of incomplete data,treating each agent’s class membership status as the missing information.Let superscript s
denote the estimates obtained at the s th iteration of this algorithm.Then at iteration
s+1,the estimates are updated as
βs+1=argmaxβ N
n=1
C
c=1
ηcn(βs,θs)ln P n(βc)
θs+1=argmaxθ N
n=1
C
c=1
ηcn(βs,θs)lnπcn(θ)
2.Further details are available in Bhat(1997)and Train(2008).
3.lclogit is also applicable when the number of scenarios varies across agents,and the number of
alternatives varies both across agents and over scenarios.
D.Pacifico and H.Yoo627 whereηcn(βs,θs)is the posterior probability that agent n is in class c evaluated at the
s th estimates:
ηcn(βs,θs)=
niorhighπcn(θs)P n(βs c)
C
l=1
πln(θs)P n(βs l)
aug是几月(4)
The updating procedure can be implemented easily in Stata,exploiting clogit and fmlogit routines as follows.4βs+1is computed byfitting a conditional logit model (clogit)C times,each time usingηcn(βs,θs)for a particular c to weight obrvations on each n.θs+1is obtained byfitting a fractional multinomial logit model(fmlogit) that takesη1n(βs,θs),η2n(βs,θs),...,ηCn(βs,θs)as dependent variables.When z n only includes the constant term so that each class share is the same for all agents,that is,whenπcn(θ)=πc(θ),each class share can be directly updated by using the following analytical solution withoutfitting the fractional multinomial logit model:
πc(θs+1)=
N
n=1
比夫拉ηcn(βs,θs)
C
l=1
N
n=1
ηln(βs,θs)
(5)
With a suitable lection of starting values,the updating procedure can be repeated until changes in the estimates and improvement in the log likelihood between iterations are small enough.
An often-highlighted feature of LCL is its ability to accommodate unobrved inter-personal taste variation without restricting the shape of the underlying taste distribu-tion.Hess et al.(2011)have recently emphasized that LCL also provides a convenient means to account for obrved interpersonal heterogeneity in correlations among tastes for different attributes.For example,letβq andβh denote taste coefficients on the q th and h th attributes,respectively.Each coefficient may take one of C distinct values and is a random parameter from the rearcher’s perspective.Their covariance is given by
cov n(βq,βh)=
C
c=1
πcn(θ)βc,qβc,h−
C
c=1
πcn(θ)βc,q
C
c=1
πcn(θ)βc,h
(6)
whereβc,q is the value ofβq when agent n is in class c,andβc,h is defined similarly.As long as z n in(2)includes a nonconstant variable,this covariance will vary across agents with different obrved characteristics through the variation inπcn(θ).
3The lclogit command
lclogit is a Stata command that implements the EM iterative scheme outlined in the previous ction.This command generalizes Pacifico’s(2012)step-by-step procedure and introduces an improved internal loop along with other important functionalities. The overall effect is to make the estimation process more convenient,significantly faster, and more stable numerically.
4.fmlogit is a ur-written program.See footnote5for a further description.
628Latent-class logit model For example,the internal code of lclogit executes fewer algebraic operations per iteration to update the estimates;us the standard generate command to perform tasks that were previously executed with slightly slower egen functions;and,when possible,works with log probabilities instead of probabilities.All of the changes substantially reduce the estimation run time,especially in the prence of a large number of parameters and obrvations.If we take the8-class modelfit by Pacifico(2012)as an example,lclogit produces the same results as the step-by-step procedure while taking less than one-half of the run time.
The data tup for lclogit is identical to that required by clogit.
3.1Syntax
love you so much
The generic syntax for lclogit is
lclogit depvar
indepvars
if
in
,group(varname)id(varname)
nclass(#)
membership(varlist)convergence(#)iterate(#)ed(#)
constraints(Class#numlist:
Class#numlist:...
)nolog
3.2Options
group(varname)specifies a numeric identifier variable for the up() is required.
id(varname)specifies a numeric identifier variable for the choice makers or agents.
滞育
With cross-ction data,urs should specify the same variable for both the group() and the id()options.id()is required.
nclass(#)specifies the number of latent class ud in the estimation.A minimum of two latent class lass()is required.
membership(varlist)specifies independent variables to enter the fractional multinomial logit model of class membership,that is,the variables included in the vector z n of
(2).The variables must be constant within the same agent as identified by id().5
When this option is not specified,the class shares are updated algebraically following
(5).
convergence(#)specifies the tolerance for the log likelihood.When the proportional increa in the log likelihood over the lastfive iterations is less than the specified criterion,lclogit declares convergence.The default is convergence(0.00001). 5.Pacifico(2012)specified an ml program with the method lf tofit the class membership model.
lclogit us another ur-written program from Buis(2008),fmlogit,which performs the same estimation with the significantly faster and more accurate d2method.lclogit is downloaded with
a modified version of the prediction command of fmlogit and fmlogit pr becau we had to modify
increa是什么意思
this command to obtain double-precision class shares.
D.Pacifico and H.Yoo 629iterate(#)specifies the maximum number of iterations.If convergence is not achieved after the lected number of iterations,lclogit stops the recursion and notes this fact before displaying the estimation results.The default is iterate(150).
ed(#)ts the ed for pudouniform random numbers.The default is the creturn value c(ed).
The starting values for taste parameters are obtained by splitting the sample into nclass()different subsamples and fitting a clogit model for each of them.Dur-ing this process,a pudouniform random number is generated for each agent to assign the agent into a particular subsample.6As for the starting values for the class shares,lclogit us equal shares,that is,1/nclass().constraints(Class #numlist : Class #numlist :... )specifies the constraints that are impod on the taste parameters of the designated class,that is,βc in (1).For instance,suppo that x1and x2are alternative-specific characteristics included in indepvars for lclogit and that the ur wishes to restrict the coefficient on x1to 0for Class1and Class4and the coefficient on x2to 2for Class4.Then the relevant ries of commands would look like this:
constraint 1x1=0
constraint 2x2=2
lclogit depvar indepvars ,group(varname )id(varname )
///
nclass(8)constraints(Class11:Class412)nolog suppress the display of the iteration log.
4Postestimation command:lclogitpr
lclogitpr predicts the probabilities of choosing each alternative in a choice situation (choice probabilities hereafter),the class shares or prior probabilities of class member-ship,and the posterior probabilities of class membership.The predicted probabilities are stored in a variable named stubname#,where #refers to the relevant class number;the only exception is the unconditional choice probability,which is stored in a variable named stubname .
4.1Syntax
The syntax for lclogitpr is lclogitpr stubname  if  in  ,class(numlist )pr0pr up cp
6.More specifically,the unit interval is divided into nclass()equal parts,and if the agent’s pudo-random draw is in the c th part,the agent is allocated to the subsample who clogit results rve as the initial estimates of class c ’s taste parameters.Note that lclogit is identical to asmprobit in that the current ed,as at the beginning of the command’s execution,is restored once all necessary pudorandom draws have been made.

本文发布于:2023-07-24 11:03:07,感谢您对本站的认可!

本文链接:https://www.wtabcd.cn/fanwen/fan/90/187259.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:
相关文章
留言与评论(共有 0 条评论)
   
验证码:
Copyright ©2019-2022 Comsenz Inc.Powered by © 专利检索| 网站地图