Tài liệu Advanced Econometrics - Part II - Chapter 4: Discrete choice analysis: Multinomial Models: Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models
Nam T. Hoang
UNE Business School 1 University of New England
Chapter 4
DISCRETE CHOICE ANALYSIS:
MULTINOMIAL MODELS
We look at settings with multiple, unordered choices.
A key notion here is the “independence of irrelevant alternative” property
Models for discrete choice with more than two choices: We assume for the thi consumer
faced with i choices (j=1,2,,J) suppose that the utility of choice j is:
ijijij XU εβ +=
If the consumer makes choice j in particular, then we assume that ijU is the maximum among
J alternatives.
Prob( )ij ikU U→ > for all jk ≠
This is a probability of individual I makes choice j.
jYi = if ikij UU > for all jk ≠
The model is made by a particular choice of distribution for the disturbances.
Let iY be a random variable that indicates the choice made McFadden (1974) has shown that
if and only if the J disturbances are independent and ...
13 trang |
Chia sẻ: honghanh66 | Lượt xem: 679 | Lượt tải: 0
Bạn đang xem nội dung tài liệu Advanced Econometrics - Part II - Chapter 4: Discrete choice analysis: Multinomial Models, để tải tài liệu về máy bạn click vào nút DOWNLOAD ở trên
Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models
Nam T. Hoang
UNE Business School 1 University of New England
Chapter 4
DISCRETE CHOICE ANALYSIS:
MULTINOMIAL MODELS
We look at settings with multiple, unordered choices.
A key notion here is the “independence of irrelevant alternative” property
Models for discrete choice with more than two choices: We assume for the thi consumer
faced with i choices (j=1,2,,J) suppose that the utility of choice j is:
ijijij XU εβ +=
If the consumer makes choice j in particular, then we assume that ijU is the maximum among
J alternatives.
Prob( )ij ikU U→ > for all jk ≠
This is a probability of individual I makes choice j.
jYi = if ikij UU > for all jk ≠
The model is made by a particular choice of distribution for the disturbances.
Let iY be a random variable that indicates the choice made McFadden (1974) has shown that
if and only if the J disturbances are independent and identically distributed with type I
extreme value distribution:
ije
ijij eF
ε
εε
−
−=−−= )exp(exp()(
Then:
1
exp( )
Pr ( )
exp( )
ij
i J
ij
j
X
ob Y j
X
β
β
=
= =
∑
∑
=
= J
j
ij
ij
Z
Z
1
)exp(
)exp(
θ
θ
Utility depends on ijZ which includes aspects specific to the individual (i) as well as to choice
(j). Let ],[ iijij wXZ = , ],[ αβθ =
• ijX varies across choices (j) (and possibly across individual (i) as well).
• iw contains the characteristics of the individual (i), therefore the same for all choice.
Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models
Nam T. Hoang
UNE Business School 2 University of New England
1
exp( )
Prob( )
exp( )
ij i
i J
ij i
j
X w
Y j
X w
β α
β α
=
+
= =
+∑
)exp()exp(
)exp()][exp(
1
αβ
αβ
i
J
j
ij
iij
wX
wX
=
∑
=
∑
=
= J
j
ij
ij
X
X
1
)exp(
][exp(
β
β
For example, a model of a shopping centre choices by individual:
Depends on: number of stores ijS , distance from the centre of the city Dij, and income of the
individual (i’) i which varies across individuals but not across the choices.
( )iijijij IDSZ =→
I. THE MULTINOMIAL LOGIT MODEL:
Suppose we have only individual specifre characteristics (i) iw which is the same for all
choice. The model response probability as:
1
exp( )
Prob( )
1 exp( )
i j
i i ij J
i j
j
w
Y j w P
w
α
α
=
= = =
+∑
For all choices j=1,.,J.
For the first choice j=0 to satisfy ∑
=
=
J
j
ijP
0
1
∑
=
+
=== J
j
ji
ioii
w
PwYob
1
)exp(1
1)0(Pr
α
The log – likelihood:
1 0
L ln ln
n J
ij ij
i j
d P
= =
= = ∑∑
Where ijd =1 if alternative j is chosen by individual i, 0 if not
∑
=
−=
∂
∂ n
i
iijij
j
wPdL
1
)(
α
j=1,,J
The marginal effects of the characteristics on probabilities:
Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models
Nam T. Hoang
UNE Business School 3 University of New England
0
[ ]
J
ij
ij ij jk ie ek ij jk
eik
P
P P P
w
δ α α α α
=
∂
= = − = − ∂
∑
=∑
=
J
e
ekieP
0
αα
II. CONDITIONAL LOGIT MODEL:
When the data consist of choice - specific ( )ijX instead of individual - specific characteristics
The model is:
1 2Prob( , ,..., ) Pr ( )i i i iJ i iY j X X X ob Y j X= = =
∑
=
= J
j
ij
ij
ij
X
X
P
0
)exp(
)exp(
β
β
Notes:
When iw is unchanged jα varies
When ijX varies β is unchanged
The multinomial logit model can be viewed as a special case of this suppose we have a vector
of individual characteristics iX with dimension K. Then define for each choice j the vector of
ijX as following:
=
0
.
.
.
0
'
1
i
i
X
X ,
=
0
.
.
0
0
'
i
ij X
X ,
=
i
iJ
X
X
.
.
.
0
0
'
So ijX varies for each choice
)1( ×K
iX
]0...00[=ioX
]0...0[1 ii XX =
.
.
]0..00[ ijij XX =
.
Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models
Nam T. Hoang
UNE Business School 4 University of New England
.
].0.00[ iiJ XX =
=→
Kβ
β
β
β
.
.
.
2
1
∑∑
==
+
== J
j
ji
ji
J
j
ij
ij
ij
X
X
X
X
P
10
)exp(1
)exp(
)exp(
)exp(
β
β
β
β
In this model, the
coefficients are not directly tied to the marginal effects:
β)])(1([ imij
im
ij PmjP
x
P
−==
∂
∂
Where )(1 mj = equals 1 if j=m and 0 if not
Log likelihood:
1 1
L ln ln
n J
ij ij
i j
d P
= =
= = ∑∑
III. MIXED LOGIT MODEL:
For a model combines the two models:
Prob
1
exp( )
( )
exp( )
ij i j
i J
ij i j
j
X W
Y j
X W
β α
β α
=
+
= =
+∑
1
exp( )
Pr[ ]
exp( )
ij
i J
ij
j
Z
Y j
Z
θ
θ
=
→ = =
∑
1 1[ 0 0 ... 0]i iZ X=
2 2[ 0 ... 0]i i iZ X W=
[ 0 ... ... 0]ij ij iZ X W=
[ 0 ...0 ]iJ iJ iZ X W=
Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models
Nam T. Hoang
UNE Business School 5 University of New England
1
:
:
j
J
β
α
θ
α
α
=
This model doesn’t have the advantage the same as the conditional logit model: If an
additional alternative was added to the choice set then one can predict its probability of
selection, since the parameter of the conditional logit model do not vary across alternatives.
IV. INDEPENDENCE OF IRRELEVANT ALTERNATIVES:
• The ratio of probabilities of any two alternatives is independent of the introduction of a
third alternative. This is unrealistic in many economic choice models.
• In the multinomial logit and conditional logit model ij
im
P
P
is independent of the remaining
probability called the Independence of Irrelevant Alternative.
• Consider the conditional probability of choosing j given that you choose either j or l.
Prob Pr( )( { , })
Pr( ) Pr( )
i
i i
i i
Y jY j Y j l
Y j Y l
=
= ∈ =
= + =
exp( )
exp( ) exp( )
ij
ij il
X
X X
β
β β
=
+
• This probability does not depend on the characteristics imX of alternatives m other than j
and l. The traditional example is MeFadden’s famous blue bus/red bus example.
• Suppose there are initially three choices: commuting by car, by red or by blue bus.
• People are indifferent between red versus blue buses.
, ,i redbus i bluebusU U=
With the choice between the blue and red bus being random, suppose:
, , ,i redbus i bluebus i busX X X= =
Then suppose that the probability of commuting by bus is
Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models
Nam T. Hoang
UNE Business School 6 University of New England
Pr( ) Pr(( )i iY bus Y redbus or bluebus= = = =
*
,
*
, ,
exp( )
exp( ) exp( )
i bus
i bus i car
X
X X
β
β β
=
+
And 1Pr( )
2i i
Y redbus Y bus= = =
• That would imply that the conditional probability commuting by car, given that one
commutes by blue or red bus, would differ from the same conditional probability if there
is no blue bus. Presumably taking away the blue bus choice would lead all the current blue
bus users to shift to the red bus, not to cars.
• exp( )ie ie ik
ik
P X X
P
β β= − does not depend on any alternative other than l & k.
• The conditional logit model does not allow for this type of substitution pattern. Again,
consider commuting initially choosing between two models of transportation, car and red
bus. So
( )
1i car
i bus red
P
P
= exp( )( 1) iccar redbus
irb
PX X
P
β β= − = = .
• Now suppose a third choice, blue bus is added. Assuming bus commuters do not care
about the colour of the bus, consumers will choose between these with equal probability.
The ratio of their probabilities of taking blue bus and red bus is 1: P 1irb
ibbP
= .
But then IIA implies that Pic
irbP
is the same whether or not another alternative is added (blue
bus) so we have: 1irb ic
ibb irb
P P
P P
= = and 1ic irb ibbP P P+ + = and
1
3ic irb ibb
P P P= = = .
Which are the probabilities that the logit model predicts?
• In real life, however, we would expect the probability of taking a car to remain the same
when a new bus is introduced that is exactly the same as the old bus. We would expect the
original probability of taking the bus to be split between the two buses after the second
one is introduced. That is we would expect: 1
2ic
P = , 1
4ibb
P = , 1
4irb
P = .
Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models
Nam T. Hoang
UNE Business School 7 University of New England
• In this case, the logit model, because of its IIA property, overestimates the probability of
taking a car. The ratio of probabilities of car and bus Pc
bbP
actually changes with
introduction of the red bus, rather than remaining constant as required by the logit model.
• The same kind of misprediction arises with logit models if there is change of another
alternative.
Suppose individuals have choice out of three restaurants: Purdue (P) restaurant,
Krannert restaurant (K), Chauncey restaurant (C): 95pP = , 85kP = , 5cP = and
quality 10pQ = , 9kQ = , 2cQ = .
Suppose that market shares for 3 restaurant are 0.1pS = , 0.25kS = and 0.65cS = .
0.2 2ij j j ijU P Q ε= − + + conditional logit model
0.1
0.65
ip
ic
P
P
→ = .
Suppose that Krannert restaurant raise the price to 1000 (taking it out of business).
Conditional logit model would predict 0.13ipP = and 0.87icP = to satisfy
0.1
0.65
ip
ic
P
const
P
= =
This seems implausible people who were planning to go to Krannert would
appear to be more likely to go to PMU than to go to the Chauncey rest so one would
expect 0.35pS ≈ ; 0.65cS ≈
(IIA not holds in reality conditional logit is not valid in this case)
IIA: adding another alternative or changing the characteristics of a third alternative does
not affects the ratio between two alternatives.
• Test of IIA
Hausman & MeFadden offer tests of the IIA assumption based on the observation that: If
the conditional logit model is true, β can be consistently estimated by conditional logit by
focusing on any subset of alternative. Using Hausman’s test to compare the estimate of β,
using all alternative with the estimate, using a subset of alternatives:
( ) [ ] ( ) 21' ~ˆˆˆˆˆˆ χββββ fsfsfs VV −−− −
s: restricted subset, f: full subset ˆ ˆ:o s fH β β=
Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models
Nam T. Hoang
UNE Business School 8 University of New England
• We need IIA holds to apply the conditional logit model
If reject Ho IIA not holds conditional logit is not valid model in this case.
• ij ij ijU X β ε= +
The IIA assumption need to hold in reality to apply the conditional logit model.
The IIA property follows from the initial assumption that ijε are extreme value
distributions.
V. NESTED LOGIT MODEL.
• If the test of IIA fails (reject ˆ ˆ:o s fH β β= ) then the conditional logit model is not valid.
We need to modify the multinomial logit model.
One way to introduce correlation between the choices is through nesting them. Suppose
the set of choices {0 , 1,, J} can be partitioned into S sets B1, B2 ,, Bs , so that the
full set of choices can be written as:
{ }
1
0,1,..., s ssJ U B==
Let Zs be set – specific characteristics (Branch characteristics) Mc Fadden (1981) studied
the following model: Adjusted with *sρ
• Conditional probability:
1
1
exp( )
Pr( , )
exp( )
s
s ij
i i i s
s ill B
X
Y j X Y B
X
ρ β
ρ β
−
−
∈
= ∈ =
∑
• Within the sets, the correlation coefficient for ijε is equal to
2(1 )sρ− . Between the sets
the ijε are independent adjusted the probabilities by sρ in each group.
The probability of a choice in the set Bs is
1
1
1
exp( )[ exp( )]
Pr( )
[exp( )( exp( )) ]
s
s
s
t
s s il
l B
i s i s
t t il
t l B
Z X
Y B X
Z X
ρ
ρ
α ρ β
α ρ β
−
∈
−
= ∈
∈ =
∑
∑ ∑
Pr( )i iY j X→ =
If we fix 1sρ = for all s, then
Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models
Nam T. Hoang
UNE Business School 9 University of New England
1
exp( )
Pr( )
exp( )
t
ij s
i i s
il t
t l B
X Z
Y j X
X Z
β α
β α
= ∈
+
= =
+∑∑
and we are back in the conditional logit model
In the first:
In general this model corresponds to individuals choosing the option with the highest
utility, where the utility of choice j in set Bs for individuals i is
ij ij s ijU X Zβ α ε= + +
Mc Fadden suppose that: the joint distribution function of the ijε is
1
1
( ,....., ) exp( ( exp( )) )s
s
S
io iJ t ij
s j B
F ρε ε ρ ε−
= ∈
= −∑ ∑
From this he derive the results in the previous page
• How do we estimate these models?
One approach is to construct the log – likelihood and directly maximize it. That is
complicated, especially since the log likelihood function is not concave (but this also
not impossible)
An easier alternative is to directly use the nesting structure. Within a nest we have a
conditional logit model with coefficient 1sρ β
− . Hence we can directly estimate 1sρ β
−
using the concavity of the conditional logit model ( Newton – Raphson procedure will
converge to a global maximum). Denote these estimate of ss λβρ ˆ
1 =− .
Then the probability of a particular set Bs can be used to estimate sρ and α through:
( )
( )∑ ∑
∑
= ∈
∈
=∈
S
t Bl
tilt
Bl
sils
isi
s
s
s
s
XZ
XZ
XBY
1
)ˆexp(exp
)ˆexp()exp(
Pr(
ρ
ρ
λα
λα
∑
=
+
+
= S
t
ttt
sss
WZ
WZ
1
ˆexp(
)ˆexp(
ρα
ρα
Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models
Nam T. Hoang
UNE Business School 10 University of New England
sWˆ is called: “inclusive values”
Where:
= ∑
∈ sBl
sils XW )ˆexp(lnˆ λ
• We have another conditional logit model with likelihood function:
n
i
i 1
( Pr( X ))
s
i s
Bi
Y B
Y= ∈
= ∈∏ ∏ ∏ ∏
∑= ∈
=
+
+
=
n
1i
1
ˆexp(
ˆexp(
si BY
s
t
ttt
sss
WZ
WZ
ρα
ρα
• These models can be extended too many lagers of nests. It should be noted that both the
order of the nests and the elements of each nest are very important.
VI. MULTINOMIAL PROBIT MODEL:
• A natural alternative model to avoid the IIA problem which is caused by correlation
across choices is to work with normally distributed errors (.))~( Nijε . Now we will not
assume ijε ~ Extreme value distribution anymore.
• Note that: extreme value ≈ normal distribution, but EV distribution is much easier to
calculate.
• The cost of using normal distribution is the complicated likelihood function.
ijijXU ij εβ += Jj ,...,2,1=
+
+
+
=
=
iJiJ
ii
ii
iJ
i
i
i
X
X
X
U
U
U
U
εβ
εβ
εβ
:
:
:
:
11
00
1
0
With:
0
1
: ~ (0, )
:
i
i
i i
iJ
X N
ε
ε
ε
ε
= ∑
Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models
Nam T. Hoang
UNE Business School 11 University of New England
With unrestricted covariance matrix ∑
JjUUqY ijiqi ,...,1,Pr[)Pr( =>== ]qj ≠ , or
])(;...,)(Pr[)Pr( 11 βεεβεε iJiqiqiJiiqiqii XXXXqY −<−−<−==
• The main obstacle to the implementation of the Multinomial probit model is the difficulty
in computing the multivariate normal probabilities for any J > 2.
• Recent results on accurate simulation of multinomial integrals have made estimation of
MNP model feasible.
• Read: Geweke, Keane and Runkle (1994) – RE Statistics 76, No4 for the method, if you
want to use the MN Probit model.
• For J = 3
);()1( 3121 iiiii UUUUPyP >>==→
∫ ∫ ∫
+∞
∞−
+∞
∞−
+∞
∞−
=
−<−=
−<−=
==→
βεε
βεε
)(
)(
)1(
31132
21121
iiii
iiii
i XXu
XXu
PyP
1 *
2
~ (0, )
U
N
U
∑
Where:
−−
∑
−
−
=∑
10
01
11
011
011*
• Each element of the likelihood is a double integral and must be evaluated numerically.
• This model does not suffer from the IIA problem.
VII. ORDERED LOGIT, ORDERED PROBIT: & SEQUENTIAL MODELS
1. Ordered Probit:
εβ += ii XY
* *Y is unobservable:
*
*
1
*
1 2
*
1
0 0
1 0
2
:
:
i i
i i
i i
i J i
Y if Y
Y if Y
Y if Y
Y J if Y
µ
µ µ
µ −
= ≤
= ≤ <
= ≤ ≤
= ≤
Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models
Nam T. Hoang
UNE Business School 12 University of New England
μ1,μ2,μJ-1, are unknown parameters to be estimate with β.
Assume that ε is normally distributed across observations.
Normalize the mean and variance of ε , )1,0(~ Nε .
We have: ( ) ( )0i i iP y X X β= = Φ −
( ) ( ) ( )ββµ iiii XXXyP −Φ−−Φ== 11
( ) ( ) ( )βµβµ iiii XXXyP −Φ−−Φ== 122
:
:
( ) ( )βµ iJii XXJyP −Φ−== −11
We must have: 121 ...0 −<<<< Jµµµ (for all the probabilities to be positive)
Likelihood function:
i
j [1,...,J]
Pr(Y )
all observations
j
∈
= =∏
Marginal Effeds:
ki
ik
ii X
XYP
ββφ
χ
)(
)0(
−=
∂
=∂
kijij
ik
ii XX
XjYP
ββµφβµφ
χ
)]()([
)(
12 −−−=∂
=∂
−−
kiJ
ik
ii X
XJYP
ββµφ
χ
)]([
)(
1 −=∂
=∂
−
2. Ordered Logit:
Replace Φwith the logit function
)exp(1
)exp()(
)exp(1
)exp()(
X
XXF
X
XXF
i
i
i +
=
+
=
β
ββ
gives the ordered logit model.
3. Sequential Multinomial Models:
A Special case of an ordered variable (where choices have a natural ranking) is a
sequential variable. This occurs when second event is dependent on the first event, the
third event is dependent on the previous two events,
Person i at nth category means person i has been all (n-1) previous categories:
Advanced Econometrics - Part II Chapter 4: Discrete choice analysis: Multinomial Models
Nam T. Hoang
UNE Business School 13 University of New England
=
college
collegenothighschool
schoolhighnot
yi
3
,2
1
[ ] [ ] [ ]1Pr12Pr2Pr ≠×≠=== iiii yyyy
))(1)(( 1122 ββ XX Φ−Φ=
The parameters β1 and β2 can be estimated by maximizing the log-likelihood:
1 1
ln ln
n m
ij ij
i j
L y p
= =
= = ∑∑
1 1 1( )i ip X β= Φ , p2i is given in the preceding equation and 3 1 21i i ip p p= − −
Notes: )2( =iyP means )12( ≠= ii yandyP
Các file đính kèm theo tài liệu này:
- chapter_04_multinomial_models_9413_7552.pdf