Advanced Econometrics - Part II - Chapter 5: Limited - Dependent Variable Models

Tài liệu Advanced Econometrics - Part II - Chapter 5: Limited - Dependent Variable Models: Advanced Econometrics - Part II Chapter 5: Limited - Dependent Variable Models Nam T. Hoang UNE Business School 1 University of New England Chapter 5 LIMITED - DEPENDENT VARIABLE MODELS: TRUNCATION, CENSORING (TOBIT) AND SAMPLE SELECTION. I. TRUNCATION: The effect of truncation occurs when sample data are drawn from a subset of a larger population of interest. 1. Truncated distributions: Is the part of an untruncated distribution that is above or below some specified value • Density of a truncated random variable: If a continuous random variable x has pdf )(xf and a is a constant then: ( )( ) Prob( ) f xf x x a x a > = > If ),(~ 2σµNx )(11)( α σ µ Φ−=      −Φ−=>→ aaxP ,       −= σ µα a )(1 2 1 )(1 )()( 2 2 2 )( 2 α πσ α σ µ Φ− = Φ− => −− x e xfaxxf )(1 1 α σ µφ σ Φ−       − = x )'( Φ=φ o Truncated standard normal distribution: 2. Moments of truncated distrib...

13 trang | Chia sẻ: honghanh66 | Lượt xem: 895 | Lượt tải: 0

Bạn đang xem nội dung tài liệu Advanced Econometrics - Part II - Chapter 5: Limited - Dependent Variable Models, để tải tài liệu về máy bạn click vào nút DOWNLOAD ở trên

Advanced Econometrics - Part II Chapter 5: Limited - Dependent Variable Models Nam T. Hoang UNE Business School 1 University of New England Chapter 5 LIMITED - DEPENDENT VARIABLE MODELS: TRUNCATION, CENSORING (TOBIT) AND SAMPLE SELECTION. I. TRUNCATION: The effect of truncation occurs when sample data are drawn from a subset of a larger population of interest. 1. Truncated distributions: Is the part of an untruncated distribution that is above or below some specified value • Density of a truncated random variable: If a continuous random variable x has pdf )(xf and a is a constant then: ( )( ) Prob( ) f xf x x a x a > = > If ),(~ 2σµNx )(11)( α σ µ Φ−=      −Φ−=>→ aaxP ,       −= σ µα a )(1 2 1 )(1 )()( 2 2 2 )( 2 α πσ α σ µ Φ− = Φ− => −− x e xfaxxf )(1 1 α σ µφ σ Φ−       − = x )'( Φ=φ o Truncated standard normal distribution: 2. Moments of truncated distributions: [ ] ( ) a E x x a xf x x a dx µ ∞ > = > =∫ 2)( µ−= ∫ xaV Advanced Econometrics - Part II Chapter 5: Limited - Dependent Variable Models Nam T. Hoang UNE Business School 2 University of New England o Truncated mean and truncated variance If 2~ ( , )x N µ σ and a is a constant )(][ ασλµ +=> < axxE )(1[][ 2 αδσ −=> < axxVar Where       −= σ µα a , (.)φ is this standard normal density And [ ])(1)()( ααφαλ Φ−= if ax > )()()( ααφαλ Φ−= if ax < And ])()[()( ααλαλαδ −= 1)(0 << αδ for all values of α 2 2truncatedσ σ< 3. The truncated regression model: Assume now: βµ ii X= iii XY εβ += Where: ),0(~ 2σε NX ii So that ),(~ 2σβiii XNXY We are interested in the distribution of Yi given that Yi is greater than the truncation point a ]/)[(1 ]/)[(][ σβ σβφσβ i i iii Xa XaXaYYE −Φ− − +=> i i ii i ii X dd X aYYE ∂ ∂ += ∂ >∂ ααλσβ )/( ][ ))(( 2 σ βλαλσβ −−+= iii )1( 2 iii λαλβ +−= )1( iδβ −= Where: σ βα iii Xa − = , )( ii αλλ = , )( ii αδδ = iδ−1 is between zero and 1  for every element of Xi , the marginal effect is less than the corresponding coefficient Advanced Econometrics - Part II Chapter 5: Limited - Dependent Variable Models Nam T. Hoang UNE Business School 3 University of New England )1(][ 2 δσ −=> aYYVar ii o Estimate: iiiii uaYYEaYY +>=> ][ iii uX ++= σλβ )1(][ 2 iiuVar δσ −= If we use OLS on (Yi,Xi)  we omit iλ  all the biases that arise because of an omitted variable can be expected. o If )( YXE in the full population is a linear function of Y then βτ=b for some τ II. CENSORED DATA • A very common problem in micro economic data is censoring of dependent variable. • When the dependent variable is censored, value in a certain range are all transferred to (or reported as) a single value. 4. The censored normal distribution: When data is censored the distribution that applies to the sample data is a mixture of discrete and continuous distribution. Define a new random variable Y transformed from the original one, *Y by:    >= ≤= 0 00 ** * YifYY YifY If ),(~ 2* σµNY *Prob( 0) Prob( 0) ( ) 1 ( )y Y µ µ σ σ − = = ≤ = Φ = −Φ If 0* >Y then Y has the density of *Y This is the mixture of discrete and continuous parts. Moments: ),(~ 2* σµNY and aY = if aY ≤* or else *YY = then: ))(1(.][ σλµ +Φ−+Φ= aYE ])()1)[(1(][ 22 Φ−+−Φ−= λαδσYVar Where: Advanced Econometrics - Part II Chapter 5: Limited - Dependent Variable Models Nam T. Hoang UNE Business School 4 University of New England * 2 ( ) Prob( ) / (1 ) ( ) 1 a Y aµ α σ ϕ λ ϕ δ λ λα  − Φ = Φ = ≤ = Φ     = −Φ = −Φ = −   For a=0  ))((]0[ σλµ σ µ +Φ==aYE )( )( σ µ σ µφ λ Φ = 5. The censored Regression Model: (Tobit Model) a. Model: iii XY εβ += *    >= ≤= 0 00 ** * iii ii YifYY YifY We only know iY )(][ iiiii X XXYE σλβ σ β ++ − Φ= Note: βµ iii XXYE == ][ * Where: )/( )/( ]/)0[(1 ]/)0[( σβ σβφ σβ σβφλ i i i i i X X X X Φ = −Φ− − = For the *Y variable [ ] β= ∂ ∂ i i X XYE * but *Y is unobservable b. Marginal Effects: iii XY εβ += *      <= <<= ≤= * ** * ii ii ii YbifbY bYaifYY aYifaY Let )(εf & )(εF denote the density and cdf of ε assume ),0(~ 2σε iid and )()( εε fXf = Then Advanced Econometrics - Part II Chapter 5: Limited - Dependent Variable Models Nam T. Hoang UNE Business School 5 University of New England *( ) *Prob[ ] E Y X a Y b X β ∂ = < < ∂ This result does not assume ε is normally distributed. For the standard case with censoring at zero and normally distributed disturbances; ),0(~ 2σε N      Φ= ∂ ∂ σ ββ i i ii X X XYE . )( OLS estimates usually = MLE estimate times the proportion of non-limit observations in the sample o A useful decomposition of i ii X XYE ∂ ∂ )( { }))](1[.)( iiiiiii i ii X XYE λαφλαλβ +++−Φ= ∂ ∂ Where: σ βα ii X = , )( ii αΦ=Φ and i i i Φ = φλ Taking two parts separately ( ) [ , 0]Prob[ 0].i i i i ii i i E Y X E Y X Y Y X X ∂ ∂ > = > ∂ ∂ Prob[ 0][ , 0]. ii i i i Y E Y X Y X ∂ > + > ∂ Thus, a change in Xi has two effects: It affects the conditional mean of *iY in the positive part of the distribution and it affects the probability that the observation will fall in that part of the distribution. 6. Estimation and Inference with Censored Tobit: Estimation of Tobit model and the truncated regression is similar using MLE. The log-likelihood for the censored regression model is 0 0 11 i i i i i y y X Y Xβ β ϕ σ σ σ= >    −    = −Φ             ∏ ∏ 2 2 2 0 0 ( )1ln ( ) ln(2 ) ln ln 1 2 i i i i i y y Y X X L β β π σ σσ> =  −   → = = − + + + −Φ         ∑ ∑ The two parts correspond to the classical regression for the non-limit observations and the relevant probabilities for the limit observation. This likelihood is a mixture of discrete and Advanced Econometrics - Part II Chapter 5: Limited - Dependent Variable Models Nam T. Hoang UNE Business School 6 University of New England continuous distribution  MLE produce an estimator with all the familiar desirable properties attained by MLEs. o With σ βγ = and →= σ θ 1 [ ]2 2 0 0 1( ) ln(2 ) ln ( ) ln 1 ( ) 2 i i i i i Y Y L y X Xπ θ θ γ γ > =  → = − − + − + −Φ ∑ ∑  The Hessian is always negative definite. Newton-Raphson method is simple to use and usually converges quickly. o By contrast, for the truncated model 0 1 1 ( ) 1i i i n i i y i i Y X f Y Y a a X β ϕ σ σ β σ > = −     = > = − −Φ    ∏ ∏ 2 2 2 1 ( )1ln ln(2 ) ln( ) ln 1 2 n i i i i Y X a X L β β π σ σσ=   −  − −   = = + + − −Φ            ∑ After convergence, the original parameters can be uncovered using θ σ 1= and θ γβ = Asymptotic covariance matrix of ),( σβ )( '' i iii iiiii XA cXb XbXXa =      Where { }iiiiii Xa Φ−Φ−−−= − )]1([ 22 φγφσ { } 2/)]1()[()( 223 iiiiiii XXb Φ−−+= − φγφφγσ { } 4/2)]1()[()()( 234 iiiiiiiii XXXc Φ−Φ−−+−= − φγφγφγσ σ βγ = iφ and iΦ are evaluated at γiX 1 1 )(),( − =       = ∑ n i iXAVarCov σβ Where:             =)( iXA o Researchers often compute least squares estimates despite their inconsistency. Advanced Econometrics - Part II Chapter 5: Limited - Dependent Variable Models Nam T. Hoang UNE Business School 7 University of New England o Empirical regularity: MLE estimates can be approximated by dividing OLS estimates by the propotion of non-limit observation in the sample:      Φ=      −>=−>=>+=> σ β σ β σ εβεεβ iiiiiii XXobXobXobYob Pr)(Pr)0(Pr)0(Pr *      Φ= σ βββ iMLEOLS X o Another strategy is to discard the limit observations, that just trades the censoring problem for the truncation problem. III. SOME ISSUES IN SPECIFICATION Heteroscedascticity and Non-normality: o Both heteroskedasticity & non-normality result in the Tobit estimator βˆ being inconsistent for β . o Note that in OLS we don’t need normality, consistency based on the CLT and we only need ( ) 0=XE ε (exogeneity)  data censoring can be costly. o Presence of hetero or non-normality in Tobit on truncated model entirely changes the functional forms for ( )0, >YXYE and ( )XYE . IV. SAMPLE SELECTION MODEL: 7. Incidental Truncation in a Bivariate Distribution: o Suppose that y & Z have a bivariate distribution with correlation ρ . o We are interested in the distribution of y give that Z exceeds a particular value  If y & Z are positively correlated, then the truncation of Z should push the distribution of Y to the right. o The truncated joint density of y and Z is ( , )( , ) Prob( ) f y Zf y Z Z a Z a > = > For the bivariate normal distribution: Theorem: If y and Z have a bivariate normal distribution with mean yµ and Zµ , standard deviations yσ and Zσ and correlation ρ , then: Advanced Econometrics - Part II Chapter 5: Limited - Dependent Variable Models Nam T. Hoang UNE Business School 8 University of New England 2 2 ( ) ( ) ( ) [1 ( )] y y Z y Z E y Z a Var y Z a µ ρσ λ α σ ρ δ α  > = +  > = − Where:       −= Φ− = −= ])()[()( )](1[ )()( )( zzzz z z z ZZz a ααλαλαδ α αφ αλ σµα If the truncation is )( )()( Z Z ZaZ α αφ αλ Φ − =→< For the standard bivariate normal: ),1,1,0,0(~),( ρNZy aaZyE ρ== )( 21)( ρ−== aZyV )( )()( a aaZyE Φ −=< φρ )(1 )()( a aaZyE Φ− => φ ρ )(1)( 2 aaZyVar δρ−=> General case: Let ),(~ ∑µNy and partition y, µ and ∑ into:       = 2 1 y y y       = 2 1 µ µ µ ,       ∑∑ ∑∑ =∑ 2221 1211 Then the marginal distribution of 1y is ),( 111 ∑µN , ),(~ 2222 ∑µNy . Conditional distribution of 21 yy is: ]),([~ 21 1 22121122 1 2212121 ∑∑∑−∑−∑∑+ −− µµ yNyy 8. The Sample Selection Model: a) Wage equation: iii uWZ += γ * * iZ : difference between a person’s market wage and her reservation wage, the wage rate necessary to make her choose to participate in the labour for 0* >iZ participate Advanced Econometrics - Part II Chapter 5: Limited - Dependent Variable Models Nam T. Hoang UNE Business School 9 University of New England 0≤iZ do not participate iW : education, age, b) Hours equation i i iY X β ε= + iY : number of hours supplied iX : wage # children, marital status.  iY is observed only when 0* >iZ . Suppose ii u&ε have a bivariate normal distribution with zero mean and correlation ρ . ( )i iE Y Y is observed )0( * >= ii ZYE )( γiii WuYE −>= )( γεβ iiii WuEX −>+= )()( uiiuii XX αλββαλρσβ λε +=+= Where: uiu W σγα −= ( ) ( )ui ui u W W σγ σγφ αλ Φ =)( iiiii vZYEZY +>=> )0(0 ** iuii vuX ++= )(λββ λ ελ ρσβ = OLS estimation produces inconsistent estimates of β because of the omitting of relevant variable )( ui αλ . Even if iλ were observed, the OLS would be inefficient. The disturbance iv is heteroskedasticity. We reformulate the model as follow: * * model1 0 0 i i i i i Z W u biary choiceif Z Z otherwise γ = +   >  =     Prob( 1 ) ( )i i iZ W W γ→ = = Φ Prob( 0 ) 1 ( )i i iZ W W γ= = −Φ Regression model: εβ += ii XY , observed only if 1=iZ ~),( iiu ε bivariate normal εεε ε σσσµµ ρσ uuuu ,,, ],,1,0,0[ Advanced Econometrics - Part II Chapter 5: Limited - Dependent Variable Models Nam T. Hoang UNE Business School 10 University of New England Suppose that, as in many of these studies, ii WZ & are observed for a random sample of individuals but iY is observed when only 1=iZ . )(],,1[ γλρσβ ε iiiiii WXWXZYE +==→ c) Estimation The parameters of the sample selection model can be estimated by maximum likelihood estimation. However Heckman’s (1979) two-step estimation procedure is usually used instead: o Estimate the probit equation by MLE to obtain estimates of γ . For each observation in the selected sample, compute )ˆ()ˆ(ˆ γγφλ iii WW Φ= and )ˆˆ(ˆˆ γλλδ iiii W+= o 2. Estimate β and ελ ρσβ = by least-squares regression of Y and λˆ&X . o Asymptotic covariance matrix of ]ˆ,ˆ[ λββ : iiiiiii vXWXZY ++== λρσβ ε),,1( Heteroskedasticity: )1(],,1[ 221 iiii WXZvVar δρσε −== Let ]ˆ,ˆ[* λβββ = , ],[ * iii XX λ= 1'*'*2*'1'*'2* ]][)([][)( −− ∆−= XXXIXXXVarCov ρσβ ε Where ∆− 2ρI is a diagonal matrix with )( 2 iI δρ− on the diagonal.       ∑=+= in pee n δδβδσ λε ˆ 1lim;ˆˆ1ˆ 2'2 * * * ˆ ˆ ˆ ε λ σ βρ = d) Model: * 1 1 2 1 * 2 2 0 [ , ] 0 0 i i iX Y X X WY Y Z Y β ε + > ==  < = 22 * 2 εβ += XY Assume ),0(~ 2 1 ∑      = N ε ε ε With       =∑ 112 1211 σ σσ Advanced Econometrics - Part II Chapter 5: Limited - Dependent Variable Models Nam T. Hoang UNE Business School 11 University of New England Heckman’s two steps estimation: In the subsample for which 01 ≠Y we have * *1 2 1 1 2( 0) ( 0)i i i i iE Y Y X E Yβ ε> = + > )( 2211 βεεβ iiii XEX −>+= )(1 iiiX αλρσβ += Where 2 2 σ βα ii X −= Therefore, in the subsample for which 01 ≠Y * *1 2 1 20 ( 0)i i i i iY Y E Y Y v> = > + iiii vX ++= )(1 αλρσβ iii vX ++= λββ λ This is a proper regression equation in the sense that: 0)0,,( *2 =>iiii YxvE λ Note that: 12 1 σρ σ = Regression of 1Y on X is subject to the omitted variable bias. o Heckman’s two steps estimation: (Heckit) procedure 1. Estimate the probit equation by MLE to get 2βˆ . Use this estimates to construct: )ˆ(1 )ˆ(ˆ 2 2 β βφλ i i i X X −Φ− − = 2. Regress 1iY on iX and iλˆ o Maximum likelihood: There are two data regimes: 02 =Y and 12 =Y . Construct the Likelihood Function: Regime What is known about ε 1 0, 21 =YobservednotY 22 βε X−< 2 1, 21 =YobservedY 22111 , βεβε XXY −>−= Regime 1: likelihood element: Advanced Econometrics - Part II Chapter 5: Limited - Dependent Variable Models Nam T. Hoang UNE Business School 12 University of New England ∫ − ∞− 2 22 )( β εε X df ( ))( 2βX−Φ Regime 2: ∫ +∞ − − 2 2211 ),( β εεβ X dXYf 2 2 1 2 2 2 1 1 2 2 1 2 ( , , ) ( ) ( , ) X X f d f Y X d β β β β ε ε β ε ε − +∞ −∞ − ∑ = −∏ ∏∫ ∫ V. THE DOUBLE SELECTION MODEL:    >>+ = otherwise bothorYandorYX Y 0 0)(0 *3 * 211 1 εβ    += += 33 * 3 22 * 2 εβ εβ XY XY VI. REGRESSION ANALYSIS OF TREATMENT EFFECTS: iiii CXE εδβ ++= iC is a dummy variable indicating whether or not the individual attended college. Does δ measure the value of a college education? (Assume the rest of the regression model is correctly specified) The answer is no If the typical individual who chooses to go to college would have relatively high earnings whether or not he or she went to college  The problem is one of seft-selection (sample selection).  δ will overestimate the treatment effect.  Other settings in which the individuals themselves decide whether or not they will receive the treatment. iii uWC += γ *    = >= otherwiseC CifC i ii 1 01 * ),,1(),,1( iiiiiiiii ZXCEXZXCYE =++== εδβ )( γλρσδβ ε ii WX −++=  estimate this model using the two-step estimator. For non-paticipate: Advanced Econometrics - Part II Chapter 5: Limited - Dependent Variable Models Nam T. Hoang UNE Business School 13 University of New England       Φ− − +== )(1 )(),,0( γ γφρσβ ε i i iiiii W WXZXCYE The difference in expected earings between participants and non-participant is then:       Φ−Φ +==−= )1( ),,0(),,1( ii iiiiiiii ZXCYEZXCYE φρσδ ε δ least square overestimate the effect.

Các file đính kèm theo tài liệu này:

chapter_05_limited_dependent_variable_models_1271_9546.pdf