Copyright 1996 Lawrence C. Marsh 0 PowerPoint Slides for Undergraduate Econometrics by


Download 0.54 Mb.
Pdf ko'rish
bet9/13
Sana04.11.2020
Hajmi0.54 Mb.
#140982
1   ...   5   6   7   8   9   10   11   12   13
Bog'liq
allChap6


this model 

with 


least squares

:

1.  One observation is used up  in creating the 



transformed (lagged) variables leaving only 

(T



1) observations for estimating the model.

2.  The value of 

ρ

 is not known .  We must find 



some way to estimate it.

11.11


Copyright 1996    Lawrence C. Marsh

Recovering the 1st Observation

Dropping the 1st observation and applying least squares 

is 


not

 the 


best linear unbiased 

estimation method. 

Efficiency is lost because the variance

of the error associated with the 1st observation 

is not equal to that of the other errors.

This is a special case of the heteroskedasticity

problem except that here all errors are assumed

to have equal variance except the 1st error.

11.12

Copyright 1996    Lawrence C. Marsh

Recovering the 1st Observation

y

1

  =  



β

1

  +  



β

2

x



1   

+  e


1

The 1st observation should fit the original model as:

We could include this as the 1st observation for our

estimation procedure but we must first transform it so

that it has the same error variance as the other observations.

with error variance:  var(e

1

) = 


σ

e

2



 = 

σ

ν



2

 /(1-

ρ

2

).



Note:   The other observations all have error variance 

σ

ν



2

.

11.13



51

Copyright 1996    Lawrence C. Marsh

y

1



  =  

β

1



  +  

β

2



x

1   


+  e

1

with error variance:  var(e



1

) = 


σ

e

2



 = 

σ

ν



2

 /(1-

ρ

2

).



The other observations all have error variance 

σ

ν



2

.

Given any constant c :    var(ce



1

) = c


2

 var(e


1

).

If c =   1-



ρ

2   


, then  var(  1-

ρ

2



 e

1

)  =  (1-



ρ

2

) var(e



1

). 


=   (1-

ρ

2



)   

σ

e



2

=   (1-


ρ

2



σ

ν

2



 /(1-

ρ

2



)

=    


σ

ν

2



 

The transformation 

ν

1

 



=    1-

ρ

2



  

e

1   



has variance 

σ

ν



2

 .

11.14



Copyright 1996    Lawrence C. Marsh

y

1



  =  

β

1



  +  

β

2



x

1   


+  e

1

The transformed error  



ν

1

 



=     1-

ρ

2



  

e

1   



has variance 

σ

ν



2

 .

Multiply through by    1-



ρ

2

   to get:



1-

ρ

2



  

y

1



  =    

1-

ρ



2

  

β



1

  +    


1-

ρ



 

β

2



x

1   


+    

1-

ρ



 e

1



 This transformed first observation may now be 

 added to the other (T-1) observations to obtain 

 the fully restored set of T observations.

11.15


Copyright 1996    Lawrence C. Marsh

Estimating Unknown 

ρ

 Value


e

t   


=  

ρ 

e



t

1  



ν

t



First, use least squares to estimate the model:

If we had values for the 

e

t

’s, we could estimate: 



y

t

  =  



β

1

  +  



β

2

x



t   

+  e


t

The residuals from this estimation are:

 e

t

  =  y



t

  -  b


1

  -  b


2

x

t



^

11.16


Copyright 1996    Lawrence C. Marsh

 e

t



  =  y

t

  -  b



1

  -  b


2

x

t



^

e

t   



=  

ρ 

e



t

1  



ν

t



^

^

^



Next, estimate the following by least squares:

The least squares solution is:

Σ 

e



e

t-1


Σ 

e

t-1



T

T

t = 2

t = 2

2

^ ^


^

ρ  


=

^

11.17



Copyright 1996    Lawrence C. Marsh

Durbin-Watson Test

H

o



:  

ρ

 = 0    vs.   H



1

:  

ρ

 



 0 , 


ρ

 > 0, or  

ρ

 < 0 


Σ (

e



− 

e

t-1



)

Σ 

e



t

T

T

t = 2

t = 1

2

^

^



^

d

  



=

2

The Durbin-Watson Test statistic, d, is :

11.18

Copyright 1996    Lawrence C. Marsh

Testing for Autocorrelation

The test statistic, d, is approximately related to 

ρ

 as:



^



 2(1

−ρ

)



^

When 


ρ

 = 0 , the Durbin-Watson statistic is d 

 2.


^

When 


ρ

 = 1 , the Durbin-Watson statistic is d 

 0.


^

Tables for critical values for d are not always 

readily available so it is easier to use the p-value

that most computer programs provide for d.

Reject  H

o

  if  p-value < 



α

, the significance level.

11.19


52

Copyright 1996    Lawrence C. Marsh

Prediction with AR(1) Errors

When errors are autocorrelated, the previous period’s 

error may help us predict next period’s error.

The best predictor, 

y

T+1 



, for next period is:

y

T+1



  =  

β

1



  +  

β

2



x

T+1   


ρ

 e



T

^

^



^

^

~



where 

β

1



 

and 


β

2

 are generalized least squares



estimates and 

e

T



 is given by:

~

^



^

e

T  



= y

T

 



 

β



1

  



  

β

2



x

 ^ 



 ^ 

~

11.20



Copyright 1996    Lawrence C. Marsh

y

T+h



  =  

β

1



  +  

β

2



x

T+h   


ρ

h



 e

T

^



^

^

^



~

For h periods ahead, the best predictor is:

Assuming  | 

ρ

 | < 1, the influence of  



ρ

h

 e



T

 

diminishes the further we go into the future



(the larger h becomes).

^

^



~

11.21


Copyright 1996    Lawrence C. Marsh

Pooling

Time-Series and

Cross-Sectional Data

Chapter 12

Copyright © 1997 John Wiley & Sons, Inc.  All rights reserved.  Reproduction or translation of this work beyond 

that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the 

copyright owner is unlawful.  Request for further information should be addressed to the Permissions Department, 

John Wiley & Sons, Inc.  The purchaser may make back-up copies for his/her own use only and not for distribution

 or resale.  The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these 

programs or from the use of the information contained herein.

12.1


Copyright 1996    Lawrence C. Marsh

Pooling Time and Cross Sections

y

it

  =  



β

1it


  +  

β

2it



x

2it  


+  

β

3it



x

3it  


+  e

it

If left unrestricted,



this model requires different equations

for each firm in each time period.

for  the  i

th

  firm  in  the  t



th

  time  period

12.2

Copyright 1996    Lawrence C. Marsh

Seemingly Unrelated Regressions

y

it

  =  



β

1i

  +  



β

2i

x



2it  

+  


β

3i

x



3it  

+  e


it

SUR models impose the restrictions:

β

1it 


β

1i



β

2it 


β

2i



β

3it 


β

3i



Each firm gets its own coefficients: 

β

1i 



β

2i 



and 

β

3i 



but those coefficients are constant over time .

12.3


Copyright 1996    Lawrence C. Marsh

The investment expenditures (INV) of General Electric (G) 

and Westinghouse(W) may be related to their stock market

value (V) and actual capital stock (K) as follows:

INV

Gt

  =  



β

1G

  +  



β

2G

V



Gt  

+  


β

3G

K



Gt  

+  e


Gt

INV


Wt

 =  


β

1W

  + 



β

2W

V



Wt  

β



3W

K

Wt  



+  e

Wt

i = G, W



t = 1, . . . , 20

Two-Equation SUR Model

12.4


53

Copyright 1996    Lawrence C. Marsh

Estimating Separate Equations

For now make the assumption of no correlation

between the error terms across equations:

We make the usual error term assumptions:

cov(


e

Gt

,



 e

Gs

)  = 0



cov(

e

Wt



,

 e

Ws



) = 0

var(


e

Gt

)  = 



σ

G

2

var(

e

Wt



)  = 

σ

W



2

E(

e



Gt

)  = 0


E(

e

Wt



)  = 0

cov(


e

Gt

,



 e

Wt

)  = 0



cov(

e

Gt



,

 e

Ws



)  = 0

12.5


Copyright 1996    Lawrence C. Marsh

homoskedasticity assumption:

σ

G

  =  



σ

W

2



2

INV


t

  =  


β

1G

 + 



δ

1

D



β



2G

V



δ

2



D

t

V



β



3G

K



δ

3



D

t

K



+ e


t

Dummy variable model assumes that                     :

σ

G

  =  



σ

W

2



2

For Westinghouse observations D

t

 = 1; otherwise D



t

 = 0.


β

1W 


=  

β

1G  



δ

1



β

2W 


=  

β

2G  



δ

2



β

3W 


=  

β

3G  



δ

3



12.6

Copyright 1996    Lawrence C. Marsh

Problem with OLS on Each Equation

The first assumption of the Gauss-Markov 

Theorem concerns the 



model specification

.

  



If the model is not fully and correctly specified 

the Gauss-Markov 



properties

 might not hold.

Any 

correlation

 of error terms across equations 

must be part of model specification.

12.7


Copyright 1996    Lawrence C. Marsh

   Any 


correlation

 between the 

dependent variables of two or 

more equations that is 



not  due 

to their explanatory variables 

is by default due to 

correlated 

error terms

.

Correlated Error Terms



12.8

Copyright 1996    Lawrence C. Marsh

1.   Sales of Pepsi  vs.  sales of Coke.

(uncontrolled factor: outdoor temperature)

2.   Investments in bonds vs. investments in stocks .

      (uncontrolled factor: computer/appliance sales)

3.   Movie admissions vs. Golf Course admissions.

  

(uncontrolled factor:  weather conditions)



4.   Sales of butter  vs.  sales of bread.

 

(uncontrolled factor:  bagels and cream cheese)



Which of the following models would

be likely to produce positively correlated

errors and which would produce 

negatively correlations errors?

12.9


Copyright 1996    Lawrence C. Marsh

Joint Estimation of the Equations

INV

Gt

  =  



β

1G

  +  



β

2G

V



Gt  

+  


β

3G

K



Gt  

+  e


Gt

INV


Wt

 =  


β

1W

  + 



β

2W

V



Wt  

β



3W

K

Wt  



+  e

Wt

cov(



e

G

t



,

 e

W



t

)   =  


σ

GW

12.10



54

Copyright 1996    Lawrence C. Marsh

Seemingly Unrelated Regressions

When the error terms of two or more equations

are 


correlated

, efficient estimation requires the use

of a 

Seemingly Unrelated Regressions 

(SUR) 


type estimator to take the 

correlation

 into account.

Be sure to use the Seemingly Unrelated Regressions (SUR)

procedure in your regression 



software program 

to estimate

any equations that you believe might have 

correlated errors

.

12.11



Copyright 1996    Lawrence C. Marsh

Separate vs. Joint Estimation

SUR will give exactly the same results as estimating 

each equation separately with OLS if either or both 

of the following two conditions are true:

1.   Every equation has exactly the same set of 

explanatory variables with exactly the same 

values.


2.  There is no correlation between the error 

terms of any of the equations.

12.12

Copyright 1996    Lawrence C. Marsh

Test  for  Correlation

Η

ο

:  σ



GW  

=  0


Test the null hypothesis of zero correlation

σ

GW



σ

σ



W

^

^



^

r

GW 



=

2

2



2

2

λ



  = T r

GW

2



λ ∼ χ

2

(1)



asy.

12.13


Copyright 1996    Lawrence C. Marsh

Start with

the residuals 

e

Gt



 and 

e

Wt



from each

equation 

estimated

separately.

^

^

σ



GW

σ



σ

W

^



^

^

r



GW 

=

2



2

2

2



λ

  = T r


GW

2

λ ∼ χ



2

(1)


asy.

σ

GW 



=     Σ

 

e



Gt

e

Wt



1

T

^

^



^

σ

G  



=     Σ

 

e



Gt

1

T

^

^



2

2

σ



W  

=     Σ


 

e

Wt



1

T

^

^



2

2

12.14



Copyright 1996    Lawrence C. Marsh

Fixed  Effects  Model

y

it

  =  



β

1it


  +  

β

2it



x

2it  


+  

β

3it



x

3it  


+  e

it

y



it

  =  


β

1i

  +  



β

2

x



2it  

+  


β

3

x



3it  

+  e


it

Fixed effects models impose the restrictions:

β

1it 


β

1i



β

2it 


β

2



β

3it 


β

3



For each i

th

 cross section in the t



th

 time period:

Each i

th

 cross-section has its own constant  



β

1i 


intercept.

12.15


Copyright 1996    Lawrence C. Marsh

The 


Fixed Effects Model 

is conveniently 

represented using dummy variables:

y

it



 = 

β

11



D

1i 


β

12



D

2i 


β

13



D

3i 


β

14



D

4 i


β

2



x

2it  


β

3



x

3it  


+ e

it

D



1i

=1 if North

D

1i

=0 if not N



D

2i

=1 if East



D

2i

=0 if not E



D

3i

=1 if South



D

3i

=0 if not S



D

4i

=1 if West



D

4i

=0 if not W



y

it    


=   millions of bushels of corn produced

x

2it  



=   price of corn in dollars per bushel

x

3it  



=   price of soybeans in dollars per bushel

Each cross-sectional unit gets its own intercept,

but each cross-sectional intercept is constant  over time.

12.16


55

Copyright 1996    Lawrence C. Marsh

H



  

β

11 



β

12 



β

13 



β

14



Test for Equality of Fixed Effects

H



:  H

o

 not true



The 

H



 joint null hypothesis may be tested with F-statistic:

(SSE


R

 



 SSE

U

/ J



SSE

U

 / (NT 



 K)


F =

~   F


(NT 

 K)



J

SSE


R

 is the restricted error sum of squares (one intercept)

SSE

U

 is the unrestricted error sum of squares (four intercepts)



N is the number of cross-sectional units (N = 4)

K is the number of parameters in the model (K = 6)

J is the number of restrictions being tested (J = N

1 = 3)



T is the number of time periods

12.17


Copyright 1996    Lawrence C. Marsh

Random  Effects  Model

y

it

  =  



β

1i

  +  



β

2

x



2it  

+  


β

3

x



3it  

+  e


it

β

1i



  = 

β

1



  +  

µ

i



β

1

 is the population mean intercept.



µ

i

  is an unobservable random error that



accounts for the cross-sectional differences.

12.18


Copyright 1996    Lawrence C. Marsh

β

1i



  = 

β

1



  +  

µ

i



µ

i

  are 



independent

 of one another and of  e

it

E(

µ



i

) = 0


var(

µ

i



) = 

σ

µ



2

where 


 i = 1, ... ,N

Consequently,

E(

β

1i



) = 

β

1



var(

β

1i



) = 

σ

µ



2

Random  Intercept  Term

12.19

Copyright 1996    Lawrence C. Marsh

y

it



  =  

β

1i



  +  

β

2



x

2it  


+  

β

3



x

3it  


+  e

it

y



it

  = (


β

1

+



µ

i

) +  



β

2

x



2it  

+  


β

3

x



3it  

+  e


it

y

it



  =  

β

1



  +  

β

2



x

2it  


+  

β

3



x

3it 


+ (

µ



+e

it

)



y

it

  =  



β

1

  +  



β

2

x



2it  

+  


β

3

x



3it   

+  


ν

it 


Random  Effects  Model

12.20


Copyright 1996    Lawrence C. Marsh

ν

it  



=  (

µ



+e

it

)



y

it

  =  



β

1

  +  



β

2

x



2it  

+  


β

3

x



3it   

+  


ν

it 


ν

it  


has zero mean

:

E(



ν

it

) = 0



ν

it  


is homoskedastic

:

var(



ν

it

) =



 σ

µ  


σ

e



2

2

The errors from the same firm in different time periods



are correlated:

The errors from different firms are always uncorrelated:

cov(

ν

it



,

ν

is



) =

 σ

µ



2

cov(


ν

it

,



ν

js

) =



 0



 s



 j

12.21


Copyright 1996    Lawrence C. Marsh

Simultaneous

Equations

Models

Chapter 13

Copyright © 1997 John Wiley & Sons, Inc.  All rights reserved.  Reproduction or translation of this work beyond 

that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the 

copyright owner is unlawful.  Request for further information should be addressed to the Permissions Department, 

John Wiley & Sons, Inc.  The purchaser may make back-up copies for his/her own use only and not for distribution

 or resale.  The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these 

programs or from the use of the information contained herein.

13.1


56

Copyright 1996    Lawrence C. Marsh

Keynesian Macro Model

Assumptions of Simple Keynesian Model

1. Consumption, c, is function of income, y.

2. Total expenditures = consumption + investment.

3. Investment assumed independent of income.

13.2

Copyright 1996    Lawrence C. Marsh

consumption is a function of income:

income is either consumed or invested:

c = 


β

1

 + 



β

2

 y



y = c + i

The Structural Equations

13.3

Copyright 1996    Lawrence C. Marsh

The Statistical Model

c

t

 = 



β

1

 + 



β

2

 y



t  

+ e


t

y

t



 = c

t

 + i



t

The consumption equation:

The income identity:

13.4


Copyright 1996    Lawrence C. Marsh

The Simultaneous Nature

of Simultaneous Equations

c

t



  =  

β

1



 + 

β

2



 y

t  


+ e

t

y



t

  =  c


t

 + i


t

Since y


t

contains e

t

they are


correlated

2.

1.



3.

4.

5.



13.5

Copyright 1996    Lawrence C. Marsh

The Failure of Least Squares

The least squares estimators of 

parameters in a structural simul-

taneous equation is 

biased


 and

inconsistent

 because of the cor-

relation between the random error

and the endogenous variables on

the right-hand side of the equation.

13.6

Copyright 1996    Lawrence C. Marsh

Single Equation:

Simultaneous Equations:

Single vs. Simultaneous Equations

c

t

y



t

e

t



c

t

y



t

i

t



e

t

13.7



57

Copyright 1996    Lawrence C. Marsh

Deriving the Reduced Form

c

t

 = 



β

1

 + 



β

2

 y



t  

+ e


t

y

t



 = c

t

 + i



t

c

t



 = 

β

1



 + 

β

2



(c

t

 + i



t

) + e


t

(1 


 

β



2

)c

t



 = 

β

1



 + 

β

2



 i

t

 + e



t

13.8


Copyright 1996    Lawrence C. Marsh

Deriving the Reduced Form

(1 



 



β

2

)c



t

 = 


β

1

 + 



β

2

 i



t

 + e


t

c

t



 =          +           i

t

  +           e



t

(1

−β



2

)

(1



−β

2

)



(1

−β

2



)

1

β



1

β

2



c

t

 = 



π

11

 + 



π

21

 i



t

 + 


ν

t

The Reduced Form Equation



13.9

Copyright 1996    Lawrence C. Marsh

Reduced Form Equation

c

t

 = 



π

11

 + 



π

21

 i



t

 + 


ν

t

(1



−β

2

)



β

1

π



11  

(1



−β

2

)



β

2

π



21 

(1



−β

2

)



1

ν

t  



=           + e

and



13.10

Copyright 1996    Lawrence C. Marsh

y

t



 = c

t

 + i



t

where


  c

t

 = 



π

11

 + 



π

21

 i



t

 + 


ν

t

y



t

 = 


π

12

 + 



π

22

 i



t

 

 



ν

t



It is sometimes useful to give this equation

its own reduced form parameters as follows:

y

t

 = 



π

11

 + (



1

+

π



21

) i


t

 

 



ν

t



13.11

Copyright 1996    Lawrence C. Marsh

y

t



 = 

π

12



 + 

π

22



 i

t

 



 

ν



t

c

t



 = 

π

11



 + 

π

21



 i

t

 + 



ν

t

Since c



t

 and y


t

 are related through the identity:

 y

t

 = c



t

 + i


, the error term, 

ν

t

, of these two 



equations is the same, and it is easy to 

show that:

(1

−β

2



)

β

1



π

11  


π

12  



(1

−β



2

)

π



22 

= (


1

−π

21



)

 



1

13.12


Copyright 1996    Lawrence C. Marsh

Identification

The structural parameters are 

β



and 

β

2



.

The reduced form parameters are 

π

11

 



and 

π

21.



Once the reduced form parameters are estimated,

the identification problem is to determine if the

orginal structural parameters can be expressed

uniquely in terms of the reduced form parameters.

(1

+

π



 

21

)



β

2  


π

21



^

^

^



(1

+

π



 

21

)



β

1  


π

11



^

^

^



13.13

58

Copyright 1996    Lawrence C. Marsh

Identification

An equation is 

exactly identified

 if its structural 

(behavorial) parameters can be uniquely expres-

sed in terms of the reduced form parameters.

An equation is 

over-identified

 if there is more

than one solution for expressing its structural 

(behavorial) parameters in terms of the reduced

form parameters.

An equation is 

under-identified

 if its structural 

(behavorial) parameters cannot be expressed 

in terms of the reduced form parameters.

13.14

Copyright 1996    Lawrence C. Marsh

The Identification Problem

A system of M equations 

containing M endogenous 

variables must exclude at least 

M



1 variables from a given 

equation in order for the 

parameters of that equation to 

be identified and to be able to

be consistently estimated.

13.15


Copyright 1996    Lawrence C. Marsh

Two Stage Least Squares

Problem:  right-hand endogenous variables

y

t2



 and 

y

t1



  are 

correlated

 with the error terms.

y

t1



 = 

β

1



 + 

β

2



 

y

t2



 + 

β

3



 x

t1

 + e



t1

y

t2



 = 

α

1



 + 

α

2



 

y

t1



 + 

α

3



 x

t2

 + e



t2

13.16


Copyright 1996    Lawrence C. Marsh

Problem:  right-hand endogenous variables

y

t2

 and 



y

t1

  are 



correlated

 with the error terms.

Solution:   First, derive the reduced form  equations.

y

t1



 = 

β

1



 + 

β

2



 

y

t2



 + 

β

3



 x

t1

 + e



t1

y

t2



 = 

α

1



 + 

α

2



 

y

t1



 + 

α

3



 x

t2

 + e



t2

y

t1



 = 

π

11



 + 

π

21



 x

t1

 + 



π

31

 x



t2

 + 


ν

t1

y



t2

 = 


π

12

 + 



π

22

 x



t1

 + 


π

32

 x



t2

 + 


ν

t2

Solve two equations for two unknowns, y



t1

, y


t2 

13.17



Copyright 1996    Lawrence C. Marsh

y

t1



 = 

π

11



 + 

π

21



 x

t1

 + 



π

31

 x



t2

 + 


ν

t1

y



t2

 = 


π

12

 + 



π

22

 x



t1

 + 


π

32

 x



t2

 + 


ν

t2

Use least squares to get fitted values: 



2SLS:  Stage I

y

t1



 = 

π

11



 + 

π

21



 x

t1

 + 



π

31

 x



t2

^

^



^

^

y



t2

 = 


π

12

 + 



π

22

 x



t1

 + 


π

32

 x



t2

^

^



^

^

y



t2

 = y


t2

 + 


ν

t2

^



^

y

t1



 = y

t1

 + 



ν

t1

^



^

13.18


Copyright 1996    Lawrence C. Marsh

2SLS:  Stage II

y

t2

 = y



t2

 + 


ν

t2

^



^

y

t1



 = y

t1

 + 



ν

t1

^



^

and


y

t1

 = 



β

1

 + 



β

2

 



y

t2

 + 



β

3

 x



t1

 + e


t1

y

t2



 = 

α

1



 + 

α

2



 

y

t1



 + 

α

3



 x

t2

 + e



t2

Substitue in

 for 

y

t1 



y

t2 



y

t1

 = 



β

1

 + 



β

2

 (y



t2

 + 


ν

t2

) + 



β

3

 x



t1

 + e


t1

^

^



y

t2

 = 



α

1

 + 



α

2

 (y



t1

 + 


ν

t1

) + 



α

3

 x



t2

 + e


t2

^

^



13.19

59

Copyright 1996    Lawrence C. Marsh

2SLS:  Stage II (continued)

y

t1

 = 



β

1

 + 



β

2

 y



t2

 + 


β

3

 x



t1

 + u


t1

^

y



t2

 = 


α

1

 + 



α

2

 y



t1

 + 


α

3

 x



t2

 + u


t2

^

^



u

t1 


β

2



ν

t2 


+ e

t1

u



t2 

α



2

ν

t1 



+ e

t2

^



where

and


Run least squares on each of the above equations

to get 2SLS estimates:

β

1

 , 



β



β

3

 , 



α



α

2

 



and

 

α



3

~

~



~

~

~



~

13.20


Copyright 1996    Lawrence C. Marsh

Nonlinear

Least

Squares

Chapter 14

Copyright © 1997 John Wiley & Sons, Inc.  All rights reserved.  Reproduction or translation of this work beyond 

that permitted in Section 117 of the 1976 United States Copyright Act without the express written permission of the 

copyright owner is unlawful.  Request for further information should be addressed to the Permissions Department, 

John Wiley & Sons, Inc.  The purchaser may make back-up copies for his/her own use only and not for distribution

 or resale.  The Publisher assumes no responsibility for errors, omissions, or damages, caused by the use of these 

programs or from the use of the information contained herein.

14.1


Copyright 1996    Lawrence C. Marsh

(A.)   “Regression” model with only  an intercept term:

Review of Least Squares Principle

y

t



  =  

α

  +  e



t

e

t



  =  y

t

  



  

α



Σ 

e

t



  = 

Σ 

(y



t

 



 

α

)



2

2

SSE  = 



Σ 

(y

t



 

 



α

)

2



∂ 

SSE


∂ α

− 



2

 

Σ



 

(y

t



 

 



α

) = 0


^

Σ

 



y

t

 



 

Σ



 

α 

 =  0



^

Σ

 



y

t

 



 

Τ α 



 =  0

^

α 



 =      

Σ

 



y

t   


 =  y

^

1



T

(minimize the sum of squared errors)

Yields an exact analytical solution:

14.2


Copyright 1996    Lawrence C. Marsh

Review of Least Squares

(B.)   Regression model without  an intercept term:

y

t



  =  

β

x



t

 +  e


t

e

t



  =  y

t

  



  

β



x

t

Σ 



e

t

  = 



Σ 

(y

t



 

 



β

x

t



)

2

2



SSE  = 

Σ 

(y



t

 



 

β

x



t

)

2



∂ 

SSE


∂ α

− 



2

 

Σ 



x

t

(y



t

 

− β



x

t

)= 0



^

Σ

 



x

t

y



t

 



 

Σ

 



β

x

t



 

 =  0


^

2

Σ



x

t

 



y

t

 



 

β Σ



x

t

 



 =  0

^

2



β Σ

x

t



   =  

Σ

 



x

t

y



t

^

2



β 

=

^



Σ

 

x



t

y

t



2

Σ

x



t

This yields an exact

 analytical solution:

14.3


Copyright 1996    Lawrence C. Marsh

Review of Least Squares

(C.)   Regression model with both  an intercept and a slope:

y

t



  = 

α

 + 



β

x

t



 +  e

t

SSE  = 



Σ 

(y

t



 

 



α

 



 

β

x



t

)

2



∂ 

SSE


∂ α

− 



2

 

Σ 



(y



 

α

 



− β

x

t



) = 0

^

^



∂ 

SSE


∂ β

− 



2

 

Σ 



x

t

(y



t

 



 

α

 



− β

x

t



) = 0

^

^



y

 



 

α

 



− β 

x = 0


^

^

β 



=

^

Σ



 

(x

t



x)(y


t

y)



Σ

(x

t



x)

2



Σ

x

t



y

t

 



 

αΣ



x

t

 



− βΣ

x

t



 = 0

^

^



2

This yields an exact

 analytical solution:

α

 = y



 

− β 


x

^

^



14.4

Copyright 1996    Lawrence C. Marsh

Nonlinear Least Squares

(D.)   Nonlinear Regression model:

y

t



   =  x

t

β



  +  e

t

SSE  = 



Σ 

(y

t



 

 x



t

β

)



2

PROBLEM: An exact

analytical solution to

this does not exist.

∂ 

SSE


∂ β

− 



2

 

Σ 



x

t

β 



ln

(x

t



)(y

− 



x

t

β



) = 0

 

^



 

^

Σ [



x

t

β 



ln

(x

t



)y

t



 

−  Σ [


x

t

2β 



ln

(x

t



)] = 0

 

^



^

Must use numerical

search algorithm to

try to find value of

β

 to satisfy this.



14.5

60

Copyright 1996    Lawrence C. Marsh

Find Minimum of Nonlinear SSE

β

β

^



SSE

SSE  = 


Σ 

(y

t



 

 x



t

β

)



2

  

14.6



Copyright 1996    Lawrence C. Marsh

The 


least squares principle 

is still appropriate when the

model is 

nonlinear

, but it is 

harder to find the solution.

Conclusion

14.7

Copyright 1996    Lawrence C. Marsh

Nonlinear least squares 

optimization methods:

The Gauss-Newton Method

 Optional  Appendix 

14.8


Copyright 1996    Lawrence C. Marsh

The Gauss-Newton Algorithm

1.  Apply the Taylor Series Expansion to the 

nonlinear model around some initial 

b

(o)

.

2.  Run Ordinary Least Squares (OLS) on the 



linear part of the Taylor Series to get 

b

(m)

.

3.  Perform a Taylor Series around the new 



b

(m)

 

to get 



b

(m+1) 

.

4.  Relabel 



b

(m+1) 

as 


b

(m)

 and rerun steps 2.-4.

5.  Stop when (

b

(m+1) 

 

b



(m) 

) becomes very small. 

14.9

Copyright 1996    Lawrence C. Marsh

The Gauss-Newton Method

y

t

  =  f(X

t

,b) +  

ε

 



t

   

 for


 t = 1, . . . , n.

Do a Taylor Series Expansion around the  vector  

b

 



b

(o) as follows:

y

t

  =   f(X

t

,b

(ο)


) + f’(X

t

,b

(ο)


)(b - b

(ο)


) + 

ε

t



where  

ε

t

∗  



  (b - b



(o)

)

T

f’’(X

t

,b

(ο)


)(b - b

(ο)


) + R

t   

+  

ε

 



t

 

f(X

t

,b)  = f(X

t

,b

(ο)


) + f’(X

t

,b

(ο)


)(b - b

(ο)


)

 

                                             +

 

(b - b

(ο)


)

T

f’’(X

t

,b

(ο)


)(b - b

(ο)


) + R

t

14.10


Copyright 1996    Lawrence C. Marsh

y

t

  =   f(X

t

,b

(ο)


) + f’(X

t

,b

(ο)


)(b - b

(ο)


) + 

ε

t



y

t

  -   f(X

t

,b

(ο)


)   =    f’(X

t

,b

(ο)


)b  - f’(X

t

,b

(ο)


) b

(ο)


 + 

ε

t



y

t

  -   f(X

t

,b

(ο)


)  +  f’(X

t

,b

(ο)


) b

(ο)  


 =    f’(X

t

,b

(ο)


)b  + 

ε

t



y

t

(ο)



  =    f’(X

t

,b

(ο)


)b  + 

ε

t



where   y

t

(ο)



   

≡   


y

t

  -   f(X

t

,b

(ο)


)  +  f’(X

t

,b

(ο)


) b

(ο) 


This is linear  in  b .

Gauss-Newton just runs OLS on this

transformed truncated Taylor series.

14.11


61

Copyright 1996    Lawrence C. Marsh

y

t

(ο)



  =    f’(X

t

,b

(ο)


)b  + 

ε

t



Gauss-Newton just runs OLS on this

transformed truncated Taylor series.

or

y

(ο)



  =    f’(X,b

(ο)


)b  + 



for  t = 1, . . . , n

in  matrix  terms

b  



[



 f’(X,b

(ο)


)



f’(X,b

(ο)


)

]

-1 

f’(X,b

(ο)


)



y

(ο)



^

This is analogous to linear OLS where  

 y  =   Xb  + 



    led to the solution:



 =  (


X

T

X)

−1

X



T

y

^

except that X is replaced with the matrix of first


Download 0.54 Mb.

Do'stlaringiz bilan baham:
1   ...   5   6   7   8   9   10   11   12   13




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling