Re: [R] maximum likelihood convergence reproducing Anderson Blundell 1982 Econometrica R vs Stata

Alex Olssen Mon, 09 May 2011 13:59:08 -0700

Wow that is really interesting,

Sorry I was asleep when you emailed these.


And yes, of course, I had been trying to implement model 18, not 18s,
that was a typo, sorry.

I will have a look at the code you posted.

Thanks,

Alex

On 10 May 2011 02:18, peter dalgaard <pda...@gmail.com> wrote:
> Hmm, I tried replacing the x's in the model with their principal component 
> scores, and suddenly everything converges as a greased lightning:
>
>> Z <- princomp(cbind(x1,x2,x3))$scores
>> Z <- as.data.frame(princomp(cbind(x1,x2,x3))$scores)
>> names(Z)<- letters[1:3]
>> optimx(rep(0,8),lnl, hessian=TRUE, y1=y1, y2=y2,
> + x1=Z$a, x2=Z$b, x3=Z$c, method="BFGS")
>                                                                               
>                  par
> 1 0.59107682, 0.01043568, -0.20639153, 0.25902746, 0.24675162, -0.01426477, 
> 0.18045177, -0.23657327
>    fvalues method fns grs itns conv KKT1 KKT2 xtimes
> 1 -76.73878   BFGS 157  37 NULL    0 TRUE TRUE  0.055
> Warning messages:
> 1: In max(logpar) : no non-missing arguments to max; returning -Inf
> 2: In min(logpar) : no non-missing arguments to min; returning Inf
>
> The likelihood appears to be spot on, but, obviously, the parameter estimates 
> need to be back-transformed to the original x1,x2,x3 system. I'll leave that 
> as the proverbial "exercise for the reader"...
>
> However, the whole thing leads me to a suspicion that maybe our numerical 
> gradients are misbehaving in cases with high local collinearity?
>
> -pd
>
> On May 9, 2011, at 13:40 , Alex Olssen wrote:
>
>> Hi Mike,
>>
>> Mike said
>> "is this it, page 1559?"
>>
>> That is the front page yes, page 15*6*9 has the table, of which the
>> model labelled 18s is the one I replicated.
>>
>> "did you actually cut/paste code anywhere and is your first
>> coefficient -.19 or -.019?
>> Presumably typos would be one possible problem."
>>
>> -0.19 is not a typo, it is pi10 in the paper, and a1 in my Stata
>> estimation - as far as I can tell cutting and pasting is not the
>> problem.
>>
>> "generally it helps if we could at least see the equations to check your
>> code against typos ( note page number ?) in lnl that may fix part of the
>> mystery.  Is full text available
>> on author's site, doesn't come up on citeseer AFAICT,"
>>
>> Unfortunately I do not know of any place to get the full version of
>> this paper that doesn't require access to a database such as JSTOR.
>> The fact that this likelihood function reproduces the published
>> results in Stata makes me confident that it is correct - I also have
>> read a lot on systems estimation and it is a pretty standard
>> likelihood function.
>>
>> "I guess one question would be " what is beta" in lnl supposed to be -
>> it isn't used anywhere but I will also mentioned I'm not that familiar
>> with the R code ( I'm trying to work through this to learn R and the
>> optimizers).
>>
>> maybe some words would help, is sigma supposed to be 2x2 or 8x8 and what are
>> e1 and e2 supposed to be?"
>>
>> The code above is certainly not as elegant as it could be - in this
>> case the beta is rather superfluous.  It is just a vector to hold
>> parameters - but since I called all the parameters theta anyway there
>> is no need for it.  e1 and e2 are the residuals from the first and
>> second equations of the system.  Sigma is a 2x2 matrix which is the
>> outer product of the two vectors of residuals.
>>
>> Kind regards,
>>
>> Alex
>>
>>
>>
>> On 9 May 2011 23:12, Mike Marchywka <marchy...@hotmail.com> wrote:
>>>
>>>
>>>
>>>
>>>
>>>
>>> ----------------------------------------
>>>> Date: Mon, 9 May 2011 22:06:38 +1200
>>>> From: alex.ols...@gmail.com
>>>> To: pda...@gmail.com
>>>> CC: r-help@r-project.org; da...@otter-rsch.com
>>>> Subject: Re: [R] maximum likelihood convergence reproducing Anderson 
>>>> Blundell 1982 Econometrica R vs Stata
>>>>
>>>> Peter said
>>>>
>>>> "Ahem! You might get us interested in your problem, but not to the
>>>> level that we are going to install Stata and Tsp and actually dig out
>>>> and study the scientific paper you are talking about. Please cite the
>>>> results and explain the differences."
>>>>
>>>> Apologies Peter, will do,
>>>>
>>>> The results which I can emulate in Stata but not (yet) in R are reported 
>>>> below.
>>>
>>> did you actually cut/paste code anywhere and is your first coefficient -.19 
>>> or -.019?
>>> Presumably typos would be one possible problem.
>>>
>>>> They come from Econometrica Vol. 50, No. 6 (Nov., 1982), pp. 1569
>>>
>>>
>>> is this it, page 1559?
>>>
>>> http://www.jstor.org/pss/1913396
>>>
>>> generally it helps if we could at least see the equations to check your
>>> code against typos ( note page number ?) in lnl that may fix part of the
>>> mystery.  Is full text available
>>> on author's site, doesn't come up on citeseer AFAICT,
>>>
>>>
>>> http://citeseerx.ist.psu.edu/search?q=blundell+1982&sort=ascdate
>>>
>>> I guess one question would be " what is beta" in lnl supposed to be -
>>> it isn't used anywhere but I will also mentioned I'm not that familiar
>>> with the R code ( I'm trying to work through this to learn R and the 
>>> optimizers).
>>>
>>> maybe some words would help, is sigma supposed to be 2x2 or 8x8 and what are
>>> e1 and e2 supposed to be?
>>>
>>>
>>>
>>>
>>>>
>>>> TABLE II - model 18s
>>>>
>>>> coef std err
>>>> p10 -0.19 0.078
>>>> p11 0.220 0.019
>>>> p12 -0.148 0.021
>>>> p13 -0.072
>>>> p20 0.893 0.072
>>>> p21 -0.148
>>>> p22 0.050 0.035
>>>> p23 0.098
>>>>
>>>> The results which I produced in Stata are reported below.
>>>> I spent the last hour rewriting the code to reproduce this - since I
>>>> am now at home and not at work :(
>>>> My results are "identical" to those published. The estimates are for
>>>> a 3 equation symmetrical singular system.
>>>> I have not bothered to report symmetrical results and have backed out
>>>> an extra estimate using adding up constraints.
>>>> I have also backed out all standard errors using the delta method.
>>>>
>>>> . ereturn display
>>>> ------------------------------------------------------------------------------
>>>> | Coef. Std. Err. z P>|z| [95% Conf. Interval]
>>>> -------------+----------------------------------------------------------------
>>>> a |
>>>> a1 | -.0188115 .0767759 -0.25 0.806 -.1692895 .1316664
>>>> a2 | .8926598 .0704068 12.68 0.000 .7546651 1.030655
>>>> a3 | .1261517 .0590193 2.14 0.033 .010476 .2418275
>>>> -------------+----------------------------------------------------------------
>>>> g |
>>>> g11 | .2199442 .0184075 11.95 0.000 .183866 .2560223
>>>> g12 | -.1476856 .0211982 -6.97 0.000 -.1892334 -.1061378
>>>> g13 | -.0722586 .0145154 -4.98 0.000 -.1007082 -.0438089
>>>> g22 | .0496865 .0348052 1.43 0.153 -.0185305 .1179034
>>>> g23 | .0979991 .0174397 5.62 0.000 .0638179 .1321803
>>>> g33 | -.0257405 .0113869 -2.26 0.024 -.0480584 -.0034226
>>>> ------------------------------------------------------------------------------
>>>>
>>>> In R I cannot get results like this - I think it is probably to do
>>>> with my inability at using the optimisers well.
>>>> Any pointers would be appreciated.
>>>>
>>>> Peter said "Are we maximizing over the same parameter space? You say
>>>> that the estimates from the paper gives a log-likelihood of 54.04, but
>>>> the exact solution clocked in at 76.74, which in my book is rather
>>>> larger."
>>>>
>>>> I meant +54.04 > -76.74. It is quite common to get positive
>>>> log-likelihoods in these system estimation.
>>>>
>>>> Kind regards,
>>>>
>>>> Alex
>>>>
>>>> On 9 May 2011 19:04, peter dalgaard  wrote:
>>>>>
>>>>> On May 9, 2011, at 06:07 , Alex Olssen wrote:
>>>>>
>>>>>> Thank you all for your input.
>>>>>>
>>>>>> Unfortunately my problem is not yet resolved.  Before I respond to
>>>>>> individual comments I make a clarification:
>>>>>>
>>>>>> In Stata, using the same likelihood function as above, I can reproduce
>>>>>> EXACTLY (to 3 decimal places or more, which is exactly considering I
>>>>>> am using different software) the results from model 8 of the paper.
>>>>>>
>>>>>> I take this as an indication that I am using the same likelihood
>>>>>> function as the authors, and that it does indeed work.
>>>>>> The reason I am trying to estimate the model in R is because while
>>>>>> Stata reproduces model 8 perfectly it has convergence
>>>>>> difficulties for some of the other models.
>>>>>>
>>>>>> Peter Dalgaard,
>>>>>>
>>>>>> "Better starting values would help. In this case, almost too good
>>>>>> values are available:
>>>>>>
>>>>>> start <- c(coef(lm(y1~x1+x2+x3)), coef(lm(y2~x1+x2+x3)))
>>>>>>
>>>>>> which appears to be the _exact_ solution."
>>>>>>
>>>>>> Thanks for the suggestion.  Using these starting values produces the
>>>>>> exact estimate that Dave Fournier emailed me.
>>>>>> If these are the exact solution then why did the author publish
>>>>>> different answers which are completely reproducible in
>>>>>> Stata and Tsp?
>>>>>
>>>>>
>>>>> Ahem! You might get us interested in your problem, but not to the level 
>>>>> that we are going to install Stata and Tsp and actually dig out and study 
>>>>> the scientific paper you are talking about. Please cite the results and 
>>>>> explain the differences.
>>>>>
>>>>> Are we maximizing over the same parameter space? You say that the 
>>>>> estimates from the paper gives a log-likelihood of 54.04, but the exact 
>>>>> solution clocked in at 76.74, which in my book is rather larger.
>>>>>
>>>>> Confused....
>>>>>
>>>>> -p
>>>>>
>>>>>
>>>>>>
>>>>>> Ravi,
>>>>>>
>>>>>> Thanks for introducing optimx to me, I am new to R.  I completely
>>>>>> agree that you can get higher log-likelihood values
>>>>>> than what those obtained with optim and the starting values suggested
>>>>>> by Peter.  In fact, in Stata, when I reproduce
>>>>>> the results of model 8 to more than 3 dp I get a log-likelihood of 
>>>>>> 54.039139.
>>>>>>
>>>>>> Furthermore if I estimate model 8 without symmetry imposed on the
>>>>>> system I reproduce the Likelihood Ratio reported
>>>>>> in the paper to 3 decimal places as well, suggesting that the
>>>>>> log-likelihoods I am reporting differ from those in the paper
>>>>>> only due to a constant.
>>>>>>
>>>>>> Thanks for your comments,
>>>>>>
>>>>>> I am still highly interested in knowing why the results of the
>>>>>> optimisation in R are so different to those in Stata?
>>>>>>
>>>>>> I might try making my convergence requirements more stringent.
>>>>>>
>>>>>> Kind regards,
>>>>>>
>>>>>> Alex
>>>>>
>>>>> --
>>>>> Peter Dalgaard
>>>>> Center for Statistics, Copenhagen Business School
>>>>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>>>>> Phone: (+45)38153501
>>>>> Email: pd....@cbs.dk  Priv: pda...@gmail.com
>>>>>
>>>>>
>>>>
>>>> ______________________________________________
>>>> R-help@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide 
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd....@cbs.dk  Priv: pda...@gmail.com
>
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] maximum likelihood convergence reproducing Anderson Blundell 1982 Econometrica R vs Stata

Reply via email to