Re: [R] R_using non linear regression with constraints

J C Nash Tue, 20 Jun 2017 13:59:33 -0700

I took another look at the problem.

Essentially there isn't enough information in the data to estimate the 
parameters. What there is
doesn't have the right shape for the model.


You can see this by plotting the data -- essentially a straight line.

Get the slope as follows:

   rr <- mean(mydata$y/mydata$x, na.rm=TRUE)

   rr

   mydata$yx <- mydata$y - rr*mydata$x

   with(mydata, plot(x, yx))

Shows that the curve is bending the wrong way for the model.

More data is needed, especially at larger values of x.

JN

On 2017-06-18 06:23 PM, Jeff Newmiller wrote:

I am not as expert as John, but I thought it worth pointing out that the variable substitution technique gives up oneset of constraints for another (b=0 in this case). I also find that plots help me see what is going on, so here is myreproducible example (note inclusion of library calls for completeness). Note that NONE of the optimizers mentioned so farappear to be finding the true best fit. The fact that myfun() yields 0 always if t=0 and that condition is within thedata given seems likely to be part of the problem. I don't know how to resolve this... perhaps John will look at it again.

David: Your thinking makes fine sense if you are using a Monte Carlo or brute force solution, but the fact that itcreates a discontinuity in the objective function will confuse any optimizer that uses analytic or numerically estimatedslopes.


##----------begin
library(minpack.lm)
library(ggplot2)

mydata <- data.frame( x = c( 0, 5, 9, 13, 17, 20 )
                     , y = c( 0, 11, 20, 29, 38, 45 )
                     )

myfun <- function( a, b, r, t ) {
   a * b * ( 1 - exp( -b * r * t ) )
}

objdta <- expand.grid( a = seq( 1000, 3000, by=20 )
                      , b = seq( -0.01, 1, 0.01 )
                      , rowidx = seq.int( nrow( mydata ) )
                      )
objdta[ , c( "y", "t" ) ] <- mydata[ objdta$rowidx
                                    , c( "y", "x" ) ]
objdta$tf <- factor( objdta$t )
objdta$myfun <- with( objdta
                     , myfun( a = a, b = b, r = 2, t = t )
                     )
objdtass <- aggregate( ( objdta$myfun - objdta$y )^2
                      , objdta[ , c( "a", "b" ) ]
                      , FUN = function( x )
                               sum( x, na.rm=TRUE )
                      )
objdtassmin <- objdtass[ which.min( objdtass$x ), ]

myfit <- nlsLM( y ~ myfun( a, b, r=2, t=x )
               , data = mydata
               , start = list( a = 2000
                             , b = 0.05
                             )
               , lower = c( 1000, 0 )
               , upper = c( 3000, 1 )
               )
a <- as.vector( coef( myfit )[ "a" ] )
b <- as.vector( coef( myfit )[ "b" ] )

brks <- c( 500, 1e7, 2e7, 3e7, 4e7 )
ggplot( objdtass, aes( x=a, y=b, z = x, fill=x ) ) +
     geom_tile() +
     geom_contour( breaks= brks ) +
     geom_point( x=a, y=b, colour="red" ) +
     geom_point( x=objdtassmin$a
               , y=objdtassmin$b
               , colour="green" ) +
     scale_fill_continuous( name="SumSq", breaks = brks )
# Green point is brute-force solution
# Red point is optimizer solution for myfun

##############

myfun2 <- function( a, log1ab, r, t ) {
   ab <- 1000 - exp( log1ab )
   ab * ( 1 - exp( -ab/a * r * t ) )
}

objdta$log1ab <- with( objdta, log( 1000 - a * b ) )
objdta$myfun2 <- with( objdta
                      , myfun2( a = a
                              , log1ab = log1ab
                              , r = 2
                              , t = t
                              )
                      )
objdtass2 <- aggregate( ( objdta$myfun2 - objdta$y )^2
                       , objdta[ , c( "a", "b" ) ]
                       , FUN = function( x )
                                if ( all( is.na( x ) ) ) NA
                                else sum( x, na.rm=TRUE )
                       )
objdtass2min <- objdtass2[ which.min( objdtass2$x ), ]

myfit2 <- nlsLM( y ~ myfun2( a, log1ab, r = 2, t = x )
                , data = mydata
                , start = list( a = 2000
                              , log1ab = 4.60517
                              )
                , lower = c( 1000, 0 )
                , upper = c( 3000, 8.0063 )
                )
a2 <- as.vector( coef( myfit2 )[ "a" ] )
b2 <- ( 1000
       - exp( as.vector( coef( myfit2 )[ "log1ab" ] ) )
       ) / a

brks <- c( 500, 1e6, 2e6, 3e6, 4e6 )
ggplot( objdtass2, aes( x=a, y=b, z = x, fill=x ) ) +
     geom_tile() +
     geom_contour( breaks = brks ) +
     geom_point( x=a2, y=b2, colour="red" ) +
     geom_point( x=objdtass2min$a
               , y=objdtass2min$b
               , colour="green" ) +
     scale_fill_continuous( name="SumSq", breaks = brks )
# Green point is brute-force solution
# Red point is optimizer solution for myfun2

##----------end

On Sun, 18 Jun 2017, J C Nash wrote:

I ran the following script. I satisfied the constraint by
making a*b a single parameter, which isn't always possible.
I also ran nlxb() from nlsr package, and this gives singular
values of the Jacobian. In the unconstrained case, the svs are
pretty awful, and I wouldn't trust the results as a model, though
the minimum is probably OK. The constrained result has a much
larger sum of squares.

Notes:
1) nlsr has been flagged with a check error by CRAN (though it
is in the vignette, and also mentions pandoc a couple of times).
I'm working to purge the "bug", and found one on our part, but
not necessarily all the issues.
2) I used nlxb that requires an expression for the model. nlsLM
can use a function because it is using derivative approximations,
while nlxb actually gets a symbolic or automatic derivative if
it can, else squawks.

JN

#  Here's the script #
#
# Manoranjan Muthusamy <ranjanmano...@gmail.com>
#

library(minpack.lm)
mydata=data.frame(x=c(0,5,9,13,17,20),y = c(0,11,20,29,38,45))

myfun=function(a,b,r,t){
 prd=a*b*(1-exp(-b*r*t))
 return(prd)}

# and using nlsLM

myfit=nlsLM(y~myfun(a,b,r=2,t=x),data=mydata,start=list(a=2000,b=0.05),
                 lower = c(1000,0), upper = c(3000,1))
summary(myfit)
library(nlsr)
r <- 2
myfitj=nlxb(y~a*b*(1-exp(-b*r*x)),data=mydata,start=list(a=2000,b=0.05), 
trace=TRUE)
summary(myfitj)
print(myfitj)

myfitj2<-nlxb(y~ab*(1-exp(-b*r*x)),data=mydata,start=list(ab=2000*0.05,b=0.05), 
trace=TRUE)
summary(myfitj2)
print(myfitj2)

myfitj2b<-nlxb(y~ab*(1-exp(-b*r*x)),data=mydata,start=list(ab=2000*0.05,b=0.05),
               trace=TRUE, upper=c(1000, Inf))
summary(myfitj2b)
print(myfitj2b)
# End of script #

On 2017-06-18 01:29 PM, Bert Gunter wrote:

https://cran.r-project.org/web/views/Optimization.html

(Cran's optimization task view -- as always, you should search before posting)

In general, nonlinear optimization with nonlinear constraints is hard,
and the strategy used here (multiplying by a*b < 1000) may not work --
it introduces  a discontinuity into the objective function, so
gradient based methods may in particular be problematic.  As usual, if
either John Nash or Ravi Varadhan comment, heed what they suggest. I'm
pretty ignorant.

Cheers,
Bert






Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Jun 18, 2017 at 9:43 AM, David Winsemius <dwinsem...@comcast.net> wrote:

On Jun 18, 2017, at 6:24 AM, Manoranjan Muthusamy <ranjanmano...@gmail.com> 
wrote:

I am using nlsLM {minpack.lm} to find the values of parameters a and b of
function myfun which give the best fit for the data set, mydata.

mydata=data.frame(x=c(0,5,9,13,17,20),y = c(0,11,20,29,38,45))

myfun=function(a,b,r,t){
  prd=a*b*(1-exp(-b*r*t))
  return(prd)}

and using nlsLM

myfit=nlsLM(y~myfun(a,b,r=2,t=x),data=mydata,start=list(a=2000,b=0.05),
                  lower = c(1000,0), upper = c(3000,1))

It works. But now I would like to introduce a constraint which is a*b<1000.

At the moment your coefficients do satisfy that constraint so that dataset is not suitable for testing. A slightmodification of the objective function to include the logical constraint as an additional factor does not "break"that particular solution.:


myfun2=function(a,b,r,t){
     prd=a*b*(1-exp(-b*r*t))*(a*b<1000)
     return(prd)}


myfit=nlsLM(y~myfun2(a,b,r=2,t=x),data=mydata,start=list(a=2000,b=0.05),
             lower = c(1000,0), upper = c(3000,1))

#------------------
myfit
Nonlinear regression model
   model: y ~ myfun2(a, b, r = 2, t = x)
    data: mydata
         a         b
3.000e+03 2.288e-02
  residual sum-of-squares: 38.02

Number of iterations to convergence: 8
Achieved convergence tolerance: 1.49e-08
#--

prod(coef(myfit))
#[1] 68.64909  Same as original result.

How nlsLM will handle more difficult problems is not something I have experience with, but obviously one would needto keep the starting values within the feasible domain. However, if your goal was to also remove the upper and lowerconstraints on a and b, This problem would not be suitably solved by the a*b product without relaxation of thedefault maxiter:

myfit=nlsLM(y~myfun2(a,b,r=2,t=x),data=mydata,start=list(a=2000,b=0.05),

+             lower = c(0,0), upper = c(9000,1))

prod(coef(myfit))

[1] 110.4382

coef(myfit)

            a            b
9.000000e+03 1.227091e-02

  myfit=nlsLM(y~myfun2(a,b,r=2,t=x),data=mydata,start=list(a=2000,b=0.05),

+              lower = c(0,0), upper = c(10^6,1))
Warning message:
In nls.lm(par = start, fn = FCT, jac = jac, control = control, lower = lower,  :
   lmdif: info = -1. Number of iterations has reached `maxiter' == 50.

#---------
  myfit=nlsLM(y~myfun2(a,b,r=2,t=x),data=mydata,start=list(a=2000,b=0.05),
              lower = c(0,0), upper = c(10^6,1), control=list(maxiter=100))
  prod(coef(myfit))

  coef(myfit)
#============

  prod(coef(myfit))

[1] 780.6732  Significantly different than the solution at default maxiter of 
50.


  coef(myfit)

            a            b
5.319664e+05 1.467524e-03



--
David.

I had a look at the option available in nlsLM to set constraint via
nls.lm.control. But it's not much of help. can somebody help me here or
suggest a different method to to this?

       [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius
Alameda, CA, USA

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnew...@dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
---------------------------------------------------------------------------


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R_using non linear regression with constraints

Reply via email to