Luigi, Your diagnostic steps for estimating Min, Max, and HalfFluoCycle look spot-on. You correctly identified that your linear model gave an unreasonably large slope (SLP = 57,724), which likely caused the nls() convergence failure in mod1.
Makes sense that fixing the slope at 2 gave you a stable fit. That confirms the importance of reasonable starting values, especially for parameters like Slope, which drastically influence the logistic shape. Recommendations for Generalizing This: If you're applying this to other datasets, you don’t necessarily need to "hard-code" Slope = 2, but you can default to 2 and allow refinement if the fit isn't good. Here’s one simple strategy: ># fallback slope guess >initial_slope <- 2 > ># optionally, refine using the steepest point ># Find approximate cycle where derivative (change) is highest >diffs <- diff(df$Fluo) >steepest_idx <- which.max(diffs) ># Estimate slope based on this region >if (steepest_idx > 1 && steepest_idx < nrow(df)-1) { > local_mod <- lm(Fluo ~ Cycle, data = df[(steepest_idx-1):>(steepest_idx+1), > ]) > est_slope <- abs(local_mod$coefficients[2]) > if (est_slope < 1000) initial_slope <- est_slope # sanity check >} Then use: >mod <- nls(Fluo ~ MaxFluo / (1 + exp(-(Cycle - HalfFluoCycle)/Slope)) + >Noise, > data = df, > start = list(MaxFluo = MAX_FLUO, > HalfFluoCycle = HLF_CYC, > Slope = initial_slope, > Noise = S)) Fixing the Slope Can Still Be Valid - in practice, if your datasets have similar kinetics (e.g., same reaction conditions or system), the slope shape might be consistent across runs, so fixing or constraining the slope (e.g., bounding it within 1 < Slope < 5) is entirely reasonable and will increase model stability. You can also use nlsLM() from minpack.lm - it should provide better convergence and bounds: >library(minpack.lm) >mod <- nlsLM(Fluo ~ MaxFluo / (1 + exp(-(Cycle - HalfFluoCycle)/Slope)) >+ >Noise, > data = df, > start = list(MaxFluo = MAX_FLUO, > HalfFluoCycle = HLF_CYC, > Slope = 2, > Noise = S), > lower = c(0, 0, 0.5, 0), upper = c(Inf, 35, 10, Inf)) Next best step would be to package this into a reusable function — either for applying it across multiple datasets or for doing reverse lookups (like finding the Cycle corresponding to a specific Fluo reading). Best, Gregg On Thursday, April 24th, 2025 at 12:27 PM, Luigi Marongiu <marongiu.lu...@gmail.com> wrote: > > > Thank you. Following your tips I tried to guess the starting values > using an approached that determined (1) the background level (the > fluorescence before the take over signal), (2) the slope between this > point and the maximum: > `df <- data.frame(Cycle=1:35, Fluo=c(15908.54, 16246.92, 16722.43, 17104.29, > 16794.93, 17031.44, 17299.05, 17185.49, 17362.28, 17127.43, 17368.96, > 17513.19, 17593.95, 17626.37, 18308.29, 18768.12, 19955.26, 22493.85, > 27088.12, 36539.44, 53694.18, 84663.18, 138835.64, 223331.89, 336409.69, > 457729.88, 561233.12, 637536.31, 688985.88, 723158.56, 746575.62, 766274.75, > 776087.75, 785144.81, 791573.81) ) # estimate starting values S = > sd(df$Fluo[1:3])*10 for (j in 4:nrow(df)) { x = df$Fluo[1:j] s = sd(x) q = > df$Fluo[j] cat("SD = ", S, "Fluo = ", q, "\\n") if(q<S) { break } else { S = > sd(df$Fluo[1:j])*10 } } MIN_FLUO = q MIN_IDX = which (df$Fluo == MIN_FLUO) > MIN_CYC = df$Cycle[MIN_IDX] MAX_FLUO = max(df$Fluo) MAX_IDX = which (df$Fluo > == MAX_FLUO) MAX_CYC = df$Cycle[MAX_IDX] HLF_CYC = MIN_CYC + (MAX_CYC - > MIN_CYC)/2 q_mod = lm(Fluo~Cycle, df[MIN_IDX:MAX_IDX,]) SLP = > q_mod$coefficients[2] # plot plot(Fluo~Cycle, df) points(MIN_FLUO~MIN_CYC, > col="blue", pch=16) points(MAX_FLUO~MAX_CYC, col="red", pch=16) > abline(h=MAX_FLUO, col="red") abline(q_mod, col="green") abline(v=HLF_CYC, > col="gold") abline(h=S, col="purple") legend("topleft", > legend=c("HalfFluoCycle", "MaxFluo", "Slope", "Noise"), lty=1, col=c("gold", > "red", "green", "purple"), title="Estimated parameters", lwd=2) > legend("left", legend=c("Min", "Max"), title="Selected interval", pch=16, > col=c("blue", "red")) # regression mod1 = nls(Fluo ~ (MaxFluo / (1+ > exp(-(Cycle - HalfFluoCycle)/Slope)) + Noise), data=df, > start=list(MaxFluo=MAX_FLUO, HalfFluoCycle=HLF_CYC, Slope=SLP, Noise=S)) mod2 > = nls(Fluo ~ (MaxFluo / (1+ exp(-(Cycle - HalfFluoCycle)/Slope)) + Noise), > data=df, start=list(MaxFluo=MAX_FLUO, HalfFluoCycle=HLF_CYC, Slope=2, > Noise=S)) lines(df$Cycle, predict(mod2), col = "orange", lwd = 2)` > Model 1 failed probably because SLP = 57724.29 , which is a weird > slope value; model 2 instead worked, using your suggested value of 2. > If I could hard-code the slope to 2, unless there is a better way to > determine such a slope, then I think I could try to extend the > approach to other data... > Best regards > Luigi > > On Wed, Apr 23, 2025 at 6:43 PM Gregg Powell g.a.pow...@protonmail.com wrote: > > > Hello Luigi, > > > > Great follow-up — Looks like you’re on the right track using nls() for > > nonlinear regression. You're fitting a logistic-like sigmoidal model (as in > > Rutledge’s paper), I think both errors you’re encountering are signs of bad > > starting values or a maybe model formulation issue. > > > > Breakdown of Your Model Attempt > > Your model formula: > > > > > Fluo ~ (MaxFluo / (1 + exp(-(Cycle - (HalfFluoCycle / Slope)))) + 1) > > > > This expression has perhaps two issues: > > > > 1. Parameter placement: In the paper, the term inside exp() should look > > like -(Cycle - HalfFluoCycle)/Slope — not Cycle - (HalfFluoCycle / Slope). > > > > 2. +1 additive constant: You called it backgroundSignal, but it’s > > hard-coded as +1. That constant should be a parameter (e.g., Bkg), > > otherwise the model fit has no way to adjust it. > > > > Alternative Model Form: > > Here’s a likely corrected version of the model based on Rutledge’s > > sigmoidal fit: > > > > > mod1 <- nls(Fluo ~ MaxFluo / (1 + exp(-(Cycle - HalfFluoCycle)/Slope)) > > > > + Bkg, > > > data = df, > > > start = list(MaxFluo = max(df$Fluo), > > > HalfFluoCycle = 20, > > > Slope = 2, > > > Bkg = min(df$Fluo))) > > > > Notes on nls() Convergence > > nls() is very sensitive to starting values. Always inspect your plot and > > roughly estimate parameters: > > > > MaxFluo: Use max(Fluo) > > > > Bkg: Use min(Fluo) > > > > HalfFluoCycle: Approximate where the curve inflects > > > > Slope: Try values like 1, 2, 5 depending on steepness > > > > You can also add trace=TRUE to watch convergence progress: > > > > > mod1 <- nls(..., trace=TRUE) > > > > About SSmicmen > > From what I've gathered - this is meant for fitting the Michaelis-Menten > > model (enzyme kinetics), not sigmoidal growth. That's why mod2 fails — > > wrong self-starting model function. > > > > Visualizing the Fit > > After fitting: > > > > > plot(Fluo ~ Cycle, data = df) > > > lines(df$Cycle, predict(mod1), col = "red", lwd = 2) > > > > You might also want to compute the inverse (e.g., solving for Cycle given a > > fluorescence value) — that would involve solving the nonlinear equation > > manually or using uniroot(). > > > > All the best, > > gregg > > > > On Wednesday, April 23rd, 2025 at 8:23 AM, Luigi Marongiu > > marongiu.lu...@gmail.com wrote: > > > > > Further on this, I have found a formula from a paper from Rutledge > > > (DOI: 10.1093/nar/gnh177), which I rendered as > > > `MaxFluo / (1+ exp(-(Cycle-(HalfFluoCycle/Slope)))) + backgroundSignal` > > > I then see that one can use `nls` to fit non-linear regression, so I > > > tried: > > > `df <- data.frame(Cycle=1:35, Fluo=c(15908.54, 16246.92, 16722.43, > > > 17104.29, 16794.93, 17031.44, 17299.05, 17185.49, 17362.28, 17127.43, > > > 17368.96, 17513.19, 17593.95, 17626.37, 18308.29, 18768.12, 19955.26, > > > 22493.85, 27088.12, 36539.44, 53694.18, 84663.18, 138835.64, 223331.89, > > > 336409.69, 457729.88, 561233.12, 637536.31, 688985.88, 723158.56, > > > 746575.62, 766274.75, 776087.75, 785144.81, 791573.81) ) plot(Fluo~Cycle, > > > df) mod1 = nls(Fluo~(MaxFluo / (1+ exp(-(Cycle-(HalfFluoCycle/Slope)))) + > > > 1), data=df, start=list(MaxFluo=max(df$Fluo), HalfFluoCycle=15, > > > Slope=0.1)) mod2 = nls(Fluo~SSmicmen(Fluo, Cycle), data=df)` > > > but I got errors in both cases: > > > ``` > > > > > > mod1 > > > > > Error in nlsModel(formula, mf, start, wts, scaleOffset = scOff, > > > nDcentral = nDcntr) : > > > singular gradient matrix at initial parameter estimates > > > > > > mod2 > > > > > Error in qr.qty(QR.rhs, .swts * ddot(attr(rhs, "gradient"), lin)) : > > > NA/NaN/Inf in foreign function call (arg 5) > > > ``` > > > How can I properly set this regression model? > > > Thank you > > > > > On Wed, Apr 16, 2025 at 7:08 AM Luigi Marongiu marongiu.lu...@gmail.com > > > wrote: > > > > > > Thank you. This topic is more complicated than anticipated. Best > > > > regards, Luigi > > > > > > On Tue, Apr 15, 2025 at 11:09 PM Andrew Robinson a...@unimelb.edu.au > > > > wrote: > > > > > > > A statistical (off-topic!) point to consider: when the GLM was > > > > > fitted, you conditioned on x and let y be the random variable. > > > > > Therefore the model supports predictions of y conditional on x. > > > > > You’re now seeking to make predictions of x conditional on y. That’s > > > > > not the same thing, even in OLS. > > > > > > > It might not matter for your application but it’s probably worth > > > > > thinking about. Simulation experiments might shed some light on that. > > > > > > > Cheers, Andrew > > > > > > > -- > > > > > Andrew Robinson > > > > > Chief Executive Officer, CEBRA and Professor of Biosecurity, > > > > > School/s of BioSciences and Mathematics & Statistics > > > > > University of Melbourne, VIC 3010 Australia > > > > > Tel: (+61) 0403 138 955 > > > > > Email: a...@unimelb.edu.au > > > > > Website: https://researchers.ms.unimelb.edu.au/~apro@unimelb/ > > > > > > > I acknowledge the Traditional Owners of the land I inhabit, and pay > > > > > my respects to their Elders. > > > > > > > On 16 Apr 2025 at 1:01 AM +1000, Gregg Powell via R-help > > > > > r-help@r-project.org, wrote: > > > > > > > Take a look at this Luigi... > > > > > > > # The model is: logit(p) = β₀ + β₁*Cycles > > > > > # Where p is the probability (or normalized value in your case) > > > > > > > # The inverse function would be: Cycles = (logit⁻¹(p) - β₀)/β₁ > > > > > # Where logit⁻¹ is the inverse logit function (also called the expit > > > > > >function) > > > > > > > # Extract coefficients from your model > > > > > intercept <- coef(b_model)[1] > > > > > slope <- coef(b_model)[2] > > > > > > > # Define the inverse function > > > > > inverse_predict <- function(p) { > > > > > # p is the probability or normalized value you want to find the > > > > > >cycles for > > > > > # Inverse logit: log(p/(1-p)) which is the logit function > > > > > logit_p <- log(p/(1-p)) > > > > > > > # Solve for Cycles: (logit(p) - intercept)/slope > > > > > cycles <- (logit_p - intercept)/slope > > > > > > > return(cycles) > > > > > } > > > > > > > # Example: What cycle would give a normalized value of 0.5? > > > > > inverse_predict(0.5) > > > > > > > This function takes a probability (normalized value between 0 and 1) > > > > > and returns the cycle value that would produce this probability > > > > > according to your model. > > > > > Also: > > > > > This works because GLM with binomial family uses the logit link > > > > > function by default > > > > > The inverse function will return values outside your original data > > > > > range if needed > > > > > This won't work for p=0 or p=1 exactly (you'd get -Inf or Inf), so > > > > > you might want to add checks > > > > > > > All the best, > > > > > Gregg > > > > > > > On Tuesday, April 15th, 2025 at 5:57 AM, Luigi Marongiu > > > > > marongiu.lu...@gmail.com wrote: > > > > > > > I have fitted a glm model to some data; how can I find the inverse > > > > > function of this model? Since I don't know the y=f(x) implemented by > > > > > glm (this works under the hood), I can't define a f⁻¹(y). > > > > > Is there an R function that can find the inverse of a glm model? > > > > > Thank you. > > > > > > > The working example is: > > > > > `V = c(120.64, 66.14, 34.87, 27.11, 8.87, -5.8, 4.52, -7.16, -17.39, > > > > > -14.29, -20.26, -14.99, -21.05, -20.64, -8.03, -21.56, -1.28, 15.01, > > > > > 75.26, 191.76, 455.09, 985.96, 1825.59, 2908.08, 3993.18, 5059.94, > > > > > 6071.93, 6986.32, 7796.01, 8502.25, 9111.46, 9638.01, 10077.19, > > > > > 10452.02, 10751.81, 11017.49, 11240.37, 11427.47, 11570.07, 11684.96, > > > > > 11781.77, 11863.35, 11927.44, 11980.81, 12021.88, 12058.35, 12100.63, > > > > > 12133.57, 12148.89, 12137.09) df = data.frame(Cycles = 1:35, Values = > > > > > V[1:35]) M = max(df$Values) df$Norm = df$Values/M df$Norm[df$Norm<0] > > > > > = 0 b_model = glm(Norm ~ Cycles, data=df, family=binomial) plot(Norm > > > > > ~ Cycles, df, main="Normalized view", xlab=expression(bold("Time")), > > > > > ylab=expression(bold("Signal (normalized)")), type="p", col="cyan") > > > > > lines(b_model$fitted.values ~ df$Cycles, col="blue", lwd=2)` > > > > > > > ______________________________________________ > > > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > > PLEASE do read the posting guide > > > > > https://www.R-project.org/posting-guide.html > > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > > > > Best regards, > > > > Luigi > > > > > -- > > > Best regards, > > > Luigi > > > > > -- > Best regards, > Luigi
signature.asc
Description: OpenPGP digital signature
______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.