Re: [R] Estimating regression with constraints in model coefficients

Jeff Newmiller via R-help Mon, 21 Apr 2025 12:32:36 -0700

Logarithms are being referred to in two contexts here... log of likelihood, and 
log of domain variables. I just wanted to highlight that the latter has a 
surprisingly simple theoretical basis in optimization.


If you have a function that you want to find an optimum of, but an input 
variable needs to be blocked from exceeding a limit, then if you let the 
optimization engine guess any unconstrained value within (-Inf,Inf) and you add 
lines at the beginning that apply antilog (exp()) to the number that the 
optimizer guesses, then your algorithm never sees numbers on the other side of 
that limit. The optimizer engine can guess all over the place but the function 
will just creep closer or further from the limit. (The limit for log is zero, 
but you can use subtraction/anti-subtraction to move it wherever you need it to 
be.) When the optimizer returns its answer, just remember to apply the same 
transformation as in the optimized function.

There can be numerical issues if you are searching near 1 so often analysts 
will use the expm1 function and log1p functions instead of exp and log... but 
you should be careful to only apply those if the nature of your optimization 
function lends itself to those functions and you have scaled your calculations 
appropriately... if not they may create biases in your results. [1] is a 
cautionary discussion on this point.

[1] https://gist.github.com/jdnewmil/99301a88de702ad2fcbaef33326b08b4

On April 21, 2025 11:25:38 AM PDT, Christofer Bogaso 
<[email protected]> wrote:
>Hi Gregg,
>
>I am sincerely thankful for this workout.
>
>Could you please suggest any text book on how to create log-likelihood
>for an ordinal model like this? Most of my online search point me
>directly to some R function etc, but a theoretical discussion on this
>subject would be really helpful to construct the same.
>
>Thanks and regards,
>
>On Mon, Apr 21, 2025 at 9:55 PM Gregg Powell <[email protected]> wrote:
>>
>> Christofer,
>>
>> Given the constraints you mentioned—bounded parameters, no intercept, and a 
>> sum constraint—you're outside the capabilities of most off-the-shelf ordinal 
>> logistic regression functions in R like vglm or polr.
>>
>> The most flexible recommendation at this point is to implement custom 
>> likelihood optimization using constrOptim() or nloptr::nloptr() with a 
>> manually coded log-likelihood function for the cumulative logit model.
>>
>> Since you need:
>>
>> Coefficient bounds (e.g., lb ≤ β ≤ ub),
>>
>> No intercept,
>>
>> And a constraint like sum(β) = C,
>>
>> …you'll need to code your own objective function. Here's something of a 
>> high-level outline of the approach:
>>
>> A. Model Setup
>> Let your design matrix X be n x p, and let Y take ordered values 1, 2, ..., 
>> K.
>>
>> B. Parameters
>> Assume the thresholds (θ_k) are fixed (or removed entirely), and you’re 
>> estimating only the coefficient vector β. Your constraints are:
>>
>> Box constraints: lb ≤ β ≤ ub
>>
>> Equality constraint: sum(β) = C
>>
>> C. Likelihood
>> The probability of category k is given by:
>>
>> P(Y = k) = logistic(θ_k - Xβ) - logistic(θ_{k-1} - Xβ)
>>
>> Without intercepts, this becomes more like:
>>
>> P(Y ≤ k) = 1 / (1 + exp(-(c_k - Xβ)))
>>
>> …where c_k are fixed thresholds.
>>
>> Implementation using nloptr
>> This example shows a rough sketch using nloptr, which allows both equality 
>> and bound constraints:
>>
>> >library(nloptr)
>> >
>> ># Custom negative log-likelihood function
>> >negLL <- function(beta, X, y, K, cutpoints) {
>> >  eta <- X %*% beta
>> >  loglik <- 0
>> >  for (k in 1:K) {
>> >    pk <- plogis(cutpoints[k] - eta) - plogis(cutpoints[k - 1] - eta)
>> >    loglik <- loglik + sum(log(pk[y == k]))
>> >  }
>> >  return(-loglik)
>> >}
>> >
>> ># Constraint: sum(beta) = C
>> >sum_constraint <- function(beta, C) {
>> >  return(sum(beta) - C)
>> >}
>> >
>> ># Define objective and constraint wrapper
>> >objective <- function(beta) negLL(beta, X, y, K, cutpoints)
>> >eq_constraint <- function(beta) sum_constraint(beta, C = 2)  # example >C
>> >
>> ># Run optimization
>> >res <- nloptr(
>> >  x0 = rep(0, ncol(X)),
>> >  eval_f = objective,
>> >  lb = lower_bounds,
>> >  ub = upper_bounds,
>> >  eval_g_eq = eq_constraint,
>> >  opts = list(algorithm = "NLOPT_LD_SLSQP", xtol_rel = 1e-8)
>> >)
>>
>>
>>
>> The next step would be writing the actual log-likelihood for your specific 
>> problem or verifying constraint implementation.
>>
>> Manually coding the log-likelihood for an ordinal model is nontrivial... so 
>> a bit of a challenge :)
>>
>>
>>
>> All the best,
>> gregg powell
>> Sierra Vista, AZ
>
>______________________________________________
>[email protected] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Estimating regression with constraints in model coefficients

Reply via email to