1

I want to fit Y using M alone I want to also fit Y using M+P

M<-c(12.7,13.7,15.6,14.5,17.4)
N<-c(0.27,0.29,0.27,0.28,0.25)
P<-c(-0.55,-0.47,-0.49,-0.48,-0.44)

I fitted the following

df1<- data.frame(X1=M, Y=N, X2=P)
Fit<-nls(Y~pweibull(M, shape= shape, scale= scale), data= df1, start=list(shape=0.1, scale=1))
Fit<-nls(Y~pweibull(M+P, shape= shape, scale= scale), data= df1, start=list(shape=0.1, scale=1))

This is what showed

Error in numericDev
Missing value or an infinity produced when evaluating the model

In addition: warning message

In pwei I'll(M, shape= shape, scale= scale): NaNs produced
Nick Cox
  • 56,404
  • 8
  • 127
  • 185
  • 1
    A problem here might be that the points do not fit very well the Weibull distribution and that gradient get out of reach of the computational range (leading to NA or inf values). – Sextus Empiricus Oct 05 '23 at 08:03
  • Also, fitting a distribution with nls (least squares) is a quick solution, but you might want to use a chi-squared cost function or likelihood ratio. – Sextus Empiricus Oct 05 '23 at 08:05

1 Answers1

2

You are fitting an increasing function pweibull to a set of decreasing points that's asking for problems.

Below is a use of the optim function that allows easier modifications, that shows that some way of fitting can work

example

M = c(12.7, 13.7, 15.6, 14.5, 17.4)
N = c(0.27, 0.29, 0.27, 0.28, 0.25)

plot(M,N)

least squares cost function

cost = function(par) { return(sum((N-pweibull(M, par[1], par[2]))^2)) }

xpar = c(1,40) # start parameters xs = seq(12,18,0.01) # for plotting curves

make ten gradient descent steps and plot the result each time

for (i in 1:10) {
xpar = optim(xpar, cost, control = list(maxit = 1), method = 'L-BFGS-B', lower = c(0,0))$par lines(xs,pweibull(xs, xpar[1], xpar[2])) }


The solution to make the fitting work was to introduce boundaries for the parameters.

lower = c(0,0)

What these fitting methods try to do is compute a gradient and they may search in an area of values with negative parameter values (for which the dweibull function will result in NA values).

That problem is solved by restricting the values of the parameters.

The main cause of the problem is not solved with this. You are trying to fit data points that have no good solution and the algorithm will not work easily. Comparable is Why is logistic regression particularly prone to overfitting in high dimensions? which is a problem of the parameters in the logistic function fitting best when they are at infinity.

Nick Cox
  • 56,404
  • 8
  • 127
  • 185