How to efficiently fit a spline with a linear piece between two constant pieces

Question

I have two dimensional data, to which I'd like to fit a spline function. A spline would be defined as this:

So, constant line -> linear function -> constant line.

I want to find out values of x_0, x_1, y_0 and a minimising RMSE (or otherwise fitting well). How can I do this efficiently in Python?

Because this is a variation of the question at https://stats.stackexchange.com/questions/291436, you will likely find at least one of the many answers there to be helpful. The principal complication concerns estimating $x_0,$ which does not appear within the model in a linear fashion: this requires a nonlinear fitting method, such as profile likelihood (or its least squares equivalent). As you might surmise from the referenced thread, creating a fully general and correct program for this changepoint problem is not a small matter: it would be best to find existing Python code. — whuber, Dec 30 '22 at 14:32
@whuber that's true in general, but in this case where there are only two changepoints to estimate we may be in luck if the OP's data is not overwhelmingly large: we can simply exhaustively try a fine grid of values $\underset{j}{\min}(\mathbf{x}_j)\leq x_0 \leq x_1 \leq \underset{j}{\max}(\mathbf{x}_j)$, implemented via a nested for loop, taking advantage of profile least squares as you advocate. Whether or not this qualifies as "efficient" as the OP requests depends; I simply note that this approach is embarrassingly parallel. — John Madden, Dec 30 '22 at 16:49
@John That would be easy to implement and parallel, as you say. In terms of total computation, deploying a nonlinear optimizer would likely be the most efficient. — whuber, Dec 30 '22 at 18:41

0 Answers0