There is a simple modeling which is both effective and assumes almost nothing about the knots. Since it is a convex problem, solving it is a stable and robust procedure.
It is based on a denoising model. It deals with the main challenge, estimating the location of the segment joins, the knots.
There are 2 assumptions to be made:
- The model is piece wise linear.
- The number of knots is sparse compared to the number samples.
The combination of the assumption means that the number of cases where the 2nd derivative of the estimated signal is not zero is sparse.
By using the ${L}_{1}$ norm to promote sparsity the problem can be formulated as:
$$ \arg \min_{\boldsymbol{x}} \underbrace{\frac{1}{2} {\left\| \boldsymbol{x} - \boldsymbol{y} \right\|}_{2}^{2}}_{\text{Denoising}} + \lambda \underbrace{\sum_{i = 2}^{n - 1} \left| {x}_{i - 1} - 2 {x}_{i} + {x}_{i + 1} \right|}_{\text{Sparse 2nd derivative}} = \frac{1}{2} {\left\| \boldsymbol{x} - \boldsymbol{y} \right\|}_{2}^{2} + \lambda {\left\| \boldsymbol{D} \boldsymbol{x} \right\|}_{1} $$
This is very similar to the Total Variation (TV) Denoising model. The difference is in the $\boldsymbol{D}$ matrix. Where in the TV Denoising case it represents the 1st order forward finite differences operator and in this case it represents the central 2nd order finite differences operator.
The data:

With both the noise level and the knots unknown, this is the result of the model:

The problem is solved using ADMM which gives the same results as the DCP solver.
The full code is available on my StackExchange Signal Processing GitHub Repository (Look at the SignalProcessing\Q1227 folder).