1

I am working on a susceptible ($S = S_1$ or $S=S_2$), exposed ($E$), infected ($I$), hospitalised ($H$), deceased ($D$) and recovered ($R$) model. Below is my model,

\begin{equation} \begin{split} \frac{dS_H}{dt} &= \sigma\Pi - \left( \lambda_1 + \lambda_2 + m_1 + \mu \right) S_H \\ \frac{dS_L}{dt} &= \left( 1-\sigma \right) \Pi - \left( \lambda_1 \tau_1 + \tau_2 \lambda_2 + m_2 + \mu \right) S_L \\ \frac{dV_1}{dt} &= m_1 S_H - \left( \epsilon \lambda_1 + \mu \right) V_1 \\ \frac{dV_2}{dt} &= m_2 S_L - \left( \epsilon \lambda_1 \tau_1 + \mu \right) V_2 \\ \frac{dV_3}{dt} &= \lambda_2 S_H - \left( \epsilon \lambda_1 + \mu \right) V_3 \\ \frac{dV_4}{dt} &= \tau_2 \lambda_2 S_L - \left( \epsilon \lambda_1 \tau_1 + \mu \right) V_4 \\ \frac{dE}{dt} &= \lambda_1 \left( S_H + \tau_1 S_L + \epsilon V_1 + \epsilon \tau_1 V_2 + \epsilon V_3 + \epsilon \tau_1 V_4 \right) - \left( \alpha + \mu \right) E \\ \frac{dI}{dt} &= \alpha E - \left( h \rho + \left( 1-h \right) \gamma + \mu \right) I \\ \frac{dH}{dt} &= h \rho I - \left(\eta + \mu \right) H \\ \frac{dD}{dt} &= f_1 \left(1-h \right)\gamma I - b D \\ % \frac{dB}{dt} &= \eta f_2 H + b D \\ \frac{dR}{dt} &= \left( 1-h \right) \left( 1-f_1 \right) \gamma I + \eta\left( 1-f_2 \right) H - \mu R \end{split} \end{equation}

where

\begin{eqnarray*} \lambda_1 &=& \frac{1}{N} \left( \beta_1 + ( \beta_0 - \beta_1)e^{-q_1 \frac{V_1+V_2}{N} - q_2 \frac{V_3+V_4}{N} } \right) \left(I + \delta D \right),\\ \lambda_2 &=& \frac{1}{N} \left( r_{l} I + r_d D \right). \end{eqnarray*} Further $S_H(0) = {S_H}_{,0},$ $S_L(0) = S_{L,0},$ $V_1(0) = V_{1,0},$ $V_2(0) = V_{2,0},$ $V_3(0) = V_{3,0},$ $V_4(0) = V_{4,0},$ $E(0) = E_0,$ $I(0) = I_0,$ $H(0) = H_0,$ $D(0) = D_0$ and $R(0) = R_0$.

I fit the cumulative cases variable and cumulative vaccinations of my model. $$ \int_{0}^t E(s) ds $$ and $$ \int_0^t \left( m_1 S_H(s) + m_2 S_L(s) + \lambda_2 S_H(s) + \lambda_2 \tau_2 S_L(s) \right) ds .$$

to the cumulative cases data (denoted by CumCasesData on my code) and the cumulative vaccinations (denoted by CumVaccData on the code), respectively.

The code:

%reset -f

import numpy as np from scipy.integrate import odeint import matplotlib.pyplot as plt from scipy import integrate, optimize import pandas as pd from sklearn.metrics import r2_score

Total population, N, and birth-rate Pi.

Total_pop = 11701878; mu = (1/(60365)); Pi = Total_popmu;

p = 0.70

Calculating f1

Total_cases = 2976 Total_deaths = 1990 Total_hospitalised_persons = 681 Total_nonhospitalised_persons = Total_cases - Total_hospitalised_persons

CFR_hospitalised_persons = (0.351+0.497+0.335+0.513)/4 Total_deaths_among_hospitalised_persons = Total_hospitalised_persons*CFR_hospitalised_persons Total_deaths_among_nonhospitalised_persons = Total_deaths - Total_deaths_among_hospitalised_persons CFR_nonhospitalised_persons = Total_deaths_among_nonhospitalised_persons/Total_nonhospitalised_persons

Days = np.array([ 1, 8, 16, 22, 29, 36, 43, 50, 59, 64, 72, 78, 85, 92, 99, 107, 114, 121, 128, 134,143,150,155,163,171,177,183,190,197,204]);

CumCasesData = np.array([ 43, 57, 102, 111, 122, 132, 142, 150, 162, 181, 216, 238, 268, 300, 333, 373, 421, 453, 500, 539, 585, 608, 625, 658, 699, 734, 785, 816, 840, 872]); CumDeaths = np.array([34,41,59,75,82,91,97,100,106,115,139,155,174,186,209,217,241,268,289,315,356,368,377,402,433,461,484,513,537,548])

CumVaccData = [0,0,0,4130, 6134, 8229, 9572, 11417, 13550, 15285, 17976, 20789, 24142, 26687, 28727, 32625, 35958, 39845, 44447, 48119, 53610, 54153, 56509, 60460, 64403, 69231, 73309, 77680, 80989, 83755]

Data = np.concatenate([CumCasesData,CumVaccData])

alpha = 1/10; gamma = 1/5.6; f1 = CFR_nonhospitalised_persons; epsilon = 0.025; b = 0.58;

f2 = (0.351+0.497+0.335+0.513)/4; rho = 1/5.5 h = Total_hospitalised_persons/Total_cases sigma = 1797039/Total_pop # a = 36397/11701878 c = 1 - a g1 = 0 g2 = 0

def deriv(y,Days,r_l, r_d, beta0, beta1, eta, tau1, tau2, m1, m2, delta, q1, q2): S_H, S_L, V, dVhighGTdt, dVhighRingdt, dVhighdt, dVlowGTdt, dVlowRingdt, dVlowdt, V1, V2, V3, V4, V5, V6, E, CumCases, I, H, CumDeaths, D, B, R, N = y; lambda1 = (beta1+(beta0-beta1) * np.exp( -((q1(V1+V2))/N)-((q2(V3+V4))/N) )) * ((I + deltaD)/N) lambda2 = (r_lI + r_dD) (1/N) dS_Hdt = sigmaPi - ( lambda1S_H + lambda2S_H + m1 S_H + g1S_H) - muS_H; dS_Ldt = (1-sigma)Pi - ( lambda1 tau1 * S_L + lambda2 * tau2 * S_L + m2 * S_L + g2S_L ) - muS_L dVdt = m1S_H + m2S_L + lambda2S_H + lambda2tau2S_L + g1S_H + g2S_L; dVhighGTdt = g1S_H; dVhighRingdt = m1S_H + lambda2S_H ; dVhighdt = m1S_H + lambda2S_H + g1S_H; dVlowGTdt = g2 S_L; dVlowRingdt = m2S_L + lambda2 tau2 S_L + g2S_L; dVlowdt = m2S_L + lambda2 tau2 S_L + g2S_L; dV1dt = m1S_H - epsilon lambda1 * V1 - muV1; dV2dt = m2S_L - epsilon * lambda1 * tau1 * V2 - muV2; dV3dt = lambda2S_H - epsilon * lambda1 * V3 - muV3; dV4dt = tau2 lambda2 S_L - epsilon lambda1 * tau1 * V4 - muV4; dV5dt = g1S_H - epsilon * lambda1 * V5 - muV5; dV6dt = g2S_L - epsilon * lambda1 * tau1* V6 - muV6; dEdt = lambda1(S_H+ tau1* S_L + epsilonV1 + epsilontau1V2 + epsilonV3 + epsilontau1V4 + epsilonV5 + epsilon tau1* V6 ) - alphaE - muE; dCumCasesdt = alphaE;
dIdt = alpha
E - hrhoI - (1-h)gammaI - muI; dHdt = hrhoI - etaH - muH; dCumDeathsdt = f1(1-h)gammaI + etaf2H; dDdt = f1(1-h)gammaI - bD; dBdt = etaf2H + bD dRdt = (1-h)(1-f1)gammaI + eta (1-f2)H - muR; dNdt = Pi- muN + (mu-b)D - etaf2*H return dS_Hdt, dS_Ldt, dVdt, dVhighGTdt, dVhighRingdt, dVhighdt, dVlowGTdt, dVlowRingdt, dVlowdt, dV1dt, dV2dt, dV3dt, dV4dt, dV5dt, dV6dt, dEdt, dCumCasesdt, dIdt, dHdt, dCumDeathsdt, dDdt, dBdt, dRdt, dNdt

#Initial values SH0, SL0, CumVacc0, CumVacchighGT0, CumVacchighRing0, CumVacchigh0, CumVacclowGT0, CumVacclowRing0, CumVacclow0, V10, V20, V30, V40, V50, V60, E0, CumCases0, I0, H0, CumDeaths0, D0, B0, R0, N0 = 1797039, Total_pop-1797039, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 43, 9, 0, 34, 0, 34, 0, Total_pop

# Initial conditions vector

y0 = np.array([SH0, SL0, CumVacc0, CumVacchighGT0, CumVacchighRing0, CumVacchigh0, CumVacclowGT0, CumVacclowRing0, CumVacclow0, V10, V20, V30, V40, V50, V60, E0, CumCases0, I0, H0, CumDeaths0, D0, B0, R0, N0]);

def fit_odeint(Days, r_l, r_d, beta0, beta1, eta, tau1, tau2, m1, m2, delta, q1, q2): ret = odeint(deriv, y0, Days, args=(r_l, r_d, beta0, beta1, eta, tau1, tau2, m1, m2, delta, q1, q2)) SH, SL, CumVacc, CumVacchighGT, CumVacchighRing, CumVacchigh, CumVacclowGT, CumVacclowRing, CumVacclow, V1, V2, V3, V4, V5, V6, E, CumCases, I, H, CumDeaths, D, B, R, N = ret.T Fitted1 = CumVacc Output = np.concatenate([CumCases,CumVacc]) return Output

lower_bounds = [10, 26, 1.6, 0.9, 0.06, 0.02, 0.8, 0.7526008999628935e-07, 5.163295828430289e-06, 0.7, 1150, 500] upper_bounds = [13, 31, 1.8, 1, 0.069, 0.03, 0.9, 0.9026008999628935e-7,6.463295828430289e-06, 0.9, 1200, 2000]

my_bounds = [lower_bounds,upper_bounds] init_vals = np.array([11.92001541438225, 29.966685050861358, 1.70242164e+00, 9.26076365e-01, 6.81429695e-02, 2.44124561e-02, 8.80134762e-01, 9.02595576e-08, 5.86358993e-06, 8.01571823e-01, 1.19400783e+03, 1.57352147e+03])

popt, pcov = optimize.curve_fit(fit_odeint, Days,Data, init_vals, bounds = my_bounds)

r_l, r_d, beta0, beta1, eta, tau1, tau2, m1, m2, delta, q1, q2 = popt ret = odeint(deriv, y0, Days, args=(r_l, r_d, beta0, beta1, eta, tau1, tau2, m1, m2, delta, q1, q2)) SH, SL, CumVacc, CumVacchighGT, CumVacchighRing, CumVacchigh, CumVacclowGT, CumVacclowRing, CumVacclow, V1, V2, V3, V4, V5, V6, E, CumCases, I, H, CumDeaths, D, B, R, N = ret.T

fig, (ax1,ax2) = plt.subplots(1,2, sharex=True, figsize=(15,5))

ax1.scatter(Days,CumCasesData,s=30, facecolors='none', edgecolors='k', label='Data') ax1.plot(Days,CumCases, label='Model output') ax1.set_xlabel('Time (days)') ax1.set_ylabel('Cumulative EVD cases') ax1.grid(False)

ax1.legend()

ax2.scatter(Days,CumVaccData,s=30, facecolors='none', edgecolors='b', label='Data') ax2.plot(Days,CumVacc, label='Model output') ax2.set_xlabel('Time (days)') ax2.set_ylabel('Cumulative ring vaccinations') ax2.grid(False)

fig.savefig('Fig.jpg')

The fitting looks perfect, the image is attached.enter image description here

The problem is that the standard deviation extracted from the covariance matrix (pcov - on my code) is very large. Below are the quantities of the standard deviation of my estimates.

[296.1213779707799,1497.1467843074781,728.3991923017029,413.7925525621536,155533.19079498723, 84.0797571357458, 0.06605222504609057,
0.004324347985989968,  0.00072971899400099, 546.2799958303086
73085.07023277077, 19970.935418618858] 

Now, these large standard deviation results in a very large confidence interval for each of the estimated parameter which is not what I wished for especially with this good looking fitting.

How come that I have this perfect-looking fitting but very large confidence intervals?

Also how to get small or reasonable confidence intervals?

Zizo
  • 113

1 Answers1

3

Here is an early example of fitting an SIR model to a few data points. The variation in the parameters is large because the data caries little information about the parameters.

fitting with different R_0

In your case you have a bit more data points, but with the many parameters in your model there is a very large range of settings that work for your data. The range of the data is only very small.

See also this question (Coronavirus growth rate and its possibly spurious resemblance to vapor pressure model) where a lot of models, even nonsense models can fit the points. What you have is an approximately exponential growth and any function that has some way to model exponential growth with a slight change of the growth rate, is able to reasonably fit the data. In the image below you see two models fitting the data well (green and black curve) and several other work as well too.

growth as differential equation

Another example is from this question Why would you perform transformations over polynomial regression? The data is a little cloud of points that can fit many functions because the range is only small

different fits


Another aspect is that your parameters can be correlated. The standard deviations that you have are for the marginal distributions and it might be that with some reparameterisation, a combination of parameters, you get a small standard deviation.

In the first image (from this question: Fitting SIR model with 2019-nCoV data doesn't conververge), the parameters $\beta$ and $\gamma$ had large deviations, but the parameter $c = \beta-\gamma$ (the growth rate) did not have a large standard deviation.

Another example from this question Why and how does adding an interaction term affects the confidence interval of a main effect? shows the distribution of the slope and intercept parameters. The naive way would be to conclude that both parameters have large standard deviations and there is no significant effect. But, this ignores the correlation. With a reparameterisation the intercept term is clearly significant.

correlation and confidence regions