This is an intrinsically "practical" question, but it leads to a well-defined mathematical problem. Let me start with the practical part:
I regularly back up my data. My backup strategy are differential backups, where the first backup is a full backup containing all files and subsequent backups are either full backups or differential backups, containing only data that has been added/changed since the last full backup.1
My backup storage is limited to $C$. I use the term "snapshot" to refer to the data from either a full backup or a differential backup.
Question: After how many differential backups should I make a new full backup to maximize the number of snapshots I can squeeze in the storage capacity $C$?
Intuition: A full backup takes up more space than a differential backup, but the more data has been added since the last full backup, the larger becomes each additional differential backup. Therefore, there is a trade-off between an additional large full backup vs. many larger differential backups.
I tried to formalize my problem as follows:
Denote data that is newly added between $t-1$ and $t$ as $s(t)$ and let $s(0) = d_0$ (some constant, size of initial data). Ignore deletes, i.e. total data at time $\tilde{t}$ is $f(\tilde{t})=\sum_{t=0}^{\tilde{t}} s(t)$, which equals the size of a full backup at this time. The size of a differential backup ($d$) at time $\tilde{t}$ depends on the time of the last full backup $b$: $d(\tilde{t},b) = \sum_{t=b+1}^{\tilde{t}} s(t)$.
With 1 full backup and only differential backups afterwards, I am not running out of storage capacity $C$ as long as $T$ is small enough2 such that $$f(0) + \sum_{\tilde{t}=1}^T d(\tilde{t}, 0) \leq C.$$
With 2 full backups, each followed by differential backups, $T_1$ must be small enough such that $$f(0) + \sum_{\tilde{t}=1}^{T_0} \Big[ d(\tilde{t}, 0) \Big] + f(T_0+1) + \sum_{\tilde{t}=T_0+2}^{T_1} d(\tilde{t}, T_0+1) \leq C.$$
I think the pattern becomes clear now:3 with $N$ full backups, $T_{N-1}$ must be small enough such that $$f(0) + \sum_{\tilde{t}=1}^{T_0} \Big[ d(\tilde{t}, 0) \Big] + \sum_{n=1}^{N-1} \Big[ f(T_{n-1}+1) + \sum_{\tilde{t}=T_{n-1}+2}^{T_n} d(\tilde{t}, T_{n-1}+1) \Big] \leq C. \tag{1} \label{eq:cond}$$
This leads to the ultimate question:
Which $N$ maximizes the largest $T_{N-1}$ for which equation $(\ref{eq:cond})$ holds?
I am not sure how to approach a solution to this problem, nor am I sure if further assumptions are needed to make progress. Maybe it is helpful to assume that the amount of new data in each period $s(t)$ is constant ($s(t) = \bar{s} ~ \forall ~ t$)?
Plugging my definitions of $f(\tilde{t})$ and $d(\tilde{t}, b)$ into equation $(\ref{eq:cond})$ yields: $$d_0 + \Big[ \sum_{\tilde{t}=1}^{T_0} \sum_{t=1}^{\tilde{t}} s(t) \Big] + \sum_{n=1}^{N-1} \Bigg[ \Big[ \sum_{t=0}^{T_{n-1}+1} s(t) \Big] + \Big[ \sum_{\tilde{t}=T_{n-1}+2}^{T_n} \sum_{t=T_{n-1}+2}^{\tilde{t}} s(t) \Big] \Bigg] \leq C, \tag{1'} \label{eq:cond-plugged-in}$$ but I do not know how to proceed from here.
Any help, either on tackling equation $(\ref{eq:cond-plugged-in})$ or completely different strategies to solve the problem are highly appreciated!
1 Note that I am not referring to incremental backups, where each backup contains only the data since the previous incremental backup. Differential backups always contain all data since the last full backup.
2 I assume that $C$ is large enough for at least 1 full and 1 differential backup.
3 To be more explicit, the conditions for 3 and 4 full backups are $$f(0) + \sum_{\tilde{t}=1}^{T_0} d(\tilde{t}, 0) + f(T_0+1) + \sum_{\tilde{t}=T_0+2}^{T_1} d(\tilde{t}, T_0+1) + f(T_1+1) + \sum_{\tilde{t}=T_1+2}^{T_2} d(\tilde{t}, T_1+1) \leq C,$$
$$f(0) + \sum_{\tilde{t}=1}^{T_0} d(\tilde{t}, 0) + f(T_0+1) + \sum_{\tilde{t}=T_0+2}^{T_1} d(\tilde{t}, T_0+1) + f(T_1+1) + \sum_{\tilde{t}=T_1+2}^{T_2} d(\tilde{t}, T_1+1) + f(T_2+1) + \sum_{\tilde{t}=T_2+2}^{T_3} d(\tilde{t}, T_2+1) \leq C.$$