If Vax represents a fraction of individuals vaccinated in a facility
The time since vaccination might be considered a moderator of the effect of vaccination, implicit in your including it in an interaction term with the vaccination prevalence. There's some difficulty forcing this scenario into DAGs; see for example Weinberg, Can DAGs Clarify Effect Modification?, Epidemiology 18: 569–572 (2007).
With respect to including time since vaccination as a predictor outside of its interaction term, that's typically the best practice. It seems particularly important here, as it's quite possible that there will be no substantial interaction between time and prevalence of vaccinations but that time on its own is important (on the log-odds scale; presumably you're doing logistic regression for these binomial outcomes), given that there have been vaccinations. You don't want to miss that possibility.
There's a good chance that there won't be a monotonic association between that time and outcome. Vaccinations very close to your evaluation times at the end of outbreaks won't have had enough opportunity to provide immunity; immunity from vaccinations long before the evaluation times might well have waned in the interim. There will need to be some flexible modeling of time.*
A potential difficulty will be that the coefficient(s) reported for Time from Vax in the interaction model of equation 1 will be at a value of Vax = 0, which might seem to makes no sense if Vax is a continuous measure. For interpretability of reported coefficients it might help to re-center the Vax values around some typical value, even though predictions from the model should be the same in any case.
If Vax is a 0/1 indicator of whether a facility has had vaccinations
This is a much simpler scenario. With Vax = 0 being no vaccinations, recognize that the interaction term is just a product, and specify VaxTime = 0 when Vax = 0. Under your equation 2, for cases with Vax = 0 you have:
$$ f(y) = \beta_0$$
For cases with Vax = 1 you have:
$$ f(y) = \beta_0 + \beta_1 + \beta_3 \text{VaxTime}$$
That is, the interaction term VaxTime*Vax is non-zero only when Vax = 1; it's thus identical to VaxTime. That interaction term can be represented just as VaxTime, covering both Vax = 0 and Vax = 1 situations if you code VaxTime as 0 when Vax = 0. Your two equations then are equivalent, except that $\beta_3$ in your second equation would be numerically equivalent to $\beta_2+\beta_3$ in the first.
As noted above, you should model VaxTime as some flexible function g(VaxTime); the above simplification of the interaction term to a term g(VaxTime) holds.
I'd worry about causal inference if this wasn't a randomized trial, as the characteristics of an institution making a choice not to vaccinate might also carry over to other policies that could affect disease outbreaks.
*A similar argument might be made for your Vax predictor if it's continuous, as things like herd immunity can lead to its having non-linear associations with outcome.