This type of calibration requires making predictions from a Cox model, based on a set of time-varying covariates in the counting-process data format. Such predictions aren't even possible in several software packages. This page and its links go into more detail. There is substantial question when, if ever, such predictions make sense: if you have a set of covariate values for an individual at a specified time, you already know that the individual has survived that long. That risks circular reasoning and survivorship bias.
To summarize: the Python lifelines package won't allow such predictions at all. The calibrate() function in the rms package won't work with such data (although there is still some post-modeling functionality provided by that package for such models). The predict.coxph() function in the standard survival package can produce a particular type of per-subject prediction, as outlined in that last link, but you have to pay very strict attention to the data format that function, or the survfit.coxph() function, or any related function in another package expects. Read the manuals carefully. (Coding questions per se are off-topic on this site.)
Even if you are willing to overlook those issues, there still is a problem in how you will define the predicted survival probabilities, at some particular point in time, that would be used to construct a survival model's calibration curve. That's central to the calibration curve, for both "observed" and "predicted" values. For example, the manual page for the calibrate() function in the rms package says:
For survival models, "predicted" means predicted survival probability at a single time point, and "observed" refers to the corresponding Kaplan-Meier survival estimate, stratifying on intervals of predicted survival, or, if the polspline package is installed, the predicted survival probability as a function of transformed predicted survival probability using the flexible hazard regression approach...
For covariates that are fixed in time that's straightforward. The baseline covariates hold over the entire time course covered by the events in the original data, so the predicted survival for an individual is well defined, based on the baseline hazard and those fixed covariate values, even beyond that individual's last follow-up or event time.
But what do you use for an individual's corresponding "predicted" survival with time-varying covariates? You can get survival-curve estimates for new individuals having time-varying covariate values via the id argument to the survfit() function. But what covariate values do you use after the last observation time for an individual, if for calibration you need a "predicted" survival at a time later than that individual's last observation? Either you make some arbitrary choice, or you are stuck with predictions (and calibration) only at the shortest follow-up time within your test cohort. That's pretty limiting.
So I'm not sure that there is a reliable way to do calibration curves for a Cox model with time-varying covariates. You can do some model validation via cross-validation or bootstrapping, however. For example, the validate() function in the rms() package can work on such Cox models as it doesn't require any survival predictions.
Surv(start, stop, event)data format of time-varying covariates. You can do some types of model validation as noted in the answer, but so far as I know there is no reliable way to construct a true calibration curve with such a model and data. – EdM Dec 12 '22 at 14:25hareapproach as you suggested here: https://stats.stackexchange.com/questions/206291/how-to-make-calibration-plot-for-survival-data-without-binning-data? – Flora Grappelli Dec 12 '22 at 15:01survfit()with theidargument to get survival predictions for all cases and times. You would probably have to code that yourself. – EdM Dec 12 '22 at 15:15haremethod for getting estimates of "observed" probabilities, but I don't have any experience using it directly. I've only used it implicitly in calls to therms::calibrate()function. Binning by predicted probability of survival is an alternate option, which might work OK if you have a lot of data. – EdM Dec 12 '22 at 15:20calibratefunction does not handle external validation. I useval.survfromrms– Flora Grappelli Dec 12 '22 at 23:23