Section 6.6 of Therneau and Grambsch discusses causes of non-proportionality. A couple of issues there should be dealt with before you jump to assuming that there your predictor requires a time-varying coefficient.
First, your model allows for only a linear association between your continuous predictor AS_avg_total_p_mm and the log-hazard of an event. Strictly linear associations are seldom correct. Mis-specifying the functional form of a continuous predictor's association with outcome can lead to an apparent violation of proportional hazards (PH), while there might be no violation if that functional form were specified properly. See this page, for example. If you perform a flexible fit in the Cox model,* for example with a spline, your problem might go away.
Second, your model only includes that one predictor. Omitting any outcome-associated predictor from a Cox model can lead to bias in the coefficient estimates for included predictors and to apparent violations of PH, violations that would also disappear if you didn't omit that predictor. If you have information on other variables that might be associated with outcome, based your understanding of the subject matter, include them.
With 259 events you have a lot of potential flexibility in dealing with these issues. Frank Harrell recommends, as a general strategy in regression modeling, to decide how many degrees of freedom (df) you can afford to spend (coefficients to estimate, adjusted for penalization), to decide where to spend them, and then to spend them. With 259 events, the rule of thumb of about 15 events per df in survival models means that you could probably afford to spend 15 df. That allows you to include several additional predictors while including a very flexible fit of AS_avg_total_p_mm.
There are several different approaches to smooth fitting of continuous predictors, outlined on this page. The ps() function provided in the survival package is a particular type of penalized smoothing spline. The penalization means that the effective number of degrees of freedom used up is smaller than the number of coefficients. An example in Section 5.8.3 of Therneau and Grambsch uses up only 4 degrees of freedom for 13 coefficients. You also could consider an unpenalized regression spline; see Section 2.4 of Harrell's Regression Modeling Strategies.
Even if PH is violated, there might be no need to find the function of time that fits the predictor's coefficient. First, as AdamO frequently points out (for example here), the coefficient of a Cox model is then a type of time-averaged value. If you use robust standard errors (cluster() term), you can still do valid inference. Second, the plot of the smoothed Schoenfeld residuals over time is the estimated form of the coefficient's value over time. That is probably more useful to the reader than the results of a numerical fit. And, as you've learned, there's no good way to document that your numerical fit solved the PH problem except for showing that the shape of the numerical fit matches the
shape of the Schoenfeld residual plot.
*I mean the Cox model of survival here without any time-varying covariate, not the model that tries to accommodate a general form of a time-varying coefficient as in Section 4.2 of the time-dependence vignette. With a continuous predictor in a Cox model, it's critical to distinguish the functional form of its association with log-hazard from the functional form in time of its potentially time-varying coefficient. You appreciate that, as indicated in a comment, but this needs to be written for those that might later come upon this page.