It sounds like you are doing some subset analyses and single-predictor evaluations. Cox survival analysis, however, can suffer from omitted-variable bias if any outcome-associated predictor is omitted from the model. You are thus much better off doing a single large model, using as many predictors and interactions as reasonable without overfitting (typically about 15 events per predictor in unpenalized models), then interrogating the model with respect to specific hypotheses.
If you "want to know if there is a higher mortality rate from stroke in hypertensive males than in other groups," then you need to include sex, a measure of hypertension, and an interaction between them in the model. If you "want to know whether there is an interaction between hypertension and BMI and its effect on stroke mortality," then there you need to include BMI itself and an interaction with your hypertension measure. If you want to distinguish deaths from stroke from other deaths, then you need to use a competing-risks model.
It's not completely clear what you mean by "my supervisor wants BMI and SBP [systolic blood pressure] as continues and categorical variables in the analysis plus its interaction and square term." It's almost always best to model continuous predictors as continuous without categorizing them. The trick is to model them smoothly, for example with regression splines, instead of as simple linear predictors or with pre-defined polynomials like quadratic terms that cover the entire predictor range. If there is a sudden jump in an association with outcome as a predictor value changes, smooth modeling with splines should show that.
Beyond that, much depends on the details of your data, in particular the number of events and the number of predictors you are intending to evaluate. Chapter 4 of Frank Harrell's course notes and book provide much useful guidance, including how to do outcome-blinded data reduction to match the number of predictors to the number of events, spline modeling (also see Chapter 2), and how to handle interactions between smoothly-modeled predictors.