Based on your current optimization strategy, it is likely you cannot get away with just the gradient of $f(\cdot)$ evaluated at $\frac{\boldsymbol{x}}{\left| \boldsymbol{x}\right|}$ since normalization isn't linear wrt. $\boldsymbol{x}$. However, you could try using chain rule by first assuming that $f(\boldsymbol{u}(\boldsymbol{x}))$ where $\boldsymbol{u}(\boldsymbol{x}) = \frac{\boldsymbol{x}}{\left| \boldsymbol{x}\right|}$. Then the gradient result you say you derived would actually represent $\vec{\nabla}_{u} f$, compared to $\vec{\nabla}_{x} f$ which you want.
Using chain rule, we can show the following:
\begin{align}
\frac{\partial f}{\partial x_{p}} &= \frac{\partial f}{\partial u_{k}} \frac{\partial u_{k}}{\partial x_{p}} \\
\end{align}
We can find $\frac{\partial u_{k}}{\partial x_{p}} \;\forall k,p$, since $u_k = x_{k} \left(x_l x_l\right)^{-1/2} \; \forall k$, doing the following:
\begin{align}
\frac{\partial u_{k}}{\partial x_{p}} &= \frac{\partial}{\partial x_{p}} \left( x_{k} \left(x_l x_l\right)^{-1/2} \right) \\
&= \delta_{kp} \left(x_l x_l\right)^{-1/2} - x_{k} x_{p} \left(x_l x_l\right)^{-3/2} \\
&= \left(\delta_{kp} - \frac{x_{k} x_{p}}{\left(x_l x_l\right)} \right) \left(x_l x_l\right)^{-1/2}
\end{align}
This expression can be simplified into the following in matrix form:
\begin{align}
\frac{\partial \boldsymbol{u}}{\partial \boldsymbol{x} } &= \frac{1}{\left| \boldsymbol{x}\right|} \left( I - \frac{\boldsymbol{x} \boldsymbol{x}^{T}}{\left| \boldsymbol{x}\right|^2}\right)
\end{align}
Thus, assuming you are defining the gradients as column vectors, you get the following relationship to compute what you want:
\begin{align}
\vec{\nabla}_{x} f &= \frac{1}{\left| \boldsymbol{x}\right|} \left( I - \frac{\boldsymbol{x} \boldsymbol{x}^{T}}{\left| \boldsymbol{x}\right|^2}\right)
\vec{\nabla}_{u} f
\end{align}