I'm confused because if FIM is $I(\theta)=Var_x(s(\theta|x))=Var_x({d \over d \theta} log(L(\theta|x)))$ (variance of the score) and the MLE estimates are $\theta^*=dt * \sum^\infty_{t=0}{d\over d\theta_t}log(L(\theta_t|x))$ (gradient ascent of log-likelihood).
Then shouldn't the variance of the MLE estimate $\theta^*$ be
$Var_x(\theta^ *)\propto \sum^\infty_{t=0}Var_x({d\over d\theta_t} log(L(\theta_t|x))) = \sum^\infty_{t=0}I(\theta_t)$
given the variance sum law? That is to say proportional to the fisher information (across the optimization trajectory). But instead (as I understand it) we have that: $Var_x(\theta^ * )=I(\theta^ * )^{-1}$. Which is to say that fisher information is inversely proportional to variance of the MLE estimate.
My confusion is further compounded by the accepted answer to a similar question: "[...] One often finds the maximum likelihood estimator (mle) of by solving the likelihood equation ℓ˙()=0 when the Fisher information as the variance of the score ℓ˙() is large, then the solution to that equation will be very sensitive to the data, giving a hope for high precision of the mle. [...]"
But if the solution were very sensitive to our data (sample), then wouldn't there be greater risk that we had the wrong data and got a wildly incorrect solution? & Therefore fear of lower MLE precision?