I assume you're provided with the following quantities:
- The intrinsic parameters of the pinhole camera model in the conventional matrix format $\Pi \in \mathbb{R}^{3\times4}$.
- $n$ 3-D stationary points $P_i=(x_i,y_i,z_i)'$ you can track in the environment that are given in the fixed reference frame $O$ used by the odometry, whose projections in the camera image plane are $\pi_i(t_j)$. The camera is supposed to be rigidly attached to a moving robot (we keep the joint fixed), thus $\pi_i$ varies with the observations taken at discrete time instants $t_j$.
- $n$ 2-D points $p_i(t_j)$ estimated within the image plane by the tracking algorithm at the discrete instants $t_j$.
- The homogeneous transformation $H(x(t_j),\theta+\delta\theta) \in \mathbb{R}^{4\times4}$ describing the location of the camera frame mounted on the robot with respect to the root frame $O$, as provided by the odometry. $H$ depends on the location $x$ (comprising the translation and the rotation of the base) that varies over time so as the joint angle $\theta+\delta\theta$ which is kept fixed, instead, representing $\theta$ the initial estimate of the joint angle and $\delta\theta$ the unknown offset to be sought.
We aim then at minimizing the following cost function:
$$
E=\frac{1}{2}\sum_{i=1}^{n} \sum_{t_j} \left\| p_i(t_j)-\tilde{\pi}_i(t_j) \right\|^2,
$$
where
$$
\pi_i(t_j)=\Pi\cdot H^{-1}(x(t_j),\theta+\delta\theta)\cdot P_i.
$$
It turns out that $\pi_i \in \mathbb{R}^3$ and specifically $\pi_i=\lambda_i\cdot(u_i,v_i,1)'$, so that we can set $\tilde{\pi}_i=(u_i,v_i)'$ to be used in $E$.
The unknowns are $\delta\theta$ and $P_i$; the total number of key-points $n$ can be kept low (around 4$\div$5 non-coplanar points) as we benefit from the significantly higher number of observations at discrete instants $t_j$. The form of the cost function $E$ is suitable for the LM algorithm since the expression for the gradient $J$ is easy to be retrieved (we know how to write down $H$).
A possible knowledge of the locations of the key-points $P_i$ specified in $O$ can be conveniently put inside $E$ to speed up convergence.