Some of the trouble here is that $AUC$ and the likelihood ratio test are based on different ideas.
The $AUC$ measures the extent to which the predictions are separated by true category: the ability of the model to discriminate or discern between categories. Notably, if you divide the predictions by two or apply any other monotonically increasing function (multiplying by $1/2$ is an increasing function), you do not change the order of the predictions, so you do not change the extent to which the predictions are separable into the two categories. Consequently, $AUC$ does not consider the output calibration and if a predicted probability of $p$ corresponds to the event truly happening with probability $p$.
The likelihood involves both the ability of the model to discriminate but also the calibration of the outputs. Consequently, if adding a variable slightly lowers the ability to discriminate but dramatically improves the calibration, the likelihood favors adding this variable. However, the $AUC$ will suffer when you add this variable.
If you want to consider the fit according to the likelihood and also want to have some kind of "absolute" measure of performance (it is hard to say that any particular score counts as "good", but it can be nice to give some context for a likelihood value that lacks an easy interpretation like mean absolute error), you might consider McFadden's $R^2$, which compares the likelihood of your model (fraction numerator) to the likelihood of a reasonable baseline model that always predicts the overall probability (fraction denominator).
$$
R^2_{McFadden} = 1-\left(
\dfrac{
\overset{N}{\underset{i=1}{\sum}}\left[
y_i\log(\hat y_i) + (1 - y_i)\log(1 - \hat y_i)
\right]
}{
\overset{N}{\underset{i=1}{\sum}}\left[
y_i\log(\bar y) + (1 - y_i)\log(1 - \bar y)
\right]
}
\right)
$$
In the equation above, $y_i\in\left\{0, 1\right\}$ are the true labels, $\hat y_i$ are the predicted probabilities, and $\bar y$ is the overall probability of the event coded as $1$.
While McFadden's $R^2$ does not seem to be as popular in machine learning circles as the $AUC$, it is part of the literature, and a big part of me thinks that it should be a more popular metric.