Quick Take: it turns out that the two are equivalent, so it does not matter which you use as long as you are clear about what the terms mean and what numbers you input into the equations.
Let's break down what the terms mean in each equation.
$$
\text{Logistic Loss}\\
\dfrac{1}{N}\overset{N}{\underset{i=1}{\sum}}
\log\left(1 + \exp(-y_i w^Tx_i)\right)
$$
(This is the correct way to write the full "logistic loss", as the equation given in the question is the contribution to the loss by each prediction (of which the mean loss value is calculated).)
$N$ is the sample size.
$y_i\in\{-1,+1\}$ is the $i$th true value.
$w^T$ is the transposed parameter vector estimate of the logistic regression.
$x_i$ is the $i$th feature vector (your vector of predictors).
Note that $w^Tx_i$ is the predicted value of the logistic regression on the log-odds scale (so before applying the inverse link function to convert to probability). After all, a generalized linear model is $g(\mathbb E[y\vert X=x_i])=w^Tx_i$.
Therefore, the logistic loss will be useful if you have coded your categories as $\pm1$. The predicted values you input into the loss function along with these $\pm1$-coded categories are the log-odds.
$$
\text{Log Loss}\\
-\dfrac{1}{N}\overset{N}{\underset{i=1}{\sum}}\left[
y_i \log(p(y_i)) + (1 - y_i)\log(1 - p(y_i))
\right]
$$
$N$ is the sample size.
$y_i\in\{0, 1\}$ is the $i$th true value.
$p(y_i)$ is the predicted probability that observation $i$ belongs to category $1$. This is the predicted value of the logistic regression on the probability scale, so applying the inverse of the log-odds logistic regression link function to the linear predictor of the logistic regression.
$$
p(y_i) = \dfrac{1}{
1 + \exp(-w^Tx_i)
}\\
\Big\Updownarrow\\
w^Tx_i = \log\left(
\dfrac{
p(y_i)
}{
1 - p(y_i)
}
\right)
$$
This "log" form of the loss function makes sense when the categories are coded as $0$ and $1$ instead of $\pm1$ and when you have predicted probabilities.
That you can convert easily between the $\{0,1\}$ and $\{-1,+1\}$ categorical encodings and between the log-odds and probabilities means that you are free to use whichever you like. Just keep track of what goes into which equation. For instance, do not mix together the predicted log-odds and $\{0,1\}$ encoding.
If you want to use log-odds and $\{-1,+1\}$ encoding, use the "logistic" form of the loss function. If you want to use probability and $\{0,1\}$ encoding, use the "log" form of the loss function.
EDIT
A simulation is not a proof, but it did give me a good feeling to see that, in the below simulation that calculates loss values each way for a range of (over $25000$) possible parameter values for the logistic regression, the two loss functions give the same loss value if the correct arguments are passed to each function.
set.seed(2023)
library(ggplot2)
N <- 100
x <- runif(N, 0, 1)
z <- 4*x - 2
p <- 1/(1 + exp(-z))
y01 <- rbinom(N, 1, p)
y_pm <- 2 * y01 - 1
b0s <- seq(-4, 0, 0.025)
b1s <- seq(2, 6, 0.025)
log_losses <- logistic_losses <- rep(NA, length(b0s) * length(b1s))
log_loss <- function(p, y){
return(
-mean(
(y) * log(p)
+
(1 - y) * log(1-p)
)
)
}
logistic_loss <- function(logodds, y){
return(
mean(
log(
1 + exp(
-y * logodds
)
)
)
)
}
counter <- 1
for (i in 1:length(b0s)){
print(i)
intercept <- b0s[i]
for (j in 1:length(b1s)){
slope <- b1s[j]
log_odds <- intercept + slope*x
probability <- 1/(1 + exp(-log_odds))
log_losses[counter] <- log_loss(probability, y01)
logistic_losses[counter] <- logistic_loss(log_odds, y_pm)
counter <- counter + 1
}
}
L <- lm(log_losses ~ logistic_losses)
d <- data.frame(
log_loss = log_losses,
logistic_loss = logistic_losses
)
ggplot(d, aes(x = logistic_loss, y = log_losses)) +
geom_point() +
geom_abline(slope = 1, intercept = 0)
summary(L)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -5.384e-15 3.348e-17 -1.608e+02 <2e-16 ***
logistic_losses 1.000e+00 4.422e-17 2.261e+16 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.189e-15 on 25919 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: 1
F-statistic: 5.114e+32 on 1 and 25919 DF, p-value: < 2.2e-16

Indeed, the differences between the two calculations all are on the order of $10^{-16}$, if not smaller.
summary(abs(logistic_losses - log_losses))
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000e+00 0.000e+00 0.000e+00 2.115e-17 0.000e+00 6.661e-16