Take a look at McCullagh and Nelder (1989) Generalized Linear Models, 2nd ed, Section 2.5 (pp 40-43), on iteratively reweighted least squares.
Let $y$ be the 0/1 outcome and let $\eta = g(\mu)$ be the link function. You never calculate $g(y)$ directly, but work with an adjusted dependent variable
$$z = \hat{\eta}_0 + (y-\hat{\mu}_0) \left(\frac{d\eta}{d\mu}\right)_0$$
where $\hat{\eta}_0$ is the current estimate of the linear predictor, $X\hat{\beta}_0$, and $\hat{\mu}_0 = g^{-1}(\hat{\eta}_0)$. So that avoids the problem with $g(0)$ and $g(1)$ being $\pm \infty$.
For the logit link, $\eta = \ln[\mu / (1-\mu)]$, you'll find that $d\eta/d\mu = 1/[\mu(1-\mu)]$ and so you would have
$$z = \hat{\eta}_0 + \frac{y-\hat{\mu}_0}{\hat{\mu}_0 (1 - \hat{\mu}_0)}$$
You further calculate weights
$$w_0^{-1} = \left(\frac{d\eta}{d\mu}\right)^2_0 v_0$$
where $v_0 = V(\mu_0)$ come from the mean/variance relationship, which for the binary case would be $V(\mu) = \mu(1-\mu)$. For the logit link, since $d\eta/d\mu = 1/[\mu(1-\mu)]$, you end up with weights $w_0 = \mu_0(1-\mu_0)$.
A key concern is the starting points. You might look at the R source code to see what they do. I wrote down in a notebook to start with
$\tilde{\mu} = 1/4$ if $y = 0$ and $\tilde{\mu} = 3/4$ if $y=1$, but I didn't include a source.
To spell out the iterative algorithm a bit more, focusing on the logit link:
At the start you do the following:
- Start with initial "fitted" values, say $\hat{\mu}^{(0)}_i = $ 1/4 or 3/4 according to whether $y_i = $ 0 or 1
- Calculate $\hat{\eta}^{(0)}_i = \ln[\hat{\mu}^{(0)}_i/(1-\hat{\mu}^{(0)}_i)]$
- Calculate $z^{(0)}_i = \hat{\eta}^{(0)}_i + [y_i-\hat{\mu}^{(0)}_i]/[\hat{\mu}^{(0)}_i (1 - \hat{\mu}^{(0)}_i)]$
- Calculate the weights $w^{(0)}_i = \hat{\mu}^{(0)}_i (1-\hat{\mu}^{(0)}_i)$
- Regress the $z^{(0)}_i$ on $X$ using weights $w^{(0)}_i$, to get initial estimates $\hat{\beta}^{(0)}$
Then, at each iteration, you do the following:
- Calculate $\hat{\eta}^{(s)}_i = X \hat{\beta}^{(s-1)}$
- Calculate $\hat{\mu}^{(s)}_i = \exp(\hat{\eta}^{(s)}_i)/[1+\exp(\hat{\eta}^{(s)}_i)]$
- Calculate $z^{(s)}_i = \hat{\eta}^{(s)}_i + [y_i-\hat{\mu}^{(s)}_i]/[\hat{\mu}^{(s)}_i (1 - \hat{\mu}^{(s)}_i)]$
- Calculate the weights $w^{(s)}_i = \hat{\mu}^{(s)}_i (1-\hat{\mu}^{(s)}_i)$
- Regress the $z^{(s)}_i$ on $X$ using weights $w^{(s)}_i$, to get revised estimates $\hat{\beta}^{(s)}$
This is all just for regular logistic regression. For the local logistic regression version, there is some discussion in Chapter 4 of Loader (1999) Local regression and likelihood (but frankly, I didn't really follow it).
A Google search for "local logistic regression IRLS" revealed these notes from Patrick Breheny, which say (pg 8):
The weight given to an observation $i$ in a given iteration of
the IRLS algorithm is then a product of the weight coming
from the quadratic approximation to the likelihood and the
weight coming from the kernel ($w_i = w_{1i} w_{2i}$)