Computing the standard error of a marginal effect

Question

I'm trying to grok how the standard errors for a marginal effect are computed.

I know that if $\eta_i = x^T\beta$, then the square of the standard error of $\eta_i$ is $x^T \Sigma x$, where $\Sigma$ is the covariance matrix obtained from the regression.

Suppose then I want to estimate the marginal effect (on the link scale) for a Poisson regression model. If my data are housed in a dataframe df and I want a point estimate of the marginal effect of txt (a binary indicator), then I can simply do

control = df.assign(txt='Control')
treatment = df.assign(txt='Treatment')
Here, I'm using statsmodels.
The predict method returns estimates on the natural scale
treat_y = fit.predict(treatment)
control_y = fit.predict(control)
Take logs to get things back onto the scale of the linear predictor.
np.log(treat_y).mean() - np.log(control_y).mean()

The point estimate agrees with {marginaleffects}, a very good R library for computing these quantities. However, I seem to get the standard error incorrect.

My first step is to construct design matrices for the case where everyone has txt=0 and everyone has txt=1.

X_control = patsy.dmatrix(fit.model.data.design_info.builder, control, return_type="matrix")
X_treat = patsy.dmatrix(fit.model.data.design_info.builder, treatment, return_type="matrix")

Now, I can compute the variance for each prediction

cov_mat = fit.cov_params().values
var_control = ((X_control @ cov_mat) * X_control).sum(1)
var_treat = ((X_treat @ cov_mat) * X_treat).sum(1)

Finally, since I am taking the mean, this means I should take the means of the variances

se_T = var_treat.mean()
se_C = var_control.mean()

Since I want the standard error of the difference, the variances should add

se = np.sqrt(se_T + se_C)

This returns a standard error of 0.12 whereas marginaleffects returns a standard error of 0.067. I will provide the code required to generate the data. I'd much appreciate some help in computing the standazrd error for a marginal effect

Code for Data

library(tidyverse)
library(marginaleffects)
library(sandwich)
set.seed(0)
N <- 1000
txt <- sample(c('Treatment', 'Control'), size=N, replace=T)
g <- sample(letters[1:3], size=N, replace=T)
X <- model.matrix(~txtg)
beta <- c(log(0.4), 0.02, 0.2, 0.25, 0, 0)
eta <- X %% beta
lam <- exp(eta)
y <- rbinom(N, 1, lam)
tibble(txt, g, y) %>% 
  write_csv('test.csv')

The model I am fitting is fit <- glm(y~txt*g, data=d, family = poisson) in R (using robust covariance). In python, the model is

import numpy as np
import patsy
from statsmodels.formula.api import poisson
import pandas as pd
df = pd.read_csv('test.csv')
fit = poisson('y ~ txt * g', data=df).fit(cov_type='HC0')
fit.summary()
```

@dimitriy The inner product $x^T\Sigma x$ should contain the covariance. — Demetri Pananos, Jan 24 '23 at 13:44
Since $\mathbf{Var}(T-C) = \mathbf{Var}(T) + \mathbf{Var}(C) - 2 \cdot \mathbf{Cov}(T,C)$ — dimitriy, Jan 24 '23 at 18:44
@dimitriy Oof, right that was silly of me. Oy, so these computations can get very messy indeed because the variance of the linear predictors includes lots of covariances between all the parameters. — Demetri Pananos, Jan 24 '23 at 20:38
Are you also using robust cov HC0 in R? For initial testing it is better to use default nonrobust cov_params. Or make sure cov_params or bse are the same in statsmodels and R. — Josef, Jan 25 '23 at 13:54
Maybe the bullet points at the top of this page can help for comparison: https://vincentarelbundock.github.io/marginaleffects/articles/sandwich.html Also, you can check attr(comparisons(model), "jacobian") for an intermediate result. — Vincent, Jan 28 '23 at 12:32

Computing the standard error of a marginal effect

Here, I'm using statsmodels.

The predict method returns estimates on the natural scale

Take logs to get things back onto the scale of the linear predictor.

Code for Data

0 Answers0