Does ANCOVA require homogeneity of regression slopes? In other words, do the slopes of the lines need to be the same in order to use this method?
Per this blog post ANCOVA can be used to discriminate between models, with different slopes, but this blog post says the slopes must be parallel for ANCOVA to be used. Please explain the discrepancy in plain language if possible. I have some simulated data with Python code below as an example for context.
Example Data:
## Module Imports
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf
Generate sample data for Z statistic method
g1_count = 100 # group 1 number of datapoints
g2_count = 100 # group 2 number of datapoints
x1 = np.arange(5, 15, 10/g1_count) # array of x values for group 1
y1 = x1 + 3 + np.random.normal(0, 1, g1_count) # array of x values for group 2
x2 = np.arange(0, 10, 10/g2_count) # array of y values for group 1
y2 = 2*x2 + 3 + np.random.normal(0, 1, g2_count) # array of y values for group 2
df = pd.DataFrame({"x1": x1, "x2": x2, "y1": y1, "y2": y2}) # create dataframe
Plot Data
fig, ax = plt.subplots()
ax.scatter(x1,y1)
ax.scatter(x2,y2)
## Sample data for ANCOVA, combines x and y values to a single column each and adds a categorical column
x_c = np.concatenate((x1, x2))
y_c = np.concatenate((y1, y2))
group_list = np.concatenate((np.array((len(x1)*["A"])),np.array((len(x2)*["B"]))))
df_c = pd.DataFrame({"x": x_c, "y": y_c, "group": group_list})
df_c
Fit Model and See Summary Statistics
lm_ancova = smf.ols('y ~ group + x', data=df_c).fit()
lm_ancova.summary()
Null hypothesis is rejected. The two linear models are likely not the same given the low p-values.
However, the assumption that there is no interaction between the group and covariate (x) fails according to an ANOVA test as described here.
inter_lm = smf.ols('y ~ group * x', data=df_c).fit() # fit linear interaction model
sm.stats.anova_lm(inter_lm, typ=3) # ANOVA test on interaction model
If you have any suggestions for a more appropriate test implemented in Python it would be much appreciated.



