I've seen post How can we show ONLY features that are correlated over a certain threshold in a heatmap?
that shows only correlations that exceed a certain threshold on a heatmap
with the following piece of code
components = list()
visited = set()
print(newdf.columns)
for col in newdf.columns:
if col in visited:
continue
component = set([col, ])
just_visited = [col, ]
visited.add(col)
while just_visited:
c = just_visited.pop(0)
for idx, val in corr[c].items():
if abs(val) > 0.999 and idx not in visited:
just_visited.append(idx)
visited.add(idx)
component.add(idx)
components.append(component)
for component in components:
plt.figure(figsize=(12,8))
sns.heatmap(corr.loc[component, component], cmap="Reds")
My question is how do u include only features with correlations that exceed a certain threshold and in which their correlations are significant ( p-values less than 0.05 ) on a heatmap?