The short, but perhaps unsatisfying answer is: when you have a prior reason to think that the effect of one variable might depend on what's going on with another variable.
For example, let's say I'm trying to model student scores on a math test as a function of math test scores in the previous year and a binary variable indicating whether the student attended a (randomly assigned) refresher course in rudimentary math topics.
Given that the course only covered rudimentary abilities there are good theoretical reasons to think that it might produce a bigger impact on test scores for students who started at a lower baseline, and little or no impact on those students who were already doing well (and thus already knew the rudimentary topics it covered). So I should include an interaction term between prior test scores and course attendance to test if this is true or not (I would predict that the interaction term coefficient would be negative and significant in this case).
Note that this decision was made purely based on my prior theoretical understanding of how the variables should (or could) work. I didn't run a model first without the interaction term and then check some diagnostic or run some post hoc tests.
In general, when you are trying to decide how to specify a model - including whether to include interaction terms - you really want to try to make these decision based on prior theory and literature. It can be tempting to search for some sort of algorithmic approach ("if this number here is less than .05, then include an interaction") as you seem to be doing, but these approaches tend to cause big problems in practice - like unintentional p hacking. See prior discussions here about the problems with other attempts to specify models using "algorithmic" approaches.
In the case of interaction terms - there are always a large number of interaction terms that you COULD specify in any model. But if you try and check them all you will end up causing a multiple comparisons problems - you will find one of them to be significant at the .05 level just due to random chance, because you ran so many statistical tests. Plus some of these interaction terms - even if significant - will just make no substantive sense. Finally, including interaction terms in a model eats up degrees of freedom, makes the model harder to interpret, and reduces statistical power. So you only want to include an interaction term if you think that the benefit (in terms of interpretation and model fit) outweighs these costs.
In short: take a step back from diagnostics and think about what the variables you are considering for your model are actually doing, and why and how they might relate to the dependent variable. If you can think of a good substantive reason why the effect of one variable might depend on the level of another variable, then consider testing for an interaction between them.