You have to be careful when you say "anova() function," as even in R that can have different meanings depending on the type of model and package.
For your evaluation of the single B:C interaction coefficient, analysis of deviance would best be a nested comparison of the first model you show, including that interaction term, against the second one with the same coefficients but with the B:C interaction omitted. You then evaluate the p-value against your pre-specified $\alpha$ cutoff.
According to the help page for anova.glm(), if you instead specify a single model then you get a sequential term-by-term analysis. That might lead to different apparent "significance" results if you change the order of predictors in the model.
The second part of the question is harder. You can compare different models with respect to goodness of fit, adjusted for the number of predictors in the model. In your example, the nested anova() just described can tell you whether adding the B:C interaction improves the last model in a way that justifies including that interaction. The p-value serves that purpose "at significance level $\alpha$."
If models being compared don't involve nested sets of predictors you can't use anova() for comparisons. Some suggest using measures like the Akaike Information Criterion in that case, but that's not universally accepted and there isn't a "significance level" for that.
A general goodness-of-fit test is to evaluate how well the modeling process works on multiple bootstrapped samples of the data. Again, though, there's no "significance level" for that. You have to gauge, based on your understanding of the subject matter, whether the model is good enough for your purposes.