2

Have anyone ever made a concept map for detecting overfitting, explainability, or interpretability ? I been making one but I wanted to get the site's feedback before I post it on stats.stackexchange.com . I realized after reading this meta-thread that the concept interpretability and how it applies "what is wrong with my model?" is.

To answer this I have been making some concept charts of Interpretability and Explainability as this is an important part diagnosing problems with models, detecting overfitting, evaluating methods to reduce overfitting. However, I also am still a novice on this topic, so I wanted to get other people feedback and help making sure these are accurate.

Feature-based methods: https://drive.google.com/file/d/1bp0Ogf2UFN2yqjoIlJK5PVevXRc8BwXo/view?usp=sharing

Anomaly/Example-Based methods: https://www.lucidchart.com/documents/edit/9cb1b163-53bf-4e15-a5b0-df17c5e6e8df/0

What do you think? What should be removed? What should elaborated more on?

How exactly is this related to CV and Data Science Stack Exchange?

If there is no general way to interpret R^2 or any measure then why not compare to diagnostic measures that are able to be generalized. There is no context-free way to decide whether model metrics, then why not at least have the context be only on the data rather than on the model and the data.

These charts for stack-exchange would be a follow up or footnote for these threads. They are good explanations, but they are quickly becoming outdated. I realized this as I was following up originally to mtk's meta-question.

As to why this is important and relevant to statistics, let me give an example based on my very novice understanding. You cannot always compare the Adjusted R^2, AUC, or prediction error of different model types or ensemble models. Ensemble models also have a problem detecting overfitting. Related to Validation vs. Test, you cannot always use R^2, AUC, or prediction error to detect overfitting for ensemble models or neural networks. Hence, if there is no general way to interpret R^2 then why not have a diagnostic measure that does. Diagnostic Measures that are agnostic enough to be assumption-free, which means they are modular enough to rely only on the data not the model. In this way our model's distill insights about what might be wrong with the data itself that prevents a good model or causes overfitting. Toward the future, as field of model explanations, grows this may someday answer how to know if your machine learning model is hopeless in even more generalized way beyond forecastability.

The Goal:

I started this with what is well-documented. If everything goes well I am hoping to expand further the parts below that I am unsure about.

  • A concept map for neural network methods that are not universal for dealing with problems with all types of neural networks.
  • An understanding of up-weighting and how it works
  • A concept map for adversarial examples and adversarial modeling
  • A concept map for prototypes and criticisms
  • A concept map for classification specific anomaly detection methods. I heard about isolation forests, but not am not familiar with them.
  • A concept map for change detection (yes, it needs to be done)
  • A concept map for diagnostic plots that are specific by neural network applications and types.

I was debating but not sure if I should include based on feedback I know if this is a good idea:

  • A concept map for AUC, AIC, R^2, Adjusted R^2.

Once most of this is done, then it will be ready to answer the questions above.

Glorfindel
  • 1,118
  • 2
  • 12
  • 18
mlane
  • 281
  • 1
    I have migrated this Q to the main site, as it is not about [stats.SE] or the SE system. You may want to work on clarifying & constraining your question, though. It strikes me as potentially too broad & at risk of being closed here as well. – gung - Reinstate Monica Jul 30 '19 at 11:23

0 Answers0