0

Is there a Machine Learning technique that can used to detect the slightest change in data? I know this can be done using a hash but I was just wondering if there is any machine learning technique out there that can do this as well or close to it.

1 Answers1

1

In comments, OP has explained that this proposal isn't oriented around solving any particular problem. While it might have some utility in a specific use-case, the general idea doesn't make a whole lot of sense.

Suppose your data is represented by a vector, and you want to know if there is a "slight change" (to use OP's words) between vector $x$ and vector $y$. This is easy enough: just measure a norm, such as the L2 norm. If the norm is positive, then there is a change! We can even be more specific about whether a change was "slight" or "large." Choose some $\epsilon > 0$;

  • If $\| x -y \|=0$, then there is no change.
  • If $0<\| x -y \|< \epsilon$, then the change is "slight."
  • If $\| x -y \| > \epsilon$, then the change is "large."

You can do similar comparisons for other types of data.

As a general observation, the purpose of ML is to generalize from a limited sample, so that in the future, new inputs that "look like" inputs used during training are mapped to outputs that are similar to training outputs. In this sense, hashing to detect slight changes and machine learning are opposite, because a robust machine learning method will not be overly sensitive to minute changes in the input (at least, it won't be sensitive to minute changes whenever those minute changes are not important to the output). For example, adversarial inputs are a problem in computer vision, because imperceptibly small changes to the input image can cause dramatic changes to the classification (see: Can't deep learning models now be said to be interpretable? Are nodes features?).

Locality-sensitive hashing (LSH) is a method to group semantically similar objects together in an unsupervised manner by constructing hash codes in a specific way, such as grouping near-duplicate text documents together. Sometimes, LSH uses machine learning methods to achieve this. However, a good LSH scheme should ignore "slight changes" to the input. In the near-duplicate text document example, we want some document and the same document with a handful of misspellings and comma splices to be grouped together. So this is actually the opposite of being sensitive to "slight changes."

Sycorax
  • 90,934