Context
I'm following Lewis (1995) exposition on normalized cross correlation for template matching (Section 2).
The cross-correlation of the image and the feature at $u,v$ is denoted by $c(u,v)$ and defined as $$ c(u,v) = \sum_{x,y} f(x,y)t(x-u,y-v) $$ where $f$ is the image and the sum is over $x$, $y$ under the window containing the feature $t$ positioned at $u$, $v$. There are several disadvantages to using $c(u,v)$ for template matching:
- If the image energy $\sum_{x,y}f^{2}(x,y)$ varies with position, matching can fail. For example, the correlation between the feature and an exactly matching region in the image may be less than the correlation between the feature and a bright spot.
- The range of $c(u, v)$ is dependent on the size of the feature.
- $c(u,v)$ is not invariant to changes in image amplitude such as those caused by changing lighting conditions across the image sequence.
The normalized cross correlation $\gamma(u,v)$ overcomes these difficulties by by normalizing the image and feature vectors to unit length, yielding a cosine-like correlation coefficient.$$\gamma(u,v) = \frac{\sum_{x,y}[f(x,y)-\bar{f_{uv}}][t(x-u,y-v)-\bar{t}]}{\{\sum_{x,y}[f(x,y)-\bar{f_{uv}}]^{2}\sum_{x,y}[t(x-u,y-v)-\bar{t}]^{2}\}^{0.5}}$$
The actual question(s)
I understand why working with normalized feature and image vectors is useful and yields a well-behaved cosine-like measure of similarity. But I'm having a hard time understanding why $\gamma(\cdot)$ normalizes the demeaned feature and image vectors instead of the feature and image vectors themselves. Would $\gamma\prime(u,v)$ as defined below have desirable template matching properties? Is the centering operation changing the angles of the feature and image vectors? $$\gamma\prime(u,v) = \frac{\sum_{x,y}[f(x,y)][t(x-u,y-v)]}{\{\sum_{x,y}[f(x,y)]^{2}\sum_{x,y}[t(x-u,y-v)]^{2}\}^{0.5}}$$
References
Lewis, J. P. (1995). Fast Normalized Cross-Correlation http://scribblethink.org/Work/nvisionInterface/nip.pdf