Naturally the answer to the former would be 0 in case of a perfect fit. But how would one go about finding out the value of the worst possible fit?
Background: Doing a simple experiment to crack Caesar Cypher, which in no way is more efficient than just brute forcing the 26 possible shifts, but that's beside the point. Anyway the strategy is to generate a histogram of characters of an encoded text and a probability vector from that containing relative frequencies of all letters in the alphabet. So observation count is the number of letters in said text and there are 26 character frequencies to check against.
After doing the same with a longer example text and thus generating a vector of expected occurances i keep chi squaring both vectors while encoding the cyphered text with all possible shifts (26) at every step to arrive at the most probable number of shifts with the associated minimum chi squared value.
So every iteration i check against the current minimum chi squared value to update it if appropriate and i'd like to initialize that value with its theoretical maximum, which brings me to the question how to determine that.
From what i've gathered it's supposed to be N(k-1) with N=number of observations (so in this case the total number of characters in the encoded text?) and k being the size of the lesser of both vectors (both 26 in this case?). But the value i arrive at with said formula seems far too high. Am i getting variables mixed up? Stumped.
I reckon this to be a rather noobish question and i apologize. Thanks alot in advance to anyone caring to answer!