I saw on Coursera machine learning classes that is possible to normalize data in two ways:
data = (data - mean) / max(data) - min(data)
or you can use an Octave function called std(), it does the following:
data = (data - mean) / std(data)
Which normalization is better to normalize a matrix containing house size(20 m² to 1000 m²), number of rooms(2 to 20) and the house's prices(10000 to 15000000)? And why its is better? I'm using linear regression to predict a house price, with size and number of rooms as parameters. Although, then a try to plot it using plot function in Octave, but it gives an error saying that the values are too high. So, if I normalize my data using both approaches I can plot it. So, which is better and I? When I should use std() or the other approach?
()in your first equation. – Nick Cox Mar 18 '16 at 19:35std()in your notation doesn't change the data; it just calculates a summary measure. The first operation, in the usual interpretation of the equation, just changes the scale and flushes out the units of measurement. Otherwise it does nothing fundamental to affect any regression. – Nick Cox Mar 18 '16 at 19:37