How does gamma in SVM RBF kernel influence the accuracy?

Question

I am working on a classification program using SVM RBF kernel. To find the best parameters C and gamma, I used grid search, and got the image below. What confuses me is that when gamma varies from 0.3 to 3, the accuracy changes so rapidly. I wonder what happens in this region.

I think the good models should be found on the diagonal, of a higher C with a lower gamma , or a lower C with a higher gamma.

Could anyone help me to explain the performance variation when gamma is between 0.3 and 3?

I didn't do feature scaling with the first image so the features are not scaled. I guess it could be a reason because when I normalized the data, the image turned out to be like

the yellow part between gamma=1 and 10 is gone and the accuracy seems decreases slightly

score 3 · Answer 1 · answered Jun 27 '16 at 16:53

3

For simplicity, first scale your data $X$ so that $median \|X_i - X_j\| \approx 1$: half the neighbors are < 1 away, and half > 1, on average.
What $e^{-gamma\ dist^2 }$ does is down-weight, attenuate, more distant neighbors. By how much ? Make a little table:

dist:                  [0    .5  1   2    3]
                       ---------------------
exp( - 0.3 * dist^2 ): [100  93  74  30   7] %
exp( -   1 * dist^2 ): [100  78  37   2   0] %
exp( -   3 * dist^2 ): [100  47   5   0   0] %

So $gamma = 3$ down-weights half the points by 5 % .. 0,
$gamma = 1$ by 37 % .. 0,
$gamma = 0.3$ even less. (The range 0.3 .. 3 is way too big.)

A simple rule of thumb: start with $gamma = 3$, for distances scaled to median 1.

Could you try $gamma = 2, 3, 4$ for your scaled data ?
Also, plotting the sample distributions of $dist = \|X_i - X_j\|$ and $e^{ -gamma\ dist^2 }$ might be useful.

answered Jun 27 '16 at 16:53

denis

3,297
24
36

Thanks for the answer denis. However, the features for this image is not scaled. (I tried both scaled and unscaled features and the latter one gave me a better prediction accuracy). When I tried svm gaussian with scaled data, the part between gamma from 1 to 10 was all black while the left part didn't change much. Also, I don't understand why the median of distance is 1 for scaled data, is there any reference? – Shiyu Jun 28 '16 at 09:41
@Shiyu, instead of scaling, just vary C and gamma over a small white range in your plot, e.g. gamma [.005, .01, .02] and C [10, 100]. See also http://stats.stackexchange.com/questions/81537/gridsearch-for-svm-parameter-estimation . – denis Jun 29 '16 at 08:57
Thank you @denis, I did like you said and found another region where the accuracy varied from 60% to 80%. The linked question you suggest is also helpful. – Shiyu Jun 29 '16 at 11:35
You're welcome. Another link, with plots: https://yunhaocsblog.wordpress.com/2014/07/27/the-effects-of-hyperparameters-in-svm/ – denis Jun 29 '16 at 16:07

How does gamma in SVM RBF kernel influence the accuracy?

1 Answers1

Linked