Most Popular
1500 questions
14
votes
1 answer
Intuition for the regularization parameter in SVM
How does varying the regularization parameter in an SVM change the decision boundary for a non-separable dataset? A visual answer and/or some commentary on the limiting behaviors (for large and small regularization) would be very helpful.
ASX
- 451
- 2
- 4
- 7
14
votes
1 answer
How to measure the similarity between two images?
I have two group images for cat and dog. And each group contain 2000 images for cat and dog respectively.
My goal is try to cluster the images by using k-means.
Assume image1 is x, and image2 is y.Here we need to measure the similarity between any…
jason
- 329
- 2
- 4
- 9
14
votes
2 answers
Validation vs. test vs. training accuracy. Which one should I compare for claiming overfit?
I have read on the several answers here and on the Internet that cross-validation helps to indicate that if the model will generalize well or not and about overfitting.
But I am confused that which two accuracies/errors amoung…
A.B
- 326
- 1
- 3
- 12
14
votes
4 answers
Meaning of 'hue" in seaborn barplot
Seaborn barplot has three parameters.
x, y, hue : names of variables in data or vector data, optional
Question
What is hue? It seems the attribute to plot but why it is called "hue" because when I googled, the result is about color?
Google
Hue -…
mon
- 711
- 2
- 10
- 19
14
votes
4 answers
Looking for example infrastructure stacks/workflows/pipelines
I'm trying to understand how all the "big data" components play together in a real world use case, e.g. hadoop, monogodb/nosql, storm, kafka, ... I know that this is quite a wide range of tools used for different types, but I'd like to get to know…
chrshmmmr
- 143
- 7
14
votes
3 answers
MAD vs RMSE vs MAE vs MSLE vs R²: When to use which?
In regression problems, you can use various different metrics to check how well your model is doing:
Mean Absolute Deviation (MAD): In $[0, \infty)$, the smaller the better
Root Mean Squared Error (RMSE): In $[0, \infty)$, the smaller the…
Martin Thoma
- 18,880
- 35
- 95
- 169
14
votes
3 answers
Can't understand Output shape of a Dense layer - keras
I am following an online tutorial to classify images and started off with dense layers as a starting point to classify cifar10 data.
# Create a model and add layers
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(32, 32,…
BhanuKiran
- 293
- 1
- 3
- 7
14
votes
3 answers
Increasing SpaCy max NLP limit
I'm getting this error:
[E088] Text of length 1029371 exceeds maximum of 1000000. The v2.x parser and NER models require roughly 1GB of temporary memory per 100,000 characters in the input. This means long texts may cause memory allocation errors.…
D500
- 241
- 1
- 2
- 3
14
votes
3 answers
How does tensor product/multiplication work in TensorFlow?
In Tensorflow, I saw the following example:
import tensorflow as tf
import numpy as np
mat_a = tf.constant(np.arange(1,13, dtype=np.int32), shape=[2,2,3])
mat_b = tf.constant(np.arange(12,24, dtype=np.int32), shape=[2,3,2])
mul_c =…
frt132
- 159
- 1
- 4
14
votes
2 answers
Efficient dimensionality reduction for large dataset
I have a dataset with ~1M rows and ~500K sparse features. I want to reduce the dimensionality to somewhere in the order of 1K-5K dense features.
sklearn.decomposition.PCA doesn't work on sparse data, and I've tried using…
timleathart
- 3,940
- 21
- 35
14
votes
1 answer
How does the naive Bayes classifier handle missing data in training?
Naive Bayes apparently handles missing data differently, depending on whether they exist in training or testing/classification instances.
When classifying instances, the attribute with the missing value is simply not included in the probability…
matsair
- 143
- 1
- 1
- 4
14
votes
2 answers
Interpreting the Root Mean Squared Error (RMSE)!
I read all about pros and cons of RMSE vs. other absolute errors namely mean absolute error (MAE). See the the following references:
MAE and RMSE — Which Metric is Better?
What's the bottom line? How to compare models
Or this nice blogpost, or this…
TwinPenguins
- 4,249
- 3
- 19
- 53
14
votes
5 answers
When to remove correlated variables
Can somebody please suggest what is the correct stage to remove correlated variables before feature engineering or after feature engineering ?
bp89
- 143
- 1
- 1
- 5
14
votes
2 answers
Why averaging the gradient works in Gradient Descent?
In Full-batch Gradient descent or Minibatch-GD we are getting gradient from several training examples. We then average them out to get a "high-quality" gradient, from several estimations and finally use it to correct the network, at once.
But why…
Kari
- 2,726
- 2
- 20
- 49
14
votes
1 answer
Stratify on regression
I have worked in classification problems, and stratified cross-validation is one of the most useful and simple techniques I've found. In that case, what it means is to build a training and validation set that have the same prorportions of classes of…
David Masip
- 6,051
- 2
- 24
- 61