Most Popular

1500 questions
18
votes
1 answer

How to handle a zero factor in Naive Bayes Classifier calculation?

If I have a training data set and I train a Naive Bayes Classifier on it and I have an attribute value which has probability zero. How do I handle this if I later want to predict the classification on new data? The problem is, if there is a zero in…
fragant
  • 333
  • 1
  • 2
  • 6
18
votes
2 answers

How many images per class are sufficient for training a CNN

I'm starting a project where the task is to identify sneaker types from images. I'm currently reading into TensorFlow and Torch implementations. My question is: how many images per class are required to reach a reasonable classification…
Feynman27
  • 301
  • 1
  • 2
  • 5
18
votes
2 answers

List of feature engineering techniques

Is there any resource with a list of feature engineering techniques? A mapping of type of data, model and feature engineering technique would be a gold mine.
icm
  • 529
  • 2
  • 5
  • 9
18
votes
4 answers

Question about bias in Convolutional Networks

I am trying to figure out how many weights and biases are needed for CNN. Say I have a (3, 32, 32)-image and want to apply a (32, 5, 5)-filter. For each feature map I have 5x5 weights, so I should have 3 x (5x5) x 32 parameters. Now I need to add…
user
  • 1,993
  • 6
  • 21
  • 38
18
votes
5 answers

Merging sparse and dense data in machine learning to improve the performance

I have sparse features which are predictive, also I have some dense features which are also predictive. I need to combine these features together to improve the overall performance of the classifier. Now, the thing is when I try to combine these…
Sagar Waghmode
  • 231
  • 2
  • 7
17
votes
5 answers

Number of epochs in Gensim Word2Vec implementation

There's an iter parameter in the gensim Word2Vec implementation class gensim.models.word2vec.Word2Vec(sentences=None, size=100, alpha=0.025, window=5, min_count=5, max_vocab_size=None, sample=0, seed=1, workers=1, min_alpha=0.0001, sg=1, hs=1,…
alvas
  • 2,410
  • 7
  • 25
  • 40
17
votes
1 answer

Algorithms for text clustering

I have a problem of clustering huge amount of sentences into groups by their meanings. This is similar to a problem when you have lots of sentences and want to group them by their meanings. What algorithms are suggested to do this? I don't know…
Maxim Galushka
  • 303
  • 1
  • 2
  • 7
17
votes
5 answers

Decision tree vs. KNN

In which cases is it better to use a Decision tree and other cases a KNN? Why use one of them in certain cases? And the other in different cases? (By looking at its functionality, not at the algorithm) Anyone have some explanations or references…
gchavez1
  • 173
  • 1
  • 1
  • 4
17
votes
3 answers

With unbalanced class, do I have to use under sampling on my validation/testing datasets?

I’m a beginner in machine learning and I’m facing a situation. I’m working on a Real Time Bidding problem, with the IPinYou dataset and I’m trying to do a click prediction. The thing is that, as you may know, the dataset is very unbalanced : Around…
jmvllt
  • 619
  • 1
  • 8
  • 15
17
votes
3 answers

Bagging vs Dropout in Deep Neural Networks

Bagging is the generation of multiple predictors that works as ensamble as a single predictor. Dropout is a technique that teach to a neural networks to average all possible subnetworks. Looking at the most important Kaggle's competitions seem that…
emanuele
  • 415
  • 1
  • 4
  • 8
17
votes
2 answers

Recommending movies with additional features using collaborative filtering

I am trying to build a recommendation system using collaborative filtering. I have the usual [user, movie, rating] information. I would like to incorporate an additional feature like 'language' or 'duration of movie'. I am not sure what techniques I…
Sidhha
  • 397
  • 3
  • 10
17
votes
5 answers

Should I use a decision tree or logistic regression for classification?

I am working on a classification problem. I have a dataset containing equal numbers of categorical variables and continuous variables. How do I decide which technique to use, between a decision tree and logistic regression? Is it right to assume…
Arun
  • 717
  • 3
  • 10
  • 27
17
votes
5 answers

Detecting cats visually by means of anomaly detection

I have a hobby project which I am contemplating committing to as a way of increasing my so far limited experience of machine learning. I have taken and completed the Coursera MOOC on the topic. My question is with regards to the feasibility of the…
Frost
  • 273
  • 2
  • 5
17
votes
2 answers

What is the use of [SEP] in paper BERT?

I know that [CLS] means the start of a sentence and [SEP] makes BERT know the second sentence has begun. However, I have a question. If I have 2 sentences, which are s1 and s2, and our fine-tuning task is the same. In one way, I add special tokens…
xiangqing shen
  • 171
  • 1
  • 1
  • 3
17
votes
3 answers

Word2Vec how to choose the embedding size parameter

I'm running word2vec over collection of documents. I understand that the size of the model is the number of dimensions of the vector space that the word is embedded into. And that different dimensions are somewhat related to different, independent…
Neil
  • 257
  • 1
  • 2
  • 8