Most Popular
1500 questions
16
votes
2 answers
Binary classification model for unbalanced data
I have a dataset with the following specifications:
Training dataset with 193,176 samples with 2,821 positives
Test Dataset with 82,887 samples with 673 positives
There are 10 features.
I want to perform a binary classification (0 or 1). The issue…
tejaskhot
- 4,065
- 7
- 20
- 18
16
votes
3 answers
Why my network needs so many epochs to learn?
I'm working on a relation classification task for natural language processing and I have some questions about the learning process. I implemented a convolutional neural network using PyTorch, and I'm trying to select the best hyper-parameters.
The…
user3319400
- 261
- 1
- 2
- 6
16
votes
1 answer
Do you have to normalize data when building decision trees using R?
So, our data set this week has 14 attributes and each column has very different values. One column has values below 1 while another column has values that go from three to four whole digits.
We learned normalization last week and it seems like…
Jae
- 163
- 1
- 1
- 8
16
votes
1 answer
What is the difference between feature generation and feature extraction?
Can anybody tell me what the purpose of feature generation is? And why feature space enrichment is needed before classifying an image? Is it a necessary step?
Is there any method to enrich feature space?
Saratha Priya
- 183
- 1
- 1
- 6
16
votes
4 answers
What are the implications for training a Tree Ensemble with highly biased datasets?
I have a highly biased binary dataset - I have 1000x more examples of the negative class than the positive class. I would like to train a Tree Ensemble (like Extra Random Trees or a Random Forest) on this data but it's difficult to create training…
gallamine
- 418
- 2
- 8
16
votes
4 answers
Best way to classify datasets with mixed types of attributes
I would like to know what is the best way to classify a data set composed of mixed types of attributes, for example, textual and numerical. I know I can convert textual to boolean, but the vocabulary is diverse and data become too sparse. I also…
user900
- 161
- 1
- 1
- 3
16
votes
6 answers
How to prepare the varied size input in CNN prediction
I want to make a CNN model in Keras which can be fed images of different sizes. According to other questions, I could understand how to set a model, like Input =(None,None,3). However, I'm not sure how to prepare the input/output…
kainamanama
- 311
- 1
- 3
- 8
16
votes
6 answers
Are there any good out-of-the-box language models for python?
I'm prototyping an application and I need a language model to compute perplexity on some generated sentences.
Is there any trained language model in python I can readily use? Something simple like
model = LanguageModel('en')
p1 =…
Fred
- 403
- 3
- 9
16
votes
5 answers
What more does TensorFlow offer to keras?
I'm aware that keras serves as a high-level interface to TensorFlow.
But it seems to me that keras can do many functionalities on its own (data input, model creation, training, evaluation).
Furthermore, some of TensorFlow's functionality can be…
Javier
- 362
- 1
- 8
16
votes
5 answers
Why does adding a dropout layer improve deep/machine learning performance, given that dropout suppresses some neurons from the model?
If removing some neurons results in a better performing model, why not use a simpler neural network with fewer layers and fewer neurons in the first place? Why build a bigger, more complicated model in the beginning and suppress parts of it later?
user781486
- 1,385
- 2
- 16
- 19
16
votes
3 answers
Zero Mean and Unit Variance
I'm studying Data Scaling, and in particular the Standardization method.
I've understood the math behind it, but it's not clear to me why it's important to give the features zero mean and unit variance.
Can you explain me ?
Qwerto
- 695
- 1
- 8
- 15
16
votes
2 answers
One Hot Encoding vs Word Embedding - When to choose one or another?
A colleague of mine is having an interesting situation, he has quite a large set of possibilities for a defined categorical feature (+/- 300 different values)
The usual data science approach would be to perform a One-Hot Encoding.
However, wouldn't…
Jonathan DEKHTIAR
- 590
- 2
- 5
- 10
16
votes
2 answers
Word2Vec embeddings with TF-IDF
When you train the word2vec model (using for instance, gensim) you supply a list of words/sentences. But there does not seem to be a way to specify weights for the words calculated for instance using TF-IDF.
Is the usual practice to multiply the…
SFD
- 281
- 1
- 3
- 7
16
votes
1 answer
Train Accuracy vs Test Accuracy vs Confusion matrix
After I developed my predictive model using Random Forest I get the following metrics:
Train Accuracy :: 0.9764634601043997
Test Accuracy :: 0.7933284397683713
Confusion matrix [[28292 1474]
…
Pedro Alves
- 367
- 2
- 3
- 11
16
votes
3 answers
Is there a thumb-rule for designing neural-networks?
I know that a neural-network architecture is mostly based on the problem itself and the types of input/output, but still - there's always a "square one" when starting to build one. So my question is - given a input dataset of MxN (M is the number…
shakedzy
- 699
- 1
- 5
- 24