Most Popular
1500 questions
25
votes
2 answers
Feature Transformation on Input data
I was reading about the solution to this OTTO Kaggle challenge and the first place solution seems to use several transforms for the input data X, for example Log(X+1), sqrt( X + 3/8), etc. Is there a general guideline on when to apply which kind…
terenceflow
- 406
- 1
- 4
- 6
25
votes
2 answers
What does the alpha and beta hyperparameters contribute to in Latent Dirichlet allocation?
LDA has two hyperparameters, tuning them changes the induced topics.
What does the alpha and beta hyperparameters contribute to LDA?
How does the topic change if one or the other hyperparameters increase or decrease?
Why are they hyperparamters…
alvas
- 2,410
- 7
- 25
- 40
25
votes
1 answer
Deep Neural Network - Backpropogation with ReLU
I'm having some difficulty in deriving back propagation with ReLU, and I did some work, but I'm not sure if I'm on the right track.
Cost Function: $\frac{1}{2}(y-\hat y)^2$ where $y$ is the real value, and $\hat y$ is a predicted value. Also assume…
user1157751
- 689
- 1
- 8
- 22
25
votes
4 answers
Is feature engineering still useful when using XGBoost?
I was reading the material related to XGBoost. It seems that this method does not require any variable scaling since it is based on trees and this one can capture complex non-linearity pattern, interactions. And it can handle both numerical and…
KevinKim
- 635
- 1
- 7
- 13
25
votes
9 answers
How can I get prediction for only one instance in Keras?
When I request Keras to apply prediction with a fitted model to a new dataset without label like this:
model1.predict_classes(X_test)
it works fine. But when I try to make prediction for only one row, it…
Hendrik
- 8,587
- 17
- 42
- 55
25
votes
4 answers
Is there any data tidying tool for python/pandas similar to R tidyr tool?
I'm working on a Kaggle challenge where some variables are represented by rows instead of columns (Telstra Network Disruption). I am currently searching for the equivalent of gather(), separate() and spread(), which can be found in R tidyr tool.
cpumar
- 807
- 1
- 9
- 14
25
votes
5 answers
Clustering based on similarity scores
Assume that we have a set of elements E and a similarity (not distance) function sim(ei, ej) between two elements ei,ej ∈ E.
How could we (efficiently) cluster the elements of E, using sim?
k-means, for example, requires a given k, Canopy…
vefthym
- 503
- 1
- 6
- 13
24
votes
2 answers
Choosing between TensorFlow or Theano as backend for Keras
Keras supports both TensorFlow and Theano as backend: what are the pros/cons of choosing one versus the other, besides the fact that currently not all operations are implemented with the TensorFlow backend?
Franck Dernoncourt
- 5,690
- 10
- 40
- 76
24
votes
2 answers
What's the difference between the cell and hidden state in LSTM?
LSTM cells consist of two types of states, the cell state and hidden state.
How do cell and hidden states differ, in terms of their functionality? What information do they carry?
user105907
24
votes
3 answers
Starting my career as Data Scientist, is Software Engineering experience required?
I am an MSc student at the University of Edinburgh, specialized in machine learning and natural language processing. I had some practical courses focused on data mining, and others dealing with machine learning, bayesian statistics and graphical…
cpumar
- 807
- 1
- 9
- 14
24
votes
1 answer
What is Monte Carlo dropout?
I understand how to use MC dropout from this answer, but I don't understand how MC dropout works, what its purpose is, and how it differs from normal dropout.
Arka Mallick
- 600
- 2
- 7
- 16
24
votes
4 answers
Looking for a good package for anomaly detection in time series
Is there a comprehensive open source package (preferably in python or R) that can be used for anomaly detection in time series?
There is a one class SVM package in scikit-learn but it is not for the time series data. I’m looking for more…
pythinker
- 1,247
- 2
- 7
- 17
24
votes
5 answers
Convolutional neural network overfitting. Dropout not helping
I am playing a little with convnets. Specifically, I am using the kaggle cats-vs-dogs dataset which consists on 25000 images labeled as either cat or dog (12500 each).
I've managed to achieve around 85% classification accuracy on my test set,…
Juan Antonio Gomez Moriano
- 1,197
- 1
- 8
- 17
24
votes
3 answers
Python implementation of cost function in logistic regression: why dot multiplication in one expression but element-wise multiplication in another
I have a very basic question which relates to Python, numpy and multiplication of matrices in the setting of logistic regression.
First, let me apologise for not using math notation.
I am confused about the use of matrix dot multiplication versus…
GhostRider
- 353
- 1
- 2
- 8
24
votes
6 answers
Keras -- Transfer learning -- changing Input tensor shape
This post seems to indicate that what I want to accomplish is not possible. However, I'm not convinced of this -- given what I've already done, I don't see why what I want to do can not be achieved...
I have two image datasets where one has images…
aweeeezy
- 501
- 2
- 5
- 9