Most Popular

1500 questions
25
votes
2 answers

Feature Transformation on Input data

I was reading about the solution to this OTTO Kaggle challenge and the first place solution seems to use several transforms for the input data X, for example Log(X+1), sqrt( X + 3/8), etc. Is there a general guideline on when to apply which kind…
terenceflow
  • 406
  • 1
  • 4
  • 6
25
votes
2 answers

What does the alpha and beta hyperparameters contribute to in Latent Dirichlet allocation?

LDA has two hyperparameters, tuning them changes the induced topics. What does the alpha and beta hyperparameters contribute to LDA? How does the topic change if one or the other hyperparameters increase or decrease? Why are they hyperparamters…
alvas
  • 2,410
  • 7
  • 25
  • 40
25
votes
1 answer

Deep Neural Network - Backpropogation with ReLU

I'm having some difficulty in deriving back propagation with ReLU, and I did some work, but I'm not sure if I'm on the right track. Cost Function: $\frac{1}{2}(y-\hat y)^2$ where $y$ is the real value, and $\hat y$ is a predicted value. Also assume…
user1157751
  • 689
  • 1
  • 8
  • 22
25
votes
4 answers

Is feature engineering still useful when using XGBoost?

I was reading the material related to XGBoost. It seems that this method does not require any variable scaling since it is based on trees and this one can capture complex non-linearity pattern, interactions. And it can handle both numerical and…
KevinKim
  • 635
  • 1
  • 7
  • 13
25
votes
9 answers

How can I get prediction for only one instance in Keras?

When I request Keras to apply prediction with a fitted model to a new dataset without label like this: model1.predict_classes(X_test) it works fine. But when I try to make prediction for only one row, it…
Hendrik
  • 8,587
  • 17
  • 42
  • 55
25
votes
4 answers

Is there any data tidying tool for python/pandas similar to R tidyr tool?

I'm working on a Kaggle challenge where some variables are represented by rows instead of columns (Telstra Network Disruption). I am currently searching for the equivalent of gather(), separate() and spread(), which can be found in R tidyr tool.
cpumar
  • 807
  • 1
  • 9
  • 14
25
votes
5 answers

Clustering based on similarity scores

Assume that we have a set of elements E and a similarity (not distance) function sim(ei, ej) between two elements ei,ej ∈ E. How could we (efficiently) cluster the elements of E, using sim? k-means, for example, requires a given k, Canopy…
vefthym
  • 503
  • 1
  • 6
  • 13
24
votes
2 answers

Choosing between TensorFlow or Theano as backend for Keras

Keras supports both TensorFlow and Theano as backend: what are the pros/cons of choosing one versus the other, besides the fact that currently not all operations are implemented with the TensorFlow backend?
Franck Dernoncourt
  • 5,690
  • 10
  • 40
  • 76
24
votes
2 answers

What's the difference between the cell and hidden state in LSTM?

LSTM cells consist of two types of states, the cell state and hidden state. How do cell and hidden states differ, in terms of their functionality? What information do they carry?
user105907
24
votes
3 answers

Starting my career as Data Scientist, is Software Engineering experience required?

I am an MSc student at the University of Edinburgh, specialized in machine learning and natural language processing. I had some practical courses focused on data mining, and others dealing with machine learning, bayesian statistics and graphical…
cpumar
  • 807
  • 1
  • 9
  • 14
24
votes
1 answer

What is Monte Carlo dropout?

I understand how to use MC dropout from this answer, but I don't understand how MC dropout works, what its purpose is, and how it differs from normal dropout.
Arka Mallick
  • 600
  • 2
  • 7
  • 16
24
votes
4 answers

Looking for a good package for anomaly detection in time series

Is there a comprehensive open source package (preferably in python or R) that can be used for anomaly detection in time series? There is a one class SVM package in scikit-learn but it is not for the time series data. I’m looking for more…
pythinker
  • 1,247
  • 2
  • 7
  • 17
24
votes
5 answers

Convolutional neural network overfitting. Dropout not helping

I am playing a little with convnets. Specifically, I am using the kaggle cats-vs-dogs dataset which consists on 25000 images labeled as either cat or dog (12500 each). I've managed to achieve around 85% classification accuracy on my test set,…
24
votes
3 answers

Python implementation of cost function in logistic regression: why dot multiplication in one expression but element-wise multiplication in another

I have a very basic question which relates to Python, numpy and multiplication of matrices in the setting of logistic regression. First, let me apologise for not using math notation. I am confused about the use of matrix dot multiplication versus…
GhostRider
  • 353
  • 1
  • 2
  • 8
24
votes
6 answers

Keras -- Transfer learning -- changing Input tensor shape

This post seems to indicate that what I want to accomplish is not possible. However, I'm not convinced of this -- given what I've already done, I don't see why what I want to do can not be achieved... I have two image datasets where one has images…
aweeeezy
  • 501
  • 2
  • 5
  • 9