Most Popular

1500 questions
11
votes
1 answer

Keras LSTM with 1D time series

I'm learning how to use Keras and I've had reasonable success with my labelled dataset using the examples on Chollet's Deep Learning for Python. The data set is ~1000 Time Series with length 3125 with 3 potential classes. I'd like to go beyond the…
user1147964
  • 165
  • 1
  • 3
  • 8
11
votes
2 answers

How do attention mechanisms in RNNs learn weights for a variable length input

Attention mechanisms in RNNs are reasonably common to sequence to sequence models. I understand that the decoder learns a weight vector $\alpha$ which is applied as a weighted sum of the output vectors from the encoder network. This is used to…
davidparks21
  • 423
  • 4
  • 17
11
votes
3 answers

What is a policy in machine learning?

While I was reading the paper "Grounded Action Transformation for Robot Learning in Simulation", I came across the term "policy". Could someone explain to me what that actually is (in general and in the particular context of the paper)?
Ramya Raj
  • 111
  • 1
  • 4
11
votes
2 answers

Trying to use TensorFlow to predict financial time series data

I'm new to ML and TensorFlow (I started about a few hours ago), and I'm trying to use it to predict the next few data points in a time series. I'm taking my input and doing this with it: /----------- x…
Isvara
  • 211
  • 2
  • 5
11
votes
1 answer

Data preprocessing: Should we normalise images pixel-wise?

Let me present you with a toy example and a reasoning on image normalisation I had: Suppose we have a CNN architecture to classify NxN grayscale images in two categories. Pixel values range from 0 (black) to 255 (white). Class 0: Images that…
lucasrodesg
  • 235
  • 2
  • 7
11
votes
2 answers

Difference between RMSProp with momentum and Adam Optimizers

According to this scintillating blogpost Adam is very similar to RMSProp with momentum. From tensorflow documentation we see that tf.train.RMSPropOptimizer has following parameters __init__( learning_rate, decay=0.9, momentum=0.0, …
hans
  • 253
  • 1
  • 3
  • 9
11
votes
1 answer

What are "VGG54" and "VGG22" derived from the VGG19 CNN?

In the paper Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network by Christian Ledig et al., the distance between images (used in the loss function) is calculated from feature maps extracted from the VGG19 network.…
Lafayette
  • 604
  • 5
  • 19
11
votes
2 answers

Find optimal P(X|Y) given I have a model that has good performance when trained on P(Y|X)

Input Data: $X$ -> features of t shirt (colour,logo,etc) $Y$ -> profit margin I have trained a random forest on the above $X$ and $Y$ and have achieved reasonable accuracy on a test data. So, I have $P(Y|X)$. Now, I would like to find $P(X|Y)$ i.e…
claudius
  • 153
  • 8
11
votes
1 answer

Difference between interpolate() and fillna() in pandas

Since interpolate and fillna method does the same work of filling na values. What is the basic difference between the two. What is the significance of having these two different methods?? Can anyone explain me in layman terms. I already visited…
Debuggerrr
  • 215
  • 1
  • 2
  • 8
11
votes
2 answers

Consequence of Feature Scaling

I am currently using SVM and scaling my training features to the range of [0,1]. I first fit/transform my training set and then apply the same transformation to my testing set. For example: ### Configure transformation and apply to training set …
mike1886
  • 933
  • 9
  • 17
11
votes
3 answers

What regression to use to calculate the result of election in a multiparty system?

I want to make a prediction for the result of the parliamentary elections. My output will be the % each party receives. There is more than 2 parties so logistic regression is not a viable option. I could make a separate regression for each party but…
Viktor
  • 850
  • 1
  • 6
  • 17
11
votes
2 answers

When do we say that the dataset is not classifiable?

I have many times analysed a dataset on which I could not really do any sort of classification. To see whether I can get a classifier I have usually used the following steps: Generate box plots of label against numerical values. Reduce the…
vc_dim
  • 188
  • 9
11
votes
4 answers

Using Clustering in text processing

Hi this is my first question in the Data Science stack. I want to create an algorithm for text classification. Suppose i have a large set of text and articles. Lets say around 5000 plain texts. I first use a simple function to determine the…
Rashid
  • 213
  • 1
  • 4
11
votes
3 answers

Relationship between KS, AUROC, and Gini

Common model validation statistics like the Kolmogorov–Smirnov test (KS), AUROC, and Gini coefficient are all functionally related. However, my question has to do with proving how these are all related. I am curious if anyone can help me prove these…
Steven
  • 111
  • 1
  • 1
  • 3
11
votes
3 answers

Can GPS coordinates (latitude and longitude) be used as features in a linear model?

I have data sets that contain, among many features, GPS coordinates (latitude and longitude). I'd like to use these data sets to explore problems such as: (1) computing ETA to drive between start and end points; and (2) estimating the amount of…