Most Popular

1500 questions
14
votes
3 answers

How can I perform stratified sampling for multi-label multi-class classification?

I am asking this question for few reasons: The dataset in hand is imbalanced I used below code x = dataset[['Message']] y = dataset[['Label1', 'Label2']] train_data, test_data = train_test_split(x, test_size = 0.1, stratify=y, random_state =…
Divyanshu Shekhar
  • 569
  • 1
  • 5
  • 15
14
votes
2 answers

Prioritized Replay, what does Importance Sampling really do?

I can't understand the purpose of importance-sampling weights (IS) in Prioritized Replay (page 5). A transition is more likely to be sampled from experience replay the larger its "cost" is. My understanding is that 'IS' helps with smoothely…
Kari
  • 2,726
  • 2
  • 20
  • 49
14
votes
1 answer

How to Predict the future values of time horizon with Keras?

I just built this LSTM neural network with Keras import numpy as np import pandas as pd from sklearn import preprocessing from keras.layers.core import Dense, Dropout, Activation from keras.activations import linear from…
Nbenz
  • 283
  • 2
  • 3
  • 7
14
votes
4 answers

Big data case study or use case example

I have read lot of blogs\article on how different type of industries are using Big Data Analytic. But most of these article fails to mention What kinda data these companies used. What was the size of the data What kinda of tools technologies they…
Brown_Dynamite
  • 241
  • 2
  • 6
14
votes
3 answers

Why is there a $2$ at the denominator of the mean squared error function?

In the famous Deep Learning Book, in chapter 1, equation 6, the Quadratic Cost (or Mean Squared Error) in a neural network is defined as $ C(w, b) = \frac{1}{2n}\sum_{x}||y(x)-a||^2 $ where $w$ is the set of all weights and $b$ the set of all…
Silas Berger
  • 161
  • 1
  • 5
14
votes
4 answers

Is PCA considered a machine learning algorithm

I've understood that principal component analysis is a dimensionality reduction technique i.e. given 10 input features, it will produce a smaller number of independent features that are orthogonal and linear transformation of original features. Is…
Victor
  • 611
  • 3
  • 8
  • 19
14
votes
2 answers

How to implement "one-to-many" and "many-to-many" sequence prediction in Keras?

I struggle to interpret the Keras coding difference for one-to-many (e. g. classification of single images) and many-to-many (e. g. classification of image sequences) sequence labeling. I frequently see two different kind of codes: Type 1 is where…
Hendrik
  • 8,587
  • 17
  • 42
  • 55
14
votes
3 answers

after grouping to minimum value in pandas, how to display the matching row result entirely along min() value

The dataframe contains >> df A B C A 196512 196512 1325 12.9010511000000 196512 196512 114569 12.9267705000000 196512 196512 118910 12.8983353775637 196512 196512 100688 12.9505091000000 196795 196795 …
Sam Joe
  • 173
  • 1
  • 1
  • 10
14
votes
3 answers

How can I fit categorical data types for random forest classification?

I need to find the accuracy of a training dataset by applying Random Forest Algorithm. But my the type of my data set are both categorical and numeric. When I tried to fit those data, I get an error. 'Input contains NaN, infinity or a value too…
IS2057
  • 295
  • 1
  • 7
  • 21
14
votes
1 answer

Input normalization for ReLu?

Let's assume a vanilla MLP for classification with a given activation function for hidden layers. I know it is a known best practice to normalize the input of the network between 0 and 1 if sigmoid is the activation function and -0.5 and 0.5 if tanh…
Taiko
  • 243
  • 1
  • 2
  • 6
14
votes
1 answer

Irregular Precision-Recall Curve

I'd expect that for a precision-recall curve, precision decreases while recall increases monotonically. I have a plot that is not smooth and looks funny. I used scikit learn the values for plotting the curve. Is the curve below abnormal? If yes, why…
Anderlecht
  • 261
  • 2
  • 7
14
votes
4 answers

What are graph embedding?

I recently came across graph embedding such as DeepWalk and LINE. However, I still do not have a clear idea as what is meant by graph embeddings and when to use it (applications)? Any suggestions are welcome!
Volka
  • 711
  • 3
  • 6
  • 21
14
votes
2 answers

Keras Multiple “Softmax” in last layer possible?

Is it possible to implement mutiple softmaxes in the last layer in Keras? So the sum of Nodes 1-4 = 1; 5-8 = 1; etc. Should I go for a different network design?
cgn.dev
  • 249
  • 1
  • 2
  • 4
14
votes
3 answers

Using TF-IDF with other features in scikit-learn

What is the best/correct way to combine text analysis with other features? For example, I have a dataset with some text but also other features/categories. scikit-learn's TF-IDF vectorizer transforms text data into sparse matrices. I can use these…
lte__
  • 1,320
  • 5
  • 18
  • 27
14
votes
3 answers

What is the difference between Dilated Convolution and Deconvolution?

These two convolution operations are very common in deep learning right now. I read about dilated convolutional layer in this paper : WAVENET: A GENERATIVE MODEL FOR RAW AUDIO and De-convolution is in this paper : Fully Convolutional Networks for…