Highest Voted Questions - Data Science Stack Exchange

14

votes

3 answers

How can I perform stratified sampling for multi-label multi-class classification?

I am asking this question for few reasons: The dataset in hand is imbalanced I used below code x = dataset[['Message']] y = dataset[['Label1', 'Label2']] train_data, test_data = train_test_split(x, test_size = 0.1, stratify=y, random_state =…

asked Jun 13 '18 at 11:18

Divyanshu Shekhar

569
1
5
15

14

votes

2 answers

Prioritized Replay, what does Importance Sampling really do?

I can't understand the purpose of importance-sampling weights (IS) in Prioritized Replay (page 5). A transition is more likely to be sampled from experience replay the larger its "cost" is. My understanding is that 'IS' helps with smoothely…

reinforcement-learning

asked Jun 09 '18 at 12:10

Kari

2,726
2
20
49

14

votes

1 answer

How to Predict the future values of time horizon with Keras?

I just built this LSTM neural network with Keras import numpy as np import pandas as pd from sklearn import preprocessing from keras.layers.core import Dense, Dropout, Activation from keras.activations import linear from…

asked Apr 24 '18 at 15:24

Nbenz

283
2
3
7

14

votes

4 answers

Big data case study or use case example

I have read lot of blogs\article on how different type of industries are using Big Data Analytic. But most of these article fails to mention What kinda data these companies used. What was the size of the data What kinda of tools technologies they…

asked Jun 11 '14 at 06:07

Brown_Dynamite

241
2
6

14

votes

3 answers

Why is there a $2$ at the denominator of the mean squared error function?

In the famous Deep Learning Book, in chapter 1, equation 6, the Quadratic Cost (or Mean Squared Error) in a neural network is defined as $ C(w, b) = \frac{1}{2n}\sum_{x}||y(x)-a||^2 $ where $w$ is the set of all weights and $b$ the set of all…

asked Mar 25 '18 at 20:32

Silas Berger

161
1
5

14

votes

4 answers

Is PCA considered a machine learning algorithm

I've understood that principal component analysis is a dimensionality reduction technique i.e. given 10 input features, it will produce a smaller number of independent features that are orthogonal and linear transformation of original features. Is…

asked Jan 16 '18 at 20:42

Victor

611
3
8
19

14

votes

2 answers

How to implement "one-to-many" and "many-to-many" sequence prediction in Keras?

I struggle to interpret the Keras coding difference for one-to-many (e. g. classification of single images) and many-to-many (e. g. classification of image sequences) sequence labeling. I frequently see two different kind of codes: Type 1 is where…

asked Jan 08 '18 at 08:55

Hendrik

8,587
17
42
55

14

votes

3 answers

after grouping to minimum value in pandas, how to display the matching row result entirely along min() value

The dataframe contains >> df A B C A 196512 196512 1325 12.9010511000000 196512 196512 114569 12.9267705000000 196512 196512 118910 12.8983353775637 196512 196512 100688 12.9505091000000 196795 196795 …

asked Jan 05 '18 at 04:27

Sam Joe

173
1
1
10

14

votes

3 answers

How can I fit categorical data types for random forest classification?

I need to find the accuracy of a training dataset by applying Random Forest Algorithm. But my the type of my data set are both categorical and numeric. When I tried to fit those data, I get an error. 'Input contains NaN, infinity or a value too…

asked Jan 04 '18 at 13:03

IS2057

295
1
7
21

14

votes

1 answer

Input normalization for ReLu?

Let's assume a vanilla MLP for classification with a given activation function for hidden layers. I know it is a known best practice to normalize the input of the network between 0 and 1 if sigmoid is the activation function and -0.5 and 0.5 if tanh…

asked Dec 20 '17 at 03:39

Taiko

243
1
2
6

14

votes

1 answer

Irregular Precision-Recall Curve

I'd expect that for a precision-recall curve, precision decreases while recall increases monotonically. I have a plot that is not smooth and looks funny. I used scikit learn the values for plotting the curve. Is the curve below abnormal? If yes, why…

asked Nov 21 '17 at 18:44

Anderlecht

261
2
7

14

votes

4 answers

What are graph embedding?

I recently came across graph embedding such as DeepWalk and LINE. However, I still do not have a clear idea as what is meant by graph embeddings and when to use it (applications)? Any suggestions are welcome!

graphs

asked Oct 25 '17 at 22:54

Volka

711
3
6
21

14

votes

2 answers

Keras Multiple “Softmax” in last layer possible?

Is it possible to implement mutiple softmaxes in the last layer in Keras? So the sum of Nodes 1-4 = 1; 5-8 = 1; etc. Should I go for a different network design?

asked Oct 08 '17 at 15:09

cgn.dev

249
1
2
4

14

votes

3 answers

Using TF-IDF with other features in scikit-learn

What is the best/correct way to combine text analysis with other features? For example, I have a dataset with some text but also other features/categories. scikit-learn's TF-IDF vectorizer transforms text data into sparse matrices. I can use these…

asked Sep 04 '17 at 11:30

lte__

1,320
5
18
27

14

votes

3 answers

What is the difference between Dilated Convolution and Deconvolution?

These two convolution operations are very common in deep learning right now. I read about dilated convolutional layer in this paper : WAVENET: A GENERATIVE MODEL FOR RAW AUDIO and De-convolution is in this paper : Fully Convolutional Networks for…

asked Aug 18 '17 at 14:09

Shamane Siriwardhana

827
1
8
25

Most Popular