Highest Voted Questions - Data Science Stack Exchange

11

votes

1 answer

CNN - imbalanced classes, class weights vs data augmentation

I have a dataset with a few strongly imbalanced classes, eg. the smallest class is about 54 times smaller than the largest. Therefore, data augmentation in order to equalize the size of classes seems like a bad idea to me (in the example above each…

asked Mar 16 '19 at 15:50

I.D.M

175
1
10

11

votes

2 answers

What is the most efficient method for hyperparameter optimization in scikit-learn?

An overview of the hyperparameter optimization process in scikit-learn is here. Exhaustive grid search will find the optimal set of hyperparameters for a model. The downside is that exhaustive grid search is slow. Random search is faster than grid…

asked Mar 13 '19 at 19:42

Brian Spiering

21,136
2
26
109

11

votes

2 answers

Pandas merge column duplicate and sum value

How to merge duplicate column and sum their value? What I have A 30 A 40 B 50 What I need A 70 B 50 DF for this example d = {'address': ["A", "A", "B"], 'balances': [30, 40, 50]} df = pd.DataFrame(data=d) df

asked Mar 10 '19 at 06:37

Руслан Миров

213
1
2
5

11

votes

4 answers

Learning ordinal regression in R?

I'm working on a project and need resources to get me up to speed. The dataset is around 35000 observations on 30 or so variables. About half the variables are categorical with some having many different possible values, i.e. if you split the…

asked Jun 19 '14 at 03:43

Matt Hall

111
3

11

votes

7 answers

What is the ideal database that allows fast cosine distance?

I'm currently trying to store many feature vectors in a database so that, upon request, I can compare an incoming feature vector against many other (if not all) stored in the db. I would need to compute the Cosine Distance and only return, for…

asked Feb 13 '19 at 19:18

G4bri3l

223
2
7

11

votes

1 answer

When to use Dense, Conv1/2D, Dropout, Flatten, and all the other layers?

I have a binary classification problem and want to build a NN model which classifies the data whether class 0 or class 1. My actual implementation looks like the following: # Split dataset in train and test data X_train, X_test, Y_train, Y_test =…

asked Jan 16 '19 at 22:27

ZelelB

1,057
2
11
14

11

votes

1 answer

Confusion about Entity Embeddings of Categorical Variables - Working Example!

Problem Statement: I have problem making the Entity Embedding of Categorical Variable works for a simple dataset. I have followed the original github, or paper, or other blogposts[1,2,or this 3], or this Kaggle kernel; still not working. Data Part:…

asked Dec 16 '18 at 22:06

TwinPenguins

4,249
3
19
53

11

votes

2 answers

Batch normalization vs batch size

I have noticed that my performance of VGG 16 network gets better if I increase the batch size from $64$ to $256$. I have also observed that, using batch size $64$, the with and without batch normalization results have lot of difference. With batch…

asked Nov 29 '18 at 19:52

Arka Mallick

600
2
7
16

11

votes

2 answers

How to rename columns that have the same name?

I would like to rename the column names, but the Data Frame contains similar column names. How do I rename them? df.columns Output: Index([ 'Goods', 'Durable goods','Services','Exports', 'Goods', 'Services', 'Imports', 'Goods',…

asked Nov 20 '18 at 10:26

Antony Naveen

121
1
1
5

11

votes

3 answers

Best languages for scientific computing

It seems as though most languages have some number of scientific computing libraries available. Python has Scipy Rust has SciRust C++ has several including ViennaCL and Armadillo Java has Java Numerics and Colt as well as several other Not to…

asked Jun 16 '14 at 19:14

ragingSloth

1,824
3
14
15

11

votes

2 answers

How does Implicit Quantile-Regression Network (IQN) differ from QR-DQN?

For several months I browsed the internet hoping to find a user-friendly explanation of the Implicit Quantile Regression Network (IQN). But, it seems there is none at all. How does IQN differ from Quantile Regression Network, in plain language? In…

asked Nov 07 '18 at 14:57

Kari

2,726
2
20
49

11

votes

2 answers

Does batch normalization mean that sigmoids work better than ReLUs?

Batch normalization and ReLUs are both solutions to the vanishing gradient problem. If we're using batch normalization, should we then use sigmoids? Or are there features of ReLUs that make them worthwhile even when using batchnorm? I suppose that…

asked Sep 28 '18 at 16:39

generic_user

499
3
10

11

votes

2 answers

Is max_depth in scikit the equivalent of pruning in decision trees?

I was analyzing the classifier created using a decision tree. There is a tuning parameter called max_depth in scikit's decision tree. Is this equivalent of pruning a decision tree? If not, how could I prune a decision tree using scikit? dt_ap =…

asked Sep 23 '18 at 06:50

Suhail Gupta

601
8
15

11

votes

2 answers

Why neural networks do not perform well on structured data?

I was recently working on some classification problem where decision trees performed better than neural networks. I had tried various combinations with neural networks altering the number of neurons / hidden layers with an objective to beat the…

asked Sep 18 '18 at 01:54

Suhail Gupta

601
8
15

11

votes

2 answers

RL Advantage function why A = Q-V instead of A=V-Q?

In RL Course by David Silver - Lecture 7: Policy Gradient Methods, David explains what an Advantage function is, and how it's the difference between Q(s,a) and the V(s) Preliminary, from this post: First recall that a policy $\pi$ is a mapping…

asked Sep 01 '18 at 03:08

Kari

2,726
2
20
49

Most Popular