Highest Voted Questions - Data Science Stack Exchange

15

votes

4 answers

How to use SimpleImputer Class to replace missing values with mean values using Python?

This is my code import numpy as np import matplotlib.pyplot as plt import pandas as pd #Importing Dataset dataset = pd.read_csv('C:/Users/Rupali Singh/Desktop/ML A-Z/Machine Learning A-Z Template Folder/Part 1 - Data…

asked May 13 '19 at 14:01

Rupali Singh

195
1
2
8

15

votes

4 answers

LightGBM gives different results (metrics) depending on the columns order

I have two nearly identical datasets A and B which differ only in terms of columns ordering. I then train a LightGBM model on each of the two datasets with the following steps: Divide each dataset into training and testing (use the same random seed…

asked Apr 30 '19 at 17:09

Duy Bui

251
2
5

15

votes

1 answer

What is the difference between ImageNet and ImageNet1k? How to download it?

Some papers mention just ImageNet and some papers mention ImageNet 1k database? What is the difference between these 2? Are they same or is the latter one subset of the former one? I'm working on Generative Adversarial Nets. I wanted to train it on…

asked Mar 17 '19 at 07:58

Nagabhushan S N

724
3
8
23

15

votes

4 answers

Is Gradient Descent central to every optimizer?

I want to know whether Gradient descent is the main algorithm used in optimizers like Adam, Adagrad, RMSProp and several other optimizers.

asked Mar 12 '19 at 10:04

rawwar

861
2
12
23

15

votes

2 answers

Why should we use (or not) dropout on the input layer?

People generally avoid using dropout at the input layer itself. But wouldn't it be better to use it? Adding dropout (given that it's randomized it will probably end up acting like another regularizer) should make the model more robust. It will make…

asked Sep 19 '18 at 19:59

Aditya

2,470
2
16
35

15

votes

2 answers

Why aren't Genetic Algorithms used for optimizing neural networks?

From my understanding, Genetic Algorithms are powerful tools for multi-objective optimization. Furthermore, training Neural Networks (especially deep ones) is hard and has many issues (non-convex cost functions - local minima, vanishing and…

asked Sep 16 '18 at 08:34

cat91

413
2
7

15

votes

2 answers

Dropout on which layers of LSTM?

Using a multi-layer LSTM with dropout, is it advisable to put dropout on all hidden layers as well as the output Dense layers? In Hinton's paper (which proposed Dropout) he only put Dropout on the Dense layers, but that was because the hidden inner…

asked Sep 13 '18 at 13:17

BigBadMe

750
1
7
18

15

votes

2 answers

What is the difference between Hadoop and noSQL

I heard about many tools / frameworks for helping people to process their data (big data environment). One is called Hadoop and the other is the noSQL concept. What is the difference in point of processing? Are they complementary?

asked May 14 '14 at 10:44

рüффп

295
5
16

15

votes

1 answer

Can gradient boosted trees fit any function?

For neural networks we have the universal approximation theorem which states that neural networks can approximate any continuous function on a compact subset of $R^n$. Is there a similar result for gradient boosted trees? It seems reasonable since…

decision-trees

asked Jun 07 '18 at 16:30

Imran

2,381
12
22

15

votes

2 answers

What is one hot encoding in tensorflow?

I am currently doing a course in tensorflow in which they used tf.one_hot(indices, depth). Now I don't understand how these indices change into that binary sequence. Can somebody please explain to me the exact process???

asked Apr 12 '18 at 09:42

thanatoz

2,405
4
16
39

15

votes

3 answers

Is there a person class in ImageNet? Are there any classes related to humans?

If I look at one of the many sources for the Imagenet classes on the Internet I cannot find a single class related to human beings (and no, harvestman is not someone who harvests, but it's what I knew as a daddy longlegs, a kind of spider :-). How…

asked Feb 11 '18 at 08:21

DeltaIV

399
1
3
14

15

votes

1 answer

Multi task learning in Keras

I am trying to implement shared layers in Keras. I do see that Keras has keras.layers.concatenate, but I am unsure from documentation about its use. Can I use it to create multiple shared layers? What would be the best way to implement a simple…

asked Feb 05 '18 at 19:56

Aditya

253
1
2
7

15

votes

1 answer

Why do we need to add START + END symbols when using Recurrent Neural Nets for Sequence-to-Sequence Models?

In the Sequence-to-Sequence models, we often see that the START (e.g. ~~) and END (e.g.~~ ) symbols are added to the inputs and outputs before training the model and before inference/decoding unseen data. E.g.…

asked Jan 23 '18 at 09:39

alvas

2,410
7
25
40

15

votes

2 answers

Visualizing deep neural network training

I'm trying to find an equivalent of Hinton Diagrams for multilayer networks to plot the weights during training. The trained network is somewhat similar to a Deep SRN, i.e. it has a high number of multiple weight matrices which would make the…

asked Dec 10 '14 at 10:15

runDOSrun

293
2
10

15

votes

4 answers

Data Science Tools Using Scala

I know that Spark is fully integrated with Scala. It's use case is specifically for large data sets. Which other tools have good Scala support? Is Scala best suited for larger data sets? Or is it also suited for smaller data sets?

asked Dec 10 '14 at 06:37

sheldonkreger

1,169
8
20

Most Popular