Most Popular
1500 questions
15
votes
4 answers
How to calculate the output shape of conv2d_transpose?
Currently I code a GAN to generate MNIST numbers but the generator doesnt want to work. First I choose z with shape 100 per Batch, put into a layer to get into the shape (7,7, 256). Then conv2d_transpose layer
to into 28, 28, 1. (which is basically…
snowparrot
- 305
- 1
- 3
- 9
15
votes
1 answer
PyTorch vs. Tensorflow eager
Google recently included in tensorflow's nightly builds its Eager mode, an imperative API to access tensorflow computation capabilities.
How do tensorflow eager compare to PyTorch?
Some aspects that could affect the comparison could be:
Advantages…
noe
- 26,410
- 1
- 46
- 76
15
votes
2 answers
Why do we need to handle data imbalance?
I would like to know why we need to deal with data imbalance. I know how to deal with it and different methods to solve the issue - by up sampling or down sampling or by using SMOTE.
For example, if I have a rare disease 1 percent out of 100, and…
sara
- 481
- 7
- 15
15
votes
1 answer
Why should I normalize also the output data?
I'm new to data science and Neural Networks in general.
Looking around, many people say it is better to normalize the data before doing anything with the NN. I understand how normalizing the input data can be useful.
However, I really don't see how…
Euler_Salter
- 323
- 1
- 2
- 7
15
votes
3 answers
Modelling Unevenly Spaced Time Series
I have a continuous variable, sampled over a period of a year at irregular intervals. Some days have more than one observation per hour, while other periods have nothing for days. This makes it particularly difficult to detect patterns in the time…
doublebyte
- 420
- 3
- 9
15
votes
3 answers
Improve Pandas dataframe filtering speed
I have a dataset with 19 columns and about 250k rows. I have worked with bigger datasets, but this time, Pandas decided to play with my nerves.
I tried to split the original dataset into 3 sub-dataframes based on some simple rules. However, it takes…
Tasos
- 3,920
- 4
- 23
- 54
15
votes
1 answer
On-line random forests by adding more single Decisions Trees
A Random Forest (RF) is created by an ensemble of Decision Trees's (DT). By using bagging, each DT is trained in a different data subset. Hence, is there any way of implementing an on-line random forest by adding more decision tress on new data?
For…
tashuhka
- 566
- 5
- 10
15
votes
2 answers
Why should the initialization of weights and bias be chosen around 0?
I read this:
To train our neural network, we will initialize each parameter W(l)ijWij(l) and each b(l)ibi(l) to a small random value near zero (say according to a Normal(0,ϵ2)Normal(0,ϵ2) distribution for some small ϵϵ, say 0.01)
from Stanford…
cinqS
- 367
- 1
- 2
- 13
15
votes
1 answer
Make Keras run on multi-machine multi-core cpu system
I'm working on Seq2Seq model using LSTM from Keras (using Theano background) and I would like to parallelize the processes, because even few MBs of data need several hours for training.
It is clear that GPUs are far much better in parallelization…
chmodsss
- 1,964
- 2
- 18
- 37
15
votes
4 answers
Can we generate huge dataset with Generative Adversarial Networks
I'm dealing with a problem where I couldn't find enough dataset(images) to feed into my deep neural network for training.
I was so inspired by the paper Generative Adversarial Text to Image Synthesis published by Scott Reed et al. on Generative…
Alwyn Mathew
- 305
- 4
- 10
15
votes
1 answer
What is a 1D Convolutional Layer in Deep Learning?
I have a good general understanding of the role and mechanism of convolutional layers in Deep Learning for image processing in case of 2D or 3D implementations - they "simply" try to catch 2D patterns in images (in 3 channels in case of 3D).
But…
Hendrik
- 8,587
- 17
- 42
- 55
15
votes
4 answers
How to specify important attributes?
Assume a set of loosely structured data (e.g. Web tables/Linked Open Data), composed of many data sources. There is no common schema followed by the data and each source can use synonym attributes to describe the values (e.g. "nationality" vs…
vefthym
- 503
- 1
- 6
- 13
15
votes
1 answer
How many LSTM cells should I use?
Are there any rules of thumb (or actual rules) pertaining to the minimum, maximum and "reasonable" amount of LSTM cells I should use? Specifically I am relating to BasicLSTMCell from TensorFlow and num_units property.
Please assume that I have a…
user27994
15
votes
1 answer
Is stratified sampling necessary (random forest, Python)?
I use Python to run a random forest model on my imbalanced dataset (the target variable was a binary class). When splitting the training and testing dataset, I struggled whether to used stratified sampling (like the code shown) or not. So far, I…
LUSAQX
- 783
- 2
- 10
- 24
15
votes
2 answers
In XGBoost would we evaluate results with a Precision Recall curve vs ROC?
I am using XGBoost for payment fraud detection. The objective is binary classification, and the data is very unbalanced. One out of every 3-4k transactions is fraud.
I would expect the best way to evaluate the results is a Precision-Recall (PR)…
davidjhp
- 435
- 1
- 4
- 10