Most Popular
1500 questions
12
votes
2 answers
Why is training take so long on my GPU?
Details:
GPU: GTX 1080
Training: ~1.1 Million images belonging to 10 classes
Validation: ~150 Thousand images belonging to 10 classes
Time per Epoch: ~10 hours
I've setup CUDA, cuDNN and Tensorflow( Tensorflow GPU as well).
I don't think my model is…
Rahul
- 121
- 1
- 5
12
votes
3 answers
Initialize perceptron weights with zero
I'm new to datascience so please just don't blast me.
In a text book i found:
Now, the reason we don't initialize the weights to zero is that the
learning rate (eta) only has an effect on the classification outcome
if the weights are…
Poiera
- 451
- 1
- 5
- 9
12
votes
1 answer
How do I implement the sigmoid function in Octave?
so given that the sigmoid function is defined as hθ(x) = g(θ^(T)x), how can I implement this funcion in Octave given that g = zeros(size(z)) ?
Shuryu Kisuke
- 223
- 1
- 2
- 5
12
votes
2 answers
Predict task duration
I'm trying to create a regression model that predicts the duration of a task. The training data I have consists of roughly 40 thousand completed tasks with these variables:
Who performed the task (~250 different people)
What part (subproject) of…
Jurgy
- 238
- 2
- 11
12
votes
6 answers
How to get the number of syllables in a word?
I have already gone through this post which uses nltk's cmudict for counting the number of syllables in a word:
from nltk.corpus import cmudict
d = cmudict.dict()
def nsyl(word):
return [len(list(y for y in x if y[-1].isdigit())) for x in…
Dawny33
- 8,296
- 12
- 48
- 104
12
votes
2 answers
Tradeoffs between Storm and Hadoop (MapReduce)
Can someone kindly tell me about the trade-offs involved when choosing between Storm and MapReduce in Hadoop Cluster for data processing? Of course, aside from the obvious one, that Hadoop (processing via MapReduce in a Hadoop Cluster) is a batch…
mbbce
- 347
- 2
- 8
12
votes
1 answer
Can HDF5 be reliably written to and read from simultaneously by separate python processes?
I'm writing a script to record live data over time into a single HDF5 file which includes my whole dataset for this project. I'm working with Python 3.6 and decided to create a command line tool using click to gather the data.
My concern is what…
basse
- 297
- 3
- 8
12
votes
1 answer
What feature engineering is necessary with tree based algorithms?
I understand data hygiene, which is probably the most basic feature engineering. That is making sure all your data is properly loaded, making sure N/As are treated as a special value rather than a number between -1 and 1, and tagging your…
William Entriken
- 423
- 1
- 4
- 10
12
votes
3 answers
Find the consecutive zeros in a DataFrame and do a conditional replacement
I have a dataset like this:
Sample Dataframe
import pandas as pd
df = pd.DataFrame({
'names': ['A','B','C','D','E','F','G','H','I','J','K','L'],
'col1': [0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0],
'col2': [0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0,…
Kevin
- 533
- 2
- 5
- 12
12
votes
3 answers
Instances vs. cores when using EC2
Working on what could often be called "medium data" projects, I've been able to parallelize my code (mostly for modeling and prediction in Python) on a single system across anywhere from 4 to 32 cores. Now I'm looking at scaling up to clusters on…
Therriault
- 871
- 1
- 8
- 13
12
votes
3 answers
Relation between convolution in math and CNN
I've read explanation of convolution and understand it to some extent. Can somebody help me understand how this operation relates to convolution in Convolutional Neural Nets? Is filter like function g which applies weight?
noname7619
- 323
- 2
- 9
12
votes
3 answers
Xgboost - How to use feature_importances_ with XGBRegressor()?
How could we get feature_importances when we are performing regression with XGBRegressor()?
There is something like XGBClassifier().feature_importances_?
Simone
- 705
- 1
- 14
- 23
12
votes
2 answers
What is the feature matrix in word2vec?
I'm a beginner in neural networks and currently I'm exploring the word2vec model. However I'm having a tough time to understand what the feature matrix exactly is.
I can understand that the first matrix is a one-hot encoding vector for a given…
Satrajit Maitra
- 121
- 1
- 4
12
votes
4 answers
How to know the model has started overfitting?
I hope the following excerpts will provide an insight into what my question is going to be. These are from here.
The learning then gradually slows down. Finally, at around epoch 280 the classification accuracy pretty much stops improving. Later…
figs_and_nuts
- 833
- 1
- 5
- 14
12
votes
2 answers
Naming conventions for dataframes
I often find myself writing code like the following (oversimplfied example)
df = read_csv('customer_data_export.csv')
df2 = df.query("date > '2017-01-10'")
data = df_filtered.groupby('transaction_id').sum()
plot_data = pivot_table(data,…
Max Flander
- 316
- 1
- 2
- 7