Most Popular

1500 questions
13
votes
1 answer

How max_features parameter works in DecisionTreeClassifier?

What is the parameter max_features in DecisionTreeClassifier responsible for? I thought it defines the number of features the tree uses to generate its nodes. But in spite of the different values of this parameter (n = 1 and 2), my tree employs both…
James Flash
  • 301
  • 1
  • 2
  • 8
13
votes
5 answers

How does Sigmoid activation work in multi-class classification problems

I know that for a problem with multiple classes we usually use softmax, but can we also use sigmoid? I have tried to implement digit classification with sigmoid at the output layer, it works. What I don't understand is how does it work?
bharath chandra
  • 131
  • 1
  • 1
  • 4
13
votes
1 answer

Can Reinforcement learning be applied for time series forecasting?

Can Reinforcement learning be applied for time series forecasting?
Osama Dar
  • 599
  • 2
  • 8
  • 19
13
votes
3 answers

How can I do classification with categorical data which is not fixed?

I have a classification problem with both categorical and numerical data. The problem I'm facing is that my categorical data is not fixed, that means that the new candidate whose label I want to predict may have a new category which was not observed…
Marisa
  • 281
  • 1
  • 2
  • 6
13
votes
1 answer

Neo4j vs OrientDB vs Titan

I am working on a data-science project related on social relationship mining and need to store data in some graph databases. Initially I chose Neo4j as the database. But it seams Neo4j doesn't scale well. The alternative I found out are Titan and…
Sreejithc321
  • 1,920
  • 3
  • 18
  • 33
13
votes
1 answer

Source of Arthur Samuel's definition of machine learning

Many people seem to agree that Arthur Samuel wrote or said in 1959 that machine learning is the "Field of study that gives computers the ability to learn without being explicitly programmed". For example the quote is contained in this page, that…
Pierre Cattin
  • 263
  • 1
  • 2
  • 6
13
votes
2 answers

K-fold cross validation when using fit_generator and flow_from_directory() in Keras

I am using flow_from_directory() and fit_generator in my deep learning model, and I want to use cross validation method to train the CNN model. datagen = ImageDataGenerator(rotation_range=15,width_shift_range=0.2, …
Noran
  • 768
  • 3
  • 8
  • 21
13
votes
3 answers

Why is the F-measure preferred for classification tasks?

Why is the F-measure usually used for (supervised) classification tasks, whereas the G-measure (or Fowlkes–Mallows index) is generally used for (unsupervised) clustering tasks? The F-measure is the harmonic mean of the precision and recall. The…
Bruno Lubascher
  • 3,548
  • 1
  • 12
  • 36
13
votes
2 answers

Batch Size of Stateful LSTM in keras

My Model is defined as below: ## defining the model batch_size = 1 def my_model(): input_x = Input(batch_shape=(batch_size, look_back, 4), name='input') drop = Dropout(0.5) lstm_1 = LSTM(100, return_sequences=True,…
Jazz
  • 420
  • 1
  • 5
  • 15
13
votes
2 answers

How does generalised advantage estimation work?

I've been trying to add GAE to my A2C implementation for a while now, but I' can't quite seem to grok how it works. My understanding of it, is that it reduces the variance of the advantage estimation function by kind of 'averaging out' (or…
Omegastick
  • 233
  • 1
  • 3
  • 6
13
votes
1 answer

What is difference between Fully Connected layer and Bilinear layer in CNN?

What is the difference between Fully Connected layers and Bilinear layers in deep learning?
N.IT
  • 1,995
  • 4
  • 19
  • 35
13
votes
3 answers

What are the consequences of not freezing layers in transfer learning?

I am trying to fine tune some code from a Kaggle kernel. The model uses pretrained VGG16 weights (via 'imagenet') for transfer learning. However, I notice there is no layer freezing of layers as is recommended in a keras blog. One approach would be…
Borealis
  • 347
  • 2
  • 4
  • 16
13
votes
3 answers

Why do we need 2 matrices for word2vec or GloVe

Word2vec and GloVe are the two most known words embedding methods. Many works pointed that these two models are actually very close to each other and that under some assumptions, they perform a matrix factorization of the ppmi of the co-occurrences…
Robin
  • 1,337
  • 9
  • 19
13
votes
5 answers

What are helpful annotation tools (if any)

I'm looking for tools that would help me and my team annotate training sets. I work in an environment with large sets of data, some of which are un- or semi-structured. In many cases there are registration that help in finding a grounded truth. In…
S van Balen
  • 1,364
  • 1
  • 9
  • 28
13
votes
3 answers

Train new data to pre-trained model

Let's say I've trained my model and made my predictions. My question is... How can I append some new data to my pre-trained model without retrain the model from the beginning.
porfgian
  • 173
  • 1
  • 1
  • 10