Most Popular
1500 questions
11
votes
2 answers
Counting indexes in pandas
I feel like this is a rudimentary question but I'm very new to this and just haven't been able to crack it / find the answer.
Ultimately what I'm trying to do here is to count unique values on a certain column and then determine which of those…
Mr. Hasquestions
- 113
- 1
- 1
- 6
11
votes
1 answer
Multiple Categorical values for a single feature how to convert them to binary using python
I have a data set of movies which has 28 columns. One of them is genres. For each row in this data set, the value for column genres is of the form "Action|Animation|Comedy|Family|Fantasy". I want to encode them using pandas.get_dummies() but since…
aks_Nin
- 111
- 1
- 1
- 4
11
votes
4 answers
Feature selection and classification accuracy relation
One of the methodology to select a subset of your available features for your classifier is to rank them according to a criterion (such as information gain) and then calculate the accuracy using your classifier and a subset of the ranked…
Pauline
- 113
- 1
- 1
- 6
11
votes
2 answers
Unable to figure out the linear embedding layer in the convolutional neural network?
I have the network architecture from the paper "learning fine-grained image similarity with deep ranking" and I am unable to figure out how the output from the three parallel network is merged using the linear embedding layer.
The only information…
A. Sam
- 233
- 1
- 6
11
votes
2 answers
Earlystopping in multi-output deep learning
When working with a neural network with more than one output, what is generally advised as the best strategy for early-stopping the training process?
Given that I am currently monitoring the net validation loss (validation loss from n different…
didgeridoo92
- 111
- 1
- 4
11
votes
2 answers
What does Negative Log Likelihood mean?
I have a data set which has continuous independent variables and a continuous dependent variable. To predict the dependent variable using the independent variables, I've run an ensemble of regression models and tried to compare them against each…
Minu
- 805
- 2
- 9
- 18
11
votes
1 answer
What is the significance of model merging in Keras?
I have learned that Keras has a functionality to "merge" two models according to the following:
from keras.layers import Merge
left_branch = Sequential()
left_branch.add(Dense(32, input_dim=784))
right_branch =…
Hendrik
- 8,587
- 17
- 42
- 55
11
votes
3 answers
Can map-reduce algorithms written for MongoDB be ported to Hadoop later?
In our company, we have a MongoDB database containing a lot of unstructured data, on which we need to run map-reduce algorithms to generate reports and other analyses. We have two approaches to select from for implementing the required…
Amir Ali Akbari
- 1,393
- 3
- 13
- 25
11
votes
1 answer
Calculate cosine similarity in Apache Spark
I have a DataFrame with IDF of certain words computed.
For example
(10,[0,1,2,3,4,5],[0.413734499590671,0.4244680552337798,0.4761400657781007, 1.4004620708967006,0.37876590175292424,0.48374466516332])
.... and so on
Now give a query Q, I can…
Ganesh Krishnan
- 243
- 1
- 2
- 6
11
votes
3 answers
Is TensorFlow a complete Machine Learning Library?
I am new to TensorFlow and I need to understand the capabilities and shortcomings of TensorFlow before I can use it. I know that it is a deep learning framework, but apart from that which other machine learning algorithms can we use with tensor…
Swaroop
- 213
- 1
- 2
- 6
11
votes
2 answers
Book keeping of experiment runs and results
I am a hands on researcher and I like testing out viable solutions, so I tend to run a lot of experiments. For example, if I am calculating a similarity score between documents, I might want to try out many measures. In fact, for each measure I…
machine-wisdom
- 113
- 5
11
votes
3 answers
How can I classify text considering word order, instead of just using a bag-of-words approach?
I've made a Naive Bayes classifier that uses the bag-of-words technique to classify spam posts on a message board. It works, but I think I could get much better results if my models considered the word orderings and phrases. (ex: 'girls' and 'live'…
Yerk
- 211
- 1
- 5
11
votes
4 answers
Choosing regularization method in neural networks
When training neural networks, there are at least 4 ways to regularize the network:
L1 Regularization
L2 Regularization
Dropout
Batch Normalization
plus of course other things like weight sharing and reducing the number of connections, which…
Thomas Johnson
- 665
- 1
- 7
- 11
11
votes
2 answers
How much time do scikit classifiers take to classify?
I am planning to use scikit linear support vector machine (SVM) classifier for text classification on a corpus consisting of 1 million labeled documents. What I am planning to do is, when a user enters some keyword, the classifier will first…
user3498
- 111
- 2
11
votes
7 answers
ChatGPT's Architecture - Decoder Only? Or Encoder-Decoder?
Does ChatGPT use an encoder-decoder architecture, or a decoder-only architecture? I have been coming across Medium and TowardsDataScience articles suggesting that it has an encoder-decoder architecture (see sources below):
--…
user141493
- 251
- 1
- 3
- 9