Highest Voted Questions - Data Science Stack Exchange

10

votes

5 answers

Time-series grouped cross-validation

I have data with the following structure: created_at | customer_id | features | target 2019-01-01 2 xxxxxxxx y 2019-01-02 3 xxxxxxxx y 2019-01-03 3 xxxxxxxx y ... That is, a session…

asked Jul 14 '20 at 07:05

David Masip

6,051
2
24
61

10

votes

1 answer

How is GPT able to handle large vocabularies?

From what I understand, GPT and GPT-2 are trained to predict the $N^{th}$ word in a sentence given the previous $N-1$ words. When the vocabulary size is very large (100k+ words) how is it able to generate any meaningful prediction? Shouldn't it…

asked Jul 11 '20 at 03:33

AAC

509
2
5
13

10

votes

4 answers

Can Boosted Trees predict below the minimum value of the training label?

I am using gradient Gradient Boosted Trees (with Catboost) for a Regression task. Can GBtrees predict a label that is below the minimum (or above the max) that was seen in the training ? For instance if the minimum value the label had is 10, would…

asked Jul 06 '20 at 11:39

Yairh

119
1
5

10

votes

2 answers

Is this Neo4j comparison to RDBMS execution time correct?

Background: Following is from the book Graph Databases, which covers a performance test mentioned in the book Neo4j in Action: Relationships in a graph naturally form paths. Querying, or traversing, the graph involves following paths. Because of…

asked May 15 '14 at 01:22

blunders

1,932
2
15
19

10

votes

2 answers

What is a good interpretation of this 'learning curve' plot?

I read about the validation_curve and how interpret it to know if there are over-fitting or underfitting, but how can interpret the plot when the data is the error like this: The X-axis is "Nº of examples of training" Redline is train error Green…

asked Jun 27 '20 at 09:36

Tlaloc-ES

337
1
7

10

votes

5 answers

AttributeError: module 'tensorflow.python.keras.utils' has no attribute 'to_categorical'

I'm trying to run the code below in my Jupyter Notebook. I get: AttributeError: module 'tensorflow.python.keras.utils' has no attribute 'to_categorical' This is code from Kaggle tutorial. I have installed Keras and Tensorflow. import numpy as np …

asked Jun 18 '20 at 08:36

vojtak

241
1
2
6

10

votes

3 answers

BPE vs WordPiece Tokenization - when to use / which?

What's the general tradeoff between choosing BPE vs WordPiece Tokenization? When is one preferable to the other? Are there any differences in model performance between the two? I'm looking for a general overall answer, backed up with specific…

asked Jun 02 '20 at 14:21

vgoklani

238
2
7

10

votes

4 answers

Skewed multi-class data

I have a dataset which contains ~100,000 samples of 50 classes. I have been using SVM with an RBF kernel to train and predict new data. The problem though is the dataset is skewed towards different classes. For example, Class 1 - 30 (~3% each),…

asked Jul 14 '14 at 13:53

mike1886

933
9
17

10

votes

1 answer

Is Minimax Linkage a Lance-Williams hierarchical clustering?

I found the following article on "Hierarchical Clustering With Prototypes via Minimax Linkage". It is stated in Property 6 that Minimax linkage cannot be written using Lance–Williams updates. A succinct proof using a counter-example is…

asked Sep 02 '15 at 13:34

mic

513
5
15

10

votes

2 answers

What are some key strengths of BERT over ELMO/ULMFiT?

I see BERT family is being used as benchmark everywhere for NLP tasks. What are some key strengths of BERT over models like ELMO or ULMFiT?

asked Feb 16 '20 at 03:28

Akshay

101
1
1
4

10

votes

2 answers

How to get feature importance from a keras deep learning model?

In case of scikit-learn's models, we can get feature importance using the relevant attributes of the model. I've been working on a RNN, using LSTMs for text embedding. Is there any way to get feature importance of various features from the…

asked Feb 14 '20 at 15:27

soham_dhole

140
1
1
8

10

votes

4 answers

How to impute Missing values not the usual way?

I have a dataset of 4712 records working on binary classification. Label 1 is 33% and Label 0 is 67%. I can't drop records because my sample is already small. Because there are few columns which has around 250-350 missing records. How do I know…

asked Jan 11 '20 at 07:52

The Great

2,565
2
20
43

10

votes

2 answers

Reducing the dimensionality of word embeddings

I trained word embeddings with 300 dimensions. Now, I would like to have word embeddings with 50 dimensions: is it better to retrain the word embeddings with 50 dimensions, or can I use some dimensionality reduction method to scale the word…

asked Jul 28 '15 at 17:54

Franck Dernoncourt

5,690
10
40
76

10

votes

2 answers

How are samples selected from training data in Xgboost

In Random Forest, each tree is not fed with the full batch of training data, only a sample. How does this work for Xgboost? If this sampling happens as well, how does it work for this ML algorithm?

asked Jan 08 '20 at 09:32

Aman Raparia

257
2
8

10

votes

3 answers

What is the correct way to call Keras flow_from_directory() method?

In the following article there is an instruction that dataset needs to be divided into train, validation and test folders where the test folder should not contain the labeled subfolders. Instead it should only contain a single folder (i.e.…

keras

asked Jan 06 '20 at 18:29

Tauno

799
2
9
9

Most Popular