Highest Voted Questions - Data Science Stack Exchange

14

votes

2 answers

Class token in ViT and BERT

I'm trying to understand the architecture of the ViT Paper, and noticed they use a CLASS token like in BERT. To the best of my understanding this token is used to gather knowledge of the entire class, and is then solely used to predict the class of…

asked Mar 14 '21 at 18:01

Shir

241
1
2
5

14

votes

2 answers

What to do when testing data has less features than training data?

Let's say we are predicting the sales of a shop and my training data has two sets of features: One about the store sales with the dates (the field "Store" is not unique) One about the store types (the field "Store" is unique here) So the matrix…

asked Nov 18 '15 at 09:12

alvas

2,410
7
25
40

14

votes

9 answers

Is Python suitable for big data

I read in this post Is the R language suitable for Big Data that big data constitutes 5TB, and while it does a good job of providing information about the feasibility of working with this type of data in R it provides very little information about…

asked Jul 18 '14 at 22:34

ragingSloth

1,824
3
14
15

14

votes

3 answers

When are p-values deceptive?

What are the data conditions that we should watch out for, where p-values may not be the best way of deciding statistical significance? Are there specific problem types that fall into this category?

asked May 14 '14 at 22:12

user179

143
1
4

14

votes

10 answers

How can I appropriately handle cleaning of gender data?

I’m a data science student and I’ve begun working with an open mental health dataset. As part of this, I need to clean the data so that I can perform an analysis of it. In this dataset, the gender field is a string that could have had anything…

asked Mar 20 '20 at 04:23

nick012000

263
2
9

14

votes

5 answers

How can you include information not present in an image for neural networks?

I am training a CNN to identify objects in images (one label per image). However, I have additional information about these images that cannot be retrieved by looking at the image itself. In more detail, I'm talking about the physical location of…

asked Feb 21 '20 at 09:08

seb

143
1
6

14

votes

3 answers

what is darknet and why is it needed for YOLO object detection?

what is darknet and why is it needed for YOLO object detection ? I read that its a neural network written in C , but why is it needed for YOLO object detection when we have lot of machine learning framework,api like tensorflow,keras,pytorch . Im…

asked Jan 06 '20 at 09:59

star

1,471
7
19
29

14

votes

2 answers

What are the good parameter ranges for BERT hyperparameters while finetuning it on a very small dataset?

I need to finetune BERT model (from the huggingface repository) on a sentence classification task. However, my dataset is really small.I have 12K sentences and only 10% of them are from positive classes. Does anyone here have any experience on…

asked Dec 10 '19 at 18:31

zwlayer

259
1
2
8

14

votes

3 answers

How to automatically mount my Google Drive to Google Colab

I have recently discovered Google Colab and I am wondering if there is an option to permanently authorize Google Colab to access and mount my Google Drive. from google.colab import drive drive.mount('/content/drive') Go to this URL in a browser:…

asked Dec 09 '19 at 14:59

Georgi Stoyanov

243
1
2
5

14

votes

2 answers

Preprocessing for Text Classification in Transformer Models (BERT variants)

This might be silly to ask, but I am wondering if one should carry out the conventional text preprocessing steps for training one of the transformer models? I remember for training a Word2Vec or Glove, we needed to perform an extensive text cleaning…

asked Nov 08 '19 at 06:28

TwinPenguins

4,249
3
19
53

14

votes

2 answers

SHAP value analysis gives different feature importance on train and test set

Should SHAP value analysis be done on the train or test set? What does it mean if the feature importance based on mean |SHAP value| is different between the train and test set of my lightgbm model? I intend to use SHAP analysis to identify how each…

asked Oct 07 '19 at 19:10

pbk

143
1
5

14

votes

3 answers

Where can I download historical market capitalization and daily turnover data for stocks?

There are plenty of sources which provide the historical stock data but they only provide the OHLC fields along with volume and adjusted close. Also a couple of sources I found provide market cap data sets but they're restricted to US stocks. Yahoo…

dataset

asked Jun 25 '14 at 18:06

tejaskhot

4,065
7
20
18

14

votes

2 answers

Fast k-means like algorithm for $10^{10}$ points?

I am looking to do k-means clustering on a set of 10-dimensional points. The catch: there are $10^{10}$ points. I am looking for just the center and size of the largest clusters (let's say 10 to 100 clusters); I don't care about what cluster each…

asked May 11 '15 at 06:53

Alex I

3,152
1
21
27

14

votes

1 answer

Differences between gradient calculated by different reduction methods in PyTorch

I'm playing with different reduction methods provided in built-in loss functions. In particular, I would like to compare the following. The averaged gradient by performing backward pass for each loss value calculated with reduction="none" The…

asked Jul 05 '19 at 18:12

Zhuoran Liu

141
1
3

14

votes

5 answers

How to make LightGBM to suppress output?

I have tried for a while to figure out how to "shut up" LightGBM. Especially, I would like to suppress the output of LightGBM during training (i.e. feedback on the boosting steps). My model: params = { 'objective': 'regression', …

asked Jun 17 '19 at 15:06

Peter

7,446
5
19
49

Most Popular