Highest Voted Questions - Data Science Stack Exchange

9

votes

2 answers

How to model user's buying behavior on Amazon?

For our final course project in Data Science, we proposed the following- Give the Amazon Reviews Dataset, we plan to come up with an algorithm (thats roughly based on Personalized PageRank) that determines a strategic position for placing ads on…

asked Nov 05 '15 at 17:06

Pavan Manjunath

191
2

9

votes

2 answers

High accuracy on test-set, what could go wrong?

You are given a pre-trained binary ML classification model with 99% accuracy on the test-set (assume the customer required 95% and that the test-set is balanced). We would like to deploy our model in production. What could go wrong? How would you…

asked Dec 17 '20 at 11:20

CodeHoarder

193
1
4

9

votes

2 answers

Effect of Stop-Word Removal on Transformers for Text Classification

The domain here is essentially topic classification, so not necessarily a problem where stop-words have an impact on the analysis (as opposed to, say, sentiment analysis where structure can affect meaning). With respect to the positional encoding…

asked Dec 03 '20 at 20:24

Andy

650
4
13

9

votes

3 answers

Sentiment Analysis Tutorial

I am trying to understand sentiment analysis and how to apply it using any language (R, Python etc). I would like to know if there is a good place on internet for tutorial that I can follow. I googled, but I wasn't very much satisfied because they…

asked Oct 12 '15 at 03:31

KurioZ7

285
3
7

9

votes

3 answers

Why is 10000 used as the denominator in Positional Encodings in the Transformer Model?

I was working through the All you need is Attention paper, and while the motivation of positional encodings makes sense and the other stackexchange answers filled me in on the motivations of the structure of it, I still don't understand why…

asked Oct 01 '20 at 21:27

ThirtyOneTwentySeven

191
1
3

9

votes

1 answer

Why is the cosine distance used to measure the similatiry between word embeddings?

While computing the similarity between the words, cosine similarity or distance is computed on word vectors. Why aren't other distance metrics such as Euclidean distance suitable for this task. Let us consider 2 vectors a and b. Where, a = [-1,2,-3]…

asked Sep 03 '20 at 12:45

Ashwin Geet D'Sa

1,129
2
9
20

9

votes

2 answers

Does "feature importance" depend on the model type?

I was working on a small classification problem (breast cancer data set from sklearn), and trying to decide which features were most important to predict the labels. I understand that there are several ways to define "important feature" here…

asked Aug 24 '20 at 14:19

Frank

200
1
5

9

votes

1 answer

Original Meaning of "Intelligence" in "Business Intelligence"

What does the term "Intelligence" originally stand for in "Business Intelligence" ? Does it mean as used in "Artificial Intelligence" or as used in "Intelligence Agency" ? In other words, does "Business Intelligence" mean: "Acting smart &…

asked Sep 05 '15 at 16:42

Seyed Mohammad

193
6

9

votes

2 answers

image_dataset_from_directory VS flow_from_directory

What is the main diffrence between flow_from_directory VS image_dataset_from_directory in keras? which one should I use?

asked Jul 28 '20 at 07:38

Bala venkatesh

391
1
3
10

9

votes

1 answer

Is it possible to have stratified train-test split of a set based on two columns?

Consider a dataframe that contains two columns, text and label. I can very easily create a stratified train-test split using sklearn.model_selection.train_test_split. The only thing I have to do is to set the column I want to use for the…

asked Jul 23 '20 at 13:09

Aventinus

213
1
3
7

9

votes

3 answers

Multivariate Time series analysis: When is a CNN vs. LSTM appropriate?

I have multiple features in a time series and want to predict the values of the same features for the next time step. I have already trained an LSTM which is working okay, but takes a bit long to train. So now my question: is it reasonable to use a…

asked Jul 20 '20 at 14:01

drops

220
2
7

9

votes

3 answers

How to setup and run Conda on Google Colab

I am interested in using Google Colab for data modeling. How do I install conda, create an environment and run python in a notebook? I did some searching and found some helpful hints, but had several issues with this. I can only get a partially…

asked Jun 13 '20 at 14:41

Donald S

1,939
3
8
28

9

votes

2 answers

Why leaky relu is not so common in real practice?

As leaky relu does not lead any value to 0, so training always continues. And I can't think of any disadvantages it have. Yet Leaky relu is less popular than Relu in real practice. Can someone tell why?

asked May 14 '20 at 02:30

Prashant Gupta

201
2
4

9

votes

2 answers

Is BERT a language model?

Is BERT a language model in the sense of a function that gets a sentence and returns a probability? I know its main usage is sentence embedding, but can it also provide this functionality?

asked May 13 '20 at 12:22

Amit Keinan

796
6
19

9

votes

2 answers

What is the meaning of a quadratic relation when r = 0?

A website (on page 4) says: The correlation coefficient is a measure of linear relationship and thus a value of r = 0 does not imply there is no relationship between the variables. For example in the following scatterplot which implies no…

asked Apr 23 '20 at 02:43

Subhash C. Davar

613
5
18

Most Popular