Highest Voted Questions - Data Science Stack Exchange

9

votes

2 answers

Features reduction for the not correlated data set

I am working with classification problem on a training data set, which have 100 features. All the features in pairs haven't visible correlation. One can see it in the example pair plot for the some of features: I am trying to find the right way to…

asked Sep 04 '19 at 18:45

Ruben Kazumov

211
1
4

9

votes

3 answers

Is There a Way to Re-Calibrate Predicted Probabilities After Using Class Weights?

I have classification data with far more negative instances than positive instances. I have used class weights in my models and have achieved the discrimination I want but the predicted probabilities from the models do not match the actual…

asked Sep 03 '19 at 20:36

from keras import michael

370
3
13

9

votes

1 answer

How is the cross-product transformation defined for binary features?

I am reading the paper on Wide & Deep learning and for the wide component, it states that one of the most important transformations is the cross-product transformation. This is defined as follows: $$\phi_{k}(\mathbf{x})=\prod_{i=1}^{d} x_{i}^{c_{k…

asked Aug 12 '19 at 12:42

Dimitris Poulopoulos

93
1
3

9

votes

3 answers

In elbow curve how to find the point from where the curve starts to rise?

I am computing a distance metric on my data. The result is then being sorted in ascending order. The samples having distance more than a specific threshold are to be marked as outliers and will be discarded. Below is a plot of all distance…

asked Aug 07 '19 at 10:49

Faiz Kidwai

235
1
2
12

9

votes

2 answers

Is Faster RCNN the same thing as VGG-16, RESNET-50, etc... or not?

My understanding is that Faster RCNN is an architecture for performing object detection. It finds objects in an image and classifies them. My understanding is also that VGG-16, RESNET-50, etc... also find objects in images and classify them. Are…

asked Jun 26 '19 at 14:32

b19wh33l5

91
1
2

9

votes

2 answers

What does an Input layer of shape=(None,) or (None,12) actually mean?

Is this telling the model that there are two dimensions (i.e. it’s a matrix) but we don’t yet know the size of that particular dimension? If so, how can the model be compiled? Doesn’t the size of each dimension affect the number of nodes in middle…

asked Jun 20 '19 at 14:01

Nic Cottrell

303
1
2
10

9

votes

1 answer

How does SQL Server Analysis Services compare to R?

This may be too broad of a question with heavy opinions, but I really am finding it hard to seek information about running various algorithms using SQL Server Analysis Service Data Mining projects versus using R. This is mainly because all the data…

asked Mar 27 '15 at 08:41

Fastidious

213
2
7

9

votes

2 answers

How to determine input shape in keras?

I am having difficulty finding where my error is while building deep learning models, but I typically have issues when setting the input layer input shape. This is my model: model = Sequential([ Dense(32, activation='relu', input_shape=(1461,…

asked Jun 12 '19 at 03:21

Josh Zwiebel

193
1
1
6

9

votes

3 answers

Why do RNNs usually have fewer hidden layers than CNNs?

CNNs can have hundreds of hidden layers and since they are often used with image data, having many layers captures more complexity. However, as far as I have seen, RNNs usually have few layers e.g. 2-4. For example, for electrocardiogram (ECG)…

asked Jun 09 '19 at 02:18

KRL

221
1
4

9

votes

3 answers

How to detect cardboard boxes using Neural Network

I'm trying to train a Neural Network how to detect cardboard boxes along with multiple classes of persons (people). Although it's easy to detect persons and correctly classifies them, it's incredibly hard to detect cardboard boxes. The boxes look…

asked May 28 '19 at 11:47

Martin Brisiak

151
1
7

9

votes

3 answers

Score matrix string similarity

I have a load of documents, which have a load of key value pairs in them. The key might not be unique so there might be multiple keys of the same type with different values. I want to compare the similarity of the keys between 2 documents. More…

asked Jun 22 '14 at 21:45

David

95
5

9

votes

2 answers

What does pandas describe() percentiles values tell about our data?

Let say this is my dataframe x=[0.09, 0.95, 0.93, 0.93, 0.34, 0.29, 0.14, 0.23, 0.91, 0.31, 0.62, 0.29, 0.71, 0.26, 0.79, 0.3 , 0.1 , 0.73, 0.63, 0.61] x=pd.DataFrame(x) When we x.describe() this dataframe we get result as this >>>…

asked May 25 '19 at 16:48

Eka

301
1
3
10

9

votes

2 answers

How to Use Shap Kernal Explainer with Pipeline models?

I have a pandas DataFrame X. I would like to find the prediction explanation of a a particular model. My model is given below: pipeline = Pipeline(steps= [ ('imputer', imputer_function()), ('classifier', RandomForestClassifier() …

asked May 23 '19 at 14:57

Nayana Madhu

436
1
3
8

9

votes

1 answer

How to arrange the dataset/images for CNN+LSTM

I am working on an image classification problem using Transfer Learning with Resnet50 as base model (in Keras) (For example Class A and Class B). There is a time factor involved in this classification. For example, I need sufficient evidence to make…

asked May 13 '19 at 02:44

deepguy

1,441
8
18
39

9

votes

2 answers

Does feature selections matter to Decision Tree algorithms?

Background: Currently I'm working on my thesis project, which is to build Tree-based ensemble methods for classification on a large data set. Before I started with modeling, I've spent a large amount of time on feature selection using…

asked May 08 '19 at 13:17

Ping

91
1
1
4

Most Popular