Most Popular

1500 questions
9
votes
2 answers

Features reduction for the not correlated data set

I am working with classification problem on a training data set, which have 100 features. All the features in pairs haven't visible correlation. One can see it in the example pair plot for the some of features: I am trying to find the right way to…
Ruben Kazumov
  • 211
  • 1
  • 4
9
votes
3 answers

Is There a Way to Re-Calibrate Predicted Probabilities After Using Class Weights?

I have classification data with far more negative instances than positive instances. I have used class weights in my models and have achieved the discrimination I want but the predicted probabilities from the models do not match the actual…
9
votes
1 answer

How is the cross-product transformation defined for binary features?

I am reading the paper on Wide & Deep learning and for the wide component, it states that one of the most important transformations is the cross-product transformation. This is defined as follows: $$\phi_{k}(\mathbf{x})=\prod_{i=1}^{d} x_{i}^{c_{k…
9
votes
3 answers

In elbow curve how to find the point from where the curve starts to rise?

I am computing a distance metric on my data. The result is then being sorted in ascending order. The samples having distance more than a specific threshold are to be marked as outliers and will be discarded. Below is a plot of all distance…
Faiz Kidwai
  • 235
  • 1
  • 2
  • 12
9
votes
2 answers

Is Faster RCNN the same thing as VGG-16, RESNET-50, etc... or not?

My understanding is that Faster RCNN is an architecture for performing object detection. It finds objects in an image and classifies them. My understanding is also that VGG-16, RESNET-50, etc... also find objects in images and classify them. Are…
b19wh33l5
  • 91
  • 1
  • 2
9
votes
2 answers

What does an Input layer of shape=(None,) or (None,12) actually mean?

Is this telling the model that there are two dimensions (i.e. it’s a matrix) but we don’t yet know the size of that particular dimension? If so, how can the model be compiled? Doesn’t the size of each dimension affect the number of nodes in middle…
Nic Cottrell
  • 303
  • 1
  • 2
  • 10
9
votes
1 answer

How does SQL Server Analysis Services compare to R?

This may be too broad of a question with heavy opinions, but I really am finding it hard to seek information about running various algorithms using SQL Server Analysis Service Data Mining projects versus using R. This is mainly because all the data…
Fastidious
  • 213
  • 2
  • 7
9
votes
2 answers

How to determine input shape in keras?

I am having difficulty finding where my error is while building deep learning models, but I typically have issues when setting the input layer input shape. This is my model: model = Sequential([ Dense(32, activation='relu', input_shape=(1461,…
Josh Zwiebel
  • 193
  • 1
  • 1
  • 6
9
votes
3 answers

Why do RNNs usually have fewer hidden layers than CNNs?

CNNs can have hundreds of hidden layers and since they are often used with image data, having many layers captures more complexity. However, as far as I have seen, RNNs usually have few layers e.g. 2-4. For example, for electrocardiogram (ECG)…
KRL
  • 221
  • 1
  • 4
9
votes
3 answers

How to detect cardboard boxes using Neural Network

I'm trying to train a Neural Network how to detect cardboard boxes along with multiple classes of persons (people). Although it's easy to detect persons and correctly classifies them, it's incredibly hard to detect cardboard boxes. The boxes look…
Martin Brisiak
  • 151
  • 1
  • 7
9
votes
3 answers

Score matrix string similarity

I have a load of documents, which have a load of key value pairs in them. The key might not be unique so there might be multiple keys of the same type with different values. I want to compare the similarity of the keys between 2 documents. More…
David
  • 95
  • 5
9
votes
2 answers

What does pandas describe() percentiles values tell about our data?

Let say this is my dataframe x=[0.09, 0.95, 0.93, 0.93, 0.34, 0.29, 0.14, 0.23, 0.91, 0.31, 0.62, 0.29, 0.71, 0.26, 0.79, 0.3 , 0.1 , 0.73, 0.63, 0.61] x=pd.DataFrame(x) When we x.describe() this dataframe we get result as this >>>…
Eka
  • 301
  • 1
  • 3
  • 10
9
votes
2 answers

How to Use Shap Kernal Explainer with Pipeline models?

I have a pandas DataFrame X. I would like to find the prediction explanation of a a particular model. My model is given below: pipeline = Pipeline(steps= [ ('imputer', imputer_function()), ('classifier', RandomForestClassifier() …
Nayana Madhu
  • 436
  • 1
  • 3
  • 8
9
votes
1 answer

How to arrange the dataset/images for CNN+LSTM

I am working on an image classification problem using Transfer Learning with Resnet50 as base model (in Keras) (For example Class A and Class B). There is a time factor involved in this classification. For example, I need sufficient evidence to make…
deepguy
  • 1,441
  • 8
  • 18
  • 39
9
votes
2 answers

Does feature selections matter to Decision Tree algorithms?

Background: Currently I'm working on my thesis project, which is to build Tree-based ensemble methods for classification on a large data set. Before I started with modeling, I've spent a large amount of time on feature selection using…
Ping
  • 91
  • 1
  • 1
  • 4