Most Popular
1500 questions
10
votes
2 answers
Using Cross Validation technique for a CNN model
I am working on a CNN model. As always, I used batches with epochs to train my model. When it completed training and validation, finally I used a test set to measure the model performance and generate confusion matrix. Now I want to use Cross…
Hunar
- 1,147
- 2
- 11
- 33
10
votes
1 answer
Why Gaussian latent variable (noise) for GAN?
When I was reading about GAN, the thing I don't understand is why people often choose the input to a GAN (z) to be samples from a Gaussian? - and then are there also potential problems associated with this?
asahi kibou
- 143
- 1
- 5
10
votes
2 answers
How can I detect blocks of text from scanned document images
ORIGINAL IMAGE:
GOAL:
I want to separate texts into individual paragraphs by placing bounding boxes over them (as shown above).
I tried it do this via traditional computer vision approach using opencv.
I plotted character level bounding…
DGS
- 291
- 1
- 3
- 7
10
votes
4 answers
Gas consumption outliers detection - Neural network project. Bad results
I tried to detect outliers in the energy gas consumption of some dutch buildings, building a neural network model. I have very bad results, but I can't find the reason.
I am not an expert so I would like to ask you what I can improve and what I'm…
marcodena
- 1,667
- 4
- 14
- 17
10
votes
5 answers
Products classification by name
I am a beginner with machine learning, and I'm trying to build a model to classify products by category according to the words present in the product name.
My goal is to predict the category of some new product, just by observing the categories for…
elias
- 153
- 1
- 7
10
votes
2 answers
Cutting numbers into fixed buckets
I am trying to put numeric data into fixed number of buckets using Python/R.
I have data in key:value format {1 : 12.3, 2 : 4.7, 3 : 7.4, 4 : 15.9, ......, 50 : 24.1}, which is device_id:data_usages I need to bucket based on value into nine buckets…
rp346
- 221
- 1
- 2
- 7
10
votes
4 answers
Online machine learning tutorial
Does anyone know some good tutorials on online machine learning technics?
I.e. how it can be used in real-time environments, what are key differences compared to normal machine learning methods etc.
UPD: Thank you everyone for answers, by "online" I…
Igor Bobriakov
- 1,071
- 2
- 9
- 11
10
votes
3 answers
LightGBM - Why Exclusive Feature Bundling (EFB)?
I'm currently studying GBDT and started reading LightGBM's research paper.
In section 4. they explain the Exclusive Feature Bundling algorithm, which aims at reducing the number of features by regrouping mutually exclusive features into bundles,…
Tom
- 103
- 1
- 5
10
votes
1 answer
How to calculate Cumulative Sum with Groupby in Python?
I am trying to calculate cumulative sum with groupby using Pandas's DataFrame. However, I don't get expected output.
My Source Code:
import pandas as pd
Employee = [['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'],
['CSE', 'CSE', 'EEE', 'EEE', 'CE',…
sakamoto sato
- 103
- 1
- 1
- 5
10
votes
4 answers
Which is first ? Tuning the parameters or selecting the model
I've been reading about how we split our data into 3 parts; generally, we use the validation set to help us tune the parameters and the test set to have an unbiased estimate on how well does our model perform and thus we can compare models based on…
Ahmed Gharbi
- 103
- 1
- 6
10
votes
1 answer
How to find the count of consecutive same string values in a pandas dataframe?
Assume that we have the following pandas dataframe:
df = pd.DataFrame({'col1':['A>G','C>T','C>T','G>T','C>T', 'A>G','A>G','A>G'],'col2':['TCT','ACA','TCA','TCA','GCT', 'ACT','CTG','ATG'],…
burcak
- 203
- 1
- 2
- 4
10
votes
5 answers
DeprecationWarning: The 'categorical_features' keyword is deprecated in version 0.20
I was watching Machine Learning A- Z from SuperDataScience but when I was doing below code sample:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
dataset = pd.read_csv('Data.csv')
X = dataset.iloc[:, :-1].values
from…
Navid
- 201
- 1
- 3
- 5
10
votes
2 answers
How do i calculate prediction probability of a class in Java Weka Api?
I am developing a prediction model using Java Weka api. I can predict class for the new instance using the following code:
double predictClass = classifer.classifyInstance(instance)
However, I need class probability instead of class value. Thanks…
Howa Begum
- 348
- 1
- 6
10
votes
4 answers
Log loss vs accuracy for deciding between different learning rates?
While model tuning using cross validation and grid search I was plotting the graph of different learning rate against log loss and accuracy separately.
Log loss
When I used log loss as score in grid search to identify the best learning rate out of…
CodeMaster GoGo
- 778
- 1
- 6
- 15
10
votes
2 answers
Mapping column values of one DataFrame to another DataFrame using a key with different header names
I have two data frames df1 and df2 which look something like this.
cat1 cat2 cat3
0 10 25 12
1 11 22 14
2 12 30 15
all_cats cat_codes
0 10 A
1 11 B
2 12 C
3 25 …
Danny
- 1,148
- 1
- 8
- 16