Most Popular
1500 questions
13
votes
5 answers
Generate pdf from jupyter notebook without code
I have a Jupyter notebook that contains markdown, code, and outputs (graphs). I would like to generate PDF from this notebook.
I tried to hide code using HTML code which I get from here then I tried to download it as pdf but again code shows up. But…
GIRISH kuniyal
- 253
- 1
- 2
- 8
13
votes
8 answers
If A and B are correlated and A and C are correlated. Why is it possible for B and C to be uncorrelated?
Let's say
A and B are correlated
A and C are correlated
B and C is uncorrelated
How is it possible for B and C to be uncorrelated when they are both correlated to A?
Ashley
- 131
- 1
- 3
13
votes
2 answers
Is FPGrowth still considered "state of the art" in frequent pattern mining?
As far as I know the development of algorithms to solve the Frequent Pattern Mining (FPM) problem, the road of improvements have some main checkpoints. Firstly, the Apriori algorithm was proposed in 1993, by Agrawal et al., along with the…
Rubens
- 4,107
- 5
- 23
- 42
13
votes
3 answers
What do Python's pandas/matplotlib/seaborn bring to the table that Tableau does not?
I spent the past year learning Python. As a person who thought coding was impossible to learn for those outside of the CS/IT sphere, I was obviously gobsmacked by the power of a few lines of Python code!
Having arrived at an intermediate level…
Uralan
- 143
- 1
- 8
13
votes
2 answers
Efficient algorithm to compute the ROC curve for a classifier consisting of an ensemble of disjoint classifiers
Suppose I have classifiers C_1 ... C_n that are disjoint in the sense that no two will return true on the same input (e.g. the nodes in a decision tree). I want to build a new classifier that is the union of some subset of these (e.g. I want to…
Josh Brown Kramer
- 233
- 1
- 4
13
votes
4 answers
How can I provide an answer to Neural Network skeptics?
After given several talks on NN's, I always have a skeptic that wants a real measure of how well the model is. How do you know the model is truly accurate?
I explain the use of test data etc. to evaluate the total error, however, there is always…
Shinobii
- 419
- 4
- 10
13
votes
1 answer
Feature selection using feature importances in random forests with scikit-learn
I have plotted the feature importances in random forests with scikit-learn. In order to improve the prediction using random forests, how can I use the plot information to remove features? I.e. how to spot whether a feature is useless or even worse…
Franck Dernoncourt
- 5,690
- 10
- 40
- 76
13
votes
5 answers
In industry, what type of new data science algorithms does one develop?
I've seen several job descriptions for data science which include developing a novel algorithm to be a part of production environments.
Can you give some input of what could be meant here exactly? Would they mean an algorithm that behaves somewhat…
Mariah
- 338
- 1
- 9
13
votes
2 answers
Activation function between LSTM layers
I'm aware the LSTM cell uses both sigmoid and tanh activation functions internally, however when creating a stacked LSTM architecture does it make sense to pass their outputs through an activation function (e.g. ReLU)?
So do we prefer this:
model =…
lsfischer
- 242
- 1
- 2
- 8
13
votes
8 answers
I am a programmer, how do I get into field of Data Science?
First of all this term sounds so obscure.
Anyways..I am a software programmer. One of the languages I can code is Python. Speaking of Data I can use SQL and can do Data Scraping. What I figured out so far after reading soo many articles that Data…
Volatil3
- 341
- 3
- 10
13
votes
3 answers
Measuring performance of different classifiers with different sample sizes
I'm currently using several different classifiers on various entities extracted from text, and using precision/recall as a summary of how well each separate classifier performs across a given dataset.
I'm wondering if there's a meaningful way of…
Dave Challis
- 395
- 2
- 10
13
votes
3 answers
Are ontologies and the Semantic Web dead?
Is the Semantic Web dead? Are ontologies dead?
I am developing a work plan for my thesis about "A knowledge base through a set ontology for interest groups around wetlands". I have been researching and developing ontologies for it but I am still…
Antonio Edgar Martinez
- 155
- 1
- 5
13
votes
4 answers
Pandas change value of a column based another column condition
I have values in column1, I have columns in column2.
What I want to achieve: Condition: where column2 == 2 leave to be 2 if
column1 < 30 elsif change to 3 if column1 > 90.
Here is what i did so far, the problem is 2 does not change to 3 where…
Koko
- 213
- 1
- 2
- 6
13
votes
6 answers
Datasets understanding best practices
I am a CS master student in data mining. My supervisor once told me that before I run any classifier or do anything with a dataset I must fully understand the data and make sure that the data is clean and correct.
My questions:
What are the best…
Jack Twain
- 719
- 1
- 5
- 7
13
votes
1 answer
How to know if a model is overfitting or underfitting by looking at graph
Just recently got my hands on tensorboard, but can you tell me what features should I look for in the graph (Accuracy and Validation Accuracy)
And please do enlighten me about the concept of underfitting as well.
Nikhil.Nixel
- 329
- 1
- 2
- 10