Most Popular

1500 questions
14
votes
2 answers

Reshaping of data for deep learning using Keras

I am a beginner to Keras and I have started with the MNIST example to understand how the library actually works. The code snippet of the MNIST problem in the Keras example folder is given as : import numpy as np np.random.seed(1337) # for…
enterML
  • 3,031
  • 9
  • 27
  • 38
14
votes
3 answers

Creating neural net for xor function

It is a well known fact that a 1-layer network cannot predict the xor function, since it is not linearly separable. I attempted to create a 2-layer network, using the logistic sigmoid function and backprop, to predict xor. My network has 2 neurons…
user
  • 1,993
  • 6
  • 21
  • 38
14
votes
4 answers

Studying machine learning algorithms: depth of understanding vs. number of algorithms

Recently I was introduced to the field of Data Science (its been 6 months approx), and Ii started the journey with Machine Learning Course by Andrew Ng and post that started working on the Data Science Specialization by JHU. On practical application…
Vinay Tiwari
  • 151
  • 4
14
votes
1 answer

Machine learning libraries for Ruby

Are there any machine learning libraries for Ruby that are relatively complete (including a wide variety of algorithms for supervised and unsupervised learning), robustly tested, and well-documented? I love Python's scikit-learn for its incredible…
the911s
  • 321
  • 1
  • 8
14
votes
3 answers

How do I merge two data frames in Python Pandas?

I have two data frames df1 and df2 and I would like to merge them into a single data frame. It is as if df1 and df2 were created by splitting a single data frame down the center vertically, like tearing a piece of paper that contains a list in half…
sebastianspiegel
  • 891
  • 4
  • 11
  • 16
14
votes
4 answers

Do you actually need math for your data science job?

I am a physicist working in a data scientist role. I was told everywhere that my degree is a very good starting point because I know a lot of math and it is crucial for this job. But other than understanding the math behind the models' calculations…
Physicist92
  • 141
  • 3
13
votes
3 answers

What is the difference between NLP and text mining?

As discussed with Sean in this Meta post, I thought it would be nice to have a question which can help people who were confused like me, to know about the differences between text mining and NLP! So, what are the differences between nlp and…
Dawny33
  • 8,296
  • 12
  • 48
  • 104
13
votes
1 answer

What is the difference in xgboost binary:logistic and reg:logistic

What is the difference in R in xgboost between binary:logistic and reg:logistic? Is it only in evaluation metric? If yes, how does RMSE on binary classification compare to error rate? Is the relationship between the metrics more or less monotonic,…
user2530062
  • 297
  • 2
  • 8
13
votes
1 answer

Classify Customers based on 2 features AND a Time series of events

I need help on what should be my next step in an algorithm I am designing. Due to NDAs, I can't disclose much, but I'll try to be generic and understandable. Basically, after several steps in the algorithms, I have this: For each customer that I…
JusefPol
  • 131
  • 4
13
votes
3 answers

How to create US state choropleth map

I have a value associated with each US state (let's pretend it's the average temperature in January for each state). I want to display this data as a heat map of the United States. To be clear, it would be a map of the US with each state having a…
user15180
  • 131
  • 1
  • 1
  • 3
13
votes
1 answer

Do I have to standardize my new polynomial features?

I have a vector X with n features previously standardized. If I want to generate new polynomial features (let say adding square features), do I need to do another standardization on these new features after the computing ? Because knowing that my…
jmvllt
  • 619
  • 1
  • 8
  • 15
13
votes
5 answers

Best Julia library for neural networks

I have been using this library for basic neural network construction and analysis. However, it does not have support for building multi-layered neural networks, etc. So, I would like to know of any nice libraries for doing advanced neural networks…
Dawny33
  • 8,296
  • 12
  • 48
  • 104
13
votes
3 answers

Unbalanced classes -- How to minimize false negatives?

I have a dataset that has a binary class attribute. There are 623 instances with class +1 (cancer positive) and 101,671 instances with class -1 (cancer negative). I've tried various algorithms (Naive Bayes, Random Forest, AODE, C4.5) and all of them…
user798275
  • 293
  • 2
  • 3
  • 5
13
votes
3 answers

Unsupervised feature learning for NER

I have implemented NER system with the use of CRF algorithm with my handcrafted features that gave quite good results. The thing is that I used lots of different features including POS tags and lemmas. Now I want to make the same NER for different…
MaticDiba
  • 651
  • 1
  • 6
  • 10
13
votes
3 answers

Efficient database model for storing data indexed by n-grams

I'm working on an application which requires creating a very large database of n-grams that exist in a large text corpus. I need three efficient operation types: Lookup and insertion indexed by the n-gram itself, and querying for all n-grams that…
Phonon
  • 298
  • 2
  • 6