Most Popular

1500 questions
22
votes
2 answers

Loading own train data and labels in dataloader using pytorch?

I have x_data and labels separately. How can I combine and load them in the model using torch.utils.data.DataLoader? I have a dataset that I created and the training data has 20k samples and the labels are also separate. Lets say I want to load a…
Amarnath
  • 351
  • 1
  • 2
  • 5
22
votes
3 answers

When to use Standard Scaler and when Normalizer?

I understand what Standard Scalar does and what Normalizer does, per the scikit documentation: Normalizer, Standard Scaler. I know when Standard Scaler is applied. But in which scenario is Normalizer applied? Are there scenarios where one is…
Heisenbug
  • 411
  • 1
  • 3
  • 7
22
votes
1 answer

What's the difference between Sklearn F1 score 'micro' and 'weighted' for a multi class classification problem?

I have a multi-class classification problem with class imbalance. I searched for the best metric to evaluate my model. Scikit-learn has multiple ways of calculating the F1 score. I would like to understand the differences. What do you recommending…
Fractale
  • 345
  • 1
  • 2
  • 5
22
votes
1 answer

Understanding Timestamps and Batchsize of Keras LSTM considering Hiddenstates and TBPTT

What I'm trying to do What I am trying to do is predicting the next data-point $x_t$ for each point in the timeseries $[x_0, x_1, x_2,...,x_T]$ in the context of a date-stream in real-time, in theory the series is infinity. If a new value $x$ is…
KenMarsu
22
votes
2 answers

How to adjust the hyperparameters of MLP classifier to get more perfect performance

I am just getting touch with Multi-layer Perceptron. And, I got this accuracy when classifying the DEAP data with MLP. However, I have no idea how to adjust the hyperparameters for improving the result. Here is the detail of my code and…
Irving.ren
  • 337
  • 1
  • 2
  • 7
22
votes
3 answers

How does multicollinearity affect neural networks?

Multicollinearity is a problem for linear regression because the results become unstable / depend too much on single elements (source). (Also, the inverse of $X^TX$ doesn't exist so the standard OLS estimator does not exist ... I have no idea how,…
Martin Thoma
  • 18,880
  • 35
  • 95
  • 169
22
votes
2 answers

Sliding window leads to overfitting in LSTM?

Will I overfit my LSTM if I train it via the sliding-window approach? Why do people not seem to use it for LSTMs? For a simplified example, assume that we have to predict the sequence of characters: A B C D E F G H I J K L M N O P Q R S T U V W X Y…
Kari
  • 2,726
  • 2
  • 20
  • 49
22
votes
1 answer

Decision trees: leaf-wise (best-first) and level-wise tree traverse

Issue 1: I am confused by the description of LightGBM regarding the way the tree is expanded. They state: Most decision tree learning algorithms grow tree by level (depth)-wise, like the following image: Questions 1: Which "most" algorithms…
kkk
  • 443
  • 1
  • 4
  • 12
22
votes
1 answer

What are the pros and cons of Keras and TFLearn?

What are the pros and cons of Keras and TFlearn? When is one library preferred over the other?
Ankit Bindal
  • 207
  • 1
  • 2
  • 5
22
votes
2 answers

Data science without knowledge of a specific topic, is it worth pursuing as a career?

I had a conversation with someone recently and mentioned my interest in data analysis and who I intended to learn the necessary skills and tools. They suggested to me that while it is great to learn the tools and build the skills there is little…
user3754366
  • 343
  • 1
  • 2
  • 4
22
votes
1 answer

XGBRegressor vs. xgboost.train huge speed difference?

If I train my model using the following code: import xgboost as xg params = {'max_depth':3, 'min_child_weight':10, 'learning_rate':0.3, 'subsample':0.5, 'colsample_bytree':0.6, 'obj':'reg:linear', 'n_estimators':1000, 'eta':0.3} features =…
user1566200
  • 315
  • 1
  • 3
  • 8
22
votes
7 answers

Good "frequent sequence mining" packages in Python?

Has anyone used (and liked) any good "frequent sequence mining" packages in Python other than the FPM in MLLib? I am looking for a stable package, preferable stilled maintained by people. Thank you!
Edamame
  • 2,745
  • 5
  • 24
  • 33
22
votes
2 answers

Convert a pandas column of int to timestamp datatype

I have a dataframe that among other things, contains a column of the number of milliseconds passed since 1970-1-1. I need to convert this column of ints to timestamp data, so I can then ultimately convert it to a column of datetime data by adding…
Austin Capobianco
  • 485
  • 1
  • 4
  • 20
22
votes
4 answers

Data science / machine learning books for mathematicians

I have found other requests for references here. In particular in: Where to start, which books and Books about the "Science" in Data Science? I have given a glance to: Artificial Intelligence: A Modern Approach (Russel & Norvig) Machine Learning:…
Contactomorph
  • 320
  • 2
  • 5
22
votes
1 answer

Difference of Activation Functions in Neural Networks in general

I have studied the activation function types for neural networks. The functions themselves are quite straightforward, but the application difference is not entirely clear. It's reasonable that one differentiates between logical and linear type…
Hendrik
  • 8,587
  • 17
  • 42
  • 55