Most Popular
1500 questions
44
votes
2 answers
How to derive the standard error of linear regression coefficient
For this univariate linear regression model
$$y_i = \beta_0 + \beta_1x_i+\epsilon_i$$
given data set $D=\{(x_1,y_1),...,(x_n,y_n)\}$, the coefficient estimates are
$$\hat\beta_1=\frac{\sum_ix_iy_i-n\bar x\bar y}{n\bar x^2-\sum_ix_i^2}$$…
avocado
- 3,581
- 6
- 35
- 49
44
votes
5 answers
What is the significance of logistic regression coefficients?
I am currently reading a paper concerning voting location and voting preference in the 2000 and 2004 election. In it, there is a chart which displays logistic regression coefficients. From courses years back and a little reading up, I understand…
amccormack
- 543
44
votes
4 answers
How can SVM 'find' an infinite feature space where linear separation is always possible?
What is the intuition behind the fact that an SVM with a Gaussian Kernel has infinite dimensional feature space?
user36162
- 581
44
votes
9 answers
Why use vector error correction model?
I am confused about the Vector Error Correction Model (VECM).
Technical background:
VECM offers a possibility to apply Vector Autoregressive Model (VAR) to integrated multivariate time series. In the textbooks they name some problems in applying a…
DatamineR
- 1,627
- 4
- 19
- 26
44
votes
2 answers
How to interpret glmnet?
I am trying to fit a multivariate linear regression model with approximately 60 predictor variables and 30 observations, so I am using the glmnet package for regularized regression because p>n.
I have been going through documentation and other…
Alice
- 885
44
votes
1 answer
Manually calculated $R^2$ doesn't match up with randomForest() $R^2$ for testing new data
I know this is a fairly specific R question, but I may be thinking about proportion variance explained, $R^2$, incorrectly. Here goes.
I'm trying to use the R package randomForest. I have some training data and testing data. When I fit a random…
Stephen Turner
- 4,293
44
votes
9 answers
Tiny (real) datasets for giving examples in class?
When teaching an introductory level class, the teachers I know tend to invent some numbers and a story in order to exemplify the method they are teaching.
What I would prefer is to tell a real story with real numbers. However, these stories needs…
Tal Galili
- 21,541
44
votes
4 answers
Polynomial regression using scikit-learn
I am trying to use scikit-learn for polynomial regression. From what I read polynomial regression is a special case of linear regression. I was hopping that maybe one of scikit's generalized linear models can be parameterised to fit higher order…
Mihai Damian
- 543
44
votes
5 answers
How to derive the least square estimator for multiple linear regression?
In the simple linear regression case $y=\beta_0+\beta_1x$, you can derive the least square estimator $\hat\beta_1=\frac{\sum(x_i-\bar x)(y_i-\bar y)}{\sum(x_i-\bar x)^2}$ such that you don't have to know $\hat\beta_0$ to estimate…
Saber CN
- 819
44
votes
2 answers
Evidence for man-made global warming hits 'gold standard': how did they do this?
This message in a Reuter's article from 25.02.2019 is currently all over the news:
Evidence for man-made global warming hits 'gold standard'
[Scientists] said confidence that human activities were raising the heat at the Earth’s surface had reached…
Sextus Empiricus
- 77,915
44
votes
2 answers
Simulation of logistic regression power analysis - designed experiments
This question is in response to an answer given by @Greg Snow in regards to a question I asked concerning power analysis with logistic regression and SAS Proc GLMPOWER.
If I am designing an experiment and will analze the results in a factorial…
B_Miner
- 8,630
44
votes
6 answers
How to quasi match two vectors of strings (in R)?
I am not sure how this should be termed, so please correct me if you know a better term.
I've got two lists. One of 55 items (e.g: a vector of strings), the other of 92. The item names are similar but not identical.
I wish to find the best…
Tal Galili
- 21,541
44
votes
6 answers
Why do I get a 100% accuracy decision tree?
I'm getting a 100% accuracy for my decision tree. What am I doing wrong?
This is my code:
import pandas as pd
import json
import numpy as np
import sklearn
import matplotlib.pyplot as plt
data =…
Nadjla
- 451
44
votes
1 answer
Existence of the moment generating function and variance
Can a distribution with finite mean and infinite variance have a moment generating function? What about a distribution with finite mean and finite variance but infinite higher moments?
Mgf
- 441
44
votes
3 answers
Training loss increases with time
I am training a model (Recurrent Neural Network) to classify 4 types of sequences. As I run my training I see the training loss going down until the point where I correctly classify over 90% of the samples in my training batches. However a couple of…
dins2018
- 443