Most Popular
1500 questions
39
votes
9 answers
Is overfitting "better" than underfitting?
I've understood the main concepts behind overfitting and underfitting, even though some reasons as to why they occur might not be as clear to me.
But what I am wondering is: isn't overfitting "better" than underfitting?
If we compare how well the…
LeLuc
- 651
39
votes
6 answers
Under which assumptions a regression can be interpreted causally?
First, don't panic. Yes, there are many similar question on this site. But I believe none gives a conclusive answer to the question below. Please bear with me.
Consider a data generation process $\text{D}_X(x_1, ... , x_n|\theta)$, where…
luchonacho
- 2,742
- 4
- 26
- 47
39
votes
2 answers
A fair die is rolled 1,000 times. What is the probability of rolling the same number 5 times in a row?
A fair die is rolled 1,000 times. What is the probability of rolling the same number 5 times in a row? How do you solve this type of question for variable number of throws and number of repeats?
AutisticRat
- 637
39
votes
3 answers
How to calculate goodness of fit in glm (R)
I have the following result from running glm function.
How can I interpret the following values:
Null deviance
Residual deviance
AIC
Do they have something to do with the goodness of fit? Can I calculate some goodness of fit measure from these…
learner
- 915
39
votes
2 answers
Understanding p-value
I know that there are lots of materials explaining p-value. However the concept is not easy to grasp firmly without further clarification.
Here is the definition of p-value from Wikipedia:
The p-value is the probability of obtaining a test…
JDL
- 501
39
votes
3 answers
Why is the Dirichlet distribution the prior for the multinomial distribution?
In LDA topic model algorithm, I saw this assumption. But I don't know why chose Dirichlet distribution? I don't know if we can use Uniform distribution over Multinomial as a pair?
ColinBinWang
- 555
39
votes
8 answers
Graphical data overview (summary) function in R
I'm sure I've come across a function like this in an R package before, but after extensive Googling I can't seem to find it anywhere. The function I'm thinking of produced a graphical summary for a variable given to it, producing output with some…
robintw
- 2,117
39
votes
1 answer
How do you deal with "nested" variables in a regression model?
Consider a statistical problem where you have a response variable that you want to describe conditional on an explanatory variable and a nested variable, where the nested variable only arises as a meaningful variable for particular values of the…
Ben
- 124,856
39
votes
6 answers
If a credible interval has a flat prior, is a 95% confidence interval equal to a 95% credible interval?
I'm very new to Bayesian statistics, and this may be a silly question. Nevertheless:
Consider a credible interval with a prior that specifies a uniform distribution. For example, from 0 to 1, where 0 to 1 represents the full range of possible values…
pomodoro
- 793
39
votes
6 answers
What is the connection between credible regions and Bayesian hypothesis tests?
In frequentist statistics, there is a close connection between confidence intervals and tests. Using inference about $\mu$ in the $\rm N(\mu,\sigma^2)$ distribution as an example, the $1-\alpha$ confidence interval
$$\bar{x}\pm…
MånsT
- 11,979
39
votes
5 answers
Clustering a dataset with both discrete and continuous variables
I have a dataset X which has 10 dimensions, 4 of which are discrete values.
In fact, those 4 discrete variables are ordinal, i.e. a higher value implies a higher/better semantic.
2 of these discrete variables are categorical in the sense that for…
ptikobj
- 601
39
votes
10 answers
Why are survival times assumed to be exponentially distributed?
I am learning survival analysis from this post on UCLA IDRE and got tripped up at section 1.2.1. The tutorial says:
... if the survival times were known to be exponentially distributed, then the probability of observing a survival time ...
Why…
Haitao Du
- 36,852
- 25
- 145
- 242
39
votes
3 answers
How to do logistic regression in R when outcome is fractional (a ratio of two counts)?
I'm reviewing a paper which has the following biological experiment. A device is used to expose cells to varying amounts of fluid shear stress. As greater shear stress is applied to the cells, more of them start to detach from the substrate. At each…
thecity2
- 1,955
39
votes
4 answers
How should Feature Selection and Hyperparameter optimization be ordered in the machine learning pipeline?
My objective is to classify sensor signals.
The concept of my solution so far is :
i) Engineering features from raw signal
ii) Selecting relevant features with ReliefF and a clustering approach
iii) Apply N.N, Random Forest and SVM
However I am…
Grunwalski
- 645
- 2
- 8
- 11
39
votes
4 answers
Is there any supervised-learning problem that (deep) neural networks obviously couldn't outperform any other methods?
I have seen people have put a lot of efforts on SVM and Kernels, and they look pretty interesting as a starter in Machine Learning. But if we expect that almost-always we could find outperforming solution in terms of (deep) Neural Network, what is…
fordicus
- 595