Most Popular
1500 questions
37
votes
2 answers
In survival analysis, why do we use semi-parametric models (Cox proportional hazards) instead of fully parametric models?
I've been studying the Cox Proportional Hazards model, and this question is glossed over in most texts.
Cox proposed fitting the coefficients of the Hazard function using a partial likelihood method, but why not just fit the coefficients of a…
user1956609
- 645
37
votes
4 answers
Internal vs external cross-validation and model selection
My understanding is that with cross validation and model selection we try to address two things:
P1. Estimate the expected loss on the population when training with our sample
P2. Measure and report our uncertainty of this estimation (variance,…
Amelio Vazquez-Reina
- 19,346
37
votes
4 answers
When to log transform a time series before fitting an ARIMA model
I have previously used forecast pro to forecast univariate time series, but am switching my workflow over to R. The forecast package for R contains a lot of useful functions, but one thing it doesn't do is any kind of data transformation before…
Zach
- 23,766
37
votes
4 answers
Why is a sample covariance matrix singular when sample size is less than number of variables?
Let's say I have a $p$-dimensional multivariate Gaussian distribution. And I take $n$ observations (each of them a $p$-vector) from this distribution and calculate the sample covariance matrix $S$. In this paper, the authors state that the sample…
user34790
- 6,757
- 10
- 46
- 69
37
votes
7 answers
Good sources for learning Markov chain Monte Carlo (MCMC)
Any suggestions for a good source to learn MCMC methods?
dram
- 111
37
votes
10 answers
The moose must flow, but how?
Background
Suppose I collected a data set of the latitude and longitude of moose tracks within an irregular polygon, and also took a compass bearing of the direction that the hooves pointed in.
Image Credit: © Galen Seilis 2022 (used with…
Galen
- 8,442
37
votes
6 answers
Backpropagation vs Genetic Algorithm for Neural Network training
I've read a few papers discussing pros and cons of each method, some arguing that GA doesn't give any improvement in finding the optimal solution while others show that it is more effective. It seems GA is generally preferred in literature (although…
sashkello
- 2,244
37
votes
3 answers
Why are Gaussian process models called non-parametric?
I am a bit confused. Why are Gaussian processes called non parametric models?
They do assume that the functional values, or a subset of them, have a Gaussian prior with mean 0 and covariance function given as the kernel function. These kernel…
user34790
- 6,757
- 10
- 46
- 69
37
votes
3 answers
Gradient of Hinge loss
I'm trying to implement basic gradient descent and I'm testing it with a hinge loss function i.e. $l_{\text{hinge}} = \max(0,1-y\ \boldsymbol{x}\cdot\boldsymbol{w})$. However, I'm confused about the gradient of the hinge loss. I'm under the…
brcs
- 533
37
votes
2 answers
Distributions other than the normal where mean and variance are independent
I was wondering if there are any distributions besides the normal where the mean and variance are independent of each other (or in other words, where the variance is not a function of the mean).
Wolfgang
- 16,997
37
votes
5 answers
What is a good use of the 'comment' function in R?
I just discovered the comment function in R. Example:
x <- matrix(1:12, 3,4)
comment(x) <- c("This is my very important data from experiment #0234",
"Jun 5, 1998")
x
comment(x)
This is the first time I came by this function and was…
Tal Galili
- 21,541
37
votes
3 answers
The relationship between the gamma distribution and the normal distribution
I recently found it necessary to derive a pdf for the square of a normal random variable with mean 0. For whatever reason, I chose not to normalise the variance beforehand. If I did this correctly then this pdf is as follows:
$$
N^2(x; \sigma^2) =…
timxyz
- 473
37
votes
2 answers
Interpretation of simple predictions to odds ratios in logistic regression
I'm somewhat new to using logistic regression, and a bit confused by a discrepancy between my interpretations of the following values which I thought would be the same:
exponentiated beta values
predicted probability of the outcome using beta…
mike
- 867
37
votes
5 answers
Timing functions in R
I would like to measure the time that it takes to repeat the running of a function. Are replicate() and using for-loops equivalent? For example:
system.time(replicate(1000, f()));
system.time(for(i in 1:1000){f()});
Which is the prefered…
Tim
- 19,445
37
votes
3 answers
Why Beta/Dirichlet Regression are not considered Generalized Linear Models?
The premise is this quote from vignette of R package betareg1.
Further-more, the model shares some properties (such as linear
predictor, link function, dispersion parameter) with generalized
linear models (GLMs; McCullagh and Nelder 1989), but…
Firebug
- 19,076
- 6
- 77
- 139