Most Popular

1500 questions
50
votes
6 answers

Percentage of overlapping regions of two normal distributions

I was wondering, given two normal distributions with $\sigma_1,\ \mu_1$ and $\sigma_2, \ \mu_2$ how can I calculate the percentage of overlapping regions of two distributions? I suppose this problem has a specific name, are you aware of any…
50
votes
17 answers

What is your favorite data visualization blog?

What is the best blog on data visualization? I'm making this question a community wiki since it is highly subjective. Please limit each answer to one link. Please note the following criteria for proposed answers: [A]cceptable answers to…
Shane
  • 12,461
49
votes
15 answers

Expected number of ratio of girls vs boys birth

I have came across a question in job interview aptitude test for critical thinking. It is goes something like this: The Zorganian Republic has some very strange customs. Couples only wish to have female children as only females can inherit the…
49
votes
8 answers

Would a Bayesian admit that there is one fixed parameter value?

In Bayesian data analysis, parameters are treated as random variables. This stems from the Bayesian subjective conceptualization of probability. But do Bayesians theoretically acknowledge that there is one true fixed parameter value out in the 'real…
ATJ
  • 1,861
49
votes
5 answers

Statistical test to tell whether two samples are pulled from the same population?

Let's say I have two samples. If I want to tell whether they are pulled from different populations, I can run a t-test. But let's say I want to test whether the samples are from the same population. How does one do this? That is, how do I calculate…
user1566200
  • 1,047
49
votes
5 answers

AIC guidelines in model selection

I typically use BIC as my understanding is that it values parsimony more strongly than does AIC. However, I have decided to use a more comprehensive approach now and would like to use AIC as well. I know that Raftery (1995) presented nice guidelines…
49
votes
2 answers

What is the adjusted R-squared formula in lm in R and how should it be interpreted?

What is the exact formula used in R lm() for the Adjusted R-squared? How can I interpret it? Adjusted r-squared formulas There seem to exist several formulas to calculate Adjusted R-squared. Wherry’s formula:…
user1272262
49
votes
13 answers

Can machine learning decode the SHA256 hashes?

I have a 64 character SHA256 hash. I'm hoping to train a model that can predict if the plaintext used to generate the hash begins with a 1 or not. Regardless if this is "Possible", what algorithm would be the best approach? My initial…
John
  • 531
  • 1
  • 4
  • 3
49
votes
4 answers

Does correlation = 0.2 mean that there is an association "in only 1 in 5 people"?

In The Idiot Brain: A Neuroscientist Explains What Your Head is Really Up To, Dean Burnett wrote The correlation between height and intelligence is usually cited as being about $0.2$, meaning height and intelligence seem to be associated in only…
Sitak
  • 593
49
votes
5 answers

Entropy of an image

What is the most information/physics-theoretical correct way to compute the entropy of an image? I don't care about computational efficiency right now - I want it theoretically as correct as possible. Lets start with a gray-scale image. One…
49
votes
12 answers

Can simple linear regression be done without using plots and linear algebra?

I'm completely blind and come from a programming background. What I'm trying to do is to learn machine learning, and to do this, I first need to learn about linear regression. All the explanations on the Internet I am finding about this subject plot…
49
votes
6 answers

Is there an explanation for why there are so many natural phenomena that follow normal distribution?

I think this is a fascinating topic and I do not fully understand it. What law of physics makes so that so many natural phenomena have normal distribution? It would seem more intuitive that they would have uniform distribution. It is so hard for me…
yoyo_fun
  • 659
49
votes
1 answer

Is regression with L1 regularization the same as Lasso, and with L2 regularization the same as ridge regression? And how to write "Lasso"?

I'm a software engineer learning machine learning, particularly through Andrew Ng's machine learning courses. While studying linear regression with regularization, I've found terms that are confusing: Regression with L1 regularization or L2…
49
votes
4 answers

How to sample from a normal distribution with known mean and variance using a conventional programming language?

I've never had a course in statistics, so I hope I'm asking in the right place here. Suppose I have only two data describing a normal distribution: the mean $\mu$ and variance $\sigma^2$. I want to use a computer to randomly sample from this…
Fixee
  • 635
49
votes
4 answers

How do you use the 'test' dataset after cross-validation?

In some lectures and tutorials I've seen, they suggest to split your data into three parts: training, validation and test. But it is not clear how the test dataset should be used, nor how this approach is better than cross-validation over the whole…
Serhiy
  • 1,068
  • 1
  • 11
  • 13