3

Under uncertainty, precise probability cannot be assigned, see my other question: How valid is assignment of probabilities when evidence is totally lacking, as in Pascal's Wager? In this case, either probability cannot be assigned, or a range of probabilities is assigned (in the case of complete uncertainty, [0, 1]).

How does this change when there is some evidence, but not full evidence? For example, many cosmologists posit the existence of an infinite universe under the assumption that the universe is flat. This does not directly follow, yet many physicists say that it is "likely."

In general, how can evidence slightly favoring one hypothesis over another be used in support of that hypothesis, if the prior probability of the hypothesis being true is not known? Indeed, if we have a probability ranging over [0, 1], and any value can rationally be chosen from that, we could choose 1. This makes evidence useless in support or refutation of that hypothesis, according to Bayes' Theorem.

Question: In what sense can we use evidence to change our probabilities of a given proposition if the prior probability of that proposition cannot be known (or range over an interval)?

Josh
  • 355
  • 1
  • 6
  • Given that probabilities are in the range [0,1], we can apply some math to them, we can often compute maximum likelihoods. In particular, enough repetition can make the original distribution and probabilities highly unlikely to matter. We really can measure the maximum likelihood they do matter due to the Law of Large numbers. The whole mechanism used by most scientists is modern Normal Theory statistical techniques, which rely upon the normality of measures on large enough data sets to give answers that are less than a fixed probability of lying outside a given range. –  Aug 22 '19 at 19:56
  • @jobermark First, my question isn't really aimed at cases where evidence is abundant, but more where evidence is few and far between. Second, I'm not sure that it really matters how much evidence you have if the probability is either 0 or 1 (Bayes' theorem). But yes, I think the probability does converge in (0, 1), but not [0, 1]. Perhaps (0, 1) should be used in scientific questions? – Josh Aug 22 '19 at 20:15
  • It is not that prior probabilities can not be known, it is that there is nothing to know. In many cases, Bayesian priors are an artifice assigned more or less based on technical convenience (Wikipedia lists some schemes). It is how they are updated that really matters. – Conifold Aug 22 '19 at 20:44
  • @Conifold So the ability to incorporate evidence, and how much pull that evidence has, is dependent on the prior you choose? Is there, perhaps, another way to evaluate evidence besides having to update your prior? Besides, of course, frequentist probability. – Josh Aug 22 '19 at 22:20
  • Sure, Bayesian methodology has a lot of critics, starting with Popper. But applying probabilities to assessing the quality of evidence is a stretch to begin with, so it is not surprising that it leads to more stretches, and results of questionable meaningfulness. I think tracking the relative change of Bayesian probabilities has value, but the absolute numbers are often meaningless (and can be manipulated by shifting the prior). – Conifold Aug 22 '19 at 22:34
  • 1
    @Josh But then the answer is obviously 'It can't that is why we verify theories with more than one test.' People don't judge theories on little bits of data, they judge it on subjective criteria, or adequate data. –  Aug 22 '19 at 22:53
  • @Conifold What about probabilities that are generated "internally"? For example, a math problem where you have a best guess, but are not completely sure that that is, in fact, the correct answer. Can we meaningfully assign a probability here? It seems not, because there would be no evidence to persuade you that your answer is wrong, and evidence does exist to persuade you that it is correct. Then again, you could be wrong (perhaps previous experience can tell you you might be wrong?). If you can assign probability, how could you determine a specific, non-arbitrary value? – Josh Aug 24 '19 at 03:26
  • These are questions to those who do assign them, see SEP Subjective Probability Theory. – Conifold Aug 24 '19 at 03:32
  • @Conifold Yes, but is it rational to assign in this case? It is not a case of complete ignorance, like what I've asked about in my other question (which you gave a very good answer to :)). There is knowledge to sway you one way or the other, but it is not clear how much one should be swayed. Is it just as rational to assign a 100% chance of being correct as a 50%, as the evidence is not objective (and, I suppose, fairly unhelpful)? – Josh Aug 24 '19 at 03:45
  • I do not subscribe to such assignments, so can't help you there. – Conifold Aug 24 '19 at 03:46
  • @Conifold So this is a case of "uncertain subjective probabilities," as you say in your other answer? Sorry for the confusion, but I'm not entirely sure when something passes from being "uncertain" to "certain." – Josh Aug 24 '19 at 04:14
  • For the infinite universe question the problem is not the priors, it is defining a likelihood to use in Bayes rule. – Dikran Marsupial Sep 05 '23 at 15:27
  • 1
    A major issue here is that according to subjectivist Bayesianism there is no such thing as a true unknown prior probability/distribution, rather this is chosen from the point of view of the person to which the uncertainty refers. So the issue is not whether you "know" or "don't know" the prior, rather you have to construct it, optimally expressing your own uncertainty. Having done that, you can update your priors using Bayes' rule when new evidence comes in, as explained in existing answers. – Christian Hennig Sep 05 '23 at 18:20

2 Answers2

3

Bayesians wouldn't assign a specific numeric probability to such an event. Instead they would adopt a prior that was a probability distribution over the probability of the proposition being true. An objectivist Bayesian would probably choose a distribution than encoded only the fact that we don't know what that probability is, and use something like a Beta(1,1) prior which is uniform on the interval [0,1], i.e. every value of the probability the proposition is true is equally likely.

If any evidence came in, they could use Bayes rule to update their prior to produce a posterior, which would also be a probability distribution over the probability that the proposition is true.

Either way, if you want to decide what course of action to take, you work out a loss function, which tells you how much you will lose or gain under each strategy depending on whether the proposition is true or not in reality. We then work out the "expected loss" of each course of action by marginalising over our uncertainty of whether the proposition was true or not (in this case it would involve a sum of the losses, weighted by their posterior probabilities - I'm not going to spell it out on a SE without LateX). We then rationally choose the course of action with the lowest expected loss.

In the case of Pascals wager, the losses consist of:

(i) the cost of behaving like God exists if it does exist (ii) the cost of behaving like God exists if it does not exist (iii) the cost of behaving like God does not exist when it does (iv) the cost of behaving like God does not exist when it doesn't

So the Bayesian scheme doesn't tell you what to do, that depends on the losses, which are going to be your judgement and your prior (which may be an objective uninformative prior). It does however provide a rational means of going from a prior belief and a set of losses (and perhaps some evidence) to the course of action that is most likely to minimise your losses.

For details, the standard reference is Berger "Statistical Decision Theory and Bayesian Analysis", which is published by Springer.

Just a few quotes from the previous question that the OP mentions, first from Keynes:

About these matters there is no scientific basis on which to form any calculable probability whatever. We simply do not know."

The uniform uninformative prior encodes the knowledge that we simply do not know. It expresses no preference for any probability that God exists (in the case of Pascal's wager). Note it is only an interval in the sense that it covers the entire interval on which probabilities are defined. I suspect "interval" in the previous discussion is likely to mean a subset of [0,1].

And from Taleb:

It eliminates the need for us to understand the probabilities of a rare event (there are fundamental limits to our knowledge of these); rather, we can focus on the payoff and benefits of an event if it takes place.

If we reason from an uninformative prior, and have little or no evidence, the optimal course of action is essentially determined by the "payoff and benefits" (or in this case losses - Bayesians are a pessimistic lot! ;o). The marginalisation over the probability that the proposition is true means that it doesn't affect the outcome greatly simply because it is so vague and uninformative. So it all comes down to the costs. The real problem with Pascal's wager is that the cost of an eternity in Hell is essentially infinite, to all intents and purposes, so that dominates the decision. The difficulty is entirely in justifying the losses.

Dikran Marsupial
  • 2,118
  • 10
  • 15
  • 4
    Choosing a supposedly uninformative prior in most real situations is hard. There is dependence on the parametrisation, and the principle of indifference can often be interpreted in various ways. In multivariate parameter spaces supposedly uninformative priors can implicitly impose far stronger information than what most people believe. See for example here https://www.tandfonline.com/doi/abs/10.1080/01621459.1996.10477003 – Christian Hennig Sep 05 '23 at 17:18
  • 1
    To add on to that, the idea of an “uninformative” prior makes no sense. If one is truly uninformative, one can’t assign a probability to it. –  Sep 05 '23 at 18:00
  • @thinkingman again asserting that doesn't make it true. "once can't assign a probability to it" you don't assign a probability to an uninformative prior, the uninformative prior assigns a (equal in this case) probability to different probabilities of the truth of the proposition. – Dikran Marsupial Sep 05 '23 at 18:07
  • 2
    @ChristianHennig I completely agree, and if this was the statistics SE I would have mentioned it, but the discussion is currently at a far lower level than that so I thought it inappropriate here. As it happens, I wouldn't view a Jeffrey's prior as completely uninformative in that it encodes the knowledge that an invariance to parameterization is required (however that can be objectively justified). There is also the point that if you use an uninformative prior in situations where you do have solid knowledge (e.g. from physics) it can give obviously wrong [given that knowledge] conclusions. – Dikran Marsupial Sep 05 '23 at 18:10
  • Note I wrote "and use something like a Beta(1,1) prior " to encode a lack of knowlege, in order to hint that there were deeper levels of truth to this issue, rather than a fully prescriptive "and use a Beta(1,1) prior". In the case of Pascal's wager the precise choice of (minimally) "uninformative" is unlikely to make much of a difference as the decision of how to act depends almost solely on the costs/losses/benefits. – Dikran Marsupial Sep 05 '23 at 18:25
  • 1
    The principle of indifference leads to contradictions. “If we are truly ignorant about a set of alternatives, then we are also ignorant about combinations of alternatives and about subdivisions of alternatives. However, the principle of indifference when applied to alternatives, or their combinations, or their subdivisions, yields different probability assignments” - Fine –  Sep 05 '23 at 18:34
  • 1
    Proponents reply that it rather codifies the way in which such ignorance should be epistemically managed — for anything other than an equal assignment of probabilities would represent the possession of some knowledge. Critics counter-reply that in a state of complete ignorance, it is better to assign imprecise probabilities (perhaps ranging over the entire [0, 1] interval), or to eschew the assignment of probabilities altogether.” See https://plato.stanford.edu/entries/probability-interpret/ –  Sep 05 '23 at 18:36
  • @thinkingman what subdivisions and combinations are involved in Pascal's wager? Either Pascal's deity exists or it doesn't. – Dikran Marsupial Sep 05 '23 at 18:39
  • "it is better to assign imprecise probabilities (perhaps ranging over the entire [0, 1] interval), or to eschew the assignment of probabilities altogether.” wonder what they meant by that? ;o) – Dikran Marsupial Sep 05 '23 at 18:43
  • 1
    You have clearly not read the philosophical literature. Please read https://en.m.wikipedia.org/wiki/Bertrand_paradox_(probability). “Either Pascal's deity exists or it doesn't.” Yes, that is exactly why it is a fool’s errand to assign a probability to it. –  Sep 05 '23 at 18:43
  • 2
    " the principle of indifference may not produce definite, well-defined results for probabilities if it is applied uncritically when the domain of possibilities is infinite." is whether Pascal's deity exists or not an infinite domain? No, “Either Pascal's deity exists or it doesn't.”. Quit quote mining. – Dikran Marsupial Sep 05 '23 at 18:56
  • 1
    Almost every proposition that we assign a probability to is binary. We just map it into an infinite domain. It will either rain tomorrow or not. John was either the murderer or not. The proposed probability domain is usually in a range of [0,1]. If you don’t understand this despite being on the statistics SE for so long, I give up. Anyways, my argument is to not assign probabilities and I’ve stated my reasons for it. I’m not interested in having the same discussion with you for the 300th time after your constant refusals to understand things. –  Sep 05 '23 at 19:06
  • 2
    "We just map it into an infinite domain." why would we do that? You are just making stuff up now. – Dikran Marsupial Sep 05 '23 at 19:08
  • BTW if this gets moved to chat, please don't move Christian's comment and my reply. – Dikran Marsupial Sep 05 '23 at 19:13
0

P(H|E) = [P(H)*(P(E/H)]/P(E) is Bayes' Theorem.

Hypothesis (H): All Swans are White

The probability that all swans are white = P(H) = 50%

The probability that there's evidence (there are white swans) given all swans are white = P(E|H) > 50% (say P(E|H) = 70%)

The probability that there's evidence (there are white swans) = P(E) = 50%

P(H|E) = [50% * 70%]/50% = 70%.

So, the probability that all swans are white went up from 50% to 70%, after we saw some white swans (after we found some evidence).

Agent Smith
  • 3,642
  • 9
  • 30