How do we know if something is generated from a random process?

Question

Suppose a machine outputs a digit between 1 - 9 over and over. You try to figure out if it's random. The most natural definition of random seems to be that it exhibits no pattern. You simply look at the data, check whether there's some pattern inherent to it, and determine if it's random. But it exhibiting some pattern by itself can't be it.

Clearly, if someone predicts that the next 5 digits will be 5, 4, 2, 1, and 9, and it ends up coming out as such, you now have some suspicion that the pattern may not be random, even though the sequence itself is patternless.

Even without a prediction, if the machine spit out your birthday, you would be rightly surprised. It might spit out 19980910 and you now have suspicion that some agent is at play even though the sequence 19980910 itself does not exhibit a pattern.

So what constitutes as evidence of a non random process? A pattern OR a sequence being resulted from a prediction OR a sequence being meaningful? The last of these seems debatable of course since random processes can create meaningful sequences. So if the sequence is meaningful then, how meaningful does it have to be before it is considered non random?

By definition, if a machine outputs the numbers then the numbers are not random. Machines are deterministic. Regarding the claim that a sequence that looks random is random, you might consider the decimal digits of pi. They are perfectly deterministic. There's an algorithm that, given a positive integer n, will output the n-th digit after a finite number of operations. The digits are not random. But they look random. There's random as in randomly generated, and random as in looking statistically random. Two different things. — user4894, Jan 05 '23 at 23:18
By running randomness tests on the output. Formally, identifying an algorithmically random sequence requires knowing the entire infinite sequence, but in practice sufficiently long finite sequences that pass randomness tests are good enough. Roughly, they cannot be compressed into a generating code that is substantially shorter than the sequence itself, "meaningfulness" is irrelevant. — Conifold, Jan 05 '23 at 23:35
@Conifold You have conflated the two kinds of randomness. Any sufficiently long initial segment of the digits of pi displays statistical randomness; but there are many very short close-form expressions that generate the entire infinite sequence of digits. In other words the digit stream is statistically random yet highly compressible. — user4894, Jan 05 '23 at 23:59
I think what you mean to ask is whether one can determine if a sequence is produced by a random process simply by examining the sequence. I would have said "yes" until I saw @user4894's comment about pi . I think this question might be better asked on the math stackexchange. — David Gudeman, Jan 06 '23 at 03:03
If you had any complete theory to fix the landscape of randomness with its intrinsic complexity then it's probably incomputable. There's a famous paradox of statistics where for two twenty bits of binary strings one is 01001101...010 and the other is 0000...000 (all 0), apparently the latter won't pass any Diehard (random) test while it shares exactly the same probability as the former from a same uniform Head-Tail stationary stochastic process generator. And you can further demand such a data generator to follow an arbitrary random distribution, thus you jump to another level of randomness... — Double Knot, Jan 06 '23 at 05:11
@user4894 Nope, the digits of pi do not pass all statistical randomness tests, although they do pass some simple ones that are commonly performed, exactly because there is a compressing algorithm. "Since its inception, Martin-Löf randomness has been shown to admit many equivalent characterizations — in terms of compression, randomness tests, and gambling — that bear little outward resemblance to the original definition." — Conifold, Jan 06 '23 at 07:57
@Conifold Name a statistical randomness test the digits of pi don't pass. On the contrary, pi is widely believe though not proven to be normal; meaning that its digits ARE statistically random. Of course if pi is not normal, you'd be right. I don't suppose you have a proof for that assertion. No? Then you made a claim you can't support. You're wrong. — user4894, Jan 06 '23 at 18:17
@Conifold Source and context of your quote would be helpful. Also, it does not say what you claim. It doesn't claim that no computable real can be statistically random. That's the claim you made, to which I'd appreciate a specific reference. — user4894, Jan 06 '23 at 18:47
Related question on Cross Validated: How do you know something isn't random? — Sandejo, Jan 06 '23 at 23:20
@user4894 Normality only gives you uniform natural density (which is not sigma additive), not passing of all randomness tests. The latter is equivalent to non-compression by Martin-Löf's theorem on constructive null covers. The quote and the theorem can be found under the second link in the original comment. The test digits of pi do not pass is mentioned there as well, Martin-Löf's RANDc test, which is universal. There is even a link to his 1966 paper with proofs. — Conifold, Jan 07 '23 at 08:55
@Conifold I did not see any claim in either article that pi fails any of the specifically named tests. I would be grateful for a specific reference that there exists a test for randomness that detects the algorithmic nature of pi, or for that matter any other random-looking computable number. Are you claiming that no computable bitstring can pass all known tests for randomness? And doesn't this claim depend on what's known at any given time? I read long ago that a pseudo-random sequence "passes all known statistical tests of randomness." Clearly this is no longer true, but how false is it? — user4894, Jan 12 '23 at 03:30
@Conifold Also see https://www.purdue.edu/uns/html4ever/2005/050426.Fischbach.pi.html, http://www.yaroslavvb.com/papers/marsaglia-on.pdf, https://stats.stackexchange.com/questions/44445/are-the-digits-of-pi-statistically-randomm, https://www.jstor.org/stable/2685604, and https://www2.lbl.gov/Science-Articles/Archive/pi-random.html, among many. The fact that so many researchers are even asking the question shows there's no known answer. To the best of my understanding I do not believe you've made your case. I don't see why a computable sequence can't be statistically random. — user4894, Jan 12 '23 at 04:06

score 2 · Answer 1 · answered Jan 06 '23 at 00:23

For any given finite output, there an infinite number of deterministic processes that could have generated that output. As additional output is observed, infinitely many possibilities will be ruled out, but infinitely many possibilities will remain. No matter what output is observed, it is impossible to rule out an underlying deterministic algorithm.

score 0 · Answer 2 · answered Jan 07 '23 at 03:48

Nothing random can be generated from a process.

There cannot be any randomness if there is anything generated. Generation has to be by some method, some way, some mechanism. The very existence of method / way / mechanism go against the notion of randomness.

The second blow is the word process itself. You have said that in your own question, process. Process has to be defined (how else will you implement it). To define it you have to choose some way and reject all other ways. Your path therefore is set so where you reach is also set. You cannot generate random stuff if you have a process.

How do we know if something is generated from a random process?

2 Answers2