Suppose you decide ahead of time that you will reject
the null hypothesis that a coin is fair, if it has
a run of Heads or Tails that is $7$ or longer among ten tosses, then
what is the significance level of that test?
Put another way, what is the probability that a
fair coin will have a run of $7$ or more among ten tosses?
In R, the procedure rle (for Run Length Encoding)
provides as way to approximate this probability
by simulation.
Consider one experiment with ten tosses:
set.seed(2022)
x = rbinom(10, 1, .5)
x
[1] 1 1 0 1 0 1 0 0 0 1
rle(x)
Run Length Encoding
lengths: int [1:7] 2 1 1 1 1 3 1
values : int [1:7] 1 0 1 0 1 0 1
We see that there are seven runs, the longest
of which has length $3.$
Now we look at a $10\,000$ ten-toss experiments
to get an idea of the distribution of the length $W$ of the longest
run in ten independent tosses of a fair coin.
set.seed(320)
w = replicate(10^5,
max(rle(rbinom(10,1,.5))$len))
table(w)
w
1 2 3 4 5 6 7 8 9 10
191 17251 36152 24492 12338 5590 2426 998 360 202
mean(w >= 7)
[1] 0.03986
Thus it seems that the probability of getting a run of length seven or longer is about $0.04$ of 4%. So, according to the run-length criterion, we would
reject $H_0$ at the 5% level of significance.
There are theoretical results about run lengths
derived for use in such runs tests. You can google
runs test for discussions of distributions of run lengths.
The histogram below shows the approximate distribution of $W.$ The area of the bars to the
right of the vertical orange line is about $0.04.$
cutp = (0:10)+.5
hist(w, prob=T, br=cutp, col="skyblue2")
abline(v = 6.5, col="orange")

Note on R code for simulation: The numeric vector w
contains maximum run lengths in 10,000 ten-toss
experiments. The logical vector w >= 7 has
10,000 TRUEs and FALSEs, and mean(w) gives
the proportion of TRUEs.
1s as long as seven does seem unlikely. Have you studied distributions of runs? Also, at first glance it may seem unlikely to get seven or more heads in ten tosses of a fair coin, but the actual probability of that is about $0.17> 0.05 = 5%$ [In R, wherepbinomis a binomial CDF, code1 - pbinom(6, 10, .5)returns $0.171875.]$ You are correct to observe that each of the $2^{10}$ possible outcomes from ten tosses is equally likely. In testing a null hypothesis, it is best to agree--before taking data--about the criterion for rejection. – BruceET Mar 20 '22 at 17:2201subsequence is three times more likely than having four such subsequences, allowing one to arrive at the opposite conclusion. – whuber Mar 20 '22 at 18:25