"A statistic t=T(X) is sufficient for underlying parameter θ precisely if the conditional probability distribution of the data X, given the statistic t = T(X), does not depend on the parameter θ."
If the sampling distribution for some data $X$ does not depend on $\theta$ then how can that data say anything about $\theta$?
It would be like estimating some value by observing something irrelevant (that does not depend on the value to be estimated).
This is a general statement. In this case we have more specifically as data 'the rest of the data conditional on the sufficient statistic'. And that is confusing because the sample distribution for the rest of the data does depend on the parameter to be estimated. It is only that the conditional distribution of that data does not depend on the parameter to estimated.
example 3 (different outcomes of data, but with the same probability for a given $\theta$)
(edit: based on the comments I came up with a much more simple/intuitive explanation)
Say you do a urn problem trying to estimate the fraction of blue balls in an urn. You perform an experiment by drawing balls with replacement.
Say you got "$x_1 = red$, $x_2 = blue$, $x_3 = red$, $x_4 = blue$, $x_5 = blue$"
That is 3 blue balls in total (the total is the sufficient statistic). You could base on this a point estimate of 0.6 for the fraction of blue balls in the urn. (in reality you should take a bigger sample if you want to make a confidence interval with a narrow bandwith, but that makes this example difficult to write down)
Now, does it matter (for the fraction) which particular balls $x_i$ where blue (beyond the fact that we already know the total number 3)? Is the estimate gonna be different for "$x_1 = blue$, $x_2 = red$, $x_3 = red$, $x_4 = blue$, $x_5 = blue$" or any other different observation that also has 3 blue balls in total? Each of these outcomes, with a total of 3 blue balls, are equally possible. So they will not give more information about the fraction of balls in the urn.
We could tabulate all the different outcomes and how the probability of observing them depends on $\theta$ (the fraction of blue in the vase)
observation probability of observing given theta
bbbbb (1-theta)^0(theta)^5
rbbbb (1-theta)^1(theta)^4
brbbb (1-theta)^1(theta)^4
bbrbb (1-theta)^1(theta)^4
bbbrb (1-theta)^1(theta)^4
bbbbr (1-theta)^1(theta)^4
rrbbb (1-theta)^2(theta)^3
rbrbb (1-theta)^2(theta)^3
rbbrb (1-theta)^2(theta)^3
rbbbr (1-theta)^2(theta)^3
brrbb (1-theta)^2(theta)^3
brbrb (1-theta)^2(theta)^3
brbbr (1-theta)^2(theta)^3
bbrrb (1-theta)^2(theta)^3
bbrbr (1-theta)^2(theta)^3
bbbrr (1-theta)^2(theta)^3
rrrbb (1-theta)^3(theta)^2
rrbrb (1-theta)^3(theta)^2
rbrrb (1-theta)^3(theta)^2
brrrb (1-theta)^3(theta)^2
rrbbr (1-theta)^3(theta)^2
rbrbr (1-theta)^3(theta)^2
brrbr (1-theta)^3(theta)^2
rbbrr (1-theta)^3(theta)^2
brbrr (1-theta)^3(theta)^2
bbrrr (1-theta)^3(theta)^2
brrrr (1-theta)^4(theta)^1
rbrrr (1-theta)^4(theta)^1
rrbrr (1-theta)^4(theta)^1
rrrbr (1-theta)^4(theta)^1
rrrrb (1-theta)^4(theta)^1
rrrrr (1-theta)^5(theta)^0
Notice that in this table above there are groups of potential observations/outcomes for which the probability to be observed have exactly the same dependency on $\theta$. This means that it doesn't matter whether you observe rbrbb or brrbb, they relate to $\theta$ in the same way. All the observations with three blue balls can be considered to provide the same information about $\theta$.
This is sort of what the sufficient statistic does. It groups together the observations whose Likelihood dependency on $\theta$ is the same.
I have deleted examples 1 and 2, because it makes the post very large, but you can still see them in the history of this post