At a purely formal level, one could call probability theory the study
of measure spaces with total measure one, but that would be like
calling number theory the study of strings of digits which terminate
-- from Terry Tao's Topics in random matrix theory.
I think this is the really fundamental thing. If we've got a probability space $(\Omega, \mathscr F, P)$ and a random variable $X : \Omega \to \mathbb R$ with pushforward measure $P_X := P \circ X^{-1}$, then the reason a density $f = \frac{\text d P_X}{\text d\mu}$ integrates to one is because $P(\Omega) = 1$. And that's more fundamental than pdfs vs pmfs.
Here's the proof:
$$
\int_{\mathbb R} f \,\text d\mu = \int_{\mathbb R} \,\text dP_X = P_X(\mathbb R) = P\left(\{\omega \in \Omega : X(\omega) \in \mathbb R\}\right) = P(\Omega) = 1.
$$
This is almost a rephrasing of AdamO's answer (+1) because all CDFs are càdlàg, and there's a one-to-one relationship between the set of CDFs on $\mathbb R$ and the set of all probability measures on $(\mathbb R, \mathbb B)$, but since the CDF of a RV is defined in terms of its distribution, I view probability spaces as the place to "start" with this kind of endeavor.
I'm updating to elaborate on the correspondence between CDFs and probability measures and how both are reasonable answers for this question.
We begin by starting with two probability measures and analyzing the corresponding CDFs. We conclude by instead starting with a CDF and looking at the measure induced by it.
Let $Q$ and $R$ be probability measures on $(\mathbb R, \mathbb B)$ and let $F_Q$ and $F_R$ be their respective CDFs (i.e. $F_Q(a) = Q\left((-\infty, a]\right)$ and similarly for $R$). $Q$ and $R$ both would represent pushforward measures of random variables (i.e. distributions) but it doesn't actually matter where they came from for this.
The key idea is this: if $Q$ and $R$ agree on a rich enough collection of sets, then they agree on the $\sigma$-algebra generated by those sets. Intuitively, if we've got a well-behaved collection of events that, through a countable number of complements, intersections, and unions forms all of $\mathbb B$, then agreeing on all of those sets leaves no wiggle room for disagreeing on any Borel set.
Let's formalize that. Let $\mathscr S = \{(-\infty, a] : a \in \mathbb R\}$ and let $\mathcal L = \{A \subseteq \mathbb R : Q(A) = R(A)\}$, i.e. $\mathcal L$ is the subset of $\mathcal P(\mathbb R)$ on which $Q$ and $R$ agree (and are defined). Note that we're allowing for them to agree on non-Borel sets since $\mathcal L$ as defined isn't necessarily a subset of $\mathbb B$. Our goal is to show that $\mathbb B \subseteq \mathcal L$.
It turns out that $\sigma(\mathscr S)$ (the $\sigma$-algebra generated by $\mathscr S$) is in fact $\mathbb B$, so we hope that $\mathscr S$ is a sufficiently big collection of events that if $Q = R$ everywhere on $\mathscr S$ then they're forced to be equal on all of $\mathbb B$.
Note that $\mathscr S$ is closed under finite intersections, and that $\mathcal L$ is closed under complements and countable disjoint intersections (this follows from $\sigma$-additivity). This means that $\mathscr S$ is a $\pi$-system and $\mathcal L$ is a $\lambda$-system. By the $\pi$-$\lambda$ theorem we therefore have that $\sigma(S) = \mathbb B \subseteq \mathcal L$. The elements of $\mathscr S$ are nowhere near being as complex as an arbitrary Borel set, but because any Borel set can be formed from a countable number of complements, unions, and intersections of elements of $\mathscr S$, if there is not a single disagreement between $Q$ and $R$ on elements of $\mathscr S$ then this will be followed through to there being no disagreements on any $B \in \mathbb B$.
We have just shown that if $F_Q = F_R$ then $Q = R$ (on $\mathbb B$), which means that the map $Q \mapsto F_Q$ from $\mathscr P := \{P : P \text { is a probability measure on } (\mathbb R, \mathbb B)\}$ to $\mathcal F := \{F : \mathbb R \to \mathbb R : F \text { is a CDF}\}$ is an injection.
Now if we want to think about going the other direction, we want to start with a CDF $F$ and show that there is a unique probability measure $Q$ such that $F(a) = Q\left((-\infty, a]\right)$. This will establish that our mapping $Q \mapsto F_Q$ is in fact a bijection. For this direction, we define $F$ without any reference to probability or measures.
We first define a Stieltjes measure function as a function $G : \mathbb R \to \mathbb R$ such that
- $G$ is non-decreasing
- $G$ is right-continuous
(and note how being càdlàg follows from this definition, but because of the extra non-decreasing constraint "most" càdlàg functions are not Stieltjes measure functions).
It can be shown that each Stieltjes function $G$ induces a unique measure $\mu$ on $(\mathbb R, \mathbb B)$ defined by
$$
\mu\left((a, b]\right) = G(b) - G(a)
$$
(see e.g. Durrett's Probability and Random Processes for details on this). For example, the Lebesgue measure is induced by $G(x) = x$.
Now noting that a CDF is a Stieltjes function $F$ with the additional properties that $\lim_{x\to-\infty} F(x) := F(-\infty) = 0$ and $\lim_{x\to\infty} F(x) := F(\infty) = 1$, we can apply that result to show that for every CDF $F$ we get a unique measure $Q$ on $(\mathbb R, \mathbb B)$ defined by
$$
Q\left((a, b]\right) = F(b) - F(a).
$$
Note how $Q\left((-\infty, a]\right) = F(a) - F(-\infty) = F(a)$ and $Q\left((-\infty, -\infty]\right) = F(\infty) - F(-\infty) = 1$ so $Q$ is a probability measure and is exactly the one we would have used to define $F$ if we were going the other direction.
All together we have now seen that the mapping $Q \mapsto F_Q$ is 1-1 and onto so we really do have a bijection between $\mathscr P$ and $\mathcal F$. Bringing this back to the actual question, this shows that we could equivalently hold up either CDFs or probability measures as our object which we declare probability to be the study of (while also recognizing that this is a somewhat facetious endeavor). I personally still prefer probability spaces because I feel like the theory more naturally flows in that direction but CDFs are not "wrong".