Remember that continuous probability distributions can be represented analytically, as a curve in the plane.
Suppose we have two curves represented by f(x) and g(x). One way to define the distance between them is the greatest value of the absolute value of f(x)-g(x), the distance between the two ordinates at the abscissa x.
If this value is small, then the functions are close. Otherwise, nothing is guaranteed. This forms the "statistical distance."
Often, we want to see if our guess of the true probability distribution is correct. One way of doing this is to estimate, using computational methods, the estimated density of the sample. Then, using the test above, we can see if our guess is correct.
This is just reason I can think of.