Suppose I have a proportion estimate defined as the proportion of times some event occurs out of $N$ trials. I call this proportion the sample proportion $\hat{p}$, while I call a second as the true population proportion $p$. Suppose I have something akin to a bootstrap over different inputs, where I can generate many such pairs, and I would like to compare these two values as my evaluation criteria. Generally, I am thinking of just taking the difference as a metric in comparing how different they are:
$$ \hat{p}-p $$
However, are there better metrics for comparing these? Would the ratio be better in certain cases?