I commonly calculate the precision and recall for information pulled from text but I'm not sure how to calculate the margin of error for those precision and recall values.
So for example, if I have a sample 1,000 given names out of an unknown amount of data. My goal is to indentify the gender of of each name. My system does its magic and I am able to determine that I am assigning gender with 90% precision and 90% recall.
How do I calculate the margin of error for the precision and recall given that the values I've calculated are over a sample of data?
I've found a formula for the Maximum Margin of Error on http://www.had2know.com/business/compute-margin-of-error.html
but this doesn't seem like it would really apply in this kind of situation.
(Note: I've also asked this question under the Linguists Stack Exchange and was directed here. Calculating margin of error for precision and recall )