I keep reading that people are estimating the Google+ population based on statistical estimates:
My model is simple. I start with US Census Bureau data about surname popularity in the U.S., and compare it to the number of Google+ users with each surname. I split the U.S. users from the non-U.S. users. By using a sample of 100-200 surnames, I am able to accurately estimate the total percentage of the U.S. population that has signed up for Google+. Then I use that number and a calculated ratio of U.S. to non-U.S. users to generate my worldwide estimates. My ratio is 1 US user for every 2.12 non-U.S. users. That ratio was calculated on July 4th through a laborious effort, and I haven't updated it since. That is definitely a weakness in my model that I hope to address soon. The ratio will likely change over time.
How is this possible? I don't see how a fixed sample size tells you what percentage of the U.S. population is participating. Let's take 2 cases:
- case 1: there are 10,000 Google+ users
- case 2: there are 1,000,000 Google+ users
Why would the samples be statistically different?