I am a software engineer looking to build an A/B testing tool. I don't have a solid stats background but have been doing quite a bit of reading over the last few days.
I am following the methodology described here and will summarize the relevant points, below.
The tool will allow designers and domain experts to configure a website to split traffic received at a specific URL between two or more URLs. For example, traffic arriving at http://example.com/hello1 could be split between http://example.com/hello1 and http://example.com/hello2. Traffic would be split evenly between target URLs and the performance of the marketing processes at each of the target URLs will be compared.
In this experiment, the sample size N will correspond to visitors. The test will measure "conversions", a term describing when a visitor commits to a specific action in a marketing process. Conversions are expressed in percentages and a higher conversion rate is desirable. This makes the test a comparison of independent proportions. The tool needs to be able to be easily employed to produce tests with safe results. Selecting an appropriate value of N is important.
In the linked article, above, a power analysis of two independent proportions is employed to find N. This method requires that one know the conversion rate of the control in advance as well as specify the target desired conversion improvement. It also specifies a significance level of 95% and a statistical power of 80%.
Questions:
- Is this method of determining
Nsound? If so, what is the safest way to determine the conversion rate of the control prior to beginning the test? - Are there sound ways of determining
Nthat don't require that one know conversion rates of the control in advance? - Is the methodology in the linked article sound? If not, are there any accessible and easily digestible methods out there that you could link me to?