I am currently experimenting with a TV attribution approach proposed by Google: Liu, Y., Schwarzkopf, Y., & Koehler, J. (2017). TV Impact on Online Searches.
They propose comparing website traffic after TV-spots to an estimated baseline. Because spots may overlap, these spots are first pooled in spotgroups and the spotgroup-uplifts are later disaggregated to the individual spots based on channel, impressions and hour of day.
The approach is outlined quite clearly (see Section 2.2 on p. 5f in the paper above), but I am not sure, how to implement the model (in R). I also posted my current state of experimentation with some toy data.
Basically this boils down to questions related to linear regression:
- How can I implement the proposed multiplication between Channel and Impressions? Would this be a simple interaction effect? (say
ch1:imp1in R) - How can I account for a flexible number of spots within each spotgroup (compare sum over m in formula (3))?
There are plenty of spots with only one spot in the spotgroup. There are also quite a few different channels (around 20). How can I best model this? Do I manually have to create some dummy variables here?
(There may be multiple spots within the same spotgroup with the same channel also)
Desired Model
Current Attempts
uplift_df <- tibble::tribble(
~uplift, ~ch1, ~imp1, ~hour1, ~ch2, ~imp2, ~hour2, ~ch3, ~imp3, ~hour3,
200, 1, 5000, 19, 0, 0, 0, 1, 300, 19,
50, 1, 4000, 22, 1, 500, 22, 0, 0, 0,
400, 0, 0, 0, 1, 10000, 14, 1, 500, 14,
80, 0, 0, 0, 0, 0, 0, 1, 1000, 21,
10, 1, 1000, 12, 1, 2000, 13, 1, 500, 12,
100, 1, 8000, 14, 0, 0, 0, 1, 300, 14,
90, 1, 4000, 12, 1, 500, 12, 0, 0, 0,
250, 0, 0, 0, 1, 10000, 14, 1, 500, 14,
50, 0, 0, 0, 0, 0, 0, 1, 1000, 21,
20, 1, 4000, 12, 1, 2000, 13, 1, 500, 12
)
lm(uplift ~ ch1:imp1 + hour1 + ch2:imp2 + hour2 + ch3:imp3 + hour3, data = uplift_df)
#>
#> Call:
#> lm(formula = uplift ~ ch1:imp1 + hour1 + ch2:imp2 + hour2 + ch3:imp3 +
#> hour3, data = uplift_df)
#>
#> Coefficients:
#> (Intercept) hour1 hour2 hour3 ch1:imp1
#> 278.551180 -43.146585 31.957221 64.208258 -0.001157
#> ch2:imp2 ch3:imp3
#> -0.051877 -1.575838
Created on 2019-02-20 by the reprex package (v0.2.1)
