FR:
Accuracy: 0.64 F1_score: 0.391
TP: 232 FP: 553
FN: 170 TN: 1056
US:
Accuracy: 0.709 F1_score: 0.463
TP: 251 FP: 433
FN: 150 TN: 1172
CA:
Accuracy: 0.88 F1_score: 0.725
TP: 306 FP: 150
FN: 82 TN: 1401
DE:
Accuracy: 0.509 F1_score: 0.366
TP: 293 FP: 893
FN: 120 TN: 758
UK:
Accuracy: 0.929 F1_score: 0.824
TP: 329 FP: 74
FN: 67 TN: 1510
ROAS
A compound poisson like process?
What is ROAS?
ROAS or return on ad spend is a ratio of revenue driven by an advertisement campaign and the cost of that campaign \(\frac{R}{C}\). How would ROAS be distributed? We have the revenue which is roughly the number of conversions driven by an ad campaign \(N\) times the revenue driven per conversion \(P\): \(R=NP\).
How is ROAS distributed?
Let’s take the number of conversions (\(N\)) from a given number of impressions as poisson distributed (potentially with some liberties taken with the measure of impressions). \[N \sim Pois(\lambda)\]
The revenue driven per conversion (\(P\)) and ad expenditures (\(C\)) are drawn from some positive non-zero distribution I choose a lognormal distribution. \[\begin{aligned}P &\sim Lognormal(\mu_p, \sigma_p^2) \\ C &\sim Lognormal(\mu_c, \sigma_c^2)\end{aligned}\]
We can now think of ROAS as \(N\frac{P}{C}\). We can make some further simplifying assumptions:
- The revenue driven per conversion and the average ad expenditure do not depend on the content of the creative.
- The conversion rate is independent of the revenue driven per conversion and ad spend.
- The only influence the content of the creative \(X\) has is on the rate of conversions \(\lambda\).
Under these assumptions: \[\begin{aligned}\mu(creative)&=E[ROAS|creative]\\&=E[N\frac{P}{C}|creative]\\&\approx E[N|creative]E[\frac{P}{C}]\\ if \, P \perp \!\!\! \perp C \, \& \, E[C]\gg1 \implies &\approx E[N|creative]\frac{E[P]}{E[C]}\end{aligned}\]
These assumptions may not be realistic. For instance the conversion rates may be lower for more expensive products meaning \(cov(N, P) \not= 0\). Or a creative can influence a person to purchase a more expensive laptop over the one they were originally going to purchase \(P|X \not= P\).
The Model
A linear model will be used to predict ROAS based on the content of the creative. This type of model is relatively easy to implement and given certain assumptions can produce results with nice statistical properties.
OLS
OLS requires: \[\begin{aligned} ROAS|X &= X\beta + \varepsilon \\ \varepsilon &\sim N(0, \sigma^2) \\ Var(\varepsilon|X) &= \sigma^2 \end{aligned}\]
from the above generative model this is very unlikely to be true. But we can still use OLS to get a sense of the relationship between ROAS and the creative. Lets see how well OLS performs at uncovering the relationship between ROAS and the creative.
| true_betas | pred_betas | pvalues | |
|---|---|---|---|
| features | |||
| feature_0 | -0.117 | -0.223 | 0.000 |
| feature_1 | 0.124 | 0.147 | 0.002 |
| feature_2 | -0.061 | -0.004 | 0.935 |
| feature_3 | 0.066 | 0.134 | 0.005 |
| feature_4 | 0.135 | 0.213 | 0.000 |
| feature_5 | 0.003 | 0.032 | 0.506 |
| feature_6 | 0.006 | 0.055 | 0.248 |
| feature_7 | -0.000 | 0.002 | 0.970 |
| feature_8 | -0.141 | -0.315 | 0.000 |
| feature_9 | 0.017 | -0.015 | 0.753 |
GLM and Link Functions
A model should only be able to produce plausible responses, i.e. ROAS can not be negitive so the model should not be able to predict negative ROAS.
A natural link between \(\mathbb{R}\) (the world of a linear model \(X\beta\)) and \(\mathbb{R}^+/\{0\}\) (the world of ROAS) is a log link. \[\begin{aligned}\mu(X) &\approx E[N|X]\frac{E[P]}{E[C]} \\&=\lambda_{X}\frac{e^{\mu_p+\frac{\sigma_p^2}{2}}}{e^{\mu_c+\frac{\sigma_c^2}{2}}} \\&= \beta_0e^{X\beta}\end{aligned}\] or, \[\begin{aligned}log(\mu(X)) &\approx log(E[N|X])+log(E[P])-log(E[C]) \\&= log(\lambda_X) + \mu_c+\frac{\sigma_c^2}{2} -\mu_p-\frac{\sigma_p^2}{2}\\&= X\beta + \beta_0\end{aligned}\] Because we assumed that P and C are independent of X they get absorbed by the constant term \(\beta_0\) and \(cov(N, P)=cov(N,C)=0\).
FR:
Accuracy: 1.0 F1_score: 0.999
TP: 401 FP: 0
FN: 1 TN: 1609
US:
Accuracy: 1.0 F1_score: 0.999
TP: 400 FP: 0
FN: 1 TN: 1605
CA:
Accuracy: 0.999 F1_score: 0.999
TP: 387 FP: 0
FN: 1 TN: 1551
DE:
Accuracy: 1.0 F1_score: 0.999
TP: 412 FP: 0
FN: 1 TN: 1651
UK:
Accuracy: 0.999 F1_score: 0.999
TP: 395 FP: 0
FN: 1 TN: 1584
| true_betas | pred_betas | pvalues | |
|---|---|---|---|
| features | |||
| feature_0 | -0.117 | -0.047 | 0.028 |
| feature_1 | 0.124 | 0.039 | 0.079 |
| feature_2 | -0.061 | 0.021 | 0.323 |
| feature_3 | 0.066 | 0.016 | 0.464 |
| feature_4 | 0.135 | 0.059 | 0.009 |
| feature_5 | 0.003 | 0.010 | 0.656 |
| feature_6 | 0.006 | 0.018 | 0.401 |
| feature_7 | -0.000 | -0.008 | 0.715 |
| feature_8 | -0.141 | -0.062 | 0.004 |
| feature_9 | 0.017 | -0.013 | 0.545 |
We observe revenue \(NP|X\) and ad spend \(C|X\). We can use these to estimate the ROAS \(\frac{NP}{C}\). Or we can model revenue per impression directly, this removes the influence of the cost of the ad campaign which does not effect nor is effected by the content of the creative. We can use the same model as above to estimate the revenue per impression \(E[NP|X]\) and then use the ad spend to estimate the ROAS. This is the same as the first model but with the revenue per impression as the dependent variable instead of the ROAS.
FR:
Accuracy: 0.8 F1_score: 0.52
TP: 218 FP: 218
FN: 184 TN: 1391
US:
Accuracy: 0.85 F1_score: 0.632
TP: 259 FP: 159
FN: 142 TN: 1446
CA:
Accuracy: 0.94 F1_score: 0.856
TP: 345 FP: 73
FN: 43 TN: 1478
DE:
Accuracy: 0.689 F1_score: 0.46
TP: 273 FP: 502
FN: 140 TN: 1149
UK:
Accuracy: 0.994 F1_score: 0.986
TP: 390 FP: 5
FN: 6 TN: 1579
| true_betas | pred_betas | pvalues | |
|---|---|---|---|
| features | |||
| feature_0 | -0.117 | -0.108 | 0.000 |
| feature_1 | 0.124 | 0.110 | 0.000 |
| feature_2 | -0.061 | -0.050 | 0.005 |
| feature_3 | 0.066 | 0.055 | 0.003 |
| feature_4 | 0.135 | 0.115 | 0.000 |
| feature_5 | 0.003 | -0.006 | 0.743 |
| feature_6 | 0.006 | -0.004 | 0.841 |
| feature_7 | -0.000 | -0.005 | 0.785 |
| feature_8 | -0.141 | -0.127 | 0.000 |
| feature_9 | 0.017 | 0.007 | 0.705 |