Brian G Birkhead, Managing Director, Coniak Ltd
Introduction
The aim of Media Attribution is to accurately apportion the sales value observed (or likely to be observed) in any particular period, to each of the media “In play” in that period, with the aim of planning media budgets and optimising the return on them.
The number of routes by which customers can touch an organisation to research and potentially make purchases is legion, being the set of all of the possible sequences of search engine, social media, display media, affiliate links, email, etc.., (with each of these media potentially appearing more than once in any particular sequence).
All media “touches” appearing in a sequence associated with a particular individual visitor may legitimately contribute to any eventually observed sale.
Attribution techniques fall into two broad classes – media mix models and fractional attribution models – which serve different but complementary purposes for an organisation.
Media Mix approaches (an example of which is econometric modelling) use a wide range of industry- and organisation- specific data to understand media impacts over time. They cover all broad-reach media, which cannot always be linked back to individuals, and deliver a high-level view of the return on media spend (descriptively and predictively) over long time periods, incorporating time-lagged effects and a range of external factors in their model forms.
Fractional Attribution models by contrast, operate on individual-level data and try to assign to each medium the appropriate proportion of conversions/value from orders in generally shorter time periods. They use large amounts of detailed data (digital and direct) gathered across a range of platforms, devices and channels and combine it to create individuals’ visit and conversion paths which form the unit records for the models.
For an organisation to gather these data itself may require significant investment and many have to fall back on off-the-shelf fractional attribution modelling services – such as those provided by Google Analytics – which offer only a limited set of modelling options using conversion path data that they hold and which is not made available at individual level for the organisation to independently analyse itself.
In my view, these two general approaches (Media Mix & Fractional Attribution) are complementary and an intelligent organisation will use them in combination to inform their media strategies & expenditure both at the “highest” level and at the campaign level.
In this paper, we put econometric modelling (and media mix modelling generally) to one side as an established and robust way of informing and managing long-term media budgets and we focus on some of the main fractional attribution options available for use “at ground level”.
I will summarise and briefly critique some of the main approaches in this space and go on to recommend what in my view is the preferred method, which offers flexibility and transparency, actively encourages the use of marketers’ business intelligence in the creation of models and delivers statistically robust, interpretable and actionable results.
A Brief Review of Some of the Main Fractional Attribution Options
- Rules-based Attribution
There are a number of simple rules-based attribution “models” available (e.g. within Google Analytics) which operate only on converted visits, taking no account of non-conversions.
Routinely using these models strikes me as being less than clever since, In truth, they are not models but simply heuristics.
Making subjective rule choices based on opinion, hope or speculation cannot be right when trying to estimate returns on marketing investments. This is not a rational way to go about managing substantial budgets.
Attribution of sales must vary according to the time, the sequence and the type of the individual media interactions and simplistic, rules-based models cannot be appropriate.
This means that first-touch and last-touch models, linear attribution (assigning the same proportion of sales to each media touched along the conversion path) and time-decay models (weighing on recency of media touch) in most cases just won’t cut the mustard. All will produce wildly different results and all will lack justification in equal measure.
Given that this is a very high stakes game, more rational & intelligent approaches are called for which acknowledge that this problem is essentially a data-driven, predictive modelling one rather than one that can be solved by invoking arbitrary, heuristics rules.
To use an approach because it is simple and readily available is in my view, dumb.
NOTE: Rules-based models do not provide estimates of truly incremental contributions to value since they tend not to allow explicitly for sales that result from underlying brand equity nor from other non-trackable media impacts. Instead, they set out to allocate all the observed sales value to the media under consideration. This means that their estimates are effectively just an ordering of the relative contribution of the different media.
- Algorithmic Sequence & Markov Modelling
This is a technique that predicts conversion probabilities for different path sequences. By aggregating over specific sets of sequences that contain a particular medium and comparing this to an aggregation over an identical set of sequences that do not contain that medium, an estimate of the impact on conversions of the medium is made. Estimates of this sort across all media then deliver the relative contribution of each medium to observed conversions.
If done well it is an approach which makes more sense than most heuristic alternatives. However, it uses merely customer journey sequences and takes no explicit account of more complex factors, such as time between media touches.
- Probabilistic Attribution Modelling
These approaches apply statistical and estimation modelling procedures to customer pathway and conversion data alongside customer attribute variables to derive the probability of conversion en route to estimating the fractional contribution of different media to observed conversions.
They break down into two distinct classes:
- those that use “standard” model forms (such as logistic regression) to represent the underlying relationship and go on to estimate model parameters statistically
- those that use machine learning, standard and non-standard model forms to explore a large number of solutions in order to select the “best” one
We will discuss CII. first:
Smart models of this sort (often mathematical than statistical) may require fewer explicit assumptions about the data distributions and inter-factor relationships than conventional statistical models (ci.) require and, as new journey data arrives, they are automatically updated and refined. They are therefore able to adapt quickly to changing environmental and business factors. For this reason, they have been described in terms of “attribution delivered with the maximal flexibility, speed, automation and scalability that big data demands”.
However. this approach (particularly at the very “smart” end of the spectrum) , apart from carrying the danger of over-fitting, suffers (and justifiably in my view) from being perceived as a black box – a form of undirected analysis in which control, transparency of method and clarity of explanation are often implicitly sacrificed to the machine.
This transfer of control puts a great onus on the data structures and formats that are presented to the smart algorithm and demands a meticulous, resource-intensive data preparation phase which cannot itself be fully automated.
These considerations, a general suspicion of smart algorithms per se, together with skills and cost issues, currently make this approach within the purview of only a few organisations, and sticking with the status quo attribution paradigm is a much easier and emotionally frictionless decision for the rest. Doubtless, this will change in the future as solutions emerge that overcome these barriers to adoption and as a body of empirical evidence supporting their validity grows and is critically scrutinised.
We now turn to my currently preferred approach, CI.
Despite its dependency on fixed model forms and its lesser adaptive capability, the use of established, statistical models on customer path & characteristic data to attribute media contributions has a number of substantial advantages over the other options:
- it engages the marketer at every stage of the modelling process
- it is transparent
- it uses statistically robust procedures which are already familiar and well-established in the marketing arena across a large number of applications
- all (raw or derived) variables relating to the frequency, nature, sequencing and timing of customer pathways can be considered and tested for inclusion in the model in a systematic way (this could & should also be done in ci too, of course)
- it allows for the explicit inclusion & estimation of interaction (halo) effects between media (by inclusion of the appropriate cross-terms in the regression equation)
- it accommodates and estimates a “base” level of value that would result irrespective of the media being present (or from media that cannot be tracked at the individual level)…
- … and therefore genuinely delivers estimates of incremental media contributions
- It makes use of analytical resource and skills that already exist in most organisations
- It is profoundly more appropriate to this complex problem than the commonly used heuristic models
In the literature, most CI methods focus on using regression methods solely to estimate the impact of media (and customer variables) on the likelihood of conversion in order to assign to each medium the appropriate proportion of all conversions observed in a given period.
In a recent development with Fusion-Analytics*, my own company, Coniak, has developed a refinement of this approach** which carries all the listed benefits 1-9 above, and which combines two individual-level regression models to deliver robust estimates of the incremental contribution of media to the observed value in a period. The new methodology implicitly acknowledges and allows for the fact that different types of customers who visit may correlate with the different media to which they have been exposed during their journey, and that those different customer types, in turn, may not only convert at different rates but also place orders of significantly different value.
This approach, to me, currently seems the truly “smart” one to use!
* Fusion-Analytics provides a platform for capturing the necessary multi-channel path data
** For more information on this methodology contact:
brian.birkhead@coniak.co.uk or howard@fusion-analytics.co.uk