Attribution Modelling

In this series of articles I’m talking about the 5 steps of digital marketing attribution. Classification, Pathing, Attribution, Valuation and Optimisation. In the last article I covered the details of pathing, and that built on the article about channel classification. That means today we can get into the real heart of digital marketing attribution by examining the different types of attribution model.

To give a value to a marketing channel we’re going to go back to every converting customer, evaluate every marketing touchpoint before conversion and give it a score, and then we’re going to sum up those scores to give an overall value for each marketing channel. Let’s start with a typical conversion path which could have occurred on our website.


And then in more detail (because we’ll need it later)

04/04/2016 SEO
05/04/2016 PPC
15/04/2016 PPC
15/04/2016 DIRECT
15/04/2016 CONVERSION

There are two main types of attribution model that can be applied to a conversion path, rules-based and data driven. Rule based are far easier to understand so lets start there.

Rule-based Attribution Models

Rule-based attribution models apply a fixed rule on how to weight each marketing touchpoint in a conversion path. This makes them straightforwards to implement once you’ve got the attribution paths available.

Last touch

Let’s start with the grandad of all attribution models, the last-touch model. The rule for this is super simple, give all of the conversion credit to the last touched marketing channel. In this case that means “DIRECT”. The real strength of this model is it’s simplicity. To build a last-touch attribution model you just keep track of the last marketing touchpoint in a cookie, when a conversion occurs you record that touchpoint along with the conversion. Done. No need for complicated pathing logic, no cross device journeys to worry about, no processing systems required beyond the standard web tracking pixels.

As we can see from the fact we attribute this conversion to “DIRECT” we have a danger here of ignoring the value of some of our last touch marketing channels. Just because the last touch was direct doesn’t mean we should ignore the marketing touchpoint that occurred an hour earlier. Surely that should get some credit?

Last non-direct touch

A common improvement to the last touch model is to push back conversion value to any previously seen marketing channels where “DIRECT” is at the tail end of the path. For the example conversion above we’d now give the conversion credit to “PPC”.

This now gives a bit more credit to marketing, fairly so, but last touch models do have a major flaw. They give all of the credit to the channels which are good at finally converting prospects, but they give no credit to the channel that introduced that prospect to your business. There’s a danger that by optimising to last touch you’ll strangle your sales funnel higher up.

First touch

And that leads us onto the next commonly known model, the first-touch model. In this case we do need to worry about pathing, and we do need to consider cross-device paths to be totally accurate. Once all that’s done we simply apply the conversion credit to the initial marketing channel in the journey. Note that you also need to think about how far back you’re going to look. In general it’s hard to say that a touchpoint that occurred 6 months ago really qualifies as introducing the prospect, unless they were continuously engaged from that point until the conversion event. The simple thing to do is to only take your paths back 30 days. Most paths will fall inside this time period anyway.

For the example shown above we will now give the conversion credit to SEO. But again, this seems a little unfair as we’re no longer giving any credit to PPC, and without PPC this lead may have ended up wtih a competitor. For this reason it’s worth using both the first and last touch models to evaluate marketing channels. From these two models you can get an idea of the first/last touch skew (last touch / first touch). Where the ratio is >1 a channel is better at completion, where the ratio is <1 the channel is better at introducing prospects.

First touch does give us more information than only looking at last touch, but it does feel like we should be able to do better.

Linear attribution

How about if we change our model so that we give partial credit for the conversion to all channels that were involved in a conversion path. For the example above we’ll give 25% credit to DIRECT, 50% credit to PPC and 25% credit to SEO.

This does seem somewhat fairer, and we can even ignore the DIRECT touchpoints to give 67% credit to PPC and 33% credit to SEO. But there is also a problem with this model. What happens if our very long conversion path started 30 days ago, and was then inactive until the day of conversion? We’ll end up crediting a lot of value to those marketing channels that were touched 30 days ago. But they shouldn’t be getting the same credit as the touchpoints from just before conversion.


So next up let’s use a time-decay model. In this case we weight the partial credit based on the recency of the marketing touchpoint to the conversion event. You could try lots of different weighting curves here, but for simplicity lets use a linear decay and calculate the credit for our current conversion path. First up let’s calculate a weighted influence for each touchpoint.

04/04/2016 SEO = (30-11)/30 = 0.633
05/04/2016 PPC = (30-10)/30 = 0.667
15/04/2016 PPC = (30-0)/30 = 1
15/04/2016 DIRECT = (30-0)/30 = 1
15/04/2016 CONVERSION

Now divide through by the total weighting and our credit scores are…

SEO = 0.192
PPC = 0.202 + 0.303 = 0.505
DIRECT = 0.303

We can keep refining our models with more and more complexity, such as U shaped attribution (weight introducing and completing channels more than intermediate channels), but ultimately we’re just making our rules more and more complex, and we don’t really know if we’re improving our attribution modelling. The main improvement for all attribution models is discounting DIRECT touch points, which for time-decay gives us…

SEO = 0.275
PPC = 0.290 + 0.435 = 0.725

Before we move on from these rules based systems there is one more common attribution model I want to mention.

Any Touch

An any touch model assigns credit to all channels that are involved during a conversion. In the case above we’d end up giving a credit of 1 to SEO, PPC and DIRECT. This could be considered fair as we’re going to end up giving credit to all channels involved in conversions, and when you sum up the scores for each marketing channel you’ll get the number of conversions that the channel participated in. Put another way, this is the maximum possible impact that the channel had.

It turns out that lots of businesses actually end up with this model accidentally. It’s particularly likely where each marketing channel is owned by a different team. That’s because each team is running analytics that looks for their contribution to the conversion, without considering any of the other marketing channels that might have been involved.

Generally speaking by introducing multi-channel marketing attribution we’re trying to get away from the Any Touch model, but it’s still a useful reference when evaluating marketing performance.

Data-driven Attribution Models

Now we get on to the crazy maths bit.

Data driven attribution modelling typically uses a method known as Bayesian probability to evaluate whether or not the presence of a marketing touchpoint in a customers journey is likely to lead to increased conversion.

Consider the case where we’re trying to evaluate the actual contribution of display advertising. We’ll look through all of the paths we see on our website and try to find those which are identical except for the presence of the display channel.

SEO -> Display -> PPC -> Conversion
SEO -> PPC -> Conversion
SEO -> Display -> PPC

Now we can look at the probability of conversion for the paths which include Display, and for the paths that don’t include it. And given these numbers we can estimate the value that the Display channel has in terms of encouraging conversion. When we want to get smarter we can start to look at whether the order of channels matters, whether the time between touchpoints matter, etc. etc. It can all get quite complicated, but at it’s heart this analysis will lead to a bunch of weights that you can apply to each marketing touchpoint in a converting path.

Now before we get too carried away with this, data driven modelling isn’t perfect and it can provide misleading results.

Correlation vs Causality

Most data driven models have a problem distinguishing between correlation and causality. The difference is that correlated events often happen near conversions, but don’t necessarily cause the conversion to occur.

If we run a company where we’re trying to get people to sign up for an account, we could add a marketing touchpoint to all paths when people visit the “How do I apply?” page. This event will likely be strongly *correlated* to a conversion, but it’s unlikely to have high causality. The customer was already deep in the consideration cycle.

It’s very tough to factor this out of data driven attribution models, so bear this in mind if they indicate spending even more on brand PPC. Perhaps branded PPC clicks just correlate to conversion events rather than causing them!

The random event

An interesting way to test your data driven model is to randomly introduce a new marketing channel, let’s call it RANDOM, to all paths in your system. If your algorithm works correctly, this marketing channel should come out with a zero weighting.

So that’s it for today. We’ve now got through a lot of the nuts and bolts of attribution, and you should have lots of different ways of evaluating which marketing channels are leading to conversions. The main thing here to take away is that there is no perfect answer. You need someone looking at this data who knows what they’re talking about and is familiar with digital marketing and your particular business.

Next time we’ll talk about alternative ways of valuing these conversion events which is where we’ll really start seeing the value of attribution modelling. As ever, any questions, pop them down below.

You may also like...

2 Responses

  1. Jamie says:

    Hi Dom,

    Great article. I am really interested in testing a Bayesian probability attribution model, are there any websites or books you would recommend reading to help build one?

    Many thanks,

  2. Dom Penfold says:

    Information is relatively sparse on the ground here as few companies have published their algorithms. The main firms that do this are VisualIQ and GA360 with Adometry.

    One of our data scientists implemented our approach internally but we didn’t publish results or approach. A quick google for Markov Chain Analysis should turn up some interesting papers.

    Good luck. I’d be interested in knowing how this works for you.


Leave a Reply

Your email address will not be published. Required fields are marked *

Spam Protection *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>