Attribution Models: Detailed Review and Comparison
In this article, we’ll look at the basic principles, pros, and cons of the most well-known attribution models, from standard models in Google Analytics to Markov chains and the Shapley value. You’ll learn how to evaluate the combined influence of campaigns (not just the last click) and how to choose the model that best suits your business.
- Why is attribution needed?
- What attribution models are available?
- Attribution models based on the position of the channel in the chain
- First Interaction or First Click
- Last Interaction or Last Click
- Last Non-Direct Click
- Linear Model
- Time Decay (and other models that take time into account)
- Position-Based or U-Shaped
- What’s the problem with standard attribution models?
- Algorithmic attribution models
- Brief conclusions
Why is attribution needed?
To answer this question, let’s look at a typical sales funnel. It consists of four main stages:
- Acquaintance with the brand
- Consideration, when the user already knows about the brand but is thinking about whether to buy (includes looking at reviews, comparing prices, and selecting the product)
- Conversion (purchase)
- Repeat purchases
You probably know that retaining customers is much cheaper than attracting new ones. Less personalized campaigns are aimed at attracting customers. These campaigns have much wider coverage and are more difficult to assess.
On the contrary, more targeted campaigns are aimed at keeping customers; their effectiveness is much easier to assess because we already know a specific user and can link all their purchases and interactions with the company.
How can you know which advertising channels and campaigns work best and at what stage of the funnel? Attribution helps you figure this out.
Attribution is the distribution of the conversion value among campaigns that encouraged the user through the funnel. It helps to answer to what extent each channel influenced the conversion.
“Half the money I spend on advertising is wasted; the trouble is, I don’t know which half.”
By choosing the right attribution model for your business, you’ll be able to optimally distribute your advertising budget and, as a result, reduce costs and increase revenue. Let’s figure out which model is suitable for you.
What attribution models are available?
There are dozens of possible attribution models. They can be classified in different ways depending on the logic used in their calculations.
- If we look at what place a channel occupied in the customer journey before the order, then we’re using an attribution model based on position (Time Decay, Position-Based). If the calculation takes into account all data and not only the position of the channel in the chain, then it’s an algorithmic attribution model (Data-Driven, Markov chains).
- If we give all the value to only one channel that participated in the funnel, then it’s a single-channel model (Last Click, First Click). If the value is distributed among all channels in the chain, then it’s a multi-channel attribution model (Linear, Time Decay).
Let’s look at each model in detail.
Attribution models based on the position of the channel in the chain
Let’s start with the simplest attribution models, based on position, which are available in the free version of Google Analytics.
1. First Interaction or First Click
With these models, all the value received from a conversion is attributed to the first source that led the user down the funnel. For example, if we have a chain of four contacts, as shown below, according to the First Click model, the entire value of the conversion will be attributed to the CPC channel.
Pros: Easy to set up and use, as there are no calculations or arguments regarding the distribution of value between channels.
Cons: Doesn’t show the whole picture and overestimates channels at the top; users typically interact with several touchpoints before a purchase, which the First Interaction model completely ignores.
Suitable for businesses that want to increase brand awareness and audience reach as well as understand where to buy traffic that will convert. Useful for marketers who focus exclusively on demand generation and brand awareness.
2. Last Interaction or Last Click
According to this model, the entire value of the conversion goes to the last channel that the user came in contact with before the conversion. The contribution of all other channels is ignored. In our example, all the value will go to the Direct channel.
Pros: A popular model that’s familiar to many marketers; ideal for evaluating campaigns for quick purchases, such as for seasonal items.
Cons: Like all single-channel models, it ignores the role of other sources in the chain before ordering.
Suitable for businesses with a short sales cycle that use up to three advertising channels.
3. Last Non-Direct Click
Google Analytics reports use this model by default. The entire conversion value is assigned to the last channel in the chain. However, if that last channel is Direct, then the value is attributed to the previous channel. The logic is simple: if the user came to you from a bookmark or entered a URL, then they most likely were already familiar with your brand.
Pros: Allows you to ignore channels that are insignificant in terms of advertising costs and focus on paid sources. In addition, Last Non-Direct Click can be used for comparing with other attribution models.
Cons: Doesn’t take into account the contribution of other channels. In addition, often the penultimate source in the chain is email. We understand that the customer came from somewhere and left their email address. But using Last Non-Direct Click, we underestimate the sources that helped the customer get acquainted with the brand, leave their email, and eventually decide to buy.
Suitable for businesses that want to evaluate the effectiveness of a particular paid channel and for whom brand recognition is no longer so important.
4. Linear Model
This basic model simply divides the transaction value equally among all sources in the chain.
Pros: Simple, but at the same time more advanced than single-channel attribution models, as it takes into account all sessions before conversion.
Cons: Useless if you need to reallocate the budget; to divide it equally between channels is not the best option, since they can’t be equally effective.
Suitable for businesses such as B2B companies with a long sales cycle, for which it’s important to maintain contact with the client at all stages of the funnel.
5. Time Decay (and other models that take time into account)
With the Time Decay model, the value of a transaction is distributed between channels incrementally. That is, the first source in the chain receives the least value, and the source that was the last and closest in time to the conversion receives the most value.
Pros: All channels in the chain get their piece of the cake. The most credit is given to the channel that pushed the user to purchase.
Cons: The contribution of sources that led the user to the funnel is greatly underestimated.
Suitable for those who want to evaluate the effectiveness of promotional campaigns that are limited in time.
6. Position-Based or U-Shaped
With these models, most credit is given to two sources (40% each): the one that introduced the user to the brand and the one that closed the deal. The remaining 20% is divided equally among all channels in the middle of the funnel.
Pros: Gives the greatest value to the channels that in most cases play the most important role: those that attract the customer and motivate the conversion.
Cons: Sometimes the sessions in the middle of the chain encourage the user through the funnel much more than it seems at first glance. For example, they help the customer add a product to the cart, subscribe to the newsletter, or click Follow the price. With the Position-Based model, such sessions and their sources are underestimated.
Suitable for businesses for whom it’s equally important to attract a new audience and convert existing visitors into buyers.
What’s the problem with standard attribution models?
According to a 2017 Ad Roll survey, 44% of marketers in the US and Europe use Last Click attribution. In our experience, this percentage is even greater in the CIS markets. Algorithmic attribution models based on data are used by only 18% of marketers.
At the same time, 72.4% of those who still use Last Click say that they don’t know why they’re using it – it’s just how things have always been done. When choosing a model, many marketers prefer the one that looks easiest and is most understandable, even though it underestimates all sessions in the chain except for the last click.
In our opinion, there are three main reasons why marketers do this:
- They don’t understand the potential of more complex attribution models. For example, if we told you that by switching to an algorithmic model you could increase revenue by 20%, would you switch? Probably yes.
- There’s no single person who’s responsible for attribution, so different marketers use different models for the same campaign. As a result, the total attributed income may be more than the real income earned by the business.
- Scattered data. Models that are available in Google Analytics are certainly convenient: all the data is in one system, and you can use standard reports. However, these reports won’t tell you anything about offline data, the execution of your orders, or the ROPO effect.
By eliminating these holdups, it will be much easier to solve the problem of attribution.
Google Ads, Double Click, and some other services also have their own attribution models. Their common disadvantage is that you can only use the internal data of the service for calculations.
Algorithmic attribution models
Imagine that you’re setting up ads on Google Ads or Facebook. Evaluating a campaign by the Last Indirect Click will show you whether the campaign is working. And it will show you how the audience responds to it. Inside your advertising admin panel or Google Analytics, it doesn’t make sense to use complex attribution models.
But if you want to evaluate the influence of all your sources on each other, then you already need to combine data from different advertising services, Google Analytics, and your CRM into one system and use more complex attribution models. By doing so, you’ll learn which advertising channels work well together and at what stages.
Algorithmic attribution models include Data-Driven (in Google Analytics 360), Markov chains, OWOX BI Attribution, and custom algorithms.
1. Data Attribution (Data-Driven Attribution)
Users of the paid version of Google Analytics have access to a data-based attribution model. All attribution models described previously use rules that are set by a web analytics system or by you. In contrast, a Data-Driven model does not have any predefined rules — it calculates the value of channels using your data and the Shapley vector.
The peculiarity of a model based on data is that it doesn’t take into account the order of a channel in the chain, but rather evaluates how the presence of this channel affected the conversion. If you change the order of sessions, the value of the channels in the Shapley model won’t change.
According to Wikipedia, the Shapley value (which belongs to cooperative game theory) is the optimal distribution of winnings between players. It’s a distribution in which the gain of each player is equal to their average contribution to the well-being of the group as a whole.
To understand how a Data-Driven model works, consider a specific example. Suppose we have two chains that lead to transactions:
- Facebook → Direct → $500 transaction
- Direct → $300 transaction
We’ve specifically used short chains in our example so as not to complicate an already complicated formula.
Now, we’ll determine how much each channel brought in separately and how much they brought in together:
V1 (Facebook + Direct) = $500V2 (Direct) = $300V3 (Facebook) = 0
The Shapley value of a channel is calculated using the following formula:
n is the number of players (in our case, advertising channels)
v is the value brought by the source
k is the number of participants in the coalition K
If we insert the values from our example into this formula, we get the following:
F1 = (1 - 1)! × (2 - 1)! / 2! × (0 - 0) + (2 - 1)! × (2 - 2)! / 2! × ($500 - $300) = 0 + $100 = $100
F2 = (1 - 1)! × (2 - 1)! / 2! × (300 - 0) + (2 - 1)! × (2 - 2)! × ($500 - 0) = $150 + $250 = $400
F1 is the value of the Facebook channel.
F2 is the value of the Direct channel.
We’ll now explain this in simple words for those who got scared by the formula :) Let’s start with Facebook.
This channel hasn’t brought us anything on its own, so the first element we’ll have is 0.
Facebook, in combination with Direct, brought in $500, and Direct alone brought in $300. We subtract the money earned by Direct from the amount that the combination of channels has brought in, then divide the result by two: ($500 - $300) / 2 = $100. This is our second element.
Now add $0 + $100 = $100. This is the value of the Facebook channel.
Next, consider the value of the Direct channel. Independently, it brought in $300. Divide that by two and you get $150. The Facebook + Direct combination brought in $500, which divided by two gives us $250. We add these numbers and get $400 as the value of the Direct channel.
For more information on the Shapley vector, watch the video on Coursera.
Pros: The most objective and reliable model, as it evaluates channels using your own data.
Cons: You need to have a sufficient amount of data in Google Analytics, so this model isn’t suitable for all companies. It doesn’t evaluate the progress of the funnel, and you can’t connect offline data from the CRM and see detailed information for each transaction.
Suitable for anyone who wants to know which campaigns and keywords work as efficiently as possible and use this information to distribute the marketing budget. Not suitable for businesses that need to know not only the value but also the position of the channel in the chain.
2. Markov chains
A Markov chain is a sequence of random events with a finite number of outcomes, characterized by the fact that with a fixed present, the future is independent of the past.
Initially, Markov chains were used by weathermen, bookmakers, and others to solve problems with forecasting. People began to use them to evaluate advertising campaigns relatively recently, with the development of the digital market. Attribution based on Markov chains helps answer the question of how the lack of a channel will affect conversions.
To understand how Markov chains work, consider a specific example from e-commerce. Suppose we have three chains:
C1 → C2 → C3 → Purchase
C1 → Unsuccessful conversion
C2 → C3 → Unsuccessful conversion
C1, C2, and C3 are sessions with three different sources – for example, Google CPC, Facebook, and email.
Fill in the following table:
Inside view of the model
|Breakdown by pairs|
|C1 → C2 → C3 → Сonversion||Start → C1 → C2 → C3 → Сonversion||Start → C1,C1 → C2, C2→ C3,C3 → Сonversion|
|C1||Start→C1 →(null)||Start → C1, C1 → (null)|
|C1 → C2||Start → C2 → C3 → (null)||Start →C2, C2 → C3,C3 → (null)|
In the first column we have the customer path – three chains, in our example. The second column shows how the path will look inside the model. We added the entrance to the funnel (the Start stage) and the exit from the funnel (Conversion or Null, meaning failed conversion). In the third column, we divided the channels into pairs, since we need to evaluate all possible transitions from one step to the next.
Then we need to calculate the probabilities for each of the possible transition options and put them in a separate table. These figures are considered empirically; that is, real data about user actions is analyzed – for example, from Google Analytics. This is done using a programming language such as R or Python.
|All from Start||3/3|
|All from C1||2/2|
|All from C2||2/2|
|All from C3||2/2|
The numbers in the table above are just examples. To make it clear what these numbers mean, let’s show them on a chart:
In this chart, we see all possible transition options from our example. Everything starts with the start phase. From there, a third of users go to channel C2 and two-thirds go to C1. Further, half of users from channel C1 leave the funnel, and the other half goes to C2, then to C3. Finally, 50% of the remaining users make a purchase. There are a few more options, but you understand the principle.
Note that in our example, there are essentially only two paths to conversion, and both of them go through channel C2.
How are such sessions evaluated? With the removal effect. That is, we delete each of the sources in turn and see how its absence would affect the number of conversions:
For example, if we remove the C1 source from our example, we lose 50% of the conversions. How did we figure this?
Calculating the value of channels is carried out in three stages:
1. First, we need to calculate the probability of conversion for each of the channels. More precisely, we need to figure out how many conversions we get if we remove a specific channel from the chain.
The conversion probability (P) for each channel is calculated using the following formula:
P1 = (0.33 × 1 × 0.5) = 0.167
P2 = (0.33 × 0 × 0.5) = 0
P3 = (0.33 × 1 × 0) = 0
Let’s take a closer look at the first formula. This is our probability of conversion for channel C1. We remove the C1 channel from the model and multiply all the remaining transition probabilities from the chains that lead to the purchase. That is, we multiply 0.33 by 1 by 0.5. As a result, we get 0.167, or 16.7%. This is the conversion percentage we would get if we removed the C1 source from the funnel.
If we remove channels C2 and C3, then we’ll have no conversions at all.
2. Next, we determine the deletion effect (R) for each channel. This shows the percentage of conversions we’ll lose if we remove the channel from the funnel, and it’s calculated as follows: the unit of conversion (P) divided by the number of users at the beginning of the chain (transition probability) is subtracted from 1 (i.e. 100%).
R1 = 1 - 0.167 / 0.33 = 0.5
R2 = 1 - 0 = 1
R3 = 1 - 0 = 1
3. Finally, we calculate the value (V) of each channel. To do that, take the percentage of lost conversions (R) and divide it by the sum of all coefficients (R1, R2, and R3).
V1 = 0.5 / (0.5 + 1 + 1) = 0.2
V2 = 1 / (0.5 + 1 + 1) = 0.4
V3 = 1 / (0.5 + 1 + 1) = 0.4
Pros: An attribution model based on Markov chains allows you to evaluate the mutual influence of channels on conversions and find out which channel is the most significant.
Cons: Underestimates the first channel in the chain; requires programming skills.
Suitable for businesses that have all their data collected in a single system.
3. OWOX BI Attribution
OWOX BI Attribution helps you assess the mutual influence of channels on encouraging a customer through the funnel and achieving a conversion.
Previously, OWOX BI calculated the value of channels using a proprietary algorithm. But recently, we began to use the Markov chain in our calculations. We’ll soon publish a detailed article on this topic, where we’ll describe all the changes and advantages of the new algorithm. In the meantime, you can see how our updated attribution works by signing up for a free trial.
In the OWOX BI Attribution model, you can use the following information:
Data on user behavior from Google Analytics collected in Google BigQuery using standard exports from GA 360 or using the OWOX BI Pipeline
Data from any advertising services that you use
Data from your internal CRM system.
All this data can be analyzed in a complex and used to configure funnel steps (the default funnel is Enhanced Ecommerce one). You can add any steps – for example, transactions from your СRM and any other online and offline events (calls, meetings, etc.).
In addition, we use data from the CRM to make end-to-end analytics. That is, we use one identifier to associate actions with a user across mobile, desktop, and any other devices. As a result, you’ll get a complete picture of a user’s interaction with your business and take into account the effect of all advertising channels on conversions.
Pros: Allows you to find an effective channel and say exactly where it’s effective. There are no restrictions on the minimum amount of data needed. OWOX BI defines new and returned users and shows detailed information for each transaction, including which session, source/medium, and user actions in the funnel led to it. It also may account for margin and performance orders from your CRM.
Additionally, you can compare the effectiveness of advertising with your current attribution model and the OWOX BI Attribution model to identify undervalued or overvalued campaigns.
Cons: Underestimates the first step of the funnel.
Suitable for anyone who wants to take into account every step of the user in the funnel and honestly evaluate advertising channels.
According to the same 2017 Ad Roll survey we mentioned at the beginning of this article, 70% of marketers find it difficult to apply attribution results. But not applying the results of attribution doesn’t make sense!
It doesn’t matter what reports you have in Google Sheets or Google Data Studio. If you don’t distribute your budget based on this data and don’t change anything in your strategy, then there’s little sense in having these reports.
If you want to know the real effectiveness of your marketing:
- Set a clear task for yourself, determine the KPIs for evaluating it, and decide what reports and dashboards should look like.
- Determine who’s responsible for evaluating advertising campaigns in your company.
- Determine the path of your users.
- Take into account not only online but also offline data.
- Verify the quality of incoming data: check whether there are any duplicate transactions on the site and whether every step that you want to take into account in the funnel is monitored.
- Try to compare different attribution models and find a flexible solution that’s optimal for your business.
- Make decisions based on data.
If you didn’t find an attribution model suitable for your project in this article, you can create a custom model as Answear did. Write to us at firstname.lastname@example.org and we’ll be happy to help you set up attribution. If you have any questions, ask in the comments.