ML Funnel Based Attribution Model

The attribution model used in your project is at the heart of evaluating your advertising channels and managing the marketing budget. Therefore the quality and reliability of the attribution model have a direct effect on the implementation of the sales plan and overall business growth.

If your attribution model gives you incorrect numbers, you must have encountered unexpected steep peaks in CPA or dips in ROAS while managing your advertising campaigns.

In one of our previous articles, we reviewed in depth the shortcomings of the most popular attribution models. In this article, we’ll review the logic and benefits of the attribution model based on the data about a customer’s journey through the conversion funnel. This will help you get an objective evaluation of advertising campaigns based on their mutual influence.

Also this article describes how to implement a ML funnel based attribution model, using Google Cloud Platform services, Google Analytics and OWOX BI.

With OWOX BI, you’ll get an objective assessment of your advertising channels as our experts find a perfect match of attribution modeling and your business needs. Book a demo meeting with OWOX BI specialists and find out what can be done for your company.

Table of contents

Attribution challenge

Even in Google Analytics reports, one can easily see that only a small minority of customers place orders during their first session on the website.

Path Length Report in Google Analytics
Path Length Report in Google Analytics

This means that, to correctly distribute each particular order value (income or revenue from an order), one needs to evaluate each session of a buyer’s, not only the final one. By evaluating each session, you can easily find out the true value of your advertising campaigns. To do this, you only need to group sessions with the same advertising campaigns.

As a result, if a user interacted with several advertising campaigns before placing an order, each of the campaigns will get its own portion of the order value.

The key question is how to distribute the value of a particular order to sessions that have contributed to the order?

Attribution solution

To solve the attribution challenge, let’s remember that an order is not a goal in and of itself — there is a conversion funnel in any business. For example, the following steps are usually identified for online stores: useful visit, product page view, adding to the cart and placing an order. Useful visit is a visit during which a user has viewed 2 or more pages or performed an action. Such visits do not increase Bounce rate.

Steps of a conversion funnel in an online store
Steps of a conversion funnel in an online store

The efficiency of advertising campaigns is different at different stages of the funnels. For example, display advertising is effective in attracting new visitors at the upper stages of the funnel, retargeting is effective in returning the visitors, while emails motivate them to make a purchase.

Therefore, it’s necessary to evaluate each step in the conversion process, not only the final one (ie. a purchase). This will allow for calculating the value of even those sessions during which orders weren’t made, but which helped move onto the next step.

The calculation procedure is the following:

  1. Calculating the value of the progression through each step of the funnel.
  2. Evaluating the sessions that helped move through the step, considering the value of the step.
  3. Grouping the sessions by advertising campaigns and obtaining the value of the campaigns.

Now, in order to determine the value of progression through the funnel steps, let’s remember one more important detail — not all steps are equally easy to move through. For example, the image below shows that the probability of the Product Page View is higher than the Adding to Cart. The probability (highlighted in red) shows the proportion of users who progress to the next step from the previous one. The more difficult to move through this step, the lower the probability, and consequently, the higher the value of the session which helped to do it.

The probabilities of progression through the purchase funnel
The probabilities of progression through the purchase funnel

Here and below, for more clarity, the calculations are given for one of the thousands of possible funnels.

  • Of course, on any website there are hundreds of user segments, and the probabilities of progression within the funnel are different for each of these segments. For example, the probabilities differ depending on the users’ location and type (new or returning).
  • It’s important to use only users’ properties in setting up funnels. It would be a mistake to use session properties (eg. device type) or advertising campaigns (eg. keyword). The reason is that, one user may use multiple devices on their journey to making an order, or visit the website from different keywords. For us, it’s important to compare the efficiency of using advertising budget across different advertising campaigns.

The value of steps, based on the probability of progression through the step, can be calculated by a few different methods. After hundreds of experiments, we chose the method which has proven a very good resistance to noisy data, and excellent results even with projects with a small number of visitors. Noisy data is corrupted or incomplete data in web analytics systems. If not properly processed, this data may lead to less accurate conclusions.

The method is that each step of the funnel gets a score that equals 1 minus the probability of the progression through the step. The lower the probability, the higher score the step gets. The value of the step is calculated as a proportion of its score in the total score of all the steps:

Calculations to determine the value of steps within the funnel
Calculations to determine the value of steps within the funnel

Please note:

  • All the calculations are based on real customer behavior and will differ for each particular website and customer segment. This eliminates human error which might occur if the attribution model is based «on feelings».
  • The lower the probability of moving on through a certain step, the higher value assigned to this step. In our case, the Adding to Cart step, which is the most difficult for users to move on through, got 36% of the total order value.
  • The total attributed value is always 100%. In contrast to the assigning of value with help of associated conversions, or individual attribution models, we do not distribute more value than we have obtained.
  • It’s easy to note that the probability of a Useful Visit equals 1 minus the bounce rate. Therefore for segments of first-time buyers, the value of the first visit is higher than for returning buyers. This is fully consistent with the interests of business.

On the image below, the numbers highlighted in green show the portion of value for progression through each of the steps in the funnel.

The result of distributing value over the steps in the funnel
The result of distributing value over the steps in the funnel

Now that we know the value of each step of the funnel, we need to evaluate each session. This is simple: the value of the the session is the sum of values of steps that were passed through for the first time during this session.

The value is only assigned to all the sessions that helped a user move on through one of the steps in the funnel, with the whole journey resulting in a purchase. Knowing the source of each session, we only need to group the sessions by advertising campaigns. As a result, we’ll obtain the value of advertising campaigns, considering their impact on users’ progression through each step in the funnel, not only the final one.

Attribution results

The value (profit or revenue from orders), assigned to advertising channels as a result of implementing ML Funnel Based attribution, will be different from the results of the Last Non-Direct Click attribution model.

The value (profit or revenue from orders)

The reason is obvious — with Last Non-Direct Click attribution, the order value was attributed to only one session. For example, if a user first visited a website through the Display channel, then returned and found the product thanks to retargeting, and then completed the purchase by clicking a link in an email, the Email channel got 100% of the order value. With ML Funnel Based attribution considering the impact on the conversion process, the value will be distributed across channels that drove the user to the purchase on each stage of the funnel.

The difference in the attributed value will be even greater if you build reports for advertising campaigns, not channel groups:

Reports for advertising campaigns

The reason for the greater discrepancy is that, campaigns within a channel may affect the funnel in different ways and compensate each other. With campaign-level reports the difference will immediately become obvious.

ML Funnel Based Attribution implementation

Let’s talk about collecting and combining data. This material will be useful for analysts and technical experts.

In order to to implement the attribution model, you need to configure the following data pipelines in Google BigQuery:

  1. Transactions.
  2. User actions from the very first interaction with the advertising up to the transaction
  3. Online advertising costs.



CRM has definitely the highest quality of transactional data. Unlike Google Analytics, CRM has information about:

  • orders that were placed not through the site. This is important for many types of businesses:
    • multichannel retailers, who receive more than half of their income from brick and mortar stores or from a call-center;
    • banks, insurance companies and b2b-companies that only accept applications online and their real value clears up after some time;
    • subscription services, that charge users regularly;
  • margin rather than revenue from the transactions. If you sell physical goods, the margin probably differs depending on the category. It might be difficult to transfer margins together with order information, because this data is sensitive and becomes known after the the product was purchased;
  • the order status. A certain share of orders, that is shown in Google Analytics, can be canceled or fraud;
  • all online orders. It is known that up to 20% of the orders don’t reach Google Analytics reports because of the page load speed or nature of JavaScript.

You can use libraries and applications to automate data transfer from CRM to Google BigQuery.

  1. SDK for .NET, .Java, PHP, Python
  2. Upload files in CSV and JSON format using command line
  3. ODBC-driver from CDATA
  4. ETL applications

The big advantage of Google BigQuery is that your IT department can easily choose the most convenient way for them and import data in any format. It frees you from worrying about the correct field names or custom dimension indexes. It will save you time on integration and your energy to persuade the IT department to make this integration.

If the data you need is stored in different services, you can download it through independent pipelines. For example, the order status and margins of sold goods. The data structure and field names can also be arbitrary, the only thing you need is the keys to merge them.

Creating a data pipeline in OWOX BI, you can immediately get the recommended data structure and use it as basic.

Even if you can not automate the import of transaction data from CRM to Google BigQuery you have two options to take advantage of ML funnel based attribution.

  1. You can attribute the value of the transaction, collected by Google Analytics, without involving your IT department. In this case, you can not consider the order completion rate and the gross margin, but you will see how much the ROI of different channels will change when you take into account the contribution of each of the visits before ordering.
  2. If you can export the necessary information from your CRM, for example, for one month, then using free BigQuery Reports Add-on for Google Sheets you will be able to send the data into Google BigQuery and use it for calculations. In this case, the data will not be updated automatically for future periods, but you’ll see how much the ROI of different channels will change when you consider the order completion rate and gross margin.

The result of your efforts is the table in Google BigQuery containing the information about transactions that your CFO can trust.

the table in Google BigQuery containing the information about transactions

User behavior from the first interaction with the ad up to the transaction

User behavior from the first interaction with the ad up to the transaction

User actions — this is what brings together the efforts and the results, advertising costs and transactions. The accuracy of the attribution model depends on the completeness and precision of this data. The single fact of the visit from a particular source is not enough to evaluate the efficiency of the actions that took place in the session. Therefore, Google BigQuery should be collecting all the information about user activity. This data should be unsampled and precise up to each hit.

You can get the information in two ways:

  1. You can activate the standard export to Google BigQuery, if you use Google Analytics 360.
  2. Or, if you are using the standard version of Google Analytics, you can collect unsampled data in Google BigQuery using OWOX BI Google Analytics to Google BigQuery.

These two integration methods have their own pros and cons. The important thing is you will receive raw and unsampled data about all user actions in Google BigQuery.

The result of this stage is one more table in Google BigQuery.

table in Google BigQuery

Google Analytics Core V3 is not suitable for this purpose, since the query can return up to 7 dimension, and the metric is a required parameter. It means that the results you get will always be aggregated.

Advertising costs

Advertising costs

The most convenient way to import online advertising costs in Google BigQuery is from Google Analytics. Firstly, collect them in Google Analytics:

  1. Connect Google Analytics with Google AdWords account to import costs of Google campaigns. You’ve already done it, right?
  2. Setup the automatic import of Yahoo!, Bing and Facebook using OWOX BI Pipeline. It’s completely free. Nothing prevents you from doing it.
  3. At least once a month upload advertising costs of the remaining sources via OWOX BI. They will be automatically allocated for each day of the month proportionally to the number of visits from that particular source.

The last thing you have to do is to activate data pipelines from Google Analytics in Google BigQuery via OWOX BI. Note that the cost data will be updated automatically if it is updated in Google Analytics.

cost data


As a result, all the necessary data for the calculation of ML funnel based attribution is collected in Google BigQuery and is available for processing.

all the necessary data is collected in Google BigQuery


The described attribution method is based on two beliefs:

  1. The goal is to lead buyers throughout the conversion process, to a purchase.
  2. The more difficult it is to move through a certain step of the funnel, the more valuable for business it is moving through this step.

It should be mentioned that, although calculations for one segment or one order can be easily made in Excel, you won’t be able to make such calculations for all of the website visitors. The reason is, there may be hundreds of different segments on a website. There may not be enough information for obtaining statistically significant results for each segment, even on high traffic websites. Therefore when implementing the model, one can not do without programming. In our experience, Google BigQuery is best suited for implementing the model. Google BigQuery allows for collecting unsampled data, even if you don’t have Google Analytics 360.

What’s next? Evaluate the additional value of the attribution model.

Used tools


Expand all Close all
  • How to determine the steps of the funnel?

    If you run an online store, then it’s quite likely that the steps in the Enhanced Ecommerce module in Google Analytics will be right for you. If the final conversion on the website is submitting the form, the first visit, service page view, opening an application form and submitting the form can be identified as steps of the conversion funnel.
  • What if we have many different funnels on our website?

    Steps within any funnel can be compared to the stages of a typical purchase funnel: awareness → interest → consideration → purchase. Thus, a single attribution model can be used to measure several different funnels.
  • What is the difference between the results of calculation and Google Analytics attribution models?

    The model used by default in Google Analytics reports assigns value to the last channel that the customer moved through from before the conversion (Last Non-Direct Click). The Funnel Based attribution model) assigns value to advertising campaigns in a different way. For example, channels that contribute most at the earlier stages of the funnel, are most often undervalued by Last Non-Direct Click attribution model. The Funnel Based attribution model will consider their contribution, therefore they will get higher value. At the same time, channels at the lower stages of the funnel will get less value, while with Last Non-Direct Click attribution they took away the value of preceding channels.
  • How can I use the results?

    The Funnel Based Attribution model allows for measuring the efficiency of advertising channels with well-known KPIs — ROAS, ROI, CPA, etc... This means you can use the results in your standard reports to compare the efficiency of advertising channels. You can also use the results to calculate correction factors for each of the channels. Such factors, considering mutual impact of the channels, can be applied to the operational reports built according to the Last Non-Direct Click attribution model.
  • What does the efficiency of the attribution model depend on?

    The more channels used by your customers on their journey to a purchase, the higher the efficiency of the Funnel Based attribution model. You can evaluate the additional value of implementing Funnel Based attribution model in the Multi-Channel Funnels report in Google Analytics.
  • What are the advantages of the ML Funnel Based Attribution model?

    1. It evaluates each step in the conversion process, not only the last one.
    2. It attributes conversion value on the basis of data on the behavior of real users, not just subjective assumptions.
    3. It is fully transparent and you can easily check the calculations for each particular order.