ML Funnel Based Attribution Implementation

This article describes how to implement a ML funnel based attribution model, using Google Cloud Platform services, Google Analytics and OWOX BI.

Let’s talk about collecting and combining data. This material will be useful for analysts and technical experts. You can read more about the calculation algorithm and the benefits of ML funnel based attribution model in a dedicated article.

In order to to implement the attribution model, you need to configure the following data pipelines in Google BigQuery:

  1. Transactions.
  2. User actions from the very first interaction with the advertising up to the transaction
  3. Online advertising costs.


CRM has definitely the highest quality of transactional data. Unlike Google Analytics, CRM has information about:

  • orders that were placed not through the site. This is important for many types of businesses;
    • multichannel retailers, who receive more than half of their income from brick and mortar stores or from a call-center;
    • banks, insurance companies and b2b-companies that only accept applications online and their real value clears up after some time;
    • subscription services, that charge users regularly;
  • margin rather than revenue from the transactions. If you sell physical goods, the margin probably differs depending on the category. It might be difficult to transfer margins together with order information, because this data is sensitive and becomes known after the the product was purchased;
  • the order status. A certain share of orders, that is shown in Google Analytics, can be canceled or fraud;
  • all online orders. It is known that up to 20% of the orders don’t reach Google Analytics reports because of the page load speed or nature of JavaScript.

You can use libraries and applications to automate data transfer from CRM to Google BigQuery.

The big advantage of Google BigQuery is that your IT department can easily choose the most convenient way for them and import data in any format. It frees you from worrying about the correct field names or custom dimension indexes. It will save you time on integration and your energy to persuade the IT department to make this integration.

If the data you need is stored in different services, you can download it through independent pipelines. For example, the order status and margins of sold goods. The data structure and field names can also be arbitrary, the only thing you need is the keys to merge them.

Creating a data pipeline in OWOX BI, you can immediately get the recommended data structure and use it as basic.

Even if you can not automate the import of transaction data from CRM to Google BigQuery you have two options to take advantage of ML funnel based attribution.

  1. You can attribute the value of the transaction, collected by Google Analytics, without involving your IT department.
    In this case, you can not consider the order completion rate and the gross margin, but you will see how much the ROI of different channels will change when you take into account the contribution of each of the visits before ordering.
  2. If you can export the necessary information from your CRM, for example, for one month, then using free BigQuery Reports Add-on for Google Sheets you will be able to send the data into Google BigQuery and use it for calculations.
    In this case, the data will not be updated automatically for future periods, but you’ll see how much the ROI of different channels will change when you consider the order completion rate and gross margin.

The result of your efforts is the table in Google BigQuery containing the information about transactions that your CFO can trust.

таблица в Google BigQuery

User behavior from the first interaction with the ad up to the transaction

Действия пользователей от взаимодействия с рекламой до транзакций

User actions — this is what brings together the efforts and the results, advertising costs and transactions. The accuracy of the attribution model depends on the completeness and precision of this data. The single fact of the visit from a particular source is not enough to evaluate the efficiency of the actions that took place in the session. Therefore, Google BigQuery should be collecting all the information about user activity. This data should be unsampled and precise up to each hit.

You can get the information in two ways:

  1. You can activate the standard export to Google BigQuery, if you use Google Analytics 360.
  2. Or, if you are using the standard version of Google Analytics, you can collect unsampled data in Google BigQuery using OWOX BI Google Analytics to Google BigQuery.

These two integration methods have their own pros and cons. The important thing is you will receive raw and unsampled data about all user actions in Google BigQuery:

The result of this stage is one more table in Google BigQuery.

таблица в Google BigQuery

Google Analytics Core V3 is not suitable for this purpose, since the query can return up to 7 dimension, and the metric is a required parameter. It means that the results you get will always be aggregated.

Advertising costs

Расходы на онлайн-рекламу

The most convenient way to import online advertising costs in Google BigQuery is from Google Analytics. Firstly, collect them in Google Analytics:

  1. Connect Google Analytics with Google AdWords account to import costs of Google campaigns. You’ve already done it, right?
  2. Setup the automatic import of Yahoo!, Bing and Facebook using OWOX BI Pipeline. It’s completely free. Nothing prevents you from doing it.
  3. At least once a month upload advertising costs of the remaining sources via OWOX BI. They will be automatically allocated for each day of the month proportionally to the number of visits from that particular source.

The last thing you have to do is to activate data pipelines from Google Analytics in Google BigQuery via OWOX BI. Note that the cost data will be updated automatically if it is updated in Google Analytics.


As a result, all the necessary data for the calculation of ML funnel based attribution is collected in Google BigQuery and is available for processing.

Used tools