New OWOX BI algorithm for collecting session data
Change is constant; this is our reality, and sometimes it’s hard to accept. But the changes we want to share with you will impress each and every OWOX BI user.
You may already know that OWOX BI has recently changed the logic of session data collection; the new features have been available for all users since March 1. In this article, we’ll tell you more about our new features and how they can benefit your business.
Straight to the point
Previously, OWOX BI built session tables in Google BigQuery using session data from Google Analytics. On the one hand, this approach ensured a complete data fit across sources. On the other hand, part of the data could be lost on the way from the website to the analytics system and cloud storage because of limitations of the Google Analytics Core Reporting API.
To avoid this problem, we’ve changed the algorithm for collecting session data. Now session data is formed using hit data from OWOX BI, so Google Analytics limits can’t influence the process of building session tables. Thanks to this new algorithm, you can gather complete unsampled user behavior data on your website and attribute that data to the correct sessions. Let’s see all the benefits in detail.
1. Avoid sampling
Before: If users of your website generate more than 200,000 sessions a day, Google Analytics (the free version) will typically apply sampling. Thus, your Google BigQuery project will receive only a randomly picked sample of the sessions.
Now: OWOX BI users get raw and unsampled data of all sessions in Google BigQuery without any limitations in traffic value or size of the business.
2. Get complete data with hit-level accuracy
Before: If you reach the volume and gathered data limits in Google Analytics, the system won’t process the data over those limits and that data won’t get into Google BigQuery. These limits are 200,000 hits per user per day, 10 million hits a month, and 500 hits per session. Also, Google ignores hits with a size of more than 8 Kb.
Now: The new OWOX BI algorithm doesn’t have any of these limits, and the maximum size of a hit is doubled to 16 Kb. This means you can gather data about all users’ actions on your website.
3. Get data in time
Before: Session data may be updated with more than a 24-hour delay because these updates depend on information being updated in the Google Analytics API.
Now: Hits form sessions in OWOX BI side, so session table collection will never be paused due to exceeding limits or lack of access to Google Analytics. Data gets into BigQuery faster, which is really important if you use it for trigger email marketing or real-time report updates.
4. Define the traffic source correctly
Before: In Google Analytics, the traffic source is defined by the Last Non-Direct Click model. If the last channel in the customer action chain was direct, this channel is ignored and replaced with the last non-direct channel.
For example, say your customer googled and found a product on your website. They remembered your website address and after some time entered it directly into the URL field. In this case, the source of this session would be google/organic but not direct/none. This excludes the possibility of knowing the precise share of the direct channel as a traffic source for your website.
Now: The new OWOX BI algorithm also uses the Last Non-Direct Click model for defining the traffic source. But we’ve added a special trafficSource.isTrueDirect field to the session table to help you define direct traffic. This binary field will get a true value if a session started with a direct visit of the website and a false value if the session is following a paid channel session. Thanks to this field, you can evaluate the real value of paid channels and their effect on conversions. In Google Analytics, the isTrueDirect field works differently, because it gets the "true" value when there is a direct source or when two sessions have the same data about the campaign.
5. Track the customer path across different websites
Before: If you have a couple of websites and you want to track how the audiences cross, you can set up cross-domain measurement in Google Analytics. But this solution works only if your audience uses cross-links to transfer from one of your websites to another. What if you need to track one user visiting both websites those aren’t cross-linked at the different time? We’ve taken care of it ↴
Now: We’ve added an OWOX User ID to the session data tables. This anonymous user identifier will help you to combine data about user actions from your websites, even if they aren’t linked directly. Also, you’ll get the possibility to group tracked users to separate advertising audiences and avoid paying a couple of times for the same traffic.
6. Attribute all website events to correct sessions
Before: If you’re tracking events on your website with the help of the Measurement Protocol, remember that part of those events may be lost. The fact is that in Google Analytics (including in the paid version, GA360), the maximum duration from the moment of the hit to sending of information about the hit for the &qt parameter is only four hours. If the gap is longer, the event won’t be attributed at all. For example, say your visitor pays online on your website. The transaction will be counted after receiving a bank confirmation, even if that takes a couple of days. In such a situation, the transaction won’t be attributed to the correct session, which means the source that led to that transaction won’t be evaluated correctly.
If the &qt parameter was sent without a value, the separate session for the event will be created automatically. In any case, it will affect the precision of your data.
Now: In sessions based on the new OWOX BI algorithm, the maximum duration for the &qt parameter is extended to 30 days. This means that your events sent by the Measurement Protocol will be attributed to the correct session.
Other differences between the old and new OWOX BI algorithms
In the new OWOX BI algorithm, tables have the same structure they had in the old one. Only the logic by which values are defined differs for some fields:
How to set up session data collection based on the OWOX BI algorithm:
- When creating a new Google Analytics ⟶ Google BigQuery streaming pipeline, session data will automatically start to be collected using the new algorithm.
- To change the algorithm for collecting session data in existing streams, go to stream settings and turn on Session Data Collection.
After that, the old algorithm will become unavailable.
- With the new OWOX BI algorithm, you won’t need a custom dimension at the session level in Google Analytics. You can delete this dimension and add other necessary parameters.
- You won’t need to update the OWOX BI tracking code on your website.
- To activate automatic UTM tagging by Google Ads (gclid), you’ll need to set up collection of raw data reports for Google Ads in BigQuery. You can do this in one click with the help of native integration with Google Data Transfer.
- If you have a list of referral exclusions in Google Analytics, duplicate it in the stream settings for OWOX BI.
Should I change the requests to tables in Google BigQuery?
The structure of the new tables is the same as the old tables, so you should change only the names of the tables in your requests. To adapt your existing requests to new tables, it’s enough to change the names of the old session_streaming_ tables to the new owoxbi_sessions_ tables.
To summing up
If you want to know if it’s time for you to transfer to the new OWOX BI algorithm, answer the following questions. If you have at least one yes, you know what to do:
- Do you have approximately 200,000 sessions per day? Or do you think that you’ll reach this number in a couple of months?
- Have you experienced sampling before?
- Have you ever built non-standard reports and experienced sampling?
- Have you ever reached the limit of 500 hits per session?
- Are you interested in correctly tracking direct traffic but don’t want to buy the whole GA360 pack?
- Do you want to combine website audiences and analyze their overlap?
- Do you want to send all hits with transactions during 30 days, not 4 hours?
- Is it important for you to get session table data as soon as possible?
Even if these questions don’t sound relevant to you at the moment, it’s better to prepare as early as possible.
If you haven’t tried OWOX BI Pipeline yet, we welcome you to take a test drive during a free 14-day trial period:
If you have any questions, ask in the comments below.