Google Analytics Alternatives: How to keep using the Google tech stack and be GDPR-compliant
However, changing technology stacks is painful and expensive:
- A huge learning curve and a need to study new technologies slows down processes and requires new hires.
- Developers and analysts need to re-implement new markup on the site. This not only requires significant resources but also delays other urgent tasks.
The good news is that companies don’t have to change their tech stack. They just need to set everything up correctly.
In this article, you will learn what businesses should do to be GDPR-compliant while using the Google tech stack.
With OWOX BI, you can ensure compliance with the GDPR while working with sensitive data. Don't waste time and resources on reprocessing data or learning and adopting a new tech stack.
Table of contents
- Good Old Days of Digital Analytics
- Digital Analytics in 2022
- How to Keep Using the Google Tech Stack and Be GDPR-Compliant
- Google BigQuery Data Schema with Consent Mode
- Data Reporting Starts with Data Lineage
- Short conclusions
Good Old Days of Digital Analytics
A few years ago, everyone who worked in data analytics imagined the coming years as a beautiful world where data and personalization were everywhere, with the ad tech stack developing rapidly.
What do we know about those good old days?
- 99.5% of specialists used Google Tag Manager to send data wherever they wanted.
- 85.7% of specialists used Google Analytics for website data collection.
- Almost everyone used ETL and DWH for data processing.
- It was really easy to define keys and use them to join data and build any reports you wanted.
A variety of data visualization tools including Google Data Studio and Google Sheets seamlessly connected to data storages.
In short, it was definitely way easier to deal with data without all of today’s external requirements.
Digital Analytics in 2022
Today, we have to put extra effort into working with users’ data. We don’t have flying cars, and data personalization is not everywhere. In place of that we have requirements and limitations that create additional concerns.
Browsers limit the use of third-party cookies
Browsers and platforms limit the lifetime of third-party cookies set by a third-party domain. This affects the display of important identifiers for analytics systems, such as Client ID in Google Analytics. Because of this, a significant amount of information on the effectiveness of advertising channels will already not be available in the future:
- The share of conversions for new visitors will grow. These will not actually be “new” visitors, however, but rather former “returning” visitors that have been assigned a new cookie.
- The share of direct / none conversions will grow.
- The ROI of paid ads in reports will have a 10% to 20% margin of error. Most often, it will be on the lower side.
Google Analytics is not GDPR-compliant
After the EU’s General Data Protection Regulation (GDPR) went into effect, Google Analytics users in Europe faced a problem. Google Analytics has become illegal for website operators to use in several countries due to decisions by European data protection authorities, as it does not comply with the GDPR.
Now, businesses must remove Google Analytics from their websites or face fines for violating the GDPR. Users of Google Analytics operating in the European Union or serving customers in EU countries should take immediate action to ensure that no personal data is transferred to servers in the United States or find an alternative analytics platform that complies with the GDPR.
In addition, to comply with GDPR requirements, websites must use consent mode. That is, a website must not identify users who do not want to share cookies. And this leads to the following problem.
Consent mode reduces the number of conversions for which a traffic source can be identified
Advertisers will continue to collect user activity data, but they won’t be able to determine which interactions with ads lead to conversions. The average share of users who reject cookies on websites with consent mode implemented is 30%. Depending on the type of website, this share can reach 40%.
The volume of online conversions in marketing reports will remain the same, but the conversions will not be connected with the source of clicks and completed orders from the CRM. As a result, you won’t be able to attribute most conversions to advertising campaigns and will get a low ROI.
Today, when an analyst begins to think about collecting, processing, and transforming data, they have to answer the following tricky questions.
- What shall I do with consenting users and non-consenting users? How can I distinguish them and get data in my reports that can be trusted?
- What kind of consent do I have to ask for to track UTM parameters? (It’s essential to track UTM parameters in order to match sessions/website conversions with your campaigns.)
- To which endpoints can I send users’ data? (Double-check what kinds of services you use before you send data there.)
- What kind of data can I track for non-consenting users?
- How can I make sure that European customers’ data is processed and stored in an EU location?
- How does PII data flow through all my data pipelines and transformations?
Those who have already had conversations with their legal teams know how frustrating it can be to provide a clear answer to what’s going on with PII data on its journey to the final report.
- How can you build roll-up reports for all regions if all those regions have different laws and regulations and also different servers?
- Why are direct traffic and the share of new users unexpectedly increasing?
Let’s do our best to cover all of the questions above to make analysts’ lives easier in the coming weeks, months, and probably years.
How to Keep Using the Google Tech Stack and Be GDPR-Compliant
Almost every marketing team has a formed Google tech stack that everyone is used to and that has worked flawlessly for years. However, the limitations and innovations described above are forcing companies to look for other tools for working with data. The good news is that you can continue to use the familiar Google tech stack as long as you follow these guidelines.
1. Check out geo reports in Google Analytics
You have to understand which regions website visitors are from. How many are from the US vs the EU? You definitely have to start working with countries where visitors most commonly come from. We believe everybody knows where to find their geo reports. Check them out and define the list of countries where the majority of your visitors come from.
2. Learn about data protection laws in visitors’ regions
What laws are applicable for visitors from these counties? Thank god there’s a great website that combines all the laws and regulations around the world and makes it easy to determine which you have to follow in order to be compliant.
3. Deduplicate and prioritize requirements
Once you’ve completed steps one and two, you have to deduplicate all those requirements from different countries. Consult with lawyers to translate from legal English to data analysts’ English.
At the end of this stage, you will have figured out all the privacy restrictions no matter which platform you’re going to send data to. It’s not only about Google.
4. Implement consent mode correctly.
Finally, you have to implement consent mode. It’s really easy to implement those rules with the help of third-party tags or third-party products that are integrated with GTM. Follow these links to find GTM templates in order to ask your visitors for consent to send their data to analytics services.
Finally, we are getting to the data processing stage. While at the previous stage you realized what kind of data you could collect with what kind of consent, now you can start capturing this data and processing it.
Everybody knows that we can no longer just send PII data to GA as we did before — not even if the data from GA is then exported to GBQ and the location of GBQ is set to EU. This is because EU laws say you cannot send PII directly to GA without a proper setup.
1. Configure Google Analytics and Google Tag Manager
This is not the hardest task. All you need to do is go over this checklist, accept the new Google DPA, and disable the Data sharing settings. Most importantly, ghost hits and Google signals have to be disabled as well.
With the above done, you can make GA compliant in terms of privacy and all regulations by preventing the collection of PII without consent.
However, as soon as you adjust all these settings in GA, you will find that the really important data is nowhere to be found in GA or, consequently, in Google BigQuery Export.
We are talking about granular location data, some PII data that you need for certain reports, and some custom dimensions that are used as a key to join it, for instance, with CRM data.
Obviously, this state of affairs won't work for you because at the end of the day, as an analyst, you want to build an actionable report and you want to deal with SQL-accessible data. Luckily, there is another solution you can implement: server-side tracking.
2. Set up cookieless server-side tracking
You can use the OWOX solution or build your own.
Sign up for a demo to learn more about OWOX BI Server-Side Tracking
Either way, the most important thing about the server is that it must be located in the EU. This is how you can be sure that all PII data is filtered before you send it to any other service.
Based on our experience, server-side tracking increases the accuracy of acquisition campaign tracking by 20%. So there is a business reason, not just a legal reason, to migrate to server-side tracking.
3. Set up a server-side tag manager
The third part is setting up a server-side tag manager. Why is it important? Because you’d like to have control over all the data you send not just to your analytics service but to all third-party ad services as well (Facebook, Bing).
At this point, you can host your server-side tag manager in an EU location and filter out all PII fields such as IP address. You can send just the data required for each ad service.
This is how you can export data in a way that complies with GDPR requirements.
If you still face objections from the legal team, say: Hey, how do we make sure that nobody can access our visitors’ PII data in Google BigQuery?
At this point, there is also a solution. You can turn on customer-managed cloud KMS keys and encrypt your data in order to prevent anyone, and I mean anyone, from getting access to it.
To be honest, we haven’t encountered any organization that would still have doubts about using GCP once they have followed all of these recommendations.
Google BigQuery Data Schema with Consent Mode
Now let’s jump to some more practical recommendations. What does consent mode look like?
As soon as you start sending data with consent (for example, using OWOX BI), you will get a dedicated parameter that contains this consent mode.
Here is a session table. As you can see, it has a dedicated ConsentMode field that contains the value of consent granted on the website.
In order to collect data for analytics purposes, you have to get consent, and you can figure out the consent options with the value of this ConsentMode parameter. Google Analytics’ values that match analytics needs are G101 and G111. If the gsc parameter has one of these values, you may collect your data for analytics purposes.
However, if your website visitors haven’t given their consent, you still can store their data but without any personally identifiable information — just like how your web server logs contain IP addresses and user agents but don’t have unique user ids.
Let’s have a look at how it works.
Imagine you have not been granted consent. Now, each hit will have a new client ID and OWOX user ID.
On top of that, granular location data will be unavailable. The idea behind this is the following:
You cannot collect any kind of data that can directly or indirectly identify the individual. What kind of data is that? City, latitude, longitude, browser (meaning minor version number and user agent), anything that can be used for fingerprinting including device brand/model, and so on.
However, you can store non-PII data such as pageviews without any PII that can be used to identify individuals. Below, you will find out why you need this data.
The most obvious idea is to get the totals, right? We believe that everyone would like to have accurate totals in terms of page views and number of conversions, and it doesn’t matter which particular users these metrics come from.
Data Reporting Starts with Data Lineage
Now, let’s move to data reporting, which starts with data lineage. As soon as you collect all your data, you probably cannot avoid answering how your PII data flows, how to set and control all your data transformations, joining, cleaning.
It would be great to have a dedicated tool that shows all those transformations and how you arrived at the final report in the clearest and the most auditable way — a tool that would help you understand if your PII flows correctly.
For instance, as soon as you collect data from different regions, you’ll need to join it in order to build a roll-up. Or say that data on which users give consent and don't give consent is stored separately, and the overall metric needs to be calculated in one report. To do this, you need knowledge of the data schema. You will need to keep dozens of transformations in your head. And if suddenly an error appears in the calculations, without a clear and understandable data lineage, you will spend a lot of time searching for and eliminating it. These are just a few of hundreds of use cases when you need data lineage.
To solve this problem, which our clients have often faced, we have created a clear transformation graph in OWOX BI that clearly shows how, where, and why your data is moving. With it, you can easily see the calculation logic and influence it:
- Track how data moves and changes from connectors to dashboards.
- Set and control data transformations and metrics calculation logic in each report.
- Manage SQL transformations in a few clicks.
- Schedule data updates to keep data fresh.
- Immediately see any error or delay in updating data.
1. Create a data catalog
First of all, a data catalog is a way to organize your inventory of data assets, especially those that contain PII data. You have to have a clear mark of what type of PII data it is. For instance, you might encrypt your data, hash it, or decrypt it. It depends on how you are going to use it.
2. Assign an owner for each data asset
Secondly, you have to assign an owner for each data asset. For instance, you can set yourself as the owner for Visitors in order to easily understand who owns the data and what kinds of fields are related to PII data.
3. Define PII security on a column basis
Last but not least, you can even define PII data security on a column basis to determine if you’d like to encrypt the data or hash it.
The great news is that GC offers a simple way to use column key encryption without any need to rewrite all SQL queries from scratch.
By following the recommendations in this article, you will be able to:
- Get all of your data in Google BigQuery
- Filter all PII data for non-consenting users
- Avoid losing non-consent PII data in order to get totals and build roll-up reports
- Tell your legal team how your data flows through all pipelines
Google Tech Stack and GDPR | Compliance ChecklistDownload now