[GA4] BigQuery Export: How to Query GA4 Event Data

GA4 Google BigQuery
SQL Copilot for BigQuery

Are you leveraging the robust capabilities of BigQuery to query your Google Analytics 4 event data?

Understanding this integration can significantly enhance how you visualize and interpret user interactions, opening new doors to data-driven decision-making. In the digital analytics landscape, understanding how to query GA4 data in BigQuery has become essential.

Professionals in various roles, from digital analysts to BI experts, need precise skills to extract actionable insights from this data. Mastering these techniques allows you to delve deeper into user behavior and optimize strategies effectively.

This article will guide you through the process of querying GA4 event data in BigQuery. You'll learn how to navigate and extract specific event data, empowering you to make informed decisions.

This article is Part 3 in our series, "[GA4] BigQuery Export."

If you're following along, check out the previous articles in the series:

Why You Should Use GA4 with BigQuery

Integrating GA4 with BigQuery offers a powerful solution for professionals who need to analyze their website data. The combination of GA4’s event-based model, instead of the session-based model known before, and BigQuery’s robust querying capabilities allows for detailed analysis of user interactions.

The benefits of using GA4 with BigQuery:

1. Access to Raw Event Data: Unlike the aggregated data in the standard GA4 interface, BigQuery provides access to detailed, raw event data, allowing for more granular analysis.

2. Scalable Data Processing: BigQuery’s powerful querying engine can handle large datasets quickly, making it possible to run complex queries.

3. Custom Reporting: When storing GA4 data in BigQuery, you can create tailored reports and dashboards in Looker Studio that meet precise business needs, and exceeds what you can get inside GA4 interface.

4. Enhanced Insights: The ability to join GA4 conversions data with data from other platforms (like cost data from facebook or google ads) is a step towards end-to-end full-funnel analytics.

5. Better Marketing Strategies: By analyzing detailed event data, marketing managers can gain valuable insights to refine their strategies and improve campaign performance.

Querying GA4 event data in BigQuery is the essential skill for gaining detailed insights into user behavior and website interactions.

Learning the essential steps, including basic queries, dynamic date range selection, and traffic sources analysis, will be helpful for any data professional or marketing expert.

Report

Get BigQuery Reports in Seconds

Seamlessly generate and update reports in Google Sheets—no complex setup needed

Start Reporting Now

Basic Queries for Examining Collected Events in GA4 Report

If you’ve already set up GA4 to export data to BigQuery, your next step might be to build reports to see what events are being collected and analyze those events using SQL queries.

When analyzing GA4 data in BigQuery, it's important to start with the basics, focusing on regular, non-nested fields. Those fields in GA4 data are just data points that do not require additional processing to extract or analyze - just query.

These fields are stored in a flat structure within the dataset, making them simpler to query. Understanding these fields is crucial for performing basic analysis in BigQuery.

Some fields are:

  • Event_name: Represents the specific name of the event being tracked, such as page_view, click, purchase, etc.
  • Event_date: Captures the date on which the event occurred, typically formatted as YYYYMMDD.
  • Page_location: Represents the URL of the page where the event occurred.

Here is the list of a few repeated fields, they are more complex than the simple once, but it’s still quite easy to query them.

  • Traffic_source: This field group includes attributes like source (traffic_source.source), medium (traffic_source.medium), and campaign (traffic_source.campaign) which provide information on how the user arrived at your site.
  • Geo.country: Indicates the country from which the user accessed your site (which is hidden in the Geo field).

Working with non-nested fields in BigQuery is relatively straightforward.

You can perform simple SQL queries to group, filter, and aggregate data based on these fields, providing quick insights into your website's performance.

These fields are easy to access and manipulate, making them ideal for initial data exploration.

Events Count

Understanding the frequency of different events on your website is crucial for analyzing user behavior and tracking key interactions.

The Events Count query allows you to see how many times each event has occurred, providing a clear overview of your site's most common user actions.

Example:

The following SQL query counts the occurrences of each event in your GA4 dataset and orders them by frequency:

SELECT
  event_name,
  COUNT(*) AS event_count
FROM  `project_id.your_dataset_id.events_20240804`
GROUP BY
  event_name
ORDER BY
  event_count DESC
LIMIT
  10

Here:

  • SELECT: This clause specifies the columns you want to retrieve. In this case, event_name is selected to show the name of each event, and COUNT(*) AS event_count is used to count the occurrences of each event.
  • FROM: Here, you specify the source of your data.
  • GROUP BY: This clause groups the data by event_name, ensuring that the count is calculated for each unique event name.
  • ORDER BY: This orders the results by the event_count in descending order, so the most frequent events appear at the top.
  • LIMIT: This limits the results to the top 10 events, which can be adjusted based on your needs.

This query provides a quick snapshot of event frequencies, helping you understand which interactions are most common on your website.

Traffic Sources Report

A Traffic Report helps you understand your website's traffic sources and how users are arriving at your site.

You can get valuable insights about the usefulness of different marketing channels and campaigns by scanning traffic sources, mediums, and campaign names.

Example:

This query breaks down traffic by source, medium, and campaign name, giving you a clear view of how users are reaching your site.

SELECT
  traffic_source.source,
  traffic_source.medium,
  traffic_source.name,
  COUNT(DISTINCT user_pseudo_id) AS users
FROM `project_id.your_dataset_id.events_20240801`
GROUP BY
  traffic_source.source,
  traffic_source.medium,
  traffic_source.name
ORDER BY
  users DESC

Here:

  • SELECT: taffic_source.source, traffic_source.medium, and traffic_source.name identify the traffic source, medium, and campaign name, respectively.
  • COUNT(DISTINCT user_pseudo_id) AS users: Counts the number of unique users (identified by user_pseudo_id) for each source, medium, and campaign name combination.
  • FROM: Specifies the source of your data.
  • GROUP BY: Groups the data by traffic_source.source, traffic_source.medium, and traffic_source.name to calculate the number of users for each unique combination of these fields.
  • ORDER BY: Orders the results by the users count in descending order, so the sources with the most users appear first.

This query provides a detailed breakdown of your website's traffic sources, helping you understand which channels are driving the most users.

Traffic Report by Date

A Traffic Report by Date allows you to analyze how your website's traffic changes over time.

By breaking down traffic data by date, source, medium, and campaign name, you can track the effectiveness of your marketing efforts on a daily basis.

Example:

Let’s explore an example query that provides a detailed view of your website’s traffic by date, source, medium, and campaign name.

SELECT
  event_date AS date,
  traffic_source.source,
  traffic_source.medium,
  traffic_source.name,
  COUNT(DISTINCT user_pseudo_id) AS users
FROM `project_id.your_dataset_id.events_202408*`
GROUP BY
  date,
  traffic_source.source,
  traffic_source.medium,
  traffic_source.name
ORDER BY
  date DESC,
  users DESC
LIMIT
  10

Here:

  • SELECT event_date AS date, traffic_source.source, traffic_source.medium, and traffic_source.name: Retrieves the event date, traffic source, medium, and campaign name
  • COUNT(DISTINCT user_pseudo_id) AS users: Counts the unique users for each combination of date, source, medium, and campaign.
  • FROM: Specifies the source of your data, in this case, accessing multiple tables.
  • GROUP BY: Groups the data by date, traffic_source.source, traffic_source.medium, and traffic_source.name to calculate the number of users for each combination on a daily basis.
  • ORDER BY: Orders the results first by date in descending order to show the most recent data first, and then by users in descending order to highlight the most significant traffic sources.
  • LIMIT: Limits the output to the top 10 entries, which can be adjusted based on the volume of data you need to analyze.

This query provides a comprehensive view of your website's traffic over time, helping you identify trends and assess the daily impact of different marketing channels.

If you're interested in exploring more SQL queries for analyzing GA4 data in BigQuery, OWOX's guide offers a comprehensive library of queries. This resource offers detailed examples and explanations to help you enhance your data analysis skills and make the most of your GA4 exports. Check it out here.

Uncover in-depth insights

[GA4] BigQuery Export SQL Queries Library

Download now

Bonus for readers

[GA4] BigQuery Export SQL Queries Library

Hire SQL-Assistant Powered by AI

Leveraging AI-powered tools can significantly enhance your ability to analyze complex datasets and generate crucial insights. An AI-powered SQL assistant can automate query generation, making it easier to work with large datasets like those from GA4 in BigQuery.

OWOX SQL Copilot is an AI-powered tool designed to assist with generating SQL queries for BigQuery. It simplifies the process of querying complex datasets, offering automated suggestions and query templates.

With OWOX SQL Copilot, even those with limited SQL knowledge can efficiently extract insights from GA4 data and create custom reports.

Report

Unlock BigQuery Insights in Google Sheets

Report on what matters to you. Integrate corporate BigQuery data into a familiar spreadsheet interface. Get insightful, up-to-date reports with just a few clicks

Unlock Data Insights Now

Dynamic and Static Date Range Selection

Date ranges in BigQuery can be categorized into two types: dynamic and static. Dynamic date ranges adjust automatically based on the current date, allowing for real-time analysis and continuous monitoring of data trends. This is particularly useful for dashboards and reports that need to reflect the latest data without manual updates.

Static date ranges, on the other hand, are fixed and do not change over time. These are ideal for analyzing specific periods, such as comparing performance across different months or evaluating the impact of a particular marketing campaign.

Date Range Specific Queries (Using Wildcards)

When working with large datasets in BigQuery, it's important to have flexible methods for selecting date ranges to suit your analytical needs. One efficient way to do this is by using wildcards in your SQL queries.

Wildcards allow you to dynamically select tables based on specific patterns in their names, making it easier to query data from particular time periods without manually listing each table.

Wildcards can be used in the FROM clause to select tables based on their naming conventions, which typically include date components. Here are some examples of how to use wildcards for selecting different date ranges:

All Tables in the Dataset:

FROM `project_id.dataset_id.events_*`

This will select data from all available tables in the dataset, regardless of the date.

Full Year (2023):

FROM `project_id.dataset_id.events_2023*`

This selects data from all tables corresponding to the year 2023.

Specific Month (January 2024):

FROM `project_id.dataset_id.events_202401*`

This selects data from all tables corresponding to January 2024.

First 9 Days of January 2024:

FROM `project_id.dataset_id.events_2024010*`

This selects data from the first nine days of January 2024.

Static Date Range Query Examples

When analyzing data over a specific, unchanging period, static date range queries are the go-to method. Static date range queries are particularly useful for evaluating performance across a set period, such as comparing data from the same month across different years or analyzing the impact of a campaign that ran for a specific duration.

Example Query: Static Date Range – 30 Days

This query retrieves data from a fixed date range, such as the entire month of July 2024.

SELECT
  *
FROM
  `project_id.your_dataset_id.events_*`
WHERE
  _TABLE_SUFFIX BETWEEN '20240701' AND '20240731'

Here:

  • SELECT *: This clause selects all columns from the specified tables. It retrieves every piece of data available within the date range defined by the _TABLE_SUFFIX filter.
  • FROM `project_id.your_dataset_id.events_*`: This specifies the dataset and tables from which to pull data.
  • WHERE _TABLE_SUFFIX BETWEEN '20240701' AND '20240731': filters the data to include only the tables corresponding to dates from July 1st, 2024, to July 31st, 2024, by using _TABLE_SUFFIX.

Dynamic Date Range Query

Dynamic date range queries, on the other hand, adjust automatically based on the current date. These queries are ideal for ongoing analysis, such as monitoring the last 7 or 30 days of data without needing to manually update the date range.

Example Query: Dynamic Date Range – Last 30 Days

This query dynamically retrieves data from the last 30 days, updating every day to include the most recent data.

SELECT
  *
FROM
  `project_id.your_dataset_id.events_*`
WHERE
  _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 30 day))
  AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))

Example Query: Dynamic Date Range – Last 7 Days

For a shorter, more focused analysis, this query retrieves data from the last 7 days.

SELECT
  *
FROM
  `project_id.your_dataset_id.events_*`
WHERE
  _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 7 day))
  AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))

Example Query: Mixed Static and Dynamic Date Range

You can also combine static and dynamic dates in your queries to capture data from a fixed starting point up to a dynamically determined end date. This approach is useful when you want to analyze data from a specific start date until the most recent available data.

This query retrieves data from July 1st, 2024, up to yesterday's date:

SELECT
  *
FROM
  `project_id.your_dataset_id.events_*`
WHERE
  _TABLE_SUFFIX BETWEEN '20240701' AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))

Working with Dimensions of Date and Timestamp

In GA4 and BigQuery, date and timestamp dimensions are essential for analyzing time-based data.

  • Date Dimension: This typically refers to a specific day, such as '2021-01-01'. It’s useful for grouping data by day, allowing you to analyze daily trends and patterns.
  • Timestamp Dimension: A timestamp provides more granularity, capturing the exact moment an event occurred, including the date and time down to the second (e.g., '2021-01-01 12:34:56').

These dimensions allow you to perform temporal analysis, helping you understand when specific actions occur and how they correlate with other data points over time.

Examples of Queries to these Parameters

Let’s explore some examples of how to query these default date and timestamp dimensions using GA4 data in BigQuery. These examples will demonstrate how to extract and analyze data based on specific time intervals.

1. Querying by event_date

This query retrieves data based on the event_date, which represents the date on which the event was logged in the format YYYYMMDD.

SELECT
    event_date,  
    COUNT(*) AS event_count
FROM
    `project_id.your_dataset_id.events_*`
WHERE
    _TABLE_SUFFIX BETWEEN '20240701' AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))
GROUP BY
    event_date
ORDER BY
    event_date DESC

2. Querying by event_timestamp

This query extracts the precise time (in microseconds, UTC) at which an event was logged on the client, using event_timestamp.

SELECT
    TIMESTAMP_MICROS(event_timestamp) AS event_time, 
    COUNT(*) AS event_count
FROM
    `project_id.your_dataset_id.events_*`
WHERE
    _TABLE_SUFFIX BETWEEN '20240701' AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))
GROUP BY
    event_time
ORDER BY
    event_time DESC

3. Querying by event_previous_timestamp

This query analyzes the difference in time between consecutive events by looking at the event_previous_timestamp.

SELECT
    TIMESTAMP_MICROS(event_timestamp) AS current_event_time,
    TIMESTAMP_MICROS(event_previous_timestamp) AS previous_event_time,
    TIMESTAMP_MICROS(event_timestamp) - TIMESTAMP_MICROS(event_previous_timestamp) AS time_difference
FROM
    `project_id.your_dataset_id.events_*`
WHERE
    _TABLE_SUFFIX BETWEEN '20240701' AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))
ORDER BY
    current_event_time DESC

4. Querying by event_server_timestamp_offset

This query examines the offset between the event's collection time and the upload time, using event_server_timestamp_offset.

SELECT
    event_date,
    event_server_timestamp_offset, 
    COUNT(*) AS event_count
FROM
    `project_id.your_dataset_id.events_*`
WHERE
    _TABLE_SUFFIX BETWEEN '20240701' AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))
GROUP BY
    event_date, event_server_timestamp_offset
ORDER BY
    event_server_timestamp_offset DESC

5. Querying by user_first_touch_timestamp

This query retrieves the first time (in microseconds) a user opened the app or visited the site, using user_first_touch_timestamp.

SELECT
    user_first_touch_timestamp,  
    COUNT(DISTINCT user_pseudo_id) AS user_count
FROM
    `project_id.your_dataset_id.events_*`
WHERE
    _TABLE_SUFFIX BETWEEN '20240701' AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))
GROUP BY
    user_first_touch_timestamp
ORDER BY
    user_first_touch_timestamp DESC

6. Querying by set_timestamp_micros

This query retrieves the timestamp when a specific user property was last set, using set_timestamp_micros.

SELECT
    (SELECT value.set_timestamp_micros FROM UNNEST(user_properties) WHERE key = '') AS user_set_timestamp_micros,  
    COUNT(*) AS event_count
FROM
    `project_id.your_dataset_id.events_*`
WHERE
    _TABLE_SUFFIX BETWEEN '20240701' AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))
GROUP BY
    user_set_timestamp_micros
ORDER BY
    user_set_timestamp_micros DESC

These examples demonstrate how to use different date and timestamp dimensions to extract detailed insights from your GA4 data in BigQuery.

Each dimension provides a unique perspective on user behavior and event timing, enabling more precise and informed analysis.

SQL Copilot

Generate SQL Queries 50х Faster with AI

Use natural language to generate, dry-run, optimize, and debug SQL queries

Get started now

Analyzing Specific Events from GA4 Report in BigQuery

When analyzing GA4 data in BigQuery, you might often want to focus on specific events, such as purchase, page_view, or other custom events relevant to your business. This targeted approach allows for a more detailed analysis of user interactions and behaviors related to specific actions on your website or app.

Possible Values for the Parameter event_name

In GA4, the event_name parameter identifies the particular actions that users take on your website or app. These events can be classified into two types: automatic and custom.

Automatic Events: These are built into GA4 and are automatically tracked without any additional setup. (page_view, scroll, click).

Custom Events: These are events that you define based on specific actions that are important to your business (purchase, form_submit). Custom events are implemented through additional setups, such as using Google Tag Manager (GTM) or the dataLayer.

Querying Pageview Events

A pageview event in GA4 represents each time a user loads a page or URL on your website. This event is automatically tracked by GA4 and is crucial for understanding user behavior, such as which pages are most visited, how users navigate through your site, and how often they return to specific pages.

Example:

To focus specifically on pageview events, you can use the following query. This will filter your dataset to include only the events where the event_name is page_view.

SELECT
  *
FROM `project_id.your_dataset_id.events_*`
WHERE
  event_name = 'page_view'
LIMIT
  10

Here:

  • SELECT *: This clause selects all columns from the specified tables. It retrieves every piece of data related to the filtered events, allowing you to analyze the full scope of the pageview events.
  • FROM `project_id.your_dataset_id.events_*`: This specifies the dataset and tables from which to pull data.
  • WHERE event_name = 'page_view': This condition filters the data to include only those events where the event_name is exactly page_view.
  • LIMIT 10: This limits the query results to the first 10 records, making the output manageable and easier to review.

Querying for pageview events is an essential step in understanding how users interact with your website. By isolating these events, you can gain insights into which pages are getting the most traffic, how users navigate through your site.

Filtering by Attributes such as Country

When analyzing GA4 data in BigQuery, you might want to narrow down your focus to specific attributes like the user's country. This is useful if you’re interested in understanding user behavior from a particular geographic region.

Example:

To filter your data by country, in this case, the United States, you can modify your query to include a WHERE clause that specifies both the event name and the country.

SELECT
  *
FROM  `project_id.your_dataset_id.events_*`
WHERE
  event_name = 'page_view'
  AND geo.country = 'United States'
LIMIT
  10

Here:

  • SELECT: This clause selects all columns from the specified tables, retrieving every piece of data related to the filtered events.
  • FROM `project_id.your_dataset_id.events_*`: Specifies the dataset and tables from which to pull data. The wildcard * includes all tables that match this pattern within the dataset.
  • WHERE event_name = 'page_view' AND geo.country = 'United States': This condition filters the data to include only those events where the event_name is page_view and the geo.country is the United States.
  • LIMIT 10: Limits the query results to the first 10 records, making the output manageable and easier to review. Adjust this limit based on your specific needs.

Filtering your GA4 data by attributes such as country allows for a more targeted analysis of user behavior. By focusing on users from specific regions like the United States, you can better understand their interactions with your site, which pages they visit, and how they navigate through your content.

Advanced Querying Techniques for Event Data in BigQuery

As you become more familiar with querying GA4 data in BigQuery, you'll likely want to dive deeper into the data to uncover more complex insights. Advanced querying techniques enable you to handle nested and repeated fields, calculate custom metrics, and perform detailed segmentation of your data.

Nested / Repeated Fields Introduction

In GA4 and BigQuery, nested and repeated fields are used to store more complex data structures. These fields allow for the representation of hierarchical data within a single row of a table, which is essential for capturing detailed user interactions and events.

  • Nested Fields: These are fields that contain another record or a list of records as their value. This nesting allows for a more organized representation of data, but it requires special handling when querying.
  • Repeated Fields: These are arrays of nested fields. These repeated fields allow you to store multiple related entries within a single event, making the data structure more flexible.

Nested fields in GA4 represent complex data structures within a single event, allowing multiple related data points to be stored together. However, when you need to export this data to a traditional relational database, the nested structure can become cumbersome because relational databases typically expect flat, normalized tables.

To work with nested and repeated fields in BigQuery, you often need to use the UNNEST() function. This function flattens the nested data, allowing you to query it as if it were in a traditional relational table format.

Below are two examples of queries that utilize nested or repeated fields in GA4 data exported to BigQuery. These examples demonstrate how to handle complex data structures such as event parameters and items associated with e-commerce events.

Example 1: Accessing the ga_session_number from Event Parameters

This query extracts the session number from the event_params nested field and counts the number of page views per session.

SELECT
  user_pseudo_id,
  (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') AS sessionNumber,
  COUNTIF(event_name = 'page_view') AS pageviews
FROM
  `project_id.your_dataset_id.events_*`
GROUP BY
  user_pseudo_id, sessionNumber
ORDER BY
  pageviews DESC

Here:

  • SELECT user_pseudo_id: Selects the unique identifier for each user, helping to group data by individual users.
  • UNNEST(event_params): Flattens the nested event_params array, making each key-value pair accessible as separate rows.
  • WHERE key = 'ga_session_number': Filters the event parameters to extract the session number.
  • value.int_value AS sessionNumber: Retrieves the session number value and assigns it an alias for use in the query.
  • COUNTIF(event_name = 'page_view') AS pageviews: Counts occurrences of the page_view event for each session, representing the number of pages viewed in that session.
  • FROM `project_id.your_dataset_id.events_*`: Specifies the dataset and tables to query, with the wildcard * including all relevant tables.
  • GROUP BY user_pseudo_id, sessionNumber: Groups results by user and session, ensuring pageviews are counted for each session.
  • ORDER BY pageviews DESC: Orders results by the number of pageviews in descending order.

This query is useful for analyzing session-based metrics, allowing you to track how many pages users view during each session.

Example 2: Analyzing Items in E-Commerce Events

This query retrieves the most popular products by quantity sold, filtering only for purchase events.

SELECT
  item.item_name,
  SUM(item.quantity) AS total_quantity,
  SUM(item.price) AS total_revenue
FROM
  `project_id.your_dataset_id.events_*`,
  UNNEST(items) AS item
WHERE
  event_name = 'purchase'
GROUP BY
  item.item_name
ORDER BY
  total_quantity DESC

Here:

  • SELECT item.item_name: Selects the product name from the unnested items field.
  • SUM(item.quantity) AS total_quantity: Sums the quantities of each product purchased to calculate the total units sold.
  • SUM(item.price) AS total_revenue: Sums the prices of each product to calculate the total revenue generated from sales.
  • FROM project_id.your_dataset_id.events_*, UNNEST(items) AS item: Specifies the dataset and tables to query, and flattens the repeated items field to access individual product details.
  • WHERE event_name = 'purchase': Filters the data to include only purchase events, ensuring the analysis focuses on completed transactions.
  • GROUP BY item.item_name: Groups the results by product name to calculate total quantities and revenues for each product.
  • ORDER BY total_quantity DESC: Orders the results by total quantity sold, highlighting the most popular products.

This query is useful for e-commerce analysis, helping you identify which products are the most popular and generate the most revenue.

Report

Simplify BigQuery Reporting in Sheets

Easily analyze corporate data directly into Google Sheets. Query, run, and automatically update reports aligned with your business needs

Simplify Reporting Now

How to Query Total Users / New Users / Active Users

In GA4 and BigQuery, understanding user behavior is crucial for analyzing website performance. Key metrics such as total users, new users, and active users help you gauge the engagement and reach of your site. These metrics are often used to assess the effectiveness of your marketing efforts, user retention, and overall site growth.

Example:

The following SQL query in BigQuery calculates the total number of users, active users, the percentage of active users, new users, and the percentage of new users from GA4 event data.

COUNT(DISTINCT user_pseudo_id) AS users,
  -- active users (the number of engaged users)
  COUNT(DISTINCT
    CASE
      WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'engagement_time_msec') > 0 
      OR (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1' 
      THEN user_pseudo_id
      ELSE NULL
    END
  ) AS active_users,
  -- % active users (the percentage of engaged users from the total users)
  COUNT(DISTINCT
    CASE
      WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'engagement_time_msec') > 0 
      OR (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1' 
      THEN user_pseudo_id
      ELSE NULL
    END
  ) / COUNT(DISTINCT user_pseudo_id) AS percent_active_users,
  -- new users (the number of users who interacted with your site for the first time)
  COUNT(DISTINCT
    CASE
      WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1 
      THEN user_pseudo_id
      ELSE NULL
    END
  ) AS new_users,
  -- % new users (the percentage of new users from the total users)
  COUNT(DISTINCT
    CASE
      WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1 
      THEN user_pseudo_id
      ELSE NULL
    END
  ) / COUNT(DISTINCT user_pseudo_id) AS percent_new_users
FROM
`project_id.your_dataset_id.events_*`
WHERE
  _TABLE_SUFFIX BETWEEN '20210101'
  AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))
LIMIT
  10

Here:

  • COUNT(DISTINCT user_pseudo_id) AS users: This calculates the total number of unique users by counting distinct user_pseudo_id values.
  • COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'engagement_time_msec') > 0 OR (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1' THEN user_pseudo_id ELSE NULL END) AS active_users: This part counts users who have engaged with the site, defined by having an engagement_time_msec greater than 0 or a session_engaged value of '1'.
  • COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'engagement_time_msec') > 0 OR (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1' THEN user_pseudo_id ELSE NULL END) / COUNT(DISTINCT user_pseudo_id) AS percent_active_users: This calculates the percentage of active users by dividing the number of active users by the total number of users.
  • COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1 THEN user_pseudo_id ELSE NULL END) AS new_users: This part counts new users, identified by checking if the ga_session_number is equal to 1, indicating their first session.
  • COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1 THEN user_pseudo_id ELSE NULL END) / COUNT(DISTINCT user_pseudo_id) AS percent_new_users: This calculates the percentage of new users by dividing the number of new users by the total number of users.
  • FROM project_id.your_dataset_id.events_*:Specifies the dataset and tables to query, with the wildcard * indicating that all tables within the specified date range should be included.
  • WHERE _TABLE_SUFFIX BETWEEN '20210101' AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 day)): Filters the data to include only events between the specified date range.
  • LIMIT 10: Limits the result set to 10 rows for manageability and performance.

This query provides a comprehensive overview of user engagement by calculating key metrics such as total users, active users, new users, and their respective percentages.

How to Query Engagement Details & Bounce Rate

Understanding user engagement and bounce rates is crucial for analyzing website performance. Engagement metrics help you determine how users interact with your site, while bounce rates indicate how many users leave your site without interacting.

Example:

Imagine you want to analyze the engagement details and bounce rates for your e-commerce site over a week in August 2024. The following SQL query can help you extract this information:

SELECT
  -- Query date
  event_date AS date,

  -- Query sessions count
  COUNT(DISTINCT CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id'))) AS sessions,

  -- Query engaged sessions count
  COUNT(DISTINCT CASE WHEN (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1' THEN CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) END) AS engaged_sessions,

  -- Query engagement rate
  SAFE_DIVIDE(
    COUNT(DISTINCT CASE WHEN (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1' THEN CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) END),
    COUNT(DISTINCT CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')))
  ) AS engagement_rate,

  -- Query bounced sessions count
  COUNT(DISTINCT CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id'))) - 
  COUNT(DISTINCT CASE WHEN (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1' THEN CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) END) AS bounces,

  -- Query bounce rate
  SAFE_DIVIDE(
    COUNT(DISTINCT CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id'))) - 
    COUNT(DISTINCT CASE WHEN (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1' THEN CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) END),
    COUNT(DISTINCT CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')))
  ) AS bounce_rate,

  -- Query event count per session
  SAFE_DIVIDE(
    COUNT(DISTINCT CASE WHEN event_name = 'page_view' THEN CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) END),
    COUNT(DISTINCT CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')))
  ) AS event_count_per_session

FROM   `project_id.your_dataset_id.events_*`
WHERE
  _TABLE_SUFFIX BETWEEN '20240801' AND '20240807'
GROUP BY date

Here:

  • event_date AS date: Extracts the date on which events occurred.
  • COUNT(DISTINCT CONCAT(user_pseudo_id, ...)) AS sessions: Counts the total number of unique sessions by concatenating the user ID and session ID.
  • COUNT(DISTINCT CASE WHEN ... AS engaged_sessions: Counts the number of engaged sessions where users interacted meaningfully (e.g., session_engaged equals '1').
  • SAFE_DIVIDE(... AS engagement_rate: Calculates the engagement rate by dividing engaged sessions by total sessions.
  • COUNT(DISTINCT CONCAT(...) - COUNT(DISTINCT CASE WHEN ... AS bounces: Calculates the number of bounced sessions (where users left without meaningful engagement).
  • SAFE_DIVIDE(... AS bounce_rate: Calculates the bounce rate by dividing bounces by total sessions.
  • SAFE_DIVIDE(... AS event_count_per_session: Calculates the average number of events per session.
  • WHERE _TABLE_SUFFIX BETWEEN '20240801' AND '20240807': Filters the data to include only events between the specified date range.

By focusing on specific dates, you can track how well your site is retaining users and how effectively they are engaging with your content. This kind of analysis helps in identifying trends and making data-driven decisions to improve user experience and reduce bounce rates.

If you're interested in delving deeper into SQL queries for analyzing GA4 data in BigQuery, OWOX's guide offers a comprehensive collection of examples. This resource is filled with detailed queries and explanations, helping you enhance your data analysis skills and maximize the value of your GA4 exports.

Hire an AI-Powered SQL Assistant

OWOX SQL Copilot is an AI-powered assistant designed to help you generate SQL queries for BigQuery. It streamlines the process of querying large datasets by providing automated suggestions and ready-to-use templates. With OWOX SQL Copilot, even users with limited SQL expertise can efficiently extract insights from GA4 data and create custom reports with ease.

SQL Copilot

Spend Less Time Writing SQL

Use natural language to generate, dry-run, optimize, and debug SQL queries

Get started now

Traffic Source Analysis of GA4 Event Data in BigQuery

Knowing where your website traffic comes from is essential for enhancing your marketing strategies and improving overall site performance. Traffic source analysis in GA4 allows you to identify the origins of your site visitors, whether they arrive through organic search, paid campaigns, social media, or direct traffic.

Querying for Traffic Source Details

Traffic sources help you understand where your visitors are coming from, whether it's through organic search, paid campaigns, social media, or direct visits. By querying traffic source details, you can gain insights into the performance of your marketing efforts and optimize strategies for better results.

It’s important to note that traffic source data can be accessed at different levels in GA4:

1. User Attraction Level (traffic_source.source): This level provides information about the initial source that attracted a user to your site. It captures the broader view of how users are initially engaging with your brand.

Example: Querying Traffic Source at the User Level

In this example, we will focus on querying traffic source details at the user level using the traffic_source.source parameter. This will help us understand which platforms or networks are the initial points of entry for users visiting the site.

SELECT
    traffic_source.source AS traffic_source
FROM
    `project_id.your_dataset_id.events_*`
WHERE
    -- Define the date range (July 1, 2024, to the day before the current date)
    _table_suffix BETWEEN '20240701' AND format_date('%Y%m%d', date_sub(current_date(), INTERVAL 1 day))
GROUP BY 
traffic_source

Here:

  • SELECT Clause: traffic_source.source AS traffic_source: This selects the source of the traffic, such as 'google,' 'facebook,' or 'direct,' indicating the platform or network that first acquired the user.
  • FROM Clause: Specifies the dataset and tables in BigQuery where your GA4 data is stored. The wildcard (*) allows you to include multiple tables based on the specified date range.
  • WHERE Clause: Filters the data to include only events that occurred between July 1, 2024, and the day before the current date. The _table_suffix is used to match the specific date partitions in the dataset.

2. Session Level: The source (based on session) refers to the specific platform, website, or network that directed a user to your site during a particular session. Traffic sources at the session level are tracked using event_params. This level captures the source of traffic for each individual session.

Example: Querying Session-Based Traffic Source in GA4

In this example, we will query session-based traffic source details in GA4 using BigQuery.

WITH prep AS (
    SELECT
        user_pseudo_id,
        (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS session_id,
        ARRAY_AGG((SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'source') IGNORE NULLS ORDER BY event_timestamp)[SAFE_OFFSET(0)] AS source
    FROM
        `project_id.your_dataset_id.events_*`
    WHERE
        _TABLE_SUFFIX BETWEEN '20240701' AND FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 day))
    GROUP BY
        user_pseudo_id,
        session_id
)

SELECT
    COALESCE(source, '(direct)') AS source_session,
    COUNT(DISTINCT CONCAT(user_pseudo_id, session_id)) AS sessions
FROM
    prep
GROUP BY
    source_session
ORDER BY
    sessions DESC

Here:

  • user_pseudo_id: Selects the unique identifier for each user, which is used to track sessions.
  • session_id: Extracts the session ID from the event_params, identifying each unique session for the user.
  • ARRAY_AGG(...) AS source: Aggregates the source values (indicating the traffic source) for each session, ordering them by the event timestamp to ensure the correct source is captured.
  • COALESCE(source, '(direct)') AS source_session: Selects the source associated with the session. If no source is found, it defaults to '(direct)' to account for direct traffic.
  • COUNT(DISTINCT CONCAT(user_pseudo_id, session_id)) AS sessions: Counts the number of unique sessions for each traffic source, combining the user_pseudo_id and session_id to ensure distinct session counts.
  • FROM prep: Pulls the prepared data from the Common Table Expression (CTE) prep.
  • GROUP BY source_session: Groups the results by traffic source to aggregate session data.
  • ORDER BY sessions DESC: Orders the results by the number of sessions in descending order, highlighting the most frequent traffic sources.

Grouping and Ordering by Traffic Metrics

Grouping and ordering traffic metrics in BigQuery allows you to organize and analyze data based on specific criteria, such as the traffic source, medium, and campaign.

By doing so, you can gain insights into the performance of different channels and understand how users interact with your site based on their point of entry.

Example: Grouping and Ordering Traffic Metrics

In this example, we’ll query traffic source data, grouping by channel and ordering by the number of sessions.

The query will calculate various metrics, including

  • sessions,
  • first sessions,
  • users,
  • pageviews,
  • purchases,
  • revenue,
  • bounce rate,
  • pages per session, and
  • pages per user.

These metrics will be grouped into channel groupings like Organic Search, Direct, Paid, Social, and Referral, allowing you to see how each channel performs.

SELECT
  CASE
    WHEN REGEXP_CONTAINS(medium, r'(?mi)(organic)') THEN 'Organic Search'
    WHEN REGEXP_CONTAINS(medium, r'(?mi)(none)') THEN 'Direct'
    WHEN REGEXP_CONTAINS(medium, r'(?mi)(display)') THEN 'Paid'
    WHEN REGEXP_CONTAINS(medium, r'(?mi)(not set)') THEN 'Not set'
    WHEN REGEXP_CONTAINS(medium, r'(?mi)(^offline$)') THEN 'Offline'
    WHEN REGEXP_CONTAINS(medium, r'(?mi)(cpc)|(ppc)|(cpm)|(adv)|(_ad)|(^ad)|(paid)|(remark)|(target)') THEN 'Paid'
    WHEN REGEXP_CONTAINS(medium, r'(?mi)(insta)|(twitter)|(telegram)|(social)|(smm)|(facebook)|(youtube)') THEN 'Social'
  ELSE
    'Referral'
  END AS channelGrouping,
  *,
  pageviews/sessions AS page_per_session,
  pageviews/users AS page_per_user
FROM (
  SELECT
    event_date,
    traffic_source.source AS source,
    traffic_source.medium AS medium,
    SUM(IF(event_name = 'session_start', 1, NULL)) AS sessions,
    SUM(IF(event_name = 'first_visit', 1, NULL)) AS first_sessions,
    COUNT(DISTINCT user_pseudo_id) AS users,
    SUM(IF(event_name = 'page_view', 1, NULL)) AS pageviews,
    SUM(IF(event_name = 'purchase', 1, NULL)) AS purchases,
    SUM(ecommerce.purchase_revenue) AS revenue,
    SAFE_DIVIDE(
      COUNT(DISTINCT CONCAT(user_pseudo_id, (
        SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_id'
      ))) - COUNT(DISTINCT CASE
        WHEN (SELECT value.string_value FROM UNNEST(event_params) WHERE KEY = 'session_engaged') = '1' THEN CONCAT(user_pseudo_id, (
          SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_id'
        ))
      END),
      COUNT(DISTINCT CONCAT(user_pseudo_id, (
        SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = 'ga_session_id'
      )))
    ) AS bounce_rate
  FROM
    `project_id.your_dataset_id.events_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20240801' AND '20240810'
  GROUP BY
    event_date,
    traffic_source.source,
    traffic_source.medium
  ORDER BY
    sessions DESC
)

Here:

  • Channel Grouping (CASE Statement): The CASE statement categorizes traffic based on the medium parameter, grouping it into channels like 'Organic Search,' 'Direct,' 'Paid,' 'Social,' and 'Referral.'
  • sessions: Counts the number of session start events.
  • first_sessions: Counts the number of first visit events, indicating new users.
  • users: Counts the number of distinct users based on their user_pseudo_id.
  • pageviews: Sums the number of page views.
  • purchases: Sums the number of purchase events.
  • revenue: Sums the total revenue from ecommerce purchases.
  • bounce_rate: Calculates the bounce rate by dividing the number of bounced sessions by the total sessions.
  • page_per_session: Calculates the average number of pages viewed per session by dividing pageviews by sessions.
  • page_per_user: Calculates the average number of pages viewed per user by dividing pageviews by users.
  • GROUP BY: Groups the data by event_date, traffic_source.source, and traffic_source.medium, which allows for aggregation of metrics by these dimensions.
  • ORDER BY sessions DESC: Orders the results by the number of sessions in descending order, highlighting the most popular traffic sources.

This query provides a comprehensive analysis of traffic metrics, grouped by channel and ordered by session count. By categorizing traffic into broader channel groupings and calculating key metrics, this query helps you understand how different channels contribute to your website’s performance.

How to Query User Count by Source / Medium and Date

Querying user count by source, medium, and date allows you to analyze how different traffic channels contribute to user acquisition over time. This type of query is useful for understanding the effectiveness of your marketing efforts, as it shows which sources and mediums are driving the most users to your site on specific dates.

Key Elements of the Query:

  • Source: The platform or network that referred the user to your site (e.g., 'google', 'facebook').
  • Medium: The type of traffic (e.g., 'organic', 'cpc', 'email').
  • Date: The specific day on which users visited your site.

By combining these elements in a query, you can track user acquisition trends, identify peak traffic days, and optimize your marketing strategies based on the performance of different channels.

Example: Querying User Count and Engagement Metrics by Source, Medium, and Date

This example demonstrates how to query detailed user engagement metrics in GA4 using BigQuery. The query captures various metrics, all grouped by source, medium, and date.

SELECT
  event_date,
  (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'source') AS source,
  (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'medium') AS medium,
  COUNT(DISTINCT user_pseudo_id) AS users,
  -- Number of active users (users with engagement)
  COUNT(DISTINCT CASE 
    WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'engagement_time_msec') > 0 
      OR (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1' 
    THEN user_pseudo_id 
    ELSE NULL 
  END) AS active_users,
  -- Number of new users
  COUNT(DISTINCT CASE 
    WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1 
    THEN user_pseudo_id 
    ELSE NULL 
  END) AS new_users,
  -- Percentage of new users out of total users
  COUNT(DISTINCT CASE 
    WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1 
    THEN user_pseudo_id 
    ELSE NULL 
  END) * 1.0 / COUNT(DISTINCT user_pseudo_id) AS percentage_new_users,
  -- Count of new sessions by new users
  COUNT(DISTINCT CASE 
    WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1 
    THEN CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING)) 
    ELSE NULL 
  END) AS new_sessions,
  -- Percentage of new sessions out of total sessions
  COUNT(DISTINCT CASE 
    WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_number') = 1 
    THEN CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING)) 
    ELSE NULL 
  END) * 1.0 / COUNT(DISTINCT CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING))) AS percentage_new_sessions,
  -- Average number of sessions per user
  COUNT(DISTINCT CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING))) * 1.0 / COUNT(DISTINCT user_pseudo_id) AS number_of_sessions_per_user,
  -- Average number of page view events per user
  COUNTIF(event_name = 'page_view') * 1.0 / COUNT(DISTINCT user_pseudo_id) AS event_count_per_user,
  -- Average number of engaged sessions per user
  COUNT(DISTINCT CASE 
    WHEN (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = '1' 
    THEN CONCAT(user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING)) 
  END) * 1.0 / COUNT(DISTINCT user_pseudo_id) AS engaged_sessions_per_user
FROM
  `project_id.your_dataset_id.events_*`
WHERE
  -- Define start and end date
  _TABLE_SUFFIX BETWEEN '20240801' AND '20240810'
GROUP BY
  event_date, source, medium

Here:

  • event_date: The specific date when the events occurred.
  • source: Extracts the source of traffic (e.g., 'google', 'facebook') from the event_params.
  • medium: Extracts the medium (e.g., 'organic', 'cpc') from the event_params.
  • users: Counts the total number of unique users (based on user_pseudo_id).
  • active_users: Counts users with either engagement time (engagement_time_msec) or session engagement (session_engaged), indicating active participation on the site.
  • new_users: Counts users who are visiting the site for the first time (ga_session_number = 1).
  • percentage_new_users: Calculates the proportion of new users out of the total users.
  • new_sessions: Counts the sessions initiated by new users.
  • percentage_new_sessions: Calculates the proportion of new sessions out of the total sessions.
  • number_of_sessions_per_user: Calculates the average number of sessions per user.
  • event_count_per_user: Calculates the average number of page views per user.
  • engaged_sessions_per_user: Calculates the average number of engaged sessions per user, where the user had meaningful interaction during the session.
  • FROM Clause: Specifies the dataset and table to query from.
  • WHERE Clause: Filters data based on the specified date range.
  • GROUP BY Clause: Groups the results by event date, source, and medium.

This query is a comprehensive tool for analyzing user engagement metrics by traffic source, medium, and date.

If you're looking to dive deeper into SQL queries to analyze GA4 data in BigQuery, OWOX’s guide provides a rich library of examples. This resource is packed with detailed queries and explanations, making it easier to sharpen your data analysis skills and get the most out of your GA4 exports. You can explore it here.

Uncover in-depth insights

[GA4] BigQuery Export SQL Queries Library

Download now

Bonus for readers

[GA4] BigQuery Export SQL Queries Library

Utilize OWOX Reports Extension for Google Sheets for Automatically Updated Reports

The OWOX Reports Extension for Google Sheets is an invaluable tool for streamlining report creation and maintenance. It automates report update, ensuring that data remains current without manual intervention.

This enables you to focus more on analyzing insights rather than managing data refreshes, making your reporting more efficient and effective.

Report

Get BigQuery Reports in Seconds

Seamlessly generate and update reports in Google Sheets—no complex setup needed

Start Reporting Now

By utilizing this extension, you can fully harness the power of your GA4 data within BigQuery. Following the outlined steps in this article, you'll be able to extract meaningful insights, optimize your marketing strategies, and make informed business decisions based on accurate, up-to-date information.

FAQ

Expand all Close all
  • What are GA4 event parameters?

    GA4 event parameters are additional pieces of information that can be attached to events in Google Analytics 4 (GA4) to provide more context about user interactions. These parameters can be predefined (such as page_title or session_id) or custom (defined by you) to track specific details relevant to your business. They help you gain deeper insights by capturing the specifics of user actions, allowing for more detailed analysis in reports.

  • How do I link my GA4 property to BigQuery?

    To link GA4 to BigQuery, go to your GA4 property, navigate to 'Admin,' select 'BigQuery Linking,' and follow the prompts to link your BigQuery project. Choose your data streams and export frequency, then submit to start exporting GA4 data for advanced analysis.

  • Can I analyze real-time data from GA4 in BigQuery?

    Yes, by enabling streaming export in your GA4 property, you can analyze real-time data in BigQuery. This setup allows data to flow into BigQuery within minutes of user interactions, providing timely insights.

  • How do I create custom reports using GA4 event parameters in BigQuery?

    Create custom reports by writing SQL queries in BigQuery. Use SELECT, UNNEST, and WHERE clauses to filter and analyze GA4 event parameters. Customize your reports by grouping data, calculating metrics, and scheduling queries for automated updates.