A detailed guide to cohort analysis in Google Analytics and Google Sheets
Obviously, first-time buyers and repeat customers behave differently on your website and bring in different amounts of revenue. With the right marketing strategy, these differences can be very beneficial for your business. But how can you identify patterns and understand what distinguishes people who first visited your site a month ago from those who first visited a year ago? Cohort analysis lets you do this.
In this article, we’ll tell you what cohort analysis is and what it’s for. We’ll talk about the possibilities and limitations of cohort analysis in Google Analytics and describe in detail four ways of conducting a cohort analysis in Google Sheets.
Table of contents
What cohort analysis is and where to apply it
A cohort, in marketing terms, is a group of people who perform a certain action in a given period of time: for example, a group of people who first visit your site in December or make their first purchase between November 1–7. This action distinguishes the cohort from all your other customers.
The whole idea of cohort analysis is to divide users into groups based on certain criteria and examine how the behavior of these groups changes over time. Cohort analysis helps you understand how your marketing efforts affect key performance indicators: conversion and retention rates, LTV, ROI, CAC, etc.
Let’s review a few examples of what you can do using cohort analysis.
1. More accurately evaluate the ROI of ad campaigns
Not all users who visit your site for the first time immediately complete a conversion action. Some people need time to study your offer, review the pros and cons, compare it with competitors, and then decide whether to buy, subscribe, or register. This is especially true for businesses with delayed conversions, long sales cycles, and expensive products.
The longer the period from the first visit to conversion, the higher the chance of incorrectly evaluating the effectiveness of your advertising channels. Cohort analysis helps solve this problem. Let’s say that in November you launch a paid ad campaign in Google Ads. A month later, you review the results and see that ROI is below 100%. Your first thought is to turn off the ad. But let’s not jump to conclusions. Instead, let’s wait and see for a couple more months, just as an experiment. If you analyse the cohort of users who were attracted by this ad campaign in November and users at the same ROI report for January, you might see that ROI has grown significantly. This means people were thinking for more than a month but eventually made a purchase.
2. Find and retain loyal customers
With the help of cohort analysis, you can find where your most loyal customers come from. For example, you can create a cohort of visitors who logged in to your site for the first time between January and June. Then you can segment them by attraction channel and compare the retention rate or repeat purchase rate by month for each channel. It’s clear that channels with higher rates are more worthy of your attention and investment.
By analyzing cohort behavior, you can better plan your marketing campaigns. For example, you may notice that the user activity in a cohort falls sharply in the fifth month. You can then send an email with a personal discount to warm up their interest.
3. Predict and increase LTV
Lifetime value (LTV) is the revenue that a company receives from a customer over the course of their relationship. This metric is difficult to calculate until someone ceases to be your customer. However, it’s possible to calculate the revenue from a single cohort over one month and predict how much money this cohort will bring in over a longer period with the help of LTV calculations.
What’s more, you can segment cohorts across ad channels to compare LTV and customer acquisition cost (CAC). This will help you to understand the ROI period for each channel and redistribute your marketing budget more effectively.
4. Conduct A/B testing
Cohort analysis can also be used to evaluate the results of A/B tests. Let’s say you’ve updated the button design on your site. To find out whether this has affected the conversion rate, you create two variants of one page. Half of your users see the old button (part A), the other half see the new one (part B). Next, you consider the A/B conversion ratio based on button clicks.
However, a usual A/B test doesn’t answer the question of how a new design will affect conversions in the long run. You can use cohort analysis to find this out. Starting three to four weeks after the release, build a cohort report that considers users who first came to the site from a page with the new button. Then, compare the conversion rate of these users against a cohort who didn’t see the new design. This difference will clearly show you how the new design has affected conversions.
5. Analyze the performance of mobile apps
Cohort analysis takes a special place in mobile app analytics, where it helps to assess the user retention rate based on app version. Using cohort analysis, you also can determine which sources bring the most active users:
What do you need for cohort analysis?
No matter in which system you plan to build a cohort report, you’ll need to consider four key parameters:
- The attribute that defines the cohort: date of the first visit to the site, date of purchase or registration, etc.
- Cohort size: the time interval over which to define your cohort — days, weeks, or months
- Report period: the time during which you’ll analyze the cohort’s behavior
- The key metric for analysis: ROI, retention rate, LTV, etc.
The choice of an attribute to define a cohort with depends on what indicator you want to analyze and improve. Need to calculate the retention rate and stop the outflow of users? Explore the behavior of people who first visited, registered, or installed the app at the same time. Want to calculate and increase LTV? Form cohorts by date of first purchase.
This doesn’t mean that there should be only one attribute that defines a cohort. You can use any conditions to create a cohort according to your needs. For example, in a cohort, you can identify users who both registered in July and made their first purchase that same month.
Cohort analysis in Google Analytics
To create a report on cohorts in Google Analytics, go to the Audience —> Cohort Analysis menu. At the very top, you’ll see these settings:
Here you can set the four parameters mentioned above: the cohort type (which is also the attribute that defines the cohort), cohort size, an indicator to analyze, and a reporting period.
Since cohort analysis in Google Analytics is in beta testing, there are certain limitations:
You can create cohorts only by date of the first visit.
- You can analyze only one indicator in one cohort report. In total, there are 14 indicators available in the settings.
- There are three options available in the Cohort size field: day, week, and month.
- You can’t set your own reporting period, since the date range is limited and depends on the cohort size. For example, the maximum reporting period for cohorts created by day is 30 days; by week, 12 weeks; and by month, 3 months.
- You can’t filter data in the standard settings, for example by traffic type, device type, or another parameter. For this, you need to use segments.
There’s a graph under the settings that shows the dynamics of an analyzed indicator for all users and offers three cohorts to choose from.
In the screenshot above, we see that the customer retention rate for all users during the third week is 4.83%, and for the cohort of users who first visited the site from February 3–9 the retention rate is 0.99%.
Under the graph, you’ll see a table with data for each cohort for the entire reporting period. Let’s figure out how to interpret it. For example, let’s choose:
- Indicator — customer retention rate
- Cohort size — by week
- Data range — the last 6 weeks
As a result, we get this table:
In the first column, we see how many people visited the site during the reporting period as well as the number of users in each weekly cohort. The remaining columns show how the retention rate has changed for all users and for each cohort from week to week. The higher the retention rate, the darker the color of the cell.
From the screenshot above, it’s clear that from December 30 through January 5, 6,604 new users logged on to the site. A week later, only 4.47% of them returned to the site; after two weeks, 2.32% returned, and so on.
In general, cohort analysis in Google Analytics helps you understand your data in more detail. For example, you might see that sales per quarter are growing due to the inflow of new customers. However, if you dig deeper and look at the cohort report by weeks, you may notice that sales in all cohorts fall sharply on the sixth week. Now you know when to launch remarketing.
Cohort reports in other services
In addition to Google Analytics, built-in cohort reports are available in many advertising and analytics services. They are found in almost all analytics systems for mobile applications.
AppsFlyer has more flexible settings than Google Analytics, allowing you to add multiple filters to a cohort report. You can also set a minimum cohort size to exclude cohorts with too few users.
Cohort reports in AppMetrica (Yandex.Metrica for mobile applications) and Adjust are used mainly to track retained users. In Adjust, you can add a second indicator for analysis, for example the number of sessions per user:
An example of a cohort report in AppMetrica:
Mixpanel and Kissmetrics also support cohort analysis. These are also analytics systems for websites and mobile applications that work with data at the scale of users, not visits and page views. The report settings in these services are very similar to the settings in Google Analytics, but there are some differences.
In Kissmetrics, you can form a cohort by two attributes at a time. For example, you can build a cohort of users who visited the site and converted to a subscription. You can also group people not only by time, as is commonly done in cohort reports, but also by any other attribute such as purchase amount, location, and traffic source.
An example of a cohort report in Kissmetrics:
Cohort reports in OWOX BI Smart Data
Recently, we added cohort analysis to our own OWOX BI Smart Data service for data visualization. As a data source, OWOX BI Smart Data uses your Google BigQuery project, to which you can upload information from Google Analytics, advertising services, and CRM and ERP systems.
To get a cohort report in Smart Data, simply type the word «cohort» in the search line and the system will show you a drop-down menu in which you can select your desired metric, cohort size, and reporting period:
As of now, OWOX BI Smart Data can only create cohorts by user acquisition date. But in the near future, we’ll add more attributes.
An example of a cohort report in Smart Data:
By the way, you can examine cohort reports and other OWOX BI features here:
Cohort reports in Google Sheets
If you lack the cohort analysis capabilities you need in Google Analytics and you don’t use other analytics systems, you can build a cohort report manually in Google Sheets. You can do this in one of four ways, which we’ll discuss in more detail below — namely pivot tables, formulas, filter formulas, and Apps Script.
To build a cohort report using the first three methods, you should prepare and upload the source data to Google Sheets. This data should be structured in the following way:
It’s important that in the first column you have a start month/week/day for cohort formation (registration_week), in the second column you have the transaction month/week/day (transaction_week), and then any columns (purchases, in this case) with data on the metric you’re analyzing.
You can upload data to Google Sheets in any convenient way. In our example, we use raw data that we export from Google Analytics to Google BigQuery with the help of OWOX BI Pipeline. Then, using a SQL query, we select the necessary information:
SELECT registration_week, transaction_week, SUM ( purchases ) AS purchases FROM ( SELECT IFNULL ( user.id, clientId ) AS userID, STRING ( YEAR ( date )) + ’-’ + RIGHT ( ’0′ + STRING ( WEEK ( date )), 2 ) AS registration_week FROM TABLE_DATE_RANGE ( [projectID:datasetID.session_streaming_], TIMESTAMP ( ’2017-08-01′ ), CURRENT_TIMESTAMP ()) WHERE hits.eventInfo.eventCategory = ’registration’ ) AS t1 LEFT JOIN EACH ( SELECT IFNULL ( user.id, clientId ) AS userID, EXACT_COUNT_DISTINCT ( hits.transaction.transactionId ) AS purchases, STRING ( YEAR ( date )) + ’-’ + RIGHT ( ’0′ + STRING ( MONTH ( date )), 2 ) AS transaction_week FROM TABLE_DATE_RANGE ( [projectID:datasetID.session_streaming_], TIMESTAMP ( ’2017-08-01′ ), CURRENT_TIMESTAMP()) WHERE hits.eCommerceAction.action_type = ’purchase’ GROUP BY userID, transaction_week) AS t2 ON t1.userID = t2.userID WHERE REPLACE ( registration_week, ’-’, ’’ ) <= REPLACE ( transaction_week, ’-’, ’’ ) GROUP BY registration_week, transaction_week, ORDER BY registration_week, transaction_week,
In the first part of the request, we select users depending on the week in which they registered. So the cohort formation attribute is registration, and the cohort size is a week.
In the second part, we choose the number of purchases per week. This is the metric we want to analyze. We link all parts of the request by User ID.
Then we upload the data to Google Sheets using the OWOX BI BigQuery Reports Add-on. One of the advantages of this add-on is that you can set calculations to happen at whatever frequency you need and the report will be updated automatically.
1. Cohort report using pivot tables
The easiest way to build a cohort report is with a pivot table. Remember that in Google Sheets, we already have data in the necessary structure: column A is the week of registration, column B is the week of transactions, and column C is the number of purchases. Now select the relevant date range, open the Data tab, and select Pivot Table. You’ll see a panel with settings on the right:
In the Rows block, select the week of registration; in the Columns block, select the transaction week (additional columns show the following weeks); and in the Values block, select the number of purchases. Then apply conditional formatting to make the report more convenient and you’ll get this table:
If you’ve analyzed several metrics, the report may look like this:
You can use filters in the pivot table, but it’s not convenient. Every time you want to filter the data, you’ll have to open the table settings and look for the value you need.
2. Cohort report using formulas
The same cohort report can be built using three simple formulas. For this, we create a new sheet in Google
Sheets and apply these formulas:
We insert this formula into cell A2. It pulls all weeks of cohort formation from the source data table into column A.
This formula is used in cell B2. It pulls all transaction weeks from the column in the source data table and converts them into a row.
We insert this formula into cell B3 at the intersection of the week of registration and the week of the transaction. This formula summarizes all purchases from column C of the source data table.
Note that Example in these formulas is the name of our spreadsheet in Google Sheets with the source data. Next, we need to stretch these formulas over the entire sheet and apply conditional formatting. Then the report is ready.
3. Cohort report using formulas and filters
If you want to conveniently filter data in the report by any parameter, the third formula should be made a bit more complicated. Let’s figure out how to do it.
First, you need to add source data parameters to the table to filter the data for the cohort report. In our example, we have a traffic source, a channel, and a campaign name. You can add any parameter you need: for example, location data (continent, country, city), mobile application version, product category, or something else.
To make the most lifelike example, we add more analyzed metrics to the report and also change the cohort size from week to month. As a result, we get a table with the source data:
After that, create an additional sheet and pull all registration (cohort formation) months from the source data table using the first formula: = UNIQUE (Example! A: A). This will make it easier to access the data. Let’s name this sheet «Month.»
Later on, we’ll add a new sheet that will contain our report and create two filters by Source/Medium and Campaign. To do this, open the Data —> Data Validation tab. In the Cell range field, enter the cell address to specify which cell will display the filter drop-down list. In the Criteria field, leave the List from a range condition and specify the column address with the necessary parameters from the source data table. Click Save.
As a result, we get this filter:
Then, using the formula =Month!A$2, we pull up (in column A) the months of registration from the Month sheet. Each value must be repeated as many times as there are analyzed metrics (in our example, we have three metrics). That’s why we pulled up months of cohort formation in a separate sheet.
Since we have only three parameters in our example (Users, Revenue, Cost) and they’re the same for all cohorts, we enter the names manually in order not to use formulas to pull them from another sheet.
In cell C2, we use the formula =TRANSPOSE(UNIQUE (Example2! B2:B)), which pulls up months of transactions from the source data table and converts them into a row:
And finally, at the intersection of the month of registration, the subsequent month, and a specific metric, we insert our third formula:
This formula calculates the required metric in the desired cell based on filters.
Since we have two filters, we need to include all possible combinations in the formula:
- Both filters are empty
- Both filters are filled
- Only the first filter is filled
- Only the second filter is filled
For each metric, the formula in the cell remains unchanged; only the column by which the sum of a specific metric is calculated changes. For example, for the Users metric, we calculate the sum of the column Example2!$E:$E; for the revenue metric, the column Example2!$H:$H; and for the Cost metric, the column Example2!$I:$I.
If you want to have more than two filters in your report, you’ll need to increase the number of conditions and the size of the formula. You can learn how to do this from the webinar «Cohort Analysis 101.» In this webinar, you’ll hear our analyst Anastasiya Chausova explain in detail what part of the formula is responsible for what. Anastasiya shows how to build all the reports described in this article in Google Sheets. Fill out the form below and we’ll email you a link to the webinar, a presentation, and sample reports.
4. Cohort report using Google Apps Script
Google Apps Script is a programming language that allows you to add features and process data in Sheets and other Google services. To build a cohort report using Apps Script, you don’t need to transfer data to Google Sheets. The table with the source data is created in Google BigQuery, and from there the script exports it to the report.
- Create a menu on the Google Sheets toolbar to run your own calculations in a click.
- Execute data queries in BigQuery and access the results of these queries.
- Create cells that will be used as filters. Parameters in such cells must be entered manually. This means the filter won’t have a drop-down list, as in a report built using formulas. You specify the value by which you want to filter the data in the cell yourself, and this value is used in the SQL query to the data in Google BigQuery.
- Change the way the report is displayed: apply conditional formatting, make the necessary breaks and indents, change the order of items, flip tables, and much more.
To connect Apps Script to Google Sheets, open the Tools —> Script editor tab. In the opened editor, select Resources —> Additional functions of Google Services and activate the BigQuery API and the Google Sheets API:
Then go to Google Cloud Platform to activate these APIs there as well. After that, you can run scripts in the Script editor that will take data from Google BigQuery and generate cohort reports in Google Sheets.
We wrote a script template for a report with one parameter and two filters. You can create a data table in Google BigQuery with the structure described above. Then you can insert a project name into the script template as well as tables in BigQuery and get a ready-made report:
In addition, you can easily modify the script template to build a report with the fields and number of metrics you need. Here’s an example of how Boodmo did it. Fill out the form above and we’ll email you the script template for cohort analysis.
If we compare the methods of cohort analysis by complexity, then the simplest option is Google Analytics. However, its capabilities and settings are limited. The second easiest method is pivot tables in Google Sheets. Here you can form cohorts by any parameter, not just by the date of the first visit as in Google Analytics. Options with formulas and Apps Script are a bit more complicated, as they require additional knowledge.
If you need to use filters often, the most convenient option is to create a report using formulas: then all filters will be in a drop-down list. Sure, you can use filters in all Google Sheets reports; but in the pivot table, you need to go to the table settings. With the Apps Script method, you have to specify the filter values manually in each cell. If you make a mistake, your SQL query will return nothing.
We’ve prepared a table comparing the pros and cons of all the cohort analysis methods described in this article:
If you have any questions, leave a comment below and we’ll be happy to answer!