Table of contents
What is raw data and how to use it
Vlada Malysheva, Creative Writer @ OWOX
What do vegetables and data have in common? They both bring more benefits in their raw form. While standard GA 4 reports can quickly satisfy your hunger, raw data lets you cook something unique and get fresh insights.
You probably have access to gigabytes of user data daily, but this amount doesn’t bring value until you make it work. In this article, we look at what raw data is, why it’s needed, and how to get it and use it.
Note: This article was originally created in 2020, and was updated in 2023 due to significant changes in privacy restrictions that form the new rules for data collection and user behavior analytics processes.
What is raw data and aggregated data?
The words “data” and “information” are often used as synonyms. However, these things are fundamentally different.
Data is fixed information about events and phenomena that is stored on some storage medium. Information is the result of processing data to solve specific tasks.
Raw data refers to unprocessed and unstructured information or facts that have been collected and recorded. It is the initial data that is gathered directly from various sources without any manipulation, organization, or analysis. Raw data can take many forms, including text, numbers, images, audio, or any other data type, and it may not be immediately useful or informative until it undergoes processing, cleaning, and transformation.
Aggregated data is the result of processing and summarizing raw data to provide higher-level insights, statistics, or condensed information. It involves combining, calculating, or transforming raw data to create a more manageable and meaningful dataset.
Examples of Raw Data
Raw data in marketing often takes the form of customer behavior records. For instance, this could include customer IDs, impressions, clicks, comments, purchase dates, product SKUs, and transaction amounts.
This data can be collected from various sources, such as online sales platforms, point-of-sale systems, or customer surveys.
Analyzing raw data like this can help marketing analysts uncover trends, identify customer preferences, and develop targeted marketing strategies to enhance customer engagement and boost sales.
For example, you can collect data in a Google BigQuery repository, and when you run a SQL query on it, BigQuery provides information in response.
In informatics, analytics, marketing, and some other fields, this “data” and “information” have special names: raw (unprocessed) and aggregated (processed) data. How does aggregation work in Google Analytics 4 reports?
When the number of rows with values for one parameter exceeds the specified limit (50,000 per day and 1 million per period), the system aggregates the remaining values into an (other) line:
Here are some raw data examples collected from various sources::
1. Website Data: For example, Google Analytics 4 collects raw data about all user interactions with your website like page views, user events, and transactions. It is then processed to generate reports by applying sampling, aggregation, and filters (if you’ve configured them).
2. Social Media Engagement Metrics: Raw data from social media platforms like Facebook, Twitter, and Instagram includes metrics such as likes, shares, comments, and click-through rates. Marketers use this data to gauge the effectiveness of their social media campaigns.
3. Customer Surveys: Raw survey responses provide valuable insights into customer preferences, satisfaction levels, and pain points. Marketing analysts can analyze this data to tailor products, services, and messaging to meet customer needs better.
4. Ad Impressions and Clicks: Data on ad impressions, clicks, and conversions from online advertising campaigns are crucial for evaluating ad performance and optimizing ad spend.
5. Email Marketing Data: Raw data from email marketing campaigns includes metrics like open rates, click-through rates, and unsubscribe rates. This data helps marketers refine their email strategies for better engagement.
6. Sales Transaction Data: Information on customer purchases, including product details, prices, and transaction dates. It is essential for assessing product performance and identifying opportunities for cross-selling or upselling.
7. Market Research Data: Data from market research studies, such as demographic information, consumer preferences, and industry trends, provides a broader context for marketing decisions.
Why raw data is needed
The fewer data you use for analysis, the less accurate your results are. Sampling can distort reporting and lead to inefficient decisions. As a result of sampling, you risk not noticing ads that make a profit, or vice versa — spending money on inefficient campaigns.
It’s convenient to work with aggregated information to track your website’s main KPIs, but it’s not enough to solve more complex problems. Only with raw data can you:
Perform a deep analysis of metrics and their dependencies
Track the user’s entire journey from first touch to purchase
Build any reports without the limits and restrictions of Google Analytics 4 and reveal valuable insights
Merge information from different sources and set up advanced analytics
Create complex sales funnels that match your business structure
Evaluate the mutual impact of channels on sales using funnel-based attribution models
Segment your audience and set up more targeted targeting
Predict conversions using machine learning
Benefits of raw data
1. Make better decisions
By collecting statistics from your website to Google BigQuery or another tool, you can bypass sampling and other restrictions of Google Analytics 4. This will allow you to analyze your complete figures and make better decisions based on it.
See also: How to collect complete user behavior data on your website and cost data from advertising services with minimal resource expenses.
2. Build any reports without restrictions
Google Analytics 4 and all other analytics systems limit your ability to generate reports. For example, these systems place limits on the number of parameters and dimensions as well as their compatibility with each other. With access to raw data, you can build reports with any number and combination of metrics you need.
For example, you can conduct cohort analysis in terms of dimensions that are relevant to your business.
3. Run advanced analytics based on your rules
You can combine website statistics with information from advertising services, call tracking systems, emails, and your CRM to set up advanced analytics.
With these statistics, you can consider all user touchpoints with your company, analyze the conversion paths, evaluate the impact of all marketing efforts (both online and offline) on business indicators, find the most effective marketing channels, and quickly optimize those that lose money.
See also: How to use analytics not only to create reports but to avoid depleting your budget.
4. Target users more accurately
Using raw data, you can segment users based on their actions on your website (browsing pages, clicking links, adding items to the shopping cart, etc.), then send them to trigger mailings. In addition, you can automatically upload audiences to advertising services to launch remarketing campaigns and set up a bid management strategy for each audience segment.
Audience segmentation helps you make ads more relevant, increase customer conversions and loyalty, optimize your marketing strategy, and reduce costs.
5. Protect against bots and fraud in CPA networks
Only with raw data can you detect suspicious activity on your website — for example, too many registrations per day. In addition, with it, you can identify unscrupulous CPA partners who may replace the source of traffic on the application page.
Here is how we tackled fraud in CPA networks for Raiffeisen Bank International.
6. Avoid risks of vendor lock-in
By collecting data in Google BigQuery, you’re independent of ETL services and other tools that you use. This means you can benefit from your statistics even if you decide to disconnect from a service and use your own solution.
Steps to Use Raw Data?
Using raw data involves several steps, depending on the specific needs and objectives of the analysis. Here are the detailed steps on how to work with raw data:
Step 1. Data Collection
It involves data collection from various sources such as databases, APIs, surveys, sensors, or other relevant data repositories.
The data collected can be in various formats like spreadsheets, text files, databases, or even unstructured data like images and text.
Identifying and retrieving the data needed for your analysis, ensuring it's comprehensive, accurate, and relevant to your research or project.
Step 2. Data Preparation
After collecting the data, it often needs cleaning and preprocessing. This step involves:
Data Cleaning: Identifying and handling missing or erroneous values, duplicates, and outliers.
Data Transformation: Converting the cleaned data into a suitable format (e.g., converting text to numerical values) and scaling features.
Data Integration: Combining data from multiple sources if necessary.
Data Reduction: Reducing data dimensionality if it's too complex.
Step 3. Data Analysis
- This is the core of data processing, and it applies various statistical, machine learning, or analytical techniques to extract insights, patterns, and relationships within the data.
Step 4. Data Visualization
Data visualization is crucial for presenting the findings effectively. Create visual representations like charts, graphs, and plots to make complex data more understandable.
Visualization can reveal trends, anomalies, and patterns that might not be apparent in raw data or numerical analysis alone.
Step 5. Reporting
Reporting involves documenting the methods, findings, and conclusions clearly and concisely.
It may include recommendations or actions based on the analysis and visualizations
Where to store raw data
To collect, store, and process raw data, we recommend using Google BigQuery cloud storage because it:
Allows you to upload large amounts of information and quickly process it with SQL
Scales flexibly and provides more opportunities as your business grows
Guarantees security and gives you full control over access to your project with your Google account and two-factor authorization
Seamlessly integrates with other Google products and popular visualization and reporting systems
See also: What problems might you encounter when building reports in Google Analytics and how can you solve them using Google BigQuery?
How to collect raw data with OWOX BI
OWOX BI collects raw data to Google BigQuery directly from your website. This service isn’t constrained by the limitations of Google Analytics 4, allowing you to build reports without sampling and according to any parameters.
At the same time, OWOX BI uses a data structure compatible with Google Analytics 4, meaning you can run any SQL queries written for Google Analytics 4. This saves time for your team when preparing reports.
OWOX BI sends complete non-aggregated statistics, on-site user behavior from your website to Google BigQuery. It also supports an unlimited event size. As a result, you’ll get a full picture of user activity on your website.
In addition, with OWOX BI, you can collect an unlimited number of user parameters and dimensions in Google BigQuery. You can use them to segment users by any feature and build deep reports for detailed analysis.
Read more about all the benefits of OWOX BI in this article:
To collect raw data from your website to Google BigQuery:
Sign in to OWOX BI using your Google account.
Provide the necessary account access and create data pipelines.
Copy the tracking code and add it to your website in a way that’s convenient for you.
Building reports on raw data with OWOX BI
OWOX BI not only collects raw data from your website but automatically combines it with statistics from advertising services, call tracking systems, email systems, and CRMs so you can receive reports without the help of analysts or knowledge of SQL.
With the simple report builder in the OWOX BI Smart Data service, you can select the necessary metrics and build any reports on advertising campaigns, ROPO, RFM, LTV, and cohort analysis.
Benefits of OWOX BI Smart Data
Build reports without technical training
Do you regularly need ad campaign reports but need more time to study SQL or wait for a response from an analyst? With OWOX BI, you don’t need to understand the data structure. Simply select the parameters and key metrics you want to see in your report and the Smart Data Report Builder will instantly provide you with an understandable graph and table.
Focus on your business, not your data sources and structure
OWOX BI provides figures for reports when you need them and doesn’t limit you to ready-made dashboards. Connect your information to the model once and spend the rest of your time on analysis and conclusions.
Each report is based on your business model
Our specialists will set up a data model that takes into account the peculiarities of your business. You’ll be able to assess the impact of all marketing efforts — both online and offline — on your business performance.
In addition, you can use data collected by OWOX BI to create reports in Google Sheets, Looker Studio (formerly known as Google Data Studio) Power BI, Tableau, and other visualization systems that integrate with BigQuery.
If you need reports tailored to your business, OWOX BI analysts are ready to help you set them up. Sign up for a demo to discuss how OWOX BI can help you grow your business with confidence.
Unlock Insights: Tap into Raw Data Power!
Experience the raw data advantage with personalized insights.
What is raw and aggregated data?Data is fixed information about events and phenomena that is stored on some storage medium. Information is the result of processing data to solve specific tasks.
For example, you can collect data in a Google BigQuery repository, and when you run a SQL query on it, BigQuery provides information in response.
In informatics, analytics, marketing, and some other fields, this «data» and «information» have special names: raw (unprocessed) and aggregated (processed) data.
Why raw data is neededOnly with raw data can you:
- Perform a deep analysis of metrics and their dependencies
- Track the user’s entire journey from first touch to purchase
- Build any reports without the limits and restrictions of Google Analytics and reveal valuable insights
- Merge information from different sources and set up advanced analytics
- Create complex sales funnels that match your business structure
Where to store raw dataTo collect, store, and process raw data, we recommend using Google BigQuery cloud storage because it:
- Allows you to upload large amounts of information and quickly process it with SQL
- Scales flexibly and provides more opportunities as your business grows
- Guarantees security and gives you full control over access to your project with your Google account and two-factor authorization
- Allows you to pay only for the volume of statistics collected and processed
- Seamlessly integrates with other Google products and popular visualization and reporting systems
Why you need to gather raw unsampled dataGoogle Analytics is an undisputed leader among web analytics services. It’s free, easy to work with, and it provides insights about the key KPI of online businesses. However, there are limitations in the system that prevent you from getting deeper into the data and exploring it from all sides.
1. The data you see in Google Analytics reports is always aggregated, and this process is beyond control.
2. Sampling, which can seriously distort your data and lead to wrong business decisions.
3. Reports can contain only a limited number and only specific combinations of parameters and key figures.
4. Limit on a number of lines.
5. Data processing time — If you use a free version of Google Analytics, you need to wait up to 24-48 hours for the system to complete data processing.
4 ways to gather raw data1. OWOX BI Pipeline.
2. Use Google Analytics APIs.
3. BigQuery export for Google Analytics 360.
4. Build your own connector.