All resources

What Is the NORMALIZE Function in BigQuery?

The NORMALIZE Function in BigQuery converts text strings into a standardized Unicode form, ensuring consistent representation of characters across datasets.

The NORMALIZE Function in BigQuery helps eliminate discrepancies caused by variations in text encoding or accent marks. By unifying text data, the NORMALIZE Function supports cleaner comparisons, searches, and transformations, critical for analytics, reporting, and multilingual data management.

Why You Should Use the NORMALIZE Function in BigQuery

Using the NORMALIZE Function ensures that text data is consistent and comparable, regardless of how it was entered or sourced.

  • Improves Data Accuracy: Prevents mismatches caused by inconsistent encoding or special characters.
  • Simplifies Data Integration: Aligns text from multiple systems or languages.
  • Enhances Search Operations: Ensures that queries return accurate results even with accented or alternate characters.
  • Supports Multilingual Data: Handles inputs with global alphabets and diacritics seamlessly.

With normalized text, data analysts can ensure better consistency in joins, filters, and aggregations.

How the NORMALIZE Function Works in BigQuery

The NORMALIZE Function reformats text according to Unicode normalization forms, making visually identical strings structurally consistent.
Process Overview:

  1. The function takes an input string and applies a specified normalization form.
  2. It then outputs a standardized version of the string that’s uniform across systems.
  3. Common normalization forms include NFC, NFD, NFKC, and NFKD.

This process ensures that equivalent characters—like accented letters, are treated as the same value for analysis and comparison.

NORMALIZE Function Syntax and Optional Parameters

The NORMALIZE Function can be customized to use different Unicode normalization forms.
Syntax:

NORMALIZE(value[, form])
  • value: The string to normalize.
  • form: (Optional) The normalization type. Available options are NFC, NFD, NFKC, and NFKD.

If no form is specified, BigQuery defaults to NFC. Each form handles text differently, such as combining or decomposing accented characters for consistent encoding.

Example of Using the NORMALIZE Function in BigQuery

Here’s an example showing how NORMALIZE standardizes text data:

SELECT NORMALIZE('e\u0301', 'NFC') AS normalized_text;

Result:

é


In this case, the function converts the decomposed version of the accented “e” (e + ́) into its single combined form (é). This ensures that comparisons between visually identical but structurally different texts yield consistent results.

Best Practices for Using the NORMALIZE Function in BigQuery

Follow these best practices to use the NORMALIZE Function effectively in BigQuery:

  • Identify Encoding Issues: Audit text fields from various sources before normalization.
  • Choose the Right Form: Use NFC for general normalization and NFKC for broader compatibility.
  • Integrate Early: Normalize data during ingestion or preprocessing to avoid mismatches later.
  • Test Comparisons: Verify outputs after normalization to confirm consistency.
  • Combine with CASEFOLD: Use NORMALIZE_AND_CASEFOLD when you also need case-insensitive comparison.

These guidelines help ensure cleaner, more accurate text processing in analytical workflows.

Standardize Your Data with OWOX Data Marts

OWOX Data Marts Cloud automates SQL transformations like NORMALIZE to ensure consistent, high-quality text data across analytics environments. It allows analysts to define standardized data logic, manage refreshes, and deliver clean, comparable datasets to Sheets or BI tools. With built-in governance and automation, OWOX helps teams maintain accuracy and trust across global, multilingual data pipelines.

Enable Self-Service Analytics on top of your BigQuery Data
Get Started Free
Glossary terms

Learn more about analytics

Quick & easy explanations of the most important data terms

See all terms →
From the blog

Learn how teams ship analytics faster

Deep dives on data marts, governance, and modern reporting workflows.

See all articles →
What users are saying

Not testimonials. Comment threads.

From people who actually use the product. Each quote is attached to a specific claim.

A1
· re: warehouse integration
KP
Katya P.
BI Manager

Finally, a tool that doesn't ask business users to learn a new dashboarding UI. Our marketing team already knows Sheets. OWOX just delivers the right data.

C3
· re: governance
MR
Marco R.
Head of Data

Joinable data marts concept was the thing that sold us. We can now use the semantic layer without building one.

E7
· re: open source
JC
James C.
Data Analyst

Self-hosted the OSS version on Digital Ocean. Zero vendor lock-in. Contributed a Shopify connector back in week two.

Google Sheets in modern analytics

Google Sheets, powered by governed data marts

Google Sheets were never designed to be a system of record. With OWOX Data Marts, Sheets becomes a trusted analysis layer — powered by governed data marts defined upstream in your warehouse.

Business teams keep the flexibility they love
Data teams retain control over logic and definitions
No more fragile joins duplicated across spreadsheets
See how it works