All resources

Mastering Data Transformation and Cleaning in Google Sheets with the QUERY Function

Cleaning and organizing data in spreadsheets can often feel like a never-ending task, especially when you're working with large or messy datasets. Manually sorting, filtering, or removing duplicates takes time and increases the chance of mistakes. 

That’s where the QUERY function in Google Sheets can help. It allows you to pull specific information, sort records, and clean up your data using just a few simple instructions.

i-radius

Whether you're a data analyst managing reports, a marketing manager tracking campaign results, or a small business owner handling customer data, learning how to use QUERY can make your workflow faster and more accurate. This guide will show you exactly how to use it to transform your raw data into clean, organized insights, all within Google Sheets.

Why Use the QUERY Function for Data Transformation and Cleaning?

The QUERY function is one of the most powerful tools in Google Sheets because it lets you filter, sort, and organize your data quickly using simple, SQL-like commands. Instead of scrolling through rows and manually applying filters, QUERY pulls exactly what you need in just one step.

It helps you avoid repetitive tasks like manual sorting or copying filtered data. You can also transform your dataset by grouping, summarizing, or formatting data on the fly. All of this happens inside Google Sheets, making your reporting and analysis much faster and more efficient, with no extra tools or complex formulas needed.

Syntax of the QUERY Function

The QUERY function helps you extract and organize data in Google Sheets using a simple, SQL-like language. Below is the syntax and a breakdown of each part:

Syntax:

1=QUERY(data, query, [headers])

Here's what each parameter means:

  • data: The range of cells where you want to apply the query.
  • query: The instruction written in SQL-like language that tells Google Sheets what to retrieve.
  • [headers] (optional): Indicates how many rows at the top of the data range are headers. If not provided, Google Sheets will try to detect it automatically.

Example of the QUERY Function

Suppose you have a list of customer orders, and you want to view only those that have been delivered. This helps quickly filter completed orders in the sheet. To display only delivered orders, use the following formula:

1=QUERY(B2:E12, "SELECT * WHERE E = 'Delivered'", 1)

Here's what each parameter means:

  • B2:E12: This is the full range of your dataset, including headers.
  • SELECT *: Selects all columns.
  • WHERE E = 'Delivered': Filters rows where the Status column (column E) is "Delivered".
  • 1: Indicates that the first row contains the headers.
 Filtering rows in Google Sheets using the QUERY function to return entries that meet a specific condition in a column. i-shadow

This filters the data to show only rows where the Status is "Delivered". 

Practical Applications of the QUERY Function for Data Transformation and Cleaning

The QUERY function isn’t just for filtering, it can clean, organize, and transform your data in many ways. Below are practical examples to help you handle common data tasks easily.

Removing Duplicates using QUERY and UNIQUE Functions

Duplicate entries can clutter your dataset and lead to inaccurate analysis. In Google Sheets, you can’t remove duplicates using QUERY alone. Instead, combine QUERY and UNIQUE together for the best results.

Suppose you have a dataset in range B2:D15 containing Customer Name, Product, and Category. Some rows appear more than once. You want to display only unique customer-product-category combinations for clean analysis. 

You can use the following formula:

1=QUERY(UNIQUE(B2:D15), "SELECT *", 1)

Here's what each parameter means:

  • UNIQUE(B2:D15): Filters out all duplicate rows from the dataset
  • QUERY(..., "SELECT *", 1): Selects and displays all columns, including headers
Using QUERY with UNIQUE in Google Sheets to remove duplicate rows and display a clean list of unique records. i-shadow

This grabs unique rows from B2 to D15 and displays all columns. It's perfect for cleaning up repeated entries in your dataset.

💡 Want to eliminate duplicates and extract meaningful insights from your data? Learn how the QUERY and UNIQUE functions work together to streamline your data management process in Google Sheets. 

Filtering Data with WHERE and 'contains' in QUERY

Filtering data with QUERY lets you extract only the rows that meet specific conditions. Using the WHERE clause with contains helps you find partial matches within a column, like keywords in addresses or product names.

Suppose you have a dataset that includes Customer Name, Product, Status, and Address. You want to view only those entries where the Address contains "Austin", regardless of the full text. You can use the formula below:

1=QUERY(B2:E12, "SELECT * WHERE E contains 'Austin'", 1)

Here's what each parameter means:

  • B3:E12: The data range with headers
  • "SELECT * WHERE E contains 'Austin'": Selects all rows where the Address column contains the word “Austin”
  • 1: Tells QUERY that the first row contains headers
Using the QUERY function with a WHERE clause and CONTAINS to filter rows based on partial text matches in a specific column. i-shadow

This returns only the records with "Austin" in the address, helping you narrow down your data quickly.

Using AND & OR Operators in QUERY for Multiple Conditions 

When your data needs to meet more than one condition, using AND and OR with the QUERY function is the way to go. It helps you filter rows based on combined criteria.

Example:
Suppose you have a dataset in range B2:E12 showing Customer Name, Product, Status, and Address. You want to extract only the rows where the Status is "Delivered", and the Address contains "Austin"

Use the formula below:

1=QUERY(B2:E12, "SELECT * WHERE D = 'Delivered' AND E contains 'Austin'", 1)

Here's what each parameter means:

  • B2:E12: The full range of data
  • "SELECT * WHERE D = 'Delivered' AND E contains 'Austin'": Filters rows that meet both conditions
  • 1: Indicates the first row has headers
Using QUERY with multiple conditions in Google Sheets to filter data based on matching text values across different columns. i-shadow

This returns only the delivered orders shipped to Austin, helping you narrow down to very specific results.

💡 Want to make better decisions with your data? Check out OWOX’s guide to the IF function in Google Sheets and learn how to automate logic, simplify reporting, and create smarter workflows in seconds.

Using QUERY with ARRAYFORMULA and PROPER for Text Formatting 

Text in spreadsheets can often be inconsistent, especially when data is entered in all lowercase. To make your sheet look cleaner, you can use ARRAYFORMULA with PROPER and QUERY to fix capitalization automatically.

Suppose you have a dataset with Customer Name, Product, and Status. Some names and products are in lowercase. You want to convert them to proper case without changing the original data. 

Here’s the formula:

1=ARRAYFORMULA(QUERY({PROPER(B2:B12), PROPER(C2:C12), D2:D12}, "SELECT *", 0))

Here's what each parameter means:

  • PROPER(B2:B12), PROPER(C2:C12): Capitalize Customer Name and Product text
  • D2:D12: Leaves the Status column unchanged
  • "SELECT *", 0: Displays all columns with no headers in the output
Combining ARRAYFORMULA, PROPER, and QUERY in Google Sheets to format text and return a cleaned-up version of data. i-shadow

This gives you a well-formatted view of your data, perfect for reporting or sharing. 

Addressing Common Challenges with the QUERY Function for Transformation and Cleaning

While the QUERY function is powerful, it's not without its challenges. Users often run into errors that can be frustrating if they're not sure what’s causing them or how to fix them.

Fixing #N/A Error in QUERY

⚠️ Error: The #N/A error appears when the QUERY function runs but doesn’t return any matching results. This often happens because the conditions in your WHERE clause are too strict, or the data you’re looking for isn’t actually in the selected range, even though it might seem like it is.

✅ Solution: Start by checking your filter conditions. Loosen overly specific criteria and ensure the data you want is present in the selected range. Also, look out for hidden rows, typos, or extra spaces that might cause mismatches.

Troubleshooting #REF Error in QUERY

⚠️ Error: This error shows up when there’s a problem with your cell reference or output range. It usually means the cell you’re trying to display the result in is already filled with something, or the sheet you're referencing has been deleted, renamed, or isn’t spelled correctly.

✅ Solution: Confirm that your output cell is empty and there’s enough room for the data to appear. Double-check the sheet name and references in your formula. Fixing these usually resolves the #REF error immediately.

Resolving PARSE_ERROR in QUERY

⚠️ Error: A PARSE_ERROR means there’s something wrong with the structure of your QUERY formula. It could be due to a missing keyword like SELECT, an incorrect column label, unmatched quotation marks, or even a misplaced comma.
✅ Solution: Go over your query string carefully. Make sure all column names match the actual headers, text values are inside single quotes, and commas and spaces are placed correctly. Even small formatting mistakes can trigger this error, so double-check everything.

Handling NO_COLUMN Error in QUERY

⚠️ Error: The NO_COLUMN error occurs when your QUERY function tries to call a column that isn’t included in your data range. This happens if your SELECT statement refers to a column label (like Col3) that doesn't exist in the selected range.

✅ Solution: Expand the range in your formula to include all the columns you want to work with. Also, make sure the column labels you're referencing actually exist in the dataset. Updating the range usually resolves this issue.

Best Practices and Tips for Using QUERY Function for Transformation and Cleaning

To get the most out of the QUERY function, it's important to follow some best practices. These tips will help you keep your data clean, accurate and your queries running smoothly.

Maintain Consistent Data Formatting for Accurate Queries

Inconsistent data formats can cause the QUERY function to break or return wrong results. For example, mixing numbers and text in the same column can confuse the function. To avoid this, make sure each column has the same type of data, like all dates, all text, or all numbers. 

You can use functions like TEXT(), VALUE(), or ARRAYFORMULA() to clean up and standardize formats across your sheet. This helps the QUERY work smoothly and reduces the chance of errors.

Combine QUERY with Other Functions for Advanced Data Cleaning

You can make your QUERY results even more powerful by combining them with other Google Sheets functions. Using tools like TRIM() to remove spaces, PROPER() to fix capitalization, or IFERROR() to hide errors makes your data cleaner and more readable. For example:

1=QUERY({A2:A10, TRIM(B2:B10)}, "SELECT Col1, Col2 WHERE Col2 IS NOT NULL")

This removes extra spaces before querying. Combining functions like this helps automate cleanup and gives you cleaner outputs with less manual work.

Optimize QUERY Performance for Large Datasets

When working with large datasets, your sheet can slow down if the QUERY function pulls too much information. Instead of using SELECT *, only select the columns you actually need. You can also use LIMIT to show fewer rows or ORDER BY and WHERE to sort and filter the data more efficiently. For example:

1=QUERY(A2:F1000, "SELECT A, E WHERE E = 'Delivered' LIMIT 50", 1)

This formula shows only customer names and status where the order is delivered and limits the results to 50 rows.

Regularly Validate and Refresh Data Sources

To make sure your queries always give you accurate results, it’s important to work with clean and current data. Check for missing values, outdated entries, or inconsistent formats before running a QUERY. 

You can also use tools like data validation rules to control inputs or IMPORT functions to automatically pull in fresh data from other sources. Keeping your dataset up-to-date ensures your reports are reliable and saves you from re-checking everything manually. 

Powerful Functions to Search and Analyze Data in Google Sheets

Google Sheets offers more than just basic filters. With advanced functions, you can quickly search, filter, and analyze large sets of data. These tools help you find specific results, apply custom logic, and save time when working with complex spreadsheets. 

  • SEARCH: Finds the position of a specific word or character inside a text string, helping you locate keywords or confirm if text exists in a cell.
  • SUM: Adds up a range of numbers, making it useful for calculating totals like sales, expenses, or quantities.
  • AVERAGE: Calculates the mean of a group of numbers, helping you identify the central value in a dataset.
  • IMPORTRANGE: Imports data from another Google Sheet, allowing you to combine or link data across different files.
  • VLOOKUP: Searches for a value in the first column of a range and returns a related value from another column in the same row.
  • CONCATENATE: Combines multiple text strings into one, simplifying the process of merging data from different cells or sources.

Streamline Your Data Analysis with OWOX: Reports, Charts & Pivots Extension

If you’re looking to turn your QUERY results into clear, visual insights, the OWOX: Reports, Charts & Pivots Extension is a tool worth trying. It works right inside Google Sheets and lets you create charts, graphs, and pivot tables from your data, without needing to manually set up each one. This saves time and helps you spot patterns or trends more easily. 

With OWOX, you can quickly build dashboards and automate reports that update as your data changes. It’s especially useful for marketers, analysts, and business owners who want to make faster, smarter decisions without spending hours formatting spreadsheets.

FAQ

What is the QUERY function in Google Sheets used for?
How can I remove duplicates using the QUERY function?
How do I filter data based on multiple conditions in QUERY?
Why is my QUERY function returning a #N/A error?
Can I use QUERY with ARRAYFORMULA for data transformation?
How do I rename column headers in QUERY results?

You might also like

2,000 companies rely on us

Oops! Something went wrong while submitting the form...