Learn Google Sheets QUERY Function: Step-by-Step Guide

The Google Sheets QUERY function is a powerful tool that allows you to manipulate data efficiently for various purposes. Whether you need to extract essential information from complex datasets or create reports and charts, the QUERY function has got you covered.

One notable advantage of this function is its ability to execute queries in the Google Visualization API Query language, making it incredibly versatile for data manipulations.

In essence, the QUERY function stands as the ultimate function for data manipulation in Google Sheets. This comprehensive tutorial will guide you step-by-step to master it.

QUERY Function: Syntax and Arguments

Let’s start with the syntax of the QUERY function in Google Sheets. Unlike most other functions, understanding how to use it isn’t straightforward just by reviewing the arguments. But don’t worry, with my examples and step-by-step guides, you’ll gain a clear understanding in no time.

Syntax:

QUERY(data, query, [headers])

Arguments:

  • data: The range of cells (table) on which to perform the query. It’s crucial to avoid mixed-type data in columns to ensure accurate results.
  • query: The query you want to run, enclosed in double-quotes. Alternatively, you can enter the appropriate text in a cell and refer to it.
  • headers: The number of header rows in your data (table). Typically, there will be one header row or none. If this argument is omitted or set to -1, the header value is guessed based on the content of the source data.

Example:

=QUERY(A1:F1000, "Select A, C", 1)

In this example, A1:F1000 represents the data, "Select A, C" is the query, and 1 indicates the header. The formula returns columns A and C.

To master the Google Sheets QUERY function, you must concentrate on understanding the query argument, which is a broad topic.

Simplifying the ‘Query’ Argument

When you come across a QUERY formula written by someone else, the query segment within quotation marks can be the most confusing part. It contains language clauses (keywords), data manipulation functions, and language elements.

Attempting to learn them all in one tutorial is not practical. Therefore, our approach here is to guide you in learning essential data manipulation techniques using the Google Sheets QUERY function.

In the process, I will direct you to more advanced tutorials on our website (Crawlan.com) to master one of the most powerful functions in Google Sheets. Here we go!

Sample Data for Testing the QUERY Function

Before we dive into the QUERY function, let’s consider some sample data to work with. We’ll use a table with six columns and sixteen rows, located in the range A1:F16. The field labels in A1:F1 are as follows:

A B C D E F
1 2 3 4 5 6

You can obtain a copy of the full data by clicking here (Crawlan.com). We’ll use this data to learn the QUERY function. But first, let’s review the syntax.

There are three arguments in the syntax of the QUERY function. The first and last ones are data and headers, respectively. As per our sample table, the data will be A1:F (open range) or A1:F16 (closed range), and the headers will be 1. Now let’s focus on the query argument.

Selecting Columns in the QUERY Function (SELECT Clause)

This section covers how to select columns in the QUERY function.

Basic Usage Examples of Google Sheets QUERY Function

In our sample data, consisting of six columns, you can manipulate and select specific columns using the SELECT clause (keyword) in the QUERY function.

=QUERY(Sheet1!A1:F16, "select A, B, F, C, D, E", 1)

The formula above returns the columns in a different order.

=QUERY(Sheet1!A1:F16, "select A, E", 1)

This formula returns only columns A and E.

If you replace "select A, E" with "select *", all columns of the data source table are returned in their original order.

An alternative formula to return all columns in their original order is:

=QUERY(Sheet1!A1:F16,, 1)

In this formula, the query argument is omitted. Here, only the data and headers are specified.

If you specify 16 in the headers instead of 1, the formula will return a 1-row x 6-column output with all values as a header row.

Selecting Results from Aggregation Functions, Scalar Functions, and Arithmetic Operators

Above, we explored selecting existing columns in the table without any manipulation. Now, let’s see how to process the values in the columns and return results using aggregation functions, scalar functions, and arithmetic operators.

=QUERY(Sheet1!A1:F16, "select A, YEAR(E)", 1)

This formula returns the names in column A and the year of joining in column E. Column E in the source data table contains the date of joining.

To find the average of values in column F, which contains the age, you can use the following formula:

=QUERY(Sheet1!A1:F16, "select AVG(F)", 1)

To find the age at the time of joining, we can subtract the year of DOB from the year of joining. Here is an example:

=QUERY(Sheet1!A1:F16, "select A, YEAR(E) - YEAR(B)", 1)

Note: We are only considering years. Therefore, the age won’t be entirely accurate due to the whole months/years issue.

These examples demonstrate how to select and manipulate data in the QUERY function. For more advanced usage, please refer to our tutorials on Crawlan.com.

Filtering Rows Using Google Sheets QUERY Function (WHERE Clause)

Above, we explored how to use the SELECT keyword in the QUERY function for selecting columns. Now, let’s learn how to filter rows using the WHERE clause.

In the WHERE clause, we use comparison operators or logical operators for filtering purposes.

To filter the gender “M” in our sample data, we can use the following formula:

=QUERY(Sheet1!A1:F, "Select A, D WHERE D='M'", 1)

The formula above filters names with a gender of “M”.

To filter names whose ages are 12, we can use the following formula:

=QUERY(Sheet1!A1:F, "Select A WHERE F=12", 1)

The formula filters names with an age of 12.

The subsequent query filters names whose joining dates are 01/01/2015:

=QUERY(Sheet1!A1:F, "Select A WHERE E=DATE '2015-01-01'", 1)

These three examples demonstrate how to use the WHERE keyword in the QUERY function to filter rows based on specific conditions. For comprehensive data manipulation, we highly recommend exploring our tutorials on Crawlan.com.

Grouping and Aggregation (GROUP BY Clause)

Now, let’s explore how to group and aggregate data using the QUERY function.

For this purpose, we utilize the GROUP BY clause, which is a comprehensive topic, similar to the WHERE clause.

Here are important considerations when using the GROUP BY keyword combination in the QUERY function:

  • It aggregates values across the rows.
  • A single row is created for each distinct combination of values in the GROUP BY clause.
  • The data is automatically sorted by the grouping columns.
  • All the column identifiers specified in the SELECT clause must be part of the GROUP BY clause. However, you can include unspecified column identifiers as well.

To group the data by age and return the count, you can use the following formula:

=QUERY(Sheet1!A1:F16, "Select F, COUNT(F) GROUP BY F", 1)

To group the data by age and date of joining and return the count, you can use the following formula:

=QUERY(Sheet1!A1:F16, "SELECT F, E, COUNT(F) GROUP BY F, E", 1)

These examples demonstrate how to use aggregation functions in the QUERY function. For more advanced usage, please refer to our tutorials on Crawlan.com.

Sorting Data (SORTBY Clause)

With the ORDER BY clause, we can sort the rows in the source data by the values in specified columns.

To sort the names in column A in ascending order, you can use the following formula:

=QUERY(Sheet1!A1:F16, "SELECT A, B, F ORDER BY A ASC", 1)

To sort the DOB in column B in descending order and then the names in column A in ascending order, you can use the following formula:

=QUERY(Sheet1!A1:F16, "SELECT A, B, F ORDER BY B DESC, A ASC", 1)

You can use either column identifiers or the result of aggregation functions, scalar functions, or operators in the ORDER BY clause.

Limiting the Number of Rows (LIMIT Clause)

The LIMIT clause allows you to limit the number of rows returned by the QUERY function.

To limit the number of rows returned to 3, you can use the following formula:

=QUERY(Sheet1!A1:F, "SELECT F, COUNT(F) WHERE F IS NOT NULL GROUP BY F ORDER BY COUNT(F) DESC LIMIT 3", 1)

Offset Rows in Google Sheets QUERY Function (OFFSET Clause)

The OFFSET clause allows you to skip a given number of rows.

To skip the first 3 rows, you can use the following formula:

=QUERY(Sheet1!A1:F, "SELECT * OFFSET 3", 1)

In the second example, we have also used the LIMIT keyword. The QUERY first offsets rows and then applies the limit.

Labeling and Formatting Query Result (LABEL and FORMAT Clauses)

Sometimes, the Query formula may return ‘ugly’ headers. You can use the LABEL clause to set custom labels for columns.

=QUERY(Sheet1!A1:F16, "SELECT A, YEAR(E) - YEAR(B) LABEL YEAR(E) - YEAR(B) 'Age at the time of Joining'", 1)

This formula replaces the label difference(year(Date of Joining)year(DOB)) with Age at the time of Joining.

These are the essential techniques you need to master the QUERY function in Google Sheets. For more detailed explanations and examples, please refer to our tutorials on Crawlan.com.

Reference: Query Language Reference (Version 0.7)

Thanks for reading and happy querying!

Related:

  • The Alternative to SQL IN Operator in Google Sheets Query (Also Not IN)
  • How to Use Date Values (Date Serial Numbers) in Google Sheets Query

Related posts