Utilizing REGEXEXTRACT in Google Sheets

Video google sheet regex extract

Published on January 21, 2021 by Jake Sheridan

Are you tired of manually sifting through large amounts of data in Google Sheets? Do you want to supercharge your data management skills? Well, look no further! In this article, I will introduce you to the amazing world of regex functions in Google Sheets, specifically the powerful REGEXEXTRACT function.

What is the REGEXEXTRACT function?

The REGEXEXTRACT function in Google Sheets is used to extract substrings that match a specific regular expression. Regular expressions, or regex for short, are widely used by programmers, but they can also be beneficial for Google Sheets users. A regular expression is a sequence of characters that defines a search pattern, allowing you to find specific information. With regular expressions, you can express practically anything!

How does the REGEXEXTRACT function work?

The syntax of the REGEXEXTRACT function is as follows:

=REGEXEXTRACT(text, regular_expression)

At first glance, the syntax may seem intimidating, but don’t fret! There are numerous online resources available to help you build your own regular expressions. Once you grasp the concept and understand how to use it, REGEXEXTRACT becomes a powerful tool that significantly speeds up your processes.

You can use the REGEXEXTRACT function to search for exact matches, but its true power lies in the special characters that allow you to customize your searches. Let’s take an example regular expression: ([A-Z]w+). This regular expression translates to:

  • A-Z: The substring we’re looking for starts with an uppercase letter.
  • w: It should be a word character (not a number or space).
  • +: It can be followed by 1 or more characters.

When to use REGEXEXTRACT in marketing?

The REGEXEXTRACT function can be a game-changer in various marketing scenarios, such as:

  • Extracting specific data from a list.
  • Extracting email addresses or links from text.
  • Extracting a domain from a URL.

How to use the REGEXEXTRACT function in Google Sheets

The first argument, text, refers to the input text you want to search within. The second argument, regular_expression, is the expression to match. The function will return the first part of the text that matches the provided regular expression. Keep in mind that the REGEXEXTRACT function works only with text.

Let’s say you have a list of URLs and you need to extract certain words from them. For example, the URLs contain the language of the pages, and you want to extract those languages. In regex, the | character represents the logical OR. You can search for the specific text by connecting all the possible words with OR relationships. The formula will search for the words “Spanish” OR “French” OR “German,” and so on.

The function to achieve this is:

=REGEXEXTRACT(A2, "Spanish|French|German|English|Italian|Dutch")

Regardless of where the word appears in the text, the function will find and return it as the result. It reads the source string from left to right and returns the first correct result it finds. For example, if “Italian” is found first in the fifth row, it will ignore the presence of “English.”

An example of the REGEXEXTRACT formula

It is also common to extract the base domains from complete URLs. The REGEXEXTRACT function is suitable for this task as well. The regular expression used here is more complex, but you can learn the basics to get comfortable with such expressions, or simply search on Google and build them as you go.

=REGEXEXTRACT(A2,"^(?:https?://)?([^:/n]+)")

This regular expression excludes “https://” and then excludes everything after the first “/” character. As a result, the remaining part is the base domain.

In conclusion

REGEXEXTRACT can save you a tremendous amount of time. Once you understand how to use it, you can replace most of your LEFT, RIGHT, MID, or FIND functions with this regex function. Embrace the power of regular expressions and unlock new possibilities in your Google Sheets workflows!

Published on Crawlan.com

Related posts