ECMAScript - Introducing String "matchAll" Method in ES2020 (ES11)

June 20, 2020 5 min read ECMAScript

An introduction to the "String.prototype.matchAll" proposal which has been reached stage 4 in the TC39 process and is included in the language specification of 2020, the 11th edition.

Every year, an edition of the ECMAScript Language Specification is released with the new proposals that are officially ready. In practical terms, the proposals are attached to the latest expected edition when they are accepted and reached stage 4 in the TC39 process:

The stages of TC39 process

The stages of TC39 process

In this article, we’re going to examine and explain the “String.prototype.matchAll” proposal that has been reached stage 4 and belongs to ECMAScript 2020 - the 11th edition.

The content is available as a video as well:

If you like this kind of work, you can save the full playlist and subscribe to my YouTube channel today to not miss new content.

Motivation

Capturing groups in regular expressions are multiple characters that are enclosed within parentheses and treated as a single unit. This means that setting a quantifier at the end of the parentheses allows referring to the entire grouped characters (and not just the single leading character).

When involving the String.prototype.match method against a regular expression containing capturing groups, the result is typically an array (would be null if no matches are found) that might be produced with different content - depending on whether we use the g flag or not.

Let’s demonstrate the difference between the results.

All Matches without Capturing Groups

To begin with, we execute the method against a global regular expression that’s built from multiple groups:

As it stands, the result doesn’t contain the matching capturing groups at all - but rather the entire set of characters that are matched. For instance, “e” and “st1” are strings matching the capturing groups - so we would expect them to be contained as well.

That is, the capturing groups are ignored. πŸ€·πŸ»β€β™‚οΈ

The First Match with Capturing Groups

This time we execute against the same regex without using the g flag:

Well, the multiple capturing groups indeed considered in the result - but it refers only to the first match. πŸ€¦πŸ»β€β™‚οΈ

Notice that the result isn’t just an array of plain strings - it actually has additional properties: index, input and groups. Later, we’ll name these objects “matching values”.


Having said that, what if we’d like to combine both use-cases above so that:

  • We retrieve results for all matches considering the capturing groups
  • We retrieve them simply without getting complicated

How do we make it?

The Proposal

The proposal specifies a new String.prototype.matchAll method that addresses the case in question.

Here’s the official definition out of the specification:

Performs a regular expression match of the String representing the this value against regexp and returns an iterator. Each iteration result’s value is an Array object containing the results of the match, or null if the String did not match.

We can understand from the definition that:

  • The method returns an iterator representing results of the matches
  • The method returns null if there are no matches
  • All results of each match are contained

Put it simply, the results of each match (namely “matching values”) truly consider capturing groups and are accessible through the returned iterator.

Next, we’re going to introduce several practical usages for the method.

All Matches with Capturing Groups

Starting with executing the String.prototype.matchAll method against our regex:

The method merely takes a regex, like String.prototype.match does, but differs in the result type. Importantly, only global regular expressions are acceptable - which means, we must use the g flag to avoid getting a TypeError.

Here we explore the result:

Since we already know that the result is an iterator, we spread it into resultAsArray to easily access the matching values.

The first matching value is absolutely identical to the result in the case of String.prototype.match without the g flag. The cool thing, however, is having an additional matching value - which refers to the second match and also considers its capturing groups!

Thereby, instead of messing with complicated solutions to combine the global matching results with capturing groups - the method provides this combination simply and natively. πŸ’ͺ🏻

Iterating the Matches

So far, the common practice to iterate all the matching values was using a while loop:

Now with String.prototype.matchAll, we can benefit from the returned iterator to implement conveniently:

The output obviously remains the same.

Also, we can destructure the iterator to access a specific matching value:

It’s worth mentioning that both practices above exhaust the iterator (we reached the last value) - meaning, we need to have another non-consumed iterator by reinvoking the method in order to iterate or destructure (which actually iterates indirectly) again.

Hence, another way to go is by creating an array to be iterated repeatedly:

Thus it’s possible to iterate the matching values as we please without reinvoking the method.

The truth is that in the majority of cases we could settle for an iterator, and that’s the primary reason behind the design decision to always return an iterator instead of an array. Other than that, there is a matter of performance - the iterator performs the matching for each invocation of its next method. Practically this means that if we decide to break the loop in the middle or destructure only part of the values, the next matches wouldn’t be performed. By contrast, if the result was an array - all the matching values would be collected beforehand without exception. It’s definitely significant when there are tons of potential matches and/or capturing groups.

Anyway, on the whole, the point is we’re not restricted. The iterator provides the versatility to iterate, destructure or transform by choice to an array - depending on our use-case.

Summary

We explained today the primary reason motivating the “String.prototype.matchAll” proposal and also introduced concrete usages to the method.

Let’s recap:

  • The proposal belongs to ECMAScript 2020, which is the 11th edition
  • When using String.prototype.match with g flag, the result doesn’t consider capturing groups
  • When using String.prototype.match without g flag, the result considers the capturing groups but refers only to the first match
  • The proposal specifies new method called String.prototype.matchAll
  • String.prototype.matchAll returns an iterator representing the matches and allowing to iterate, destructure or transform to an array if necessary
  • String.prototype.matchAll returns null if there are no matches
  • String.prototype.matchAll considers all matches including capturing groups in a simple usage
  • String.prototype.matchAll throws a TypeError when using non-global regular expression
  • After exhausting the iterator, we need to reinvoke String.prototype.matchAll to iterate once more

Here’s attached a project with the examples:

βœ‰οΈ Subscription
Only new content, no spam.
Aspire. Accomplish. Online courses from $9.99