How often does JavaScript recompile regex literals in functions?

Given this function:

function doThing(values,things){
  var thatRegex = /^http:\/\//i; // is this created once or on every execution?
  if (values.match(thatRegex)) return values;
  return things;
}

How often does the JavaScript engine have to create the regex? Once per execution or once per page load/script parse?

To prevent needless answers or comments, I personally favor putting the regex outside the function, not inside. The question is about the behavior of the language, because I’m not sure where to look this up, or if this is an engine issue.


EDIT:

I was reminded I didn’t mention that this was going to be used in a loop. My apologies:

var newList = [];
foreach(item1 in ListOfItems1){ 
  foreach(item2 in ListOfItems2){ 
    newList.push(doThing(item1, item2));
  }
}

So given that it’s going to be used many times in a loop, it makes sense to define the regex outside the function, but so that’s the idea.

also note the script is rather genericized for the purpose of examining only the behavior and cost of the regex creation

From Mozilla’s JavaScript Guide on regular expressions:

Regular expression literals provide compilation of the regular expression when the script is evaluated. When the regular expression will remain constant, use this for better performance.

And from the ECMA-262 spec, §7.8.5 Regular Expression Literals:

A regular expression literal is an input element that is converted to a RegExp object (see 15.10) each time the literal is evaluated.

In other words, it’s compiled once when it’s evaluated as a script is first parsed.

It’s worth noting also, from the ES5 spec, that two literals will compile to two distinct instances of RegExp, even if the literals themselves are the same. Thus if a given literal appears twice within your script, it will be compiled twice, to two distinct instances:

Two regular expression literals in a program evaluate to regular expression objects that never compare as === to each other even if the two literals’ contents are identical.

… each time the literal is evaluated, a new object is created as if by the expression new RegExp(Pattern, Flags) where RegExp is the standard built-in constructor with that name.

The provided answers don’t clearly distinguish between two different processes behind the scene: regexp compilation and regexp object creation when hitting regexp object creation expression.

Read More:   In Vue JS, call a filter from a method inside the vue instance

Yes, using regexp literal syntax, you’re gaining the performance benefit of one time regexp compilation.

But if your code executes in ES5+ environment, every time the code path enters the doThing() function in your example, it actually creates a new RegExp object, though, without need to compile the regexp again and again.

In ES5, literal syntax produces a new RegExp object every time code path hits expression that creates a regexp via literal:

function getRE() {
    var re = /[a-z]/;
    re.foo = "bar";
    return re;
}

var reg = getRE(),
    re2 = getRE();

console.log(reg === re2); // false
reg.foo = "baz";
console.log(re2.foo); // "bar"

To illustrate the above statements from the point of actual numbers, take a look at the performance difference between storedRegExp and inlineRegExp tests in this jsperf.

storedRegExp would be about 5 – 20% percent faster across browsers than inlineRegExp – the overhead of creating (and garbage collecting) a new RegExp object every time.

Conslusion:
If you’re heavily using your literal regexps, consider caching them outside the scope where they are needed, so that they are not only be compiled once, but actual regexp objects for them would be created once as well.

There are two “regular expression” type objects in javascript.
Regular expression instances and the RegExp object.

Also, there are two ways to create regular expression instances:

  1. using the /regex/ syntax and
  2. using new RegExp(‘regex’);

Each of these will create new regular expression instance each time.

However there is only ONE global RegExp object.

var input="abcdef";
var r1 = /(abc)/;
var r2 = /(def)/;
r1.exec(input);
alert(RegExp.$1); //outputs 'abc'
r2.exec(input);
alert(RegExp.$1); //outputs 'def'

The actual pattern is compiled as the script is loaded when you use Syntax 1

The pattern argument is compiled into an internal format before use. For Syntax 1, pattern is compiled as the script is loaded. For Syntax 2, pattern is compiled just before use, or when the compile method is called.

But you still could get different regular expression instances each method call. Test in chrome vs firefox

function testregex() {
    var localreg = /abc/;
    if (testregex.reg != null){
        alert(localreg === testregex.reg);
    };
    testregex.reg = localreg;
}
testregex();
testregex();

It’s VERY little overhead, but if you wanted exactly one regex, its safest to only create one instance outside of your function

Read More:   How to use query parameters in Nest.js?

The regex will be compiled every time you call the function if it’s not in literal form.
Since you are including it in a literal form, you’ve got nothing to worry about.

Here’s a quote from websina.com:

Regular expression literals provide compilation of the regular expression when the script is evaluated. When the regular expression will remain constant, use this for better performance.

Calling the constructor function of the RegExp object, as follows:
re = new RegExp("ab+c")

Using the constructor function provides runtime compilation of the regular expression. Use the constructor function when you know the regular expression pattern will be changing, or you don’t know the pattern and are getting it from another source, such as user input.


The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .

Similar Posts