Introduction to Javascript Regular Expressions

Preface

PS:2018/03/27 Optimize the format of the article, add some new test code
When it comes to regularization, we are often exposed to the fact that the front-end is small enough to verify and plug-ins can be seen everywhere. Simple methods can also meet the requirements, but lack of flexibility. Once the need for complexity, there is no more appropriate technology than regularization. This is also a threshold for programmers.
The following screenshots are from JavaScript RegExp object

new RegExp(pattern, attributes)

Represents regular expressions, which are powerful tools for performing pattern matching on strings.

parameter Effect
pattern A string that specifies the pattern of a regular expression or other regular expression.
attributes Modifier. Before ECMAScript standardization, the m attribute was not supported. If pattern is a regular expression rather than a string, the parameter must be omitted.

Return value:
A new RegExp object with the specified pattern and flag. If the parameter pattern is a regular expression rather than a string, the RegExp() constructor creates a new RegExp object with the same pattern and flag as the specified RegExp.
If RegExp() is called as a function without the new operator, it behaves the same as when it is called with the new operator, except that when pattern is a regular expression, it only returns pattern instead of creating a new RegExp object.

Throw out:
SyntaxError - If pattern is not a legitimate regular expression, or attributes contain characters other than "g", "i" and "m", throw the exception.
TypeError - If the pattern is a RegExp object, but the attributes parameter is not omitted, the exception is thrown.

RegExp Object Method

RegExpObject.compile(regexp,modifier)

It can be used to compile regular expressions during script execution, as well as to change and recompile regular expressions (compiling regular expressions into internal formats for faster execution).

parameter Effect
regexp regular expression
modifier Specify the type of match. "g" is used for global matching, "i" is used for case-sensitive matching, and "gi" is used for case-sensitive matching.

In fact, compile is a modification compilation function and cannot be used directly for matching rules.

var str1 = "Hello World",
  str2 = "Hello World",
  patt = /man/g;
//Normal rule
console.time();
str1 = str1.replace(patt, "person");
console.timeEnd();
//Compiled rules
console.time();
patt.compile(patt);
str2 = str2.replace(patt, "person");
console.timeEnd();

// default: 2.818ms
// default: 0.096ms

The performance improvement is still quite obvious.

RegExpObject.exec(string)

Used to retrieve matches of regular expressions in strings.

parameter Effect
string Necessary. The string to retrieve.

Return value: Returns an array in which matching results are stored. If no match is found, the return value is null.
Description: Calling the test() method of RegExp object r and passing it the string s is equivalent to this expression: r. exec (s)!= null - > r. test (s).
Important: If you want to start retrieving a new string after completing a pattern match in a string, you must manually reset the lastIndex attribute to 0.
Return value: Note that whether RegExpObject is global or not, exec() adds complete details to the array it returns. This is the difference between exec() and match(), which returns much less information in global mode. So we can say that calling exec() repeatedly in a loop is the only way to get complete pattern matching information for a global pattern.

var str = "Every man in the world! Every woman on earth!",
  patt = /man/g,
  result;

while ((result = patt.exec(str)) != null) {
  console.log(result);
  console.log(patt.lastIndex);
}

// [ 'man',
//   index: 6,
//   input: 'Every man in the world! Every woman on earth!',
//   groups: undefined ]
// 9
// [ 'man',
//   index: 32,
//   input: 'Every man in the world! Every woman on earth!',
//   groups: undefined ]
// 35


The index attribute declares the location of the matching text. The input attribute stores the string being retrieved.

RegExpObject.test(string)

Used to detect whether a string matches a pattern.

parameter Effect
string Necessary. The string to retrieve.

Return value: If the string contains text matching RegExpObject, return true or false.
Description: Calling the test() method of RegExp object r and passing it the string s is equivalent to this expression: r. exec (s)!= null - > r. test (s).

var str = "Every man in the world! Every woman on earth!",
  patt = /man/g;

console.log(patt.test(str)); // true

String Object Method Supporting Regular Expressions

stringObject.search(regexp)

Used to retrieve a substring specified in a string, or to retrieve a substring that matches a regular expression.

parameter Effect
regexp This parameter can be either a substring that needs to be retrieved in stringObject or a RegExp object that needs to be retrieved.

Return value: The starting position of the first substring in stringObject that matches regexp, returns - 1 if no matching substring is found.
Note: The search() method does not perform global matching, and it ignores flag g G. It also ignores the lastIndex attribute of regexp and always retrieves it from the beginning of the string, which means that it always returns the first matching position of stringObject.

var str = "How Are you doing today?How Are you doing today?"
console.log(str.search(/are/gi)); // 4

stringObject.match(searchvalue/regexp)

The specified value can be retrieved within a string, or a match of one or more regular expressions can be found.

parameter Effect
searchvalue Necessary. Specifies the string value to retrieve.
regexp Necessary. RegExp objects that specify the pattern to match. If the parameter is not a RegExp object, you need to first pass it to the RegExp constructor and convert it to a RegExp object.

Return value: An array of matching results. The content of the array depends on whether regexp has a global flag g G.
Description: The match() method retrieves the string stringObject to find one or more text matching the regexp. The behavior of this method depends largely on whether regexp has a flag g G.
If regexp has no flag g g, the match() method can only perform a match once in stringObject. If no matching text is found, match() returns null. Otherwise, it returns an array containing information about the matched text it finds. The zero element of the array stores matching text, while the rest stores matching text with sub-expressions of regular expressions. In addition to these regular array elements, the returned array contains two object attributes. The index attribute declares the position of the starting character of the matching text in stringObject, and the input attribute declares a reference to stringObject.
If regexp has a flag g g, the match() method performs a global search to find all matching substrings in stringObject. If no matching substring is found, null is returned. If one or more matching substrings are found, an array is returned. However, the content of the array returned by global matching is quite different from that of the former. Its array elements contain all matching substrings in stringObject, and there is no index attribute or input attribute.
Note: In the global retrieval mode, match() does not provide information about the text that matches the subexpression, nor does it declare the location of each matching substring. If you need these globally retrieved information, you can use RegExp.exec().

var str = "How are you doing today?How are you doing today?"
console.log(str.match(/ a/));
console.log(str.match(/a.e/g));
// [ ' a',
//   index: 3,
//   input: 'How are you doing today?How are you doing today?',
//   groups: undefined ]
// [ 'are', 'are' ]

stringObject.replace(regexp/substr,replacement)

Used to replace some characters in a string with others, or to replace a substring that matches a regular expression.

parameter Effect
regexp/substr Necessary. RegExp objects that specify substrings or patterns to be replaced. Note that if the value is a string, it is treated as a direct quantum text pattern to retrieve, rather than being converted to a RegExp object first.
replacement Necessary. A string value. It specifies the function of replacing text or generating replacing text.

Return value: A new string is obtained by replacing regexp's first match or all matches with replacement.

var str = "How are you doing today?"
console.log(str.replace(/ /g, "|"));
console.log(str.replace(/ /g, function () {
  return '-----'
}));
console.log(str.replace(/a.e/g, "were"));

// How|are|you|doing|today?
// How-----are-----you-----doing-----today?
// How were you doing today?

stringObject.split(separator,howmany)

Used to split a string into an array of strings.

parameter Effect
separator Necessary. A string or regular expression that splits stringObject from the place specified by the parameter.
howmany Optional. This parameter specifies the maximum length of the returned array. If this parameter is set, no more substrings will be returned than the array specified by this parameter. If this parameter is not set, the entire string will be split regardless of its length.

Return value: An array of strings. The array is created by splitting the string stringObject into substrings at the boundaries specified by the separator. Strings in the returned array do not include separator itself.

            However, if the separator is a regular expression that contains subexpressions, the returned array includes strings that match those subexpressions (but not text that matches the entire regular expression).
var str = "How are you doing today?"
console.log(str.split(" ", 3));
console.log(str.split(" "));
console.log(str.split("are"));

// [ 'How', 'are', 'you' ]
// [ 'How', 'are', 'you', 'doing', 'today?' ]
// [ 'How ', ' you doing today?' ]

Square brackets are used to find characters in a range:

var str = 'abcdaabc164984616464646464AAWEGAWGAG';

console.log(str.match(/[a-f]/g).join(''));
console.log(str.match(/[A-F]/g).join(''));
console.log(str.match(/[A-z]/g).join(''));
console.log(str.match(/[0-9]/g).join(''));
console.log(str.match(/[adgk]/g).join(''));
console.log(str.match(/[^a-z]/g).join(''));
console.log(str.match(/(r|b|g)/g).join(''));

// abcdaabc
// AAEAA
// abcdaabcAAWEGAWGAG
// 164984616464646464
// adaa
// 164984616464646464AAWEGAWGAG
// bb

Metacharacter is a character with a special meaning:


Let's first look at some generic and simple metacharacters.

var str = "Every man in the world! 1, 2, 3, Let's go!!";

console.log(str.match(/m.n/g).join(''));
console.log(str.match(/\w/g).join(''));
console.log(str.match(/\W/g).join(''));
console.log(str.match(/\d/g).join(''));
console.log(str.match(/\D/g).join(''));
console.log(str.match(/\s/g).join(''));
console.log(str.match(/\S/g).join(''));

// man
// Everymanintheworld123Letsgo
//     ! , , , ' !!
// 123
// Every man in the world! , , , Let's go!!
         
// Everymanintheworld!1,2,3,Let'sgo!!

B and B characters match word boundaries or not, which means that the position matched at the word boundaries is not directly adjacent to another word character after or before the word character. Note that the matching word boundaries are not included in the matching. In other words, the length of the matched word boundary is zero. (Don't confuse [b].

var str = "If you love yourself, you can jump into your life from a springboard of self-confidence. If you love yourself, you can say what you want to say, go where you want to go.";

console.log(str.match(/your\b/));//-> You inside
console.log(str.match(/your\B/));//-> inside yourself,

// [ 'your',
//   index: 40,
//   input: 'If you love yourself, you can jump into your life from a springboard of self-confidence. If you love yourself, you can say what you want to say, go where you want to go.',
//   groups: undefined ]
// [ 'your',
//   index: 12,
//   input: 'If you love yourself, you can jump into your life from a springboard of self-confidence. If you love yourself, you can say what you want to say, go where you want to go.',
//   groups: undefined ]

I don't know much about decimal numbers. Here's all about W.

var str = "If you love yourself, you can jump into your life from a springboard of self-confidence. If you love yourself, you can say what you want to say, go where you want to go.";

console.log(str.match(/\127/gi));//Characters specified in the octal number xxx.
console.log(str.match(/\x57/gi));//Characters specified in hexadecimal dd
console.log(str.match(/\u0057/gi));//Unicode characters specified in hexadecimal number xxx.

Other literal meanings are not explained.

Classifier

var str = "n, On, Oon, Ooon";

console.log(str.match(/o+n/gi));
console.log(str.match(/o*n/gi));
console.log(str.match(/o?n/gi));

console.log(str.match(/o{1}n/gi));
console.log(str.match(/o{1,2}n/gi));
console.log(str.match(/o{3,}n/gi));

console.log(str.match(/^n/gi));
console.log(str.match(/on$/gi));

console.log(str.match(/O(?=on)/));
console.log(str.match(/O(?!on)/));

// [ 'On', 'Oon', 'Ooon' ]
// [ 'n', 'On', 'Oon', 'Ooon' ]
// [ 'n', 'On', 'on', 'on' ]
// [ 'On', 'on', 'on' ]
// [ 'On', 'Oon', 'oon' ]
// [ 'Ooon' ]
// [ 'n' ]
// [ 'on' ]
// [ 'O', index: 7, input: 'n, On, Oon, Ooon', groups: undefined ]
// [ 'O', index: 3, input: 'n, On, Oon, Ooon', groups: undefined ]

capture

parameter Effect
(n) Match n and capture text into automatically named groups
(?:n) Match n but do not get the matching result, that is to say, it is a non-acquisition matching and does not store for future use.

One of the most important features of regular expressions is the ability to store parts of a pattern that matches successfully for later use. Adding parentheses () to both sides of a regular expression pattern or a partial pattern can store this part of the expression in a temporary buffer. Each sub-match captured is stored sequentially according to what is encountered from left to right in the regular expression pattern. Storage sub-matching buffer numbers start at 1, and are numbered consecutively to a maximum of 99 sub-expressions. Each buffer can be accessed using'n'(or'$n'), where n is an Arabic number of 1 to 99, which is used to identify a particular buffer (subexpression) in sequence.

var str = "I am the best of the best in the best place";
console.log(str.match(/(the best).*\1/g)) //["the best of the best in the best"]

Notice that it matches only repetitive rules.

var str = "aa bb ab";
console.log(str.match(/(\w)\1/g))//["aa", "bb"]
console.log(str.match(/(?:\w)\1/g))//null

The meaning matched to a before can only be a, even if it customizes all the words of the rule.
The following is a brief description of "or"

var str = "better best";

console.log(str.match(/(better|best)/g));
console.log(str.match(/be(?:tter|st)/g));

Greed and Laziness Patterns

Greedy mode, which matches as many characters as possible
Lazy mode, which matches as few characters as possible
The difference is that the greedy model is followed by the lazy model.

var str = "<p>123</p><p>abc</p>";

console.log(str.match(/<p>\S*<\/p>/)); // ["<p>123</p><p>abc</p>", index: 0, input: "<p>123</p><p>abc</p>"]
console.log(str.match(/<p>\S*?<\/p>/)); // ["<p>123</p>", index: 0, input: "<p>123</p><p>abc</p>"]

Regular matching has a lot of powerful bits, only half of which is because of the limited level, which is usually used. The other half is also limited in javascript, which is not enough at this stage.

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Now let's start with the main topic. First, write a simple date matching drill, the most basic number matching.

var str = "2017.04.10 2017-4-10 2017/04/1";

//Basic Writing
console.log(str.match(/[0-9]{4}(\.|-|\/)[0-9]{1,2}(\.|-|\/)[0-9]{1,2}/g));
//Metacharacter Writing
console.log(str.match(/\d{4}.\d{1,2}.\d{1,2}/g));
//Quantifier Writing
console.log(str.match(/\d{4}(.\d+){2}/g));

// [ '2017.04.10', '2017-4-10', '2017/04/1' ]
// [ '2017.04.10', '2017-4-10', '2017/04/1' ]
// [ '2017.04.10', '2017-4-10', '2017/04/1' ]

Common check of payment amount, pure number greater than 0, up to two decimals

var reg = /^(0|[1-9]\d*)?(\.\d{1,2})?$/g;

console.log('023'.match(reg));
console.log('.5'.match(reg));
console.log('255'.match(reg));
console.log('255.1'.match(reg));
console.log('255.31'.match(reg));
console.log('255.313'.match(reg));

// null
// [ '.5' ]
// [ '255' ]
// [ '255.1' ]
// [ '255.31' ]
// null

It's a bit complicated. It's been a long time. There are many wrong answers on the Internet. For example, there are a lot of mistakes and omissions.
/^ d*.?d{0,2}$/: It matches the abnormal format of 012
/(^1-9?(.[0-9]{1,2})?$)|(^(0){1}$)|(^[0-9].0-9 ?$)/: This does not need to look closely to know is not the appropriate answer, it is equivalent to breaking up each possibility, can not play the regular advantage, I have not looked down.

Step by step, analyze what I wrote:
[1-9]d*: Start with only 1-9, followed by no or more numbers
^ (0|[1-9]d*)?: Or the beginning of 0 can also be omitted.
(.d{1,2})?$: Two decimal numbers are not necessary or not.

A previous interview question was to replace variables in the string, as follows

var str = "Hello ${name},you are so ${praise}",
  obj = {
    name: 'High circle',
    praise: 'goodly'
  };

console.log(str.replace(/\$\{([^{}]+)\}/g, function (match, key) {
  return obj[key]
}))
// Hello high circle, you are so good

Ending - -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
In fact, there should be a lot of things to say, but at present, it is enough to familiarize myself with them. After a while, I will have time to continue to write some advanced knowledge.

Tags: Javascript Attribute ECMAScript less

Posted on Wed, 31 Jul 2019 04:37:14 -0700 by fitzsic