shell script regular expression one of the three swordsmen (grep,egrep)

Regular expressions for Shell scripts

One of the three regular expression swordsmen: grep

1. Before learning regular expressions, let's take a useless configuration file as a test exercise.

[root@localhost ~]# vim chen.txt

#version=DEVEL
 System authorization information
auth --enableshadow --passalgo=sha512# Use CDROM installation media
cdrom
thethethe
THE
THEASDHAS
 Use graphical install
graphical
 Run the Setup Agent on first boot
firstboot --enable
ignoredisk --only-use=sda
wood
wd
wod
woooooooood
124153
3234
342222222
faasd11
2
ZASASDNA
short
shirt

2. Find specific characters

"-vn" reverse selection. Finding rows that do not contain the "the" character requires the "-vn" option of the grep command.
- n"indicates the display line number
"-i" means case insensitive
When the command is executed, the font color becomes red for characters that meet the matching criteria

[root@localhost ~]# grep -n 'the' chen.txt
6:thethethe
11:# Run the Setup Agent on first boot
[root@localhost ~]# grep -in 'the' chen.txt
6:thethethe
7:THE
8:THEASDHAS
11:# Run the Setup Agent on first boot
[root@localhost ~]# grep -vn 'the' chen.txt
1:#version=DEVEL
2:# System authorization information
3:auth --enableshadow --passalgo=sha512
4:# Use CDROM installation media
5:cdrom
7:THE
8:THEASDHAS
9:# Use graphical install
10:graphical
12:firstboot --enable
13:ignoredisk --only-use=sda
14:wood
15:wd
16:wod
17:woooooooood
18:124153
19:3234
20:342222222
21:faasd11
22:2
23:ZASASDNA
24:
short
shirt

3. Brackets "[]" to find set characters
When you look up the "shirt" and "short" strings, you can find that both of them contain "sh" and "rt". At this point, the following commands can be executed to find both "shirt" and "short" strings. No matter how many characters there are in "[]", they only represent one character, that is to say "[i o]" means matching "i" or "o".

[root@localhost ~]# Grep-n'sh [io] rt'chen.txt//filter short or shirt have IO set characters
24:short
25:shirt

To find a single character containing repeated "oo", simply execute the following command.

[root@localhost ~]# grep -n 'oo' chen.txt 
11:# Run the Setup Agent on first boot
12:firstboot --enable
14:wood
17:woooooooood

If the string before "oo" is not "w", only the reverse selection of the set character "[^]" is needed to achieve this purpose. For example, the execution of the "grep-n'[^ w]oo'test.txt" command means that the string before "oo" is not "w" in the text of test.txt.

[root@localhost ~]# Grep-n'[^w]oo'chen.txt//filter the string of OO at the beginning of W
11:# Run the Setup Agent on first boot
12:firstboot --enable
17:woooooooood

In the execution results of the above commands, it is found that "woood" and "wooood" also conform to the matching rules, both of which contain "w". In fact, through the results of execution, we can see that the characters that meet the matching criteria are shown in bold. From the above results, we can know that "o o o" is shown in bold "woood". The "o" in front of "oo" is in accordance with the matching rules. Similarly, " woooood " also conforms to the matching rule.
If you don't want lowercase letters in front of "oo", you can use the "grep-n'[^a-z]oo'test.txt" command, where "a-z" means lowercase letters and "A-Z" means uppercase letters.

[root@localhost ~]# grep -n '[^a-z]oo' chen.txt 
19:Foofddd

Finding rows containing numbers can be done by using the "grep-n'[0-9]'test.txt" command

[root@localhost ~]# grep -n '[0-9]' chen.txt
3:auth --enableshadow --passalgo=sha512
20:124153
21:3234
22:342222222
23:faasd11
24:2

Find the beginning "^" and end character "$"

[root@localhost ~]# grep -n '^the' chen.txt
6:thethethe

Queries that begin with lowercase letters can be filtered by the "1" rule.

[root@localhost ~]# grep -n '^[a-z]' chen.txt
3:auth --enableshadow --passalgo=sha512
5:cdrom
6:thethethe
10:graphical
12:firstboot --enable
13:ignoredisk --only-use=sda
14:wood
15:wd
16:wod
17:woooooooood
18:dfsjdjoooooof
23:faasd11
26:short
27:shirt

Inquiry for capital letters

[root@localhost ~]# grep -n '^[A-Z]' chen.txt
7:THE
8:THEASDHAS
19:Foofddd
25:ZASASDNA

If the query does not start with letters, the "[a-zA-Z]" rule is used.

[root@localhost ~]# grep -n '^[^a-zA-Z]' chen.txt
1:#version=DEVEL
2:# System authorization information
4:# Use CDROM installation media
9:# Use graphical install
11:# Run the Setup Agent on first boot
20:124153
21:3234
22:342222222
24:2

The function of "^" symbol inside and outside the metacharacter set "[]" symbol is different. It represents reverse selection in "[]" symbol and positioning header outside "[]" symbol. Conversely, if you want to find a line ending with a particular character, you can use the "$" locator. For example, execute the following command to implement a query that ends with a decimal point (.). Because decimal point (.) is also a metacharacter in regular expressions (as will be mentioned later), it is necessary to use the escape character "" to convert characters of special significance into ordinary characters.

[root@localhost ~]# grep -n '\.$' chen.txt
5:cdrom.
6:thethethe.
9:# Use graphical install.
10:graphical.
11:# Run the Setup Agent on first boot.

When querying blank rows, execute "grep - n'^$'chen.txt"

Find any character "." and duplicate character "*"
The decimal point (.) in a regular expression is also a metacharacter, representing any character. For example, a string of "w??d" can be found by executing the following command, i.e. there are four characters, beginning with W and ending with D.

[root@localhost ~]# grep -n 'w..d' chen.txt
14:wood

In the above results, the "wood" string "w... d"matching rule. If you want to query oo, ooo, OOo and other information, you need to use the asterisk () metacharacter. However, it should be noted that "" represents the repetition of zero or more previous single characters. "O" means having zero (i.e. empty characters) or more characters than or equal to one "o", because empty characters are allowed, executing the "grep-n'o'test.txt" command prints out all the contents of the text. If it is "o o", then the first O must exist, and the second O is zero or more o, so all the data including o, oo, ooo, ooo, etc. are up to standard. Similarly, if the query contains at least two strings of more than o, then execute the "grep-n'ooo'test.txt" command.

[root@localhost ~]# grep -n 'ooo*' chen.txt
11:# Run the Setup Agent on first boot.
12:firstboot --enable
14:wood
17:woooooooood
18:dfsjdjoooooof
19:Foofddd

The query ends with d at the beginning of w and contains at least one string of o, which can be implemented by executing the following commands.

[root@localhost ~]# grep -n 'woo*d' chen.txt
14:wood
16:wod
17:woooooooood

The query ends with d at the beginning of w, and the characters in the middle can be dispensable strings.

[root@localhost ~]# grep -n 'w.*d' chen.txt
14:wood
15:wd
16:wod
17:woooooooood

Query any number of rows.

[root@localhost ~]# grep -n '[0-9][0-9]*' chen.txt
3:auth --enableshadow --passalgo=sha512
20:124153
21:3234
22:342222222
23:faasd11
24:2

Find Continuous Character Range "{}"
Use "." and "*" to set zero to infinite number of repetitive characters, if you want to limit a range of repetitive strings how to achieve it? For example, to find three to five consecutive characters of o, you need to use the bounded character "{}" in the underlying regular expression. Because "{}" has a special meaning in Shell, when using "{}" character, we need to use the escape character "\" to convert "{}" character into ordinary character.

Query for more than two o characters

[root@localhost ~]# grep -n 'o\{2\}' chen.txt
11:# Run the Setup Agent on first boot.
12:firstboot --enable
14:wood
17:woooooooood
18:dfsjdjoooooof
19:Foofddd

Queries begin with w and end with d, with strings of 2 to 5 o in between.

[root@localhost ~]# grep -n 'wo\{2,5\}d' chen.txt
14:wood

Queries begin with w and end with d, with strings containing more than 2 o.

[root@localhost ~]# grep -n 'wo\{2,\}d' chen.txt
14:wood
17:woooooooood

II. Extended Regular Expressions

To simplify the entire instruction, a wider range of extended regular expressions is needed. For example, use the underlying regular expression to query rows other than blank lines and # at the beginning of the line in the file (usually used to view the valid configuration file), and execute "grep-v'^ KaTeX parse error: Expected group after'^'at position 22:... txt | grep V'^_______________ You need to use a tube here. |^#’ test.txt", where the pipe symbol in single quotation marks represents or (or).
In addition, the grep command only supports basic regular expressions, and if you use extended regular expressions, you need to use egrep or awk commands. The awk command is explained in a later section, where we use the egrep command directly. The use of the egrep command is basically similar to that of the grep command. The egrep command is a search file acquisition mode that can search for any string and symbol in a file, or for strings of one or more files. A prompt can be a single character, a string, a word or a sentence.
Common metacharacters of extended regular expressions mainly include the following:

"+"Example: Execution“ egrep -n 'wo+d' test.txt"Command, you can query"wood" "woood" "woooooood"Equal string

[root@localhost ~]# egrep -n 'wo+d' chen.txt
14:wood
16:wod
17:woooooooood

Example: Execute the command "egrep-n'bes?t'test.txt" to query the two strings "bet" and "best"

[root@localhost ~]# egrep -n 'bes?t' chen.txt
11:best
12:bet

"|" Example: Execute the command "egrep-n'of|is|on'test.txt" to query "of" or "if" or "on" strings

[root@localhost ~]# egrep -n 'of|is|on' chen.txt
1:#version=DEVEL
2:# System authorization information
4:# Use CDROM installation media
13:# Run the Setup Agent on first boot.
15:ignoredisk --only-use=sda
20:dfsjdjoooooof
21:Foofddd

"()" example: "egrep-n't(a|e)st'test.txt". "T a st" and "test" because the two words "t" and "st" are repetitive, so "a" and "e" are listed in the "()" symbol, and separated by "|", you can query the "tast" or "test" string.

[root@localhost ~]# egrep -n 't(a|e)st' chen.txt
12:test
13:tast

"()+" Example: "egrep-n'A(xyz)+C'test.txt". This command means that the "A" at the beginning of the query ends with "C" and that there is more than one "xyz" string in the middle.

[root@localhost ~]# egrep -n 'A(xyz)+C' chen.txt
14:AxyzxyzxyzC

Tags: shell vim Asterisk

Posted on Fri, 11 Oct 2019 04:33:25 -0700 by DataRater