Nginx forwarding matching rules that back-end programmers must not

Finishing some Java aspects of architecture, interview information (micro services, clusters, distributed, middleware, etc.), small buddy can be concerned about the official account [programmers], no way to collect their own.

1, Regular expression matching

  1. ~Match for case sensitivity
  2. ~*Match for case insensitive
  3. ! ~ and! ~ * are case sensitive mismatch and case insensitive mismatch respectively

2, File and directory matching

  1. -f and! - f are used to determine whether a file exists

  2. -d and! - d are used to determine whether a directory exists

  3. -e and! - e are used to determine whether a file or directory exists

  4. -x and! - x are used to determine whether the file is executable

3, The last parameter of the rewrite instruction is the flag flag, which has

  1. last is equivalent to the [L] mark in apache, indicating rewrite.
  2. break after the matching of this rule is completed, the matching will be terminated and the following rules will not be matched.
  3. redirect returns 302 temporary redirection, and the browser address will display the URL address after the jump.
  4. Permanent returns 301 permanent redirection, and the browser address will display the URL after jump.

The last and break are used to rewrite the URI, and the browser address bar is unchanged.

Moreover, there are subtle differences between the two. When using the alias instruction, you must use the Last tag; when using the proxy pass instruction, you need to use the break tag. After the Last rule is completed, the request will be re initiated for the server {...} tag, and the break tag will terminate the match after the rule is completed.

For example, if we redirect a similar URL/photo/123456 to / path/to/photo/12/1234/123456.png

rewrite "/photo/([0-9]{2})([0-9]{2})([0-9]{2})" 

rewrite "/path/to/photo/$1/$1$2/$1$2$3.png" ;

4, NginxRewrite rule related instructions

1. break instruction

Operating environment: server, location, if

The function of this instruction is to complete the current rule set and no longer process the rewrite instruction.

2. if instruction

Operating environment: server, location

This instruction is used to check whether a condition is met, and if the condition is met, execute the statement in braces. If instruction does not support nesting, multiple conditions & & and|, processing are not supported.

3. return command

Syntax: returncode

Operating environment: server, location, if

This instruction is used to end the execution of the rule and return the status code to the client.

Example: if the URL accessed ends with ". sh" or ". bash", a 403 status code is returned


location ~ .*\.(sh|bash)?$
{
   return 403;
}

4. rewrite instruction

Syntax: rewriteredex replacement flag

Operating environment: server, location, if

The instruction redirects the URI based on the expression, or modifies the string. Instructions are executed according to the order in the configuration file. Note that the rewrite expression is only valid for relative paths. If you want to match the host name, you should use the if statement, for example:


if( $host ~* www\.(.*) )
{
   set $host_without_www $1;
   rewrite ^(.*)$  http://$host_without_www$1permanent;
}

5. Set instruction

Syntax: setvariable value; Default: none Operating environment: server, location, if

This instruction is used to define a variable and assign a value to the variable. The value of a variable can be text, a variable, and a union of text variables.


   set$varname "hello world";

6. Uninitialized? Variable? Warn instruction

Syntax: uninitialized \ variable \ warn \ off

Operating environment: http, server, location, if

This instruction is used to turn on and off the warning information of uninitialized variables. The default value is on.

5, Writing instance of Rewrite rule for Nginx

1. Redirect to an html file when the accessed file and directory do not exist


if( !-e $request_filename )
{
    rewrite ^/(.*)$ index.htmllast;
}

2. Directory swapping / 123456 / XXXX = ===== / XXXX? Id = 123456

    rewrite ^/(\d+)/(.+)/  /$2?id=$1 last;

3. If the client uses ie browser, redirect to / ie directory


if( $http_user_agent  ~ MSIE)
{
    rewrite ^(.*)$ /ie/$1 break;
}

4. Prohibit access to multiple directories


location ~ ^/(cron|templates)/
{
    deny all;
    break;
}

5. Disable access to files starting with / data


location ~ ^/data
{
    deny all;
}

6. Access to files with. sh,.flv,.mp3 suffixes is prohibited


location ~ .*\.(sh|flv|mp3)$
{
    return 403;
}

7. Set browser cache time for some types of files


location ~ .*\.(gif|jpg|jpeg|png|bmp|swf)$
{
    expires 30d;
}
location ~ .*\.(js|css)$
{
    expires 1h;
}

8. Set expiration time for favicon.ico and robots.txt

In this case, favicon.ico is 99 days, robots.txt is 7 days, and 404 error logs are not recorded


location ~(favicon.ico) {
   log_not_found off;
   expires 99d;
   break;
}
location ~(robots.txt) {
   log_not_found off;
   expires 7d;
   break;
}

9. Set the expiration time of a file; here it is 600 seconds, and the access log is not recorded


location ^~ /html/scripts/loadhead_1.js {
    access_log  off;
    root /opt/lampp/htdocs/web;
    expires 600;
    break;
}

10. File anti stealing chain and set expiration time

The return412 here is a custom http status code, which is 403 by default. It is convenient to find the correct request for stealing the chain


rewrite ^/ http: //img.linuxidc.net/leech.gif; / / display a picture of anti-theft chain
access_log off; //Reduce stress by not logging access
expires 3d //Browser cache for all files for 3 days

location ~*^.+\.(jpg|jpeg|gif|png|swf|rar|zip|css|js)$ {
  valid_referers none blocked *.linuxidc.com*.linuxidc.net localhost 208.97.167.194;
if ($invalid_referer) {
     rewrite ^/ http://img.linuxidc.net/leech.gif;
     return 412;
     break;
}
access_log  off;
root /opt/lampp/htdocs/web;
expires 3d;
break;
}

11. Only fixed ip is allowed to access the website, and the password is added


root /opt/htdocs/www;
allow  208.97.167.194; 
allow  222.33.1.2; 
allow  231.152.49.4;
deny  all;
auth_basic "C1G_ADMIN";
auth_basic_user_file htpasswd;

12. Turn files under multi-level directory into a file to enhance seo effect


/job-123-456-789.html point/job/123/456/789.html

rewrite^/job-([0-9]+)-([0-9]+)-([0-9]+)\.html$ /job/$1/$2/jobshow_$3.html last;

13. Redirect when files and directories do not exist:


if (!-e $request_filename) {
    proxy_pass http://127.0.0.1;
}

14. Point a folder under the root directory to level 2 directory

For example, / shanghaijob / points to / area/shanghai/
If you change last to permanent, the browser address bar will be / location/shanghai/
rewrite ^/([0-9a-z]+)job/(.*)$ /area/$1/$2last;
The problem with the above example is that it will not match when accessing / shanghai
rewrite ^/([0-9a-z]+)job$ /area/$1/ last;
rewrite ^/([0-9a-z]+)job/(.*)$ /area/$1/$2last;
In this way / shanghai can also be accessed, but the relative links in the page cannot be used,
For example. / list ﹣ 1.html the real address is / area / Shanghai / list ﹣ 1.html will become / list ﹣ 1.html, which cannot be accessed.
Then I can't add auto jump
 (- D $request [filename) it has a condition that it must be a real directory, but my rewrite is not, so it has no effect
if (-d $request_filename){
rewrite ^/(.*)([^/])$ http://$host/$1$2/permanent;
}
I'll do it when I know the reason. Let me jump manually
rewrite ^/([0-9a-z]+)job$ /$1job/permanent;
rewrite ^/([0-9a-z]+)job/(.*)$ /area/$1/$2last;

15. Domain name jump


server{

  listen      80;
  server_name  jump.linuxidc.com;
  index index.html index.htm index.php;
  root  /opt/lampp/htdocs/www;
  rewrite ^/ http://www.linuxidc.com/;
  access_log  off;
}

16. Multi domain turn


server_name  www.linuxidc.comwww.linuxidc.net;
index index.html index.htm index.php;
root  /opt/lampp/htdocs;
if ($host ~ "linuxidc\.net") {
    rewrite ^(.*) http://www.linuxidc.com$1permanent;
}

6, nginx global variable


arg_PARAMETER    #This variable contains the value of PARAMETER in GET request.
args                    #This variable is equal to the parameter in the request line (GET request), such as: foo = 123 & Bar = blahblah;
binary_remote_addr #Binary customer address.
body_bytes_sent    #The number of body bytes sent in response. Even if the connection is broken, the data is accurate.
content_length    #The content length field in the request header.
content_type      #The content type field in the request header.
cookie_COOKIE    #Value of cookie COOKIE variable
document_root    #The value specified in the root directive for the current request.
document_uri      #Same as uri.
host                #Request host header field, otherwise server name.
hostname          #Set to themachine's hostname as returned by gethostname
http_HEADER
is_args              #If there is args parameter, this variable is equal to "?, otherwise it is equal to" ", null value.
http_user_agent    #Client agent information
http_cookie          #Client cookie information
limit_rate            #This variable limits the connection rate.
query_string          #Same as args.
request_body_file  #The temporary file name of the principal information requested by the client.
request_method    #The action requested by the client, usually GET or POST.
remote_addr          #The IP address of the client.
remote_port          #The port of the client.
remote_user          #A user name that has been authenticated by the Auth Basic Module.
request_completion #Set to OK if the request ends. Empty if the request does not end or if the request is not the last in the request chain.
request_method    #GET or POST
request_filename  #The file path of the current request, generated by the root or alias instruction and URI request.
request_uri          #The original URI containing the request parameters, excluding the host name, such as / foo/bar.php?arg=baz. Cannot be modified.
scheme                #HTTP methods (such as HTTP, https).
server_protocol      #The protocol used for the request, usually HTTP/1.0 or HTTP/1.1.
server_addr          #Server address, which can be determined after a system call.
server_name        #Server name.
server_port          #The port number of the request to the server.

####7. Correspondence between Apache and Nginx rules

RewriteCond of Apache corresponds to if of Nginx
 Rewrite rule of Apache corresponds to rewrite of Nginx
 Apache's [R] corresponds to Nginx's redirect
 [P] of Apache corresponds to last of Nginx
 Apache's [R,L] corresponds to Nginx's redirect
 [P,L] of Apache corresponds to last of Nginx
 [PT,L] of Apache corresponds to last of Nginx

For example: allow the specified domain name to visit the site, and all other domain names will be transferred to www.linuxidc.net


Apache:
RewriteCond %{HTTP_HOST} !^(.*?)\.aaa\.com$[NC]
RewriteCond %{HTTP_HOST} !^localhost$ 
RewriteCond %{HTTP_HOST}!^192\.168\.0\.(.*?)$
RewriteRule ^/(.*)$ http://www.linuxidc.net[R,L]

Nginx filtering example:

if( $host ~* ^(.*)\.aaa\.com$ )
{
   set $allowHost '1';
}
if( $host ~* ^localhost )
{
   set $allowHost '1';
}
if( $host ~* ^192\.168\.1\.(.*?)$ )
{
   set $allowHost '1';
}
if( $allowHost !~ '1' )
{
   rewrite ^/(.*)$ http://www.linuxidc.netredirect ;
}

summary

Back end development is the closest profession to the whole stack. The front end is not enough to write page JS on the top of the back end. If there is no operation and maintenance, don't worry about the back end to maintain the server. In a word, a good back end can cover all aspects.

So much for today. If this article helps you a little, I hope I can get your approval A kind of Oh

Your recognition is the driving force of my writing!

Little welfare:

Through legal means, we can get some paid courses from geeks, Shh ~, and give them to our friends for free. Official account reply to geeks to collect

Finishing some Java structure, interview information, small buddy can be concerned about the official account [programmers]

Tags: Programming Nginx Apache Java IE

Posted on Mon, 23 Mar 2020 03:30:03 -0700 by dtdetu