Ali question the location of the artifact Arthas Sao operation, positioning online BUG, super power

background
The company has a channel system, which is specially used to connect three-party channels. There is no business logic. It is mainly used to convert messages and verify parameters, which plays a connecting role.

Recently, after optimizing the response time of the interface and optimizing the code, the time still can't meet the requirements; there is a strange time-consuming problem of about 100ms. After printing the request processing time in the interface, the response time is still about 100ms less than that of the caller. For example, 150 ms is recorded in the program, but the waiting time of the caller is about 250 ms.

Next, record the detailed positioning & solution process at that time (in fact, the solution is very simple, the key is how to locate and find a solution to the problem)

Positioning process
Analysis code
Channel system is a common spring boot web project, which uses integrated tomcat. After analyzing the code, it is found that there is no special place, no special filter or interceptor, so the preliminary elimination is the business code problem

Analysis call process
After this problem occurs, first confirm the calling process of the next interface. Because it is an internal test, there are fewer calling processes.

Nginx - reverse agent - > channel system

The company is a cloud server, and the network is also the internal network of the cloud. Because the reason of the problem is not clear, so use the exclusion method to first confirm whether there is a problem in the server network.

First, confirm whether there is a problem from the sender to Nginx Host:

[jboss@VM_0_139_centos ~]$ ping 10.0.0.139
PING 10.0.0.139 (10.0.0.139) 56(84) bytes of data.
64 bytes from 10.0.0.139: icmp_seq=1 ttl=64 time=0.029 ms
64 bytes from 10.0.0.139: icmp_seq=2 ttl=64 time=0.041 ms
64 bytes from 10.0.0.139: icmp_seq=3 ttl=64 time=0.040 ms
64 bytes from 10.0.0.139: icmp_seq=4 ttl=64 time=0.040 ms

From the ping result, the delay from the sending end to the Nginx host is no problem. Next, check the network from Nginx to the channel system.

# Since the log is OK, you can directly copy the above Log here
[jboss@VM_0_139_centos ~]$ ping 10.0.0.139
PING 10.0.0.139 (10.0.0.139) 56(84) bytes of data.
64 bytes from 10.0.0.139: icmp_seq=1 ttl=64 time=0.029 ms
64 bytes from 10.0.0.139: icmp_seq=2 ttl=64 time=0.041 ms
64 bytes from 10.0.0.139: icmp_seq=3 ttl=64 time=0.040 ms
64 bytes from 10.0.0.139: icmp_seq=4 ttl=64 time=0.040 ms

From the ping result, the network delay from Nginx to the channel system server is no problem

Since the network seems to have no problems, we can continue to eliminate it. Cut off Nginx. The client directly connects to the server of the channel system through the local host. Avoid passing through the network card / dns. Narrow the scope of the problem to see if it can be reproduced (this application and address are simulated in my later stage, and the test is an empty interface):

[jboss@VM_10_91_centos tmp]$ curl -w "@curl-time.txt" http://127.0.0.1:7744/send
success
              http: 200
               dns: 0.001s
          redirect: 0.000s
      time_connect: 0.001s
   time_appconnect: 0.000s
  time_pretransfer: 0.001s
time_starttransfer: 0.073s
     size_download: 7bytes
    speed_download: 95.000B/s
                  ----------
        time_total: 0.073s Total request time

From the curl log, it takes 73 MS to call an empty interface through the loopback address. This is strange. Skipping all the calling nodes (including filters & interceptors) in the middle and directly requesting to apply an empty interface will take 73ms. Please ask again:

[jboss@VM_10_91_centos tmp]$ curl -w "@curl-time.txt" http://127.0.0.1:7744/send
success
              http: 200
               dns: 0.001s
          redirect: 0.000s
      time_connect: 0.001s
   time_appconnect: 0.000s
  time_pretransfer: 0.001s
time_starttransfer: 0.003s
     size_download: 7bytes
    speed_download: 2611.000B/s
                  ----------
        time_total: 0.003s

What's more, the second request time-consuming is normal, which becomes 3ms. After consulting the data, linux curl is enabled by default at http keep alive (for the introduction of keep alive, please refer to another article of mine). Even if you do not turn on keep alive, you will not need 70ms each time you handshake again.

After continuous analysis and testing, it is found that the time of continuous requests is very short, each request only takes a few milliseconds, but if you request again after a period of time, it will take more than 70ms.

It is speculated from this phenomenon that it may be caused by some caching mechanisms. Because of the cache, continuous requests are fast and take a long time after the cache fails.

So where is the problem? What about tomcat layer or spring webmvc?

We can't locate the problem just by guessing, but we have to actually test whether the code of the channel system can be reproduced in the local ide

However, after importing the local Ide, the problem cannot be repeated after starting in the Ide, and there is no 70+ms delay problem. It's a headache. It can't be reproduced locally or debugged. Because the problem is not in the business code, it can't be debugged by adding logs

At this time, you can sacrifice the artifact Arthas

Arthas analysis problems
Arthas is an open-source Java diagnostic tool of Alibaba, which is deeply loved by developers.
Arthas can help you solve the following problems when you are at a loss:

Which jar package does this class load from? Why are all kinds of exceptions reported?
Why didn't the code I changed execute? Didn't I commit? Wrong branch?
If there is a problem, you can't debug online. Can you only add logs and republish them?
There is a problem in the data processing of a user online, but it can't be debug ged online, and it can't be reproduced offline!
Is there a global perspective to see the health of the system?
What is the way to monitor the real-time running state of the JVM?

Above is the official profile of Arthas. This time I only need to use one of his small functions, trace. Dynamically calculate the method call path and time, so I can locate where the time is consumed.

trace method calls the path internally, and outputs the time consumption of each node on the method path
trace command can actively search class pattern / method pattern
The corresponding method call path renders and counts all the performance overhead and trace call links on the whole call link.

With artifact, what method should we follow? Because I'm not familiar with the Tomcat source code, I can only start with spring mvc. First, I'll trace the entry of spring mvc:

[arthas@24851]$ trace org.springframework.web.servlet.DispatcherServlet *
Press Q or Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:44) cost in 508 ms.
`---ts=2019-09-14 21:07:44;thread_name=http-nio-7744-exec-2;id=11;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@7c136917
    `---[2.952142ms] org.springframework.web.servlet.DispatcherServlet:buildLocaleContext()

`---ts=2019-09-14 21:07:44;thread_name=http-nio-7744-exec-2;id=11;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@7c136917
    `---[18.08903ms] org.springframework.web.servlet.DispatcherServlet:doService()
        +---[0.041346ms] org.apache.commons.logging.Log:isDebugEnabled() #889
        +---[0.022398ms] org.springframework.web.util.WebUtils:isIncludeRequest() #898
        +---[0.014904ms] org.springframework.web.servlet.DispatcherServlet:getWebApplicationContext() #910
        +---[1.071879ms] javax.servlet.http.HttpServletRequest:setAttribute() #910
        +---[0.020977ms] javax.servlet.http.HttpServletRequest:setAttribute() #911
        +---[0.017073ms] javax.servlet.http.HttpServletRequest:setAttribute() #912
        +---[0.218277ms] org.springframework.web.servlet.DispatcherServlet:getThemeSource() #913
        |   `---[0.137568ms] org.springframework.web.servlet.DispatcherServlet:getThemeSource()
        |       `---[min=0.00783ms,max=0.014251ms,total=0.022081ms,count=2] org.springframework.web.servlet.DispatcherServlet:getWebApplicationContext() #782
        +---[0.019363ms] javax.servlet.http.HttpServletRequest:setAttribute() #913
        +---[0.070694ms] org.springframework.web.servlet.FlashMapManager:retrieveAndUpdate() #916
        +---[0.01839ms] org.springframework.web.servlet.FlashMap:<init>() #920
        +---[0.016943ms] javax.servlet.http.HttpServletRequest:setAttribute() #920
        +---[0.015268ms] javax.servlet.http.HttpServletRequest:setAttribute() #921
        +---[15.050124ms] org.springframework.web.servlet.DispatcherServlet:doDispatch() #925
        |   `---[14.943477ms] org.springframework.web.servlet.DispatcherServlet:doDispatch()
        |       +---[0.019135ms] org.springframework.web.context.request.async.WebAsyncUtils:getAsyncManager() #953
        |       +---[2.108373ms] org.springframework.web.servlet.DispatcherServlet:checkMultipart() #960
        |       |   `---[2.004436ms] org.springframework.web.servlet.DispatcherServlet:checkMultipart()
        |       |       `---[1.890845ms] org.springframework.web.multipart.MultipartResolver:isMultipart() #1117
        |       +---[2.054361ms] org.springframework.web.servlet.DispatcherServlet:getHandler() #964
        |       |   `---[1.961963ms] org.springframework.web.servlet.DispatcherServlet:getHandler()
        |       |       +---[0.02051ms] java.util.List:iterator() #1183
        |       |       +---[min=0.003805ms,max=0.009641ms,total=0.013446ms,count=2] java.util.Iterator:hasNext() #1183
        |       |       +---[min=0.003181ms,max=0.009751ms,total=0.012932ms,count=2] java.util.Iterator:next() #1183
        |       |       +---[min=0.005841ms,max=0.015308ms,total=0.021149ms,count=2] org.apache.commons.logging.Log:isTraceEnabled() #1184
        |       |       `---[min=0.474739ms,max=1.19145ms,total=1.666189ms,count=2] org.springframework.web.servlet.HandlerMapping:getHandler() #1188
        |       +---[0.013071ms] org.springframework.web.servlet.HandlerExecutionChain:getHandler() #971
        |       +---[0.372236ms] org.springframework.web.servlet.DispatcherServlet:getHandlerAdapter() #971
        |       |   `---[0.280073ms] org.springframework.web.servlet.DispatcherServlet:getHandlerAdapter()
        |       |       +---[0.004804ms] java.util.List:iterator() #1224
        |       |       +---[0.003668ms] java.util.Iterator:hasNext() #1224
        |       |       +---[0.003038ms] java.util.Iterator:next() #1224
        |       |       +---[0.006451ms] org.apache.commons.logging.Log:isTraceEnabled() #1225
        |       |       `---[0.012683ms] org.springframework.web.servlet.HandlerAdapter:supports() #1228
        |       +---[0.012848ms] javax.servlet.http.HttpServletRequest:getMethod() #974
        |       +---[0.013132ms] java.lang.String:equals() #975
        |       +---[0.003025ms] org.springframework.web.servlet.HandlerExecutionChain:getHandler() #977
        |       +---[0.008095ms] org.springframework.web.servlet.HandlerAdapter:getLastModified() #977
        |       +---[0.006596ms] org.apache.commons.logging.Log:isDebugEnabled() #978
        |       +---[0.018024ms] org.springframework.web.context.request.ServletWebRequest:<init>() #981
        |       +---[0.017869ms] org.springframework.web.context.request.ServletWebRequest:checkNotModified() #981
        |       +---[0.038542ms] org.springframework.web.servlet.HandlerExecutionChain:applyPreHandle() #986
        |       +---[0.00431ms] org.springframework.web.servlet.HandlerExecutionChain:getHandler() #991
        |       +---[4.248493ms] org.springframework.web.servlet.HandlerAdapter:handle() #991
        |       +---[0.014805ms] org.springframework.web.context.request.async.WebAsyncManager:isConcurrentHandlingStarted() #993
        |       +---[1.444994ms] org.springframework.web.servlet.DispatcherServlet:applyDefaultViewName() #997
        |       |   `---[0.067631ms] org.springframework.web.servlet.DispatcherServlet:applyDefaultViewName()
        |       +---[0.012027ms] org.springframework.web.servlet.HandlerExecutionChain:applyPostHandle() #998
        |       +---[0.373997ms] org.springframework.web.servlet.DispatcherServlet:processDispatchResult() #1008
        |       |   `---[0.197004ms] org.springframework.web.servlet.DispatcherServlet:processDispatchResult()
        |       |       +---[0.007074ms] org.apache.commons.logging.Log:isDebugEnabled() #1075
        |       |       +---[0.005467ms] org.springframework.web.context.request.async.WebAsyncUtils:getAsyncManager() #1081
        |       |       +---[0.004054ms] org.springframework.web.context.request.async.WebAsyncManager:isConcurrentHandlingStarted() #1081
        |       |       `---[0.011988ms] org.springframework.web.servlet.HandlerExecutionChain:triggerAfterCompletion() #1087
        |       `---[0.004015ms] org.springframework.web.context.request.async.WebAsyncManager:isConcurrentHandlingStarted() #1018
        +---[0.005055ms] org.springframework.web.context.request.async.WebAsyncUtils:getAsyncManager() #928
        `---[0.003422ms] org.springframework.web.context.request.async.WebAsyncManager:isConcurrentHandlingStarted() #928
[jboss@VM_10_91_centos tmp]$ curl -w "@curl-time.txt" http://127.0.0.1:7744/send
success
              http: 200
               dns: 0.001s
          redirect: 0.000s
      time_connect: 0.001s
   time_appconnect: 0.000s
  time_pretransfer: 0.001s
time_starttransfer: 0.115s
     size_download: 7bytes
    speed_download: 60.000B/s
                  ----------
        time_total: 0.115s

In this call, the call end time is 115ms, but in terms of arthas trace, spring mvc only consumes 18ms, so where is the remaining 97ms?

After the local test, the problem of spring mvc can be eliminated. In the end, the only possible problem is tomcat

But I am not familiar with the source code in tomcat, even the request entry is not clear, and the classes that need trace in Tomcat are not easy to find...

However, it doesn't matter. There is a artifact called Arthas. You can use the stack command to find the calling path in reverse. Take org.springframework.web.servlet.DispatcherServlet as the parameter:

stack output the call path of the current method being called
A lot of times we all know that a method is executed, but there are many paths for the method to be executed, or you don't know where the method is executed at all. At this time, you need the stack command.

[arthas@24851]$ stack org.springframework.web.servlet.DispatcherServlet *
Press Q or Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:44) cost in 495 ms.
ts=2019-09-14 21:15:19;thread_name=http-nio-7744-exec-5;id=14;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@7c136917
    @org.springframework.web.servlet.FrameworkServlet.proce***equest()
        at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:866)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:635)
        at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:851)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:99)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.HttpPutFormContentFilter.doFilterInternal(HttpPutFormContentFilter.java:109)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:81)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:200)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:496)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342)
        at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:803)
        at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
        at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:790)
        at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1468)
        at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(Thread.java:748)

ts=2019-09-14 21:15:19;thread_name=http-nio-7744-exec-5;id=14;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@7c136917
    @org.springframework.web.servlet.DispatcherServlet.doService()
        at org.springframework.web.servlet.FrameworkServlet.proce***equest(FrameworkServlet.java:974)
        at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:866)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:635)
        at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:851)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:99)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.HttpPutFormContentFilter.doFilterInternal(HttpPutFormContentFilter.java:109)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:81)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:200)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:496)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342)
        at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:803)
        at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
        at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:790)
        at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1468)
        at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(Thread.java:748)

From the stack log, you can see the call stack of the DispatchServlet intuitively. What kind of trace should be used for such a long path (skip the trace process of the filter in spring mvc here, and trace it once in the actual troubleshooting, but this weird time consumption is not generated by the filter here)? Experienced drivers can probably guess where to start from by name, which is org.apache.coyote.http11.Http11Processor.service. From the name, http1.1 processor, which may be a better starting point. Let's trace:

[arthas@24851]$ trace org.apache.coyote.http11.Http11Processor service
Press Q or Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 269 ms.
`---ts=2019-09-14 21:22:51;thread_name=http-nio-7744-exec-8;id=17;is_daemon=true;priority=5;TCCL=org.springframework.boot.loader.LaunchedURLClassLoader@20ad9418
    `---[131.650285ms] org.apache.coyote.http11.Http11Processor:service()
        +---[0.036851ms] org.apache.coyote.Request:getRequestProcessor() #667
        +---[0.009986ms] org.apache.coyote.RequestInfo:setStage() #668
        +---[0.008928ms] org.apache.coyote.http11.Http11Processor:setSocketWrapper() #671
        +---[0.013236ms] org.apache.coyote.http11.Http11InputBuffer:init() #672
        +---[0.00981ms] org.apache.coyote.http11.Http11OutputBuffer:init() #673
        +---[min=0.00213ms,max=0.007317ms,total=0.009447ms,count=2] org.apache.coyote.http11.Http11Processor:getErrorState() #683
        +---[min=0.002098ms,max=0.008888ms,total=0.010986ms,count=2] org.apache.coyote.ErrorState:isError() #683
        +---[min=0.002448ms,max=0.007149ms,total=0.009597ms,count=2] org.apache.coyote.http11.Http11Processor:isAsync() #683
        +---[min=0.002399ms,max=0.00852ms,total=0.010919ms,count=2] org.apache.tomcat.util.net.AbstractEndpoint:isPaused() #683
        +---[min=0.033587ms,max=0.11832ms,total=0.151907ms,count=2] org.apache.coyote.http11.Http11InputBuffer:parseRequestLine() #687
        +---[0.005384ms] org.apache.tomcat.util.net.AbstractEndpoint:isPaused() #695
        +---[0.007924ms] org.apache.coyote.Request:getMimeHeaders() #702
        +---[0.006744ms] org.apache.tomcat.util.net.AbstractEndpoint:getMaxHeaderCount() #702
        +---[0.012574ms] org.apache.tomcat.util.http.MimeHeaders:setLimit() #702
        +---[0.14319ms] org.apache.coyote.http11.Http11InputBuffer:parseHeaders() #703
        +---[0.003997ms] org.apache.coyote.Request:getMimeHeaders() #743
        +---[0.026561ms] org.apache.tomcat.util.http.MimeHeaders:values() #743
        +---[min=0.002869ms,max=0.01203ms,total=0.014899ms,count=2] java.util.Enumeration:hasMoreElements() #745
        +---[0.070114ms] java.util.Enumeration:nextElement() #746
        +---[0.010921ms] java.lang.String:toLowerCase() #746   +---[0.008453ms] java.lang.String:contains() #746
        +---[0.002698ms] org.apache.coyote.http11.Http11Processor:getErrorState() #775
        +---[0.00307ms] org.apache.coyote.ErrorState:isError() #775
        +---[0.002708ms] org.apache.coyote.RequestInfo:setStage() #777
        +---[0.171139ms] org.apache.coyote.http11.Http11Processor:prepareRequest() #779
        +---[0.009349ms] org.apache.tomcat.util.net.SocketWrapperBase:decrementKeepAlive() #794
        +---[0.002574ms] org.apache.coyote.http11.Http11Processor:getErrorState() #800
        +---[0.002696ms] org.apache.coyote.ErrorState:isError() #800
        +---[0.002499ms] org.apache.coyote.RequestInfo:setStage() #802
        +---[0.005641ms] org.apache.coyote.http11.Http11Processor:getAdapter() #803
        +---[129.868916ms] org.apache.coyote.Adapter:service() #803
        +---[0.003859ms] org.apache.coyote.http11.Http11Processor:getErrorState() #809
        +---[0.002365ms] org.apache.coyote.ErrorState:isError() #809
        +---[0.003844ms] org.apache.coyote.http11.Http11Processor:isAsync() #809
        +---[0.002382ms] org.apache.coyote.Response:getStatus() #809
        +---[0.002476ms] org.apache.coyote.http11.Http11Processor:statusDropsConnection() #809
        +---[0.002284ms] org.apache.coyote.RequestInfo:setStage() #838
        +---[0.00222ms] org.apache.coyote.http11.Http11Processor:isAsync() #839
        +---[0.037873ms] org.apache.coyote.http11.Http11Processor:endRequest() #843
        +---[0.002188ms] org.apache.coyote.RequestInfo:setStage() #845
        +---[0.002112ms] org.apache.coyote.http11.Http11Processor:getErrorState() #849
        +---[0.002063ms] org.apache.coyote.ErrorState:isError() #849
        +---[0.002504ms] org.apache.coyote.http11.Http11Processor:isAsync() #853
        +---[0.009808ms] org.apache.coyote.Request:updateCounters() #854
        +---[0.002008ms] org.apache.coyote.http11.Http11Processor:getErrorState() #855
        +---[0.002192ms] org.apache.coyote.ErrorState:isIoAllowed() #855
        +---[0.01968ms] org.apache.coyote.http11.Http11InputBuffer:nextRequest() #856
        +---[0.010065ms] org.apache.coyote.http11.Http11OutputBuffer:nextRequest() #857
        +---[0.002576ms] org.apache.coyote.RequestInfo:setStage() #870
        +---[0.016599ms] org.apache.coyote.http11.Http11Processor:processSendfile() #872
        +---[0.008182ms] org.apache.coyote.http11.Http11InputBuffer:getParsingRequestLinePhase() #688
        +---[0.0075ms] org.apache.coyote.http11.Http11Processor:handleIncompleteRequestLineRead() #690
        +---[0.001979ms] org.apache.coyote.RequestInfo:setStage() #875
        +---[0.001981ms] org.apache.coyote.http11.Http11Processor:getErrorState() #877
        +---[0.001934ms] org.apache.coyote.ErrorState:isError() #877
        +---[0.001995ms] org.apache.tomcat.util.net.AbstractEndpoint:isPaused() #877
        +---[0.002403ms] org.apache.coyote.http11.Http11Processor:isAsync() #879
        `---[0.006176ms] org.apache.coyote.http11.Http11Processor:isUpgrade() #881

There is a 129ms time-consuming point in the log (the time is longer than the time when the arthas are not opened because of the performance consumption brought by the arthas themselves, so the production environment should be used carefully). This is the problem point to be found.

After finding the problem point, how to locate the problem caused by what and how to solve it?

Continue to trace, and refine to specific code block or content. Due to performance considerations, trace does not show all call paths. If the call path is too deep, only trace can be manually deepened. The principle is that trace takes a long time:

[arthas@24851]$ trace org.apache.coyote.Adapter service
Press Q or Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 608 ms.
`---ts=2019-09-14 21:34:33;thread_name=http-nio-7744-exec-1;id=10;is_daemon=true;priority=5;TCCL=org.springframework.boot.loader.LaunchedURLClassLoader@20ad9418
    `---[81.70999ms] org.apache.catalina.connector.CoyoteAdapter:service()
        +---[0.032546ms] org.apache.coyote.Request:getNote() #302
        +---[0.007148ms] org.apache.coyote.Response:getNote() #303
        +---[0.007475ms] org.apache.catalina.connector.Connector:getXpoweredBy() #324
        +---[0.00447ms] org.apache.coyote.Request:getRequestProcessor() #331
        +---[0.007902ms] java.lang.ThreadLocal:get() #331
        +---[0.006522ms] org.apache.coyote.RequestInfo:setWorkerThreadName() #331
        +---[73.793798ms] org.apache.catalina.connector.CoyoteAdapter:postParseRequest() #336
        +---[0.001536ms] org.apache.catalina.connector.Connector:getService() #339
        +---[0.004469ms] org.apache.catalina.Service:getContainer() #339
        +---[0.007074ms] org.apache.catalina.Engine:getPipeline() #339
        +---[0.004334ms] org.apache.catalina.Pipeline:isAsyncSupported() #339
        +---[0.002466ms] org.apache.catalina.connector.Request:setAsyncSupported() #339
        +---[6.01E-4ms] org.apache.catalina.connector.Connector:getService() #342
        +---[0.001859ms] org.apache.catalina.Service:getContainer() #342
        +---[9.65E-4ms] org.apache.catalina.Engine:getPipeline() #342
        +---[0.005231ms] org.apache.catalina.Pipeline:getFirst() #342
        +---[7.239154ms] org.apache.catalina.Valve:invoke() #342
        +---[0.006904ms] org.apache.catalina.connector.Request:isAsync() #345
        +---[0.00509ms] org.apache.catalina.connector.Request:finishRequest() #372
        +---[0.051461ms] org.apache.catalina.connector.Response:finishResponse() #373
        +---[0.007244ms] java.util.concurrent.atomic.AtomicBoolean:<init>() #379
        +---[0.007314ms] org.apache.coyote.Response:action() #380
        +---[0.004518ms] org.apache.catalina.connector.Request:isAsyncCompleting() #382
        +---[0.001072ms] org.apache.catalina.connector.Request:getContext() #394
        +---[0.007166ms] java.lang.System:currentTimeMillis() #401
        +---[0.004367ms] org.apache.coyote.Request:getStartTime() #401
        +---[0.011483ms] org.apache.catalina.Context:logAccess() #401
        +---[0.0014ms] org.apache.coyote.Request:getRequestProcessor() #406
        +---[min=8.0E-4ms,max=9.22E-4ms,total=0.001722ms,count=2] java.lang.Integer:<init>() #406
        +---[0.001082ms] java.lang.reflect.Method:invoke() #406
        +---[0.001851ms] org.apache.coyote.RequestInfo:setWorkerThreadName() #406
        +---[0.035805ms] org.apache.catalina.connector.Request:recycle() #410
        `---[0.007849ms] org.apache.catalina.connector.Response:recycle() #411

After a boring manual drill into trace

[arthas@24851]$ trace org.apache.catalina.webresources.AbstractArchiveResourceSet getArchiveEntries
Press Q or Ctrl+C to abort.
Affect(class-cnt:4 , method-cnt:2) cost in 150 ms.
`---ts=2019-09-14 21:36:26;thread_name=http-nio-7744-exec-3;id=12;is_daemon=true;priority=5;TCCL=org.springframework.boot.loader.LaunchedURLClassLoader@20ad9418
    `---[75.743681ms] org.apache.catalina.webresources.JarWarResourceSet:getArchiveEntries()
        +---[0.025731ms] java.util.HashMap:<init>() #106
        +---[0.097729ms] org.apache.catalina.webresources.JarWarResourceSet:openJarFile() #109
        +---[0.091037ms] java.util.jar.JarFile:getJarEntry() #110
        +---[0.096325ms] java.util.jar.JarFile:getInputStream() #111
        +---[0.451916ms] org.apache.catalina.webresources.TomcatJarInputStream:<init>() #113
        +---[min=0.001175ms,max=0.001176ms,total=0.002351ms,count=2] java.lang.Integer:<init>() #114
        +---[0.00104ms] java.lang.reflect.Method:invoke() #114
        +---[0.045105ms] org.apache.catalina.webresources.TomcatJarInputStream:getNextJarEntry() #114
        +---[min=5.02E-4ms,max=0.008531ms,total=0.028864ms,count=31] java.util.jar.JarEntry:getName() #116
        +---[min=5.39E-4ms,max=0.022805ms,total=0.054647ms,count=31] java.util.HashMap:put() #116
        +---[min=0.004452ms,max=34.479307ms,total=74.206249ms,count=31] org.apache.catalina.webresources.TomcatJarInputStream:getNextJarEntry() #117
        +---[0.018358ms] org.apache.catalina.webresources.TomcatJarInputStream:getManifest() #119
        +---[0.006429ms] org.apache.catalina.webresources.JarWarResourceSet:setManifest() #120
        +---[0.010904ms] org.apache.tomcat.util.compat.JreCompat:isJre9Available() #121
        +---[0.003307ms] org.apache.catalina.webresources.TomcatJarInputStream:getMetaInfEntry() #133
        +---[5.5E-4ms] java.util.jar.JarEntry:getName() #135
        +---[6.42E-4ms] java.util.HashMap:put() #135
        +---[0.001981ms] org.apache.catalina.webresources.TomcatJarInputStream:getManifestEntry() #137
        +---[0.064484ms] org.apache.catalina.webresources.TomcatJarInputStream:close() #141
        +---[0.007961ms] org.apache.catalina.webresources.JarWarResourceSet:closeJarFile() #151
        `---[0.004643ms] java.io.InputStream:close() #155

A point worthy of pause is found:

 +---[min=0.004452ms,max=34.479307ms,total=74.206249ms,count=31] org.apache.catalina.webresources.TomcatJarInputStream:getNextJarEntry() #117

This line of code has been loaded 31 times, taking a total of 74ms; from the name point of view, it should be the time taken by tomcat to load the jar package, so it is the time taken by tomcat to load 31 jar packages, or 31 times to load some resources in the jar package?

The notes of TomcatJarInputStream class source code write:

The purpose of this sub-class is to obtain references to the JarEntry objects
for META-INF/ and META-INF/MANIFEST.MF that are otherwise swallowed by the
JarInputStream implementation.
This is a subclass. More functions are in the parent class jar inputstream.

In fact, you can probably guess the problem here. The time-consuming caused by tomcat loading META-INF / and META-INF/MANIFEST resources in the jar package. As for why continuous requests are not time-consuming, it should be tomcat's caching mechanism (the source code analysis is introduced below)

Don't worry about locating the problem. Try to locate the details of the problem through Arthas and continue to manually drill into trace

[arthas@24851]$ trace org.apache.catalina.webresources.TomcatJarInputStream *
Press Q or Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:4) cost in 44 ms.
`---ts=2019-09-14 21:37:47;thread_name=http-nio-7744-exec-5;id=14;is_daemon=true;priority=5;TCCL=org.springframework.boot.loader.LaunchedURLClassLoader@20ad9418
    `---[0.234952ms] org.apache.catalina.webresources.TomcatJarInputStream:createZipEntry()
        +---[0.039455ms] java.util.jar.JarInputStream:createZipEntry() #43
        `---[0.007827ms] java.lang.String:equals() #44

`---ts=2019-09-14 21:37:47;thread_name=http-nio-7744-exec-5;id=14;is_daemon=true;priority=5;TCCL=org.springframework.boot.loader.LaunchedURLClassLoader@20ad9418
    `---[0.050222ms] org.apache.catalina.webresources.TomcatJarInputStream:createZipEntry()
        +---[0.001889ms] java.util.jar.JarInputStream:createZipEntry() #43
        `---[0.001643ms] java.lang.String:equals() #46
#There are 31 trace logs in total, and the rest are deleted

In terms of method name, it means loading resources. The jdk source code has been reached. Now let's look at the source code of TomcatJarInputStream class:

/**
 * Creates a new <code>JarEntry</code> (<code>ZipEntry</code>) for the
 * specified JAR file entry name. The manifest attributes of
 * the specified JAR file entry name will be copied to the new
 * <CODE>JarEntry</CODE>.
 *
 * @param name the name of the JAR/ZIP file entry
 * @return the <code>JarEntry</code> object just created
 */
protected ZipEntry createZipEntry(String name) {
    JarEntry e = new JarEntry(name);
    if (man != null) {
        e.attr = man.getAttributes(name);
    }
    return e;
}

The createZipEntry has a name parameter. From the annotation point of view, it is a jar/zip file name. If you can get the key information such as the file name, you can directly locate the problem. Or through Arthas, use the watch command to dynamically monitor the method call data

watch method performs data observation
It is convenient for you to observe the call of the specified method. The scope that can be observed is: return value, throw exception, enter parameter, and view the corresponding variable by writing OGNL expression.

watch the input parameter of this method

[arthas@24851]$ watch  org.apache.catalina.webresources.TomcatJarInputStream createZipEntry "{params[0]}"
Press Q or Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 27 ms.
ts=2019-09-14 21:51:14; [cost=0.14547ms] result=@ArrayList[
    @String[META-INF/],
]
ts=2019-09-14 21:51:14; [cost=0.048028ms] result=@ArrayList[
    @String[META-INF/MANIFEST.MF],
]
ts=2019-09-14 21:51:14; [cost=0.046071ms] result=@ArrayList[
    @String[META-INF/resources/],
]
ts=2019-09-14 21:51:14; [cost=0.033855ms] result=@ArrayList[
    @String[META-INF/resources/swagger-ui.html],
]
ts=2019-09-14 21:51:14; [cost=0.039138ms] result=@ArrayList[
    @String[META-INF/resources/webjars/],
]
ts=2019-09-14 21:51:14; [cost=0.033701ms] result=@ArrayList[
    @String[META-INF/resources/webjars/springfox-swagger-ui/],
]
ts=2019-09-14 21:51:14; [cost=0.033644ms] result=@ArrayList[
    @String[META-INF/resources/webjars/springfox-swagger-ui/favicon-16x16.png],
]
ts=2019-09-14 21:51:14; [cost=0.033976ms] result=@ArrayList[
    @String[META-INF/resources/webjars/springfox-swagger-ui/springfox.css],
]
ts=2019-09-14 21:51:14; [cost=0.032818ms] result=@ArrayList[
    @String[META-INF/resources/webjars/springfox-swagger-ui/swagger-ui-standalone-preset.js.map],
]
ts=2019-09-14 21:51:14; [cost=0.04651ms] result=@ArrayList[
    @String[META-INF/resources/webjars/springfox-swagger-ui/swagger-ui.css],
]
ts=2019-09-14 21:51:14; [cost=0.034793ms] result=@ArrayList[
    @String[META-INF/resources/webjars/springfox-swagger-ui/swagger-ui.js.map],
@String[META-INF/resources/webjars/springfox-swagger-ui/swagger-ui.js.map],

We can directly see the specific loaded resource name, so familiar name: swagger UI, a rest interface document tool from abroad, and a set of spring mvc integration tool based on swagger UI developed by domestic developers. Through annotation, we can automatically generate the interface definition json file needed by swagger UI, which is more convenient to use, but has strong invasion.

After deleting the jar package of swagger, the weird 70+ms disappears

<! -- delete these two references in POM. These two packages are encapsulated by domestic developers. swagger UI does not provide Java spring MVC support package. swagger is just a browser ui+editor

<!--pom Delete these two references. These two packages are packaged by domestic developers, swagger-ui Not provided java spring-mvc Support package for, swagger It's just a browser side ui+editor
<dependency>
    <groupId>io.springfox</groupId>
    <artifactId>springfox-swagger2</artifactId>
    <version>2.9.2</version>
</dependency>
<dependency>
    <groupId>io.springfox</groupId>
    <artifactId>springfox-swagger-ui</artifactId>
    <version>2.9.2</version>
</dependency>

So why does swagger lead to time-consuming requests? Why does every request even read load static resources inside swagger?

In fact, this is a bug of Tomcat embed. Let's introduce this bug in detail

Tomcat embedded bug analysis & solution
The source code analysis process is too long, and it is not the focus of this article, so I will not introduce it. Next, I will introduce the analysis results directly

By the way, post a core class diagram of tomcat processing request

Why static resources in Jar package are loaded every time
The key lies in org.apache.catalina.mapper.mapper ා internalmapwrapper. There is a problem with the way requests are processed in this version, resulting in checking static resources every time.

Why continuous requests don't cause problems
Because Tomcat has a cache for this static resource resolution, it takes priority to search from the cache, and then re resolve after the cache expires. Please refer to org.apache.catalina.webresources.Cache for details. The default expiration time is 5000ms.

Why can't it be reproduced locally
In fact, it can't be reproduced after the plug-in is packaged through spring boot. Because of the different startup methods, tomcat uses different classes to handle static resources, so it's OK

How to solve
Upgrade Tomcat embed version

The current version of the Bug is:

Spring boot: 2.0.2.release, built-in tomcat embed version 8.5.31

Upgrade tomcat embed to 8.5.40 +, which has been fixed

By replacing the springboot pom properties mode
If the project is maven, it is inherited from spring boot, that is, parent is configured as springboot, or dependency management is used to import the spring boot package

<parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.0.2.RELEASE</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>

You can directly override properties in pom:

<properties>
    <tomcat.version>8.5.40</tomcat.version>
</properties>

Upgrade spring boot version
The Tomcat embedded version in springboot 2.1.0.RELEASE is greater than 8.5.31, so you can directly upgrade springboot to this version or above to solve this problem

Tags: Java Apache Tomcat Spring

Posted on Tue, 14 Jan 2020 03:37:46 -0800 by Afser