JVM in-depth understanding - memory tuning and GC logging

CPU Tall Analysis

Generally available

  • ps -Lfp pid
  • ps -mp pid -o THREAD, tid, time
  • top -Hp pid
[root@redis webapps]# top -Hp 22272
top - 10:09:30 up 9 days, 22:10,  1 user,  load average: 0.00, 0.00, 0.00
Tasks:  30 total,   0 running,  30 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   3923196k total,  3795588k used,   127608k free,   153056k buffers
Swap:  6160376k total,        0k used,  6160376k free,  3079244k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                
22272 root      20   0 2286m 122m  11m S  0.0  3.2   0:00.00 java                                                                                                                                                    
22278 root      20   0 2286m 122m  11m S  0.0  3.2   0:00.00 java                                                                                    
22279 root      20   0 2286m 122m  11m S  0.0  3.2   0:00.00 java                                                                                   
22282 root      20   0 2286m 122m  11m S  0.0  3.2   0:00.00 java                                                                                    
22283 root      20   0 2286m 122m  11m S  0.0  3.2   3:01.48 java                                                                                    
22287 root      20   0 2286m 122m  11m S  0.0  3.2   0:00.00 java                                                                                    
22288 root      20   0 2286m 122m  11m S  0.0  3.2   0:00.00 java                                                                                    
22289 root      20   0 2286m 122m  11m S  0.0  3.2   0:00.00 java                                                                                    
22290 root      20   0 2286m 122m  11m S  0.0  3.2   0:00.03 java                                                                                    
22291 root      20   0 2286m 122m  11m S  0.0  3.2   0:00.34 java                                                                                    
22292 root      20   0 2286m 122m  11m S  0.0  3.2   0:00.06 java                                                                            
22299 root      20   0 2286m 122m  11m S  0.0  3.2   0:06.37 java                                                                                    
22301 root      20   0 2286m 122m  11m S  0.0  3.2   1:09.73 java                                                                                    
17034 root      20   0 2286m 122m  11m S  0.0  3.2   0:00.00 java

The TIME column is the CPU time consumed by each Java thread, with the longest CPU time being 22233 threads with thread ID.

With printf'%xn'22283

[root@redis webapps]# printf '%x\n' 22283
570b The hexadecimal value of 22283 is 570. b. 

The next step is jstack, which outputs stack information for process 22272, then grep based on the hexadecimal value of the thread ID, as follows:

[root@redis webapps]# jstack 22272 | grep 570b
"SchedulerThread" prio=10 tid=0x00007f950043e000 nid=0x54ee in Object.wait()

You can see that CPU consumption is in the SchdulerThread class Object.wait(), navigate to the following code:

// Idle wait
synchronized(sigLock) {
  try {
    if(!halted.get()) {
      sigLock.wait(timeUntilContinue);
    }
  } 
  catch (InterruptedException ignore) {
  }
}

It is the idle wait code for the polling task, which isSigLock.wait(timeUntilContinue) corresponds to the previous Object.wait().

Summary: You can find the corresponding thread through the PID, and then through the JVM jstack to find the corresponding thread information in the stack.By finding the corresponding code, you can generally analyze the reason for the high CPU usage.

Analysis with JVM commands

jstat -gcutil
[root@xxxx-nccz8-b57dd64fc-nt9dj logs]# jstat -gcutil 1 2000 20
S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT     GCT   
  0.00   8.64   2.62  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   2.81  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   2.81  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   2.81  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   3.06  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   3.10  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   3.10  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   3.21  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   3.22  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   3.61  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   6.47  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   6.76  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   6.81  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   7.07  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   7.08  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   7.38  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   7.38  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   7.62  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   7.62  27.32  80.59  73.86   1350   22.705     5    1.449   24.154
  0.00   8.64   7.76  27.32  80.59  73.86   1350   22.705     5    1.449   24.154

Percentage of space used in Survivor space 0 area on S0:Heap S1: Percentage of space used in Survivor space 1 area on Heap Percentage of space used in the Eden space area on E:Heap Percentage of used space in Old space area on O:Heap M:Metaspace area percentage of used space YGC: Number of Young GC occurrences from application startup to sampling YGCT: Time taken (in seconds) from application startup to Young GC when sampling FGC: Number of Full GC occurrences from application startup to sampling FGCT: Time taken (in seconds) from application startup to Full GC sampling GCT: Total time from application startup to garbage collection at sampling time (in seconds)

jmap -heap
[root@xxxx-nccz8-b57dd64fc-nt9dj startup]# jmap -heap 1
Attaching to process ID 1, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.191-b12

using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC

Heap Configuration:
   MinHeapFreeRatio         = 40
   MaxHeapFreeRatio         = 70
   MaxHeapSize              = 2147483648 (2048.0MB)
   NewSize                  = 805306368 (768.0MB)
   MaxNewSize               = 805306368 (768.0MB)
   OldSize                  = 1342177280 (1280.0MB)
   NewRatio                 = 2
   SurvivorRatio            = 8
   MetaspaceSize            = 268435456 (256.0MB)
   CompressedClassSpaceSize = 260046848 (248.0MB)
   MaxMetaspaceSize         = 268435456 (256.0MB)
   G1HeapRegionSize         = 0 (0.0MB)

Heap Usage:
New Generation (Eden + 1 Survivor Space):
   capacity = 724828160 (691.25MB)
   used     = 284988360 (271.7860794067383MB)
   free     = 439839800 (419.4639205932617MB)
   39.318058503687276% used
Eden Space:
   capacity = 644349952 (614.5MB)
   used     = 275398000 (262.63999938964844MB)
   free     = 368951952 (351.86000061035156MB)
   42.74043928228616% used
From Space:
   capacity = 80478208 (76.75MB)
   used     = 9590360 (9.146080017089844MB)
   free     = 70887848 (67.60391998291016MB)
   11.916716634644748% used
To Space:
   capacity = 80478208 (76.75MB)
   used     = 0 (0.0MB)
   free     = 80478208 (76.75MB)
   0.0% used
concurrent mark-sweep generation:
   capacity = 1342177280 (1280.0MB)

The heap command allows you to see one usage of the current entire heap, a real usage ratio of user to free.

jmap -dump
jmap -dump:live,format=b,file=xxxxxx.20200707.hprof 1

We usually configure it in Dockerfile to keep the scene if OOM happens.-XX:HeapDumpPath=/alidata1/admin/xxxx/logs

Using JProfiler analysis

Large objects were found here because we used Jeager link tracking, but too many threads resulted in ThreadLocal not being released.

Find out that char[] takes up a lot of space, and where did it come from?

Find specific instances through Outgoing references.

What are outgoing references and incoming references?Let's take an example to learn more about Incoming references and Outgoing references.For example, the source code for an application is as follows:

public class A {     
  private C c1 = C.getInstance();
}

public class B {     
	private C c2 = C.getInstance();
}

public class C {     
	private static C myC = new C();     

	public static C getInstance() {             
	return myC;     
	}   

	private D d1 = new D();     
	private E e1 = new E();
}

public class D {
}

public class E {
}

public class SimpleExample {     
	public static void main (String argsp[]) throws Exception {
           A a = new A();
           B b = new B();     
    }
}
  • Object A and Object B hold references to object C
  • Object C holds references to object D and E

In this example project, let's specifically analyze Incoming references and Outgoing references for object C.

Incoming References for object C

All objects that have references to object C are called Incoming references.In this example, the "Incoming references" of object C are class objects of object A, object B, and C.

Outgoing References for object C

All objects referenced by object C are called Outgoing References.In this example, the "outgoing references" of object C are class objects of object D, object E, and object C.

Then click one layer at a time from the Show in graph menu.Until you see the class you are most familiar with.

How to view GC logs

Set gc log configuration

-XX:+PrintGC Output Brief GC Log 
-XX:+PrintGCDetails Output Detailed GC Log 
-Xloggc:gc.log Output GC log to file
 -XX:+PrintGCTimeStamps Output GC Timestamp (in the form of total time stamp of JVM startup to current period) 
-XX:+PrintGCDateStamps Output GC Timestamp (in the form of date, such as 2013-05-04T21:53:59.234+0800) 
-XX:+PrintHeapAtGC Prints out heap information before and after GC
-verbose:gc
 -XX:+PrintReferenceGC Prints the number and length of references for young generations
-XX:+PrintGC

If only -XX:+PrintGC is set, the printed logs are as follows:

[GC (Allocation Failure)  61805K->9849K(256000K), 0.0041139 secs]

1. GC means a YGC (Young GC)
2. Allocation Failure indicates failure type
 3. 61805K->9849K means that the younger generation decreased from 61805K to 9849K
 4. 256000K represents the size of the entire heap
 5, 0.0041139 secs represents the total time taken for this GC
 In JDK 8, -Verbose:gcIs -XX:+PrintGC an alias, the log format is equivalent to: -XX:+PrintGC,.
-XX:+PrintGCDetails
[GC (Allocation Failure) [PSYoungGen: 53248K->2176K(59392K)] 58161K->7161K(256000K), 0.0039189 secs] [Times: user=0.02 sys=0.01, real=0.00 secs]
1. GC means a YGC (Young GC)
2. Allocation Failure indicates failure type
 3. PSYoungGen indicates the size of the younger generation
 4. 53248K->2176K indicates that the younger generation occupancy decreased from 53248K to 2176K
 5. 59392K indicates the size of the younger band
 6. 58161K->7161K means the total heap occupancy has decreased from 53248K to 2176K
 7, 256000K represents the size of the entire heap
 8, 0.0039189 secs represents the total time taken for this GC
 9, [Times: user=0.02 sys=0.01, real=0.00 secs] respectively indicates that the user state takes up a long time, the kernel time consumes, and the real time consumes.
Time is rounded by two decimal places.
-XX:+PrintGCTimeStamps
1.963: [GC (Allocation Failure)  61805K->9849K(256000K), 0.0041139 secs]

If -XX:+PrintGCTimeStamps is added, the log only has one more time stamp than the first one introduced in 1.1: 1.963, which means it took 1.963 seconds from JVM startup to GC printing.

-XX:+PrintGCDateStamps
2019-03-05T16:56:15.108+0800: [GC (Allocation Failure)  61805K->9849K(256000K), 0.0041139 secs]

If -XX:+PrintGCDateStamps is added, the log is only one more date time than the first one introduced in 1.1: 2019-03-05T16:56:15.108+0800, which means the time at which the GC is printed is 2019-03-05T16:56:15.108+0800.+0800 means East 8.

Detailed analysis of CMS GC logs
[GC (CMS Initial Mark) [1 CMS-initial-mark: 19498K(32768K)] 36184K(62272K), 0.0018083 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 
[CMS-concurrent-mark-start]
[CMS-concurrent-mark: 0.011/0.011 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
[CMS-concurrent-preclean-start]
[CMS-concurrent-preclean: 0.001/0.001 secs] [Times: user=0.00 sys=0.00, real=0.01 secs] 
[CMS-concurrent-abortable-preclean-start]
 CMS: abort preclean due to time [CMS-concurrent-abortable-preclean: 0.558/5.093 secs] [Times: user=0.57 sys=0.00, real=5.09 secs] 
[GC (CMS Final Remark) [YG occupancy: 16817 K (29504 K)][Rescan (parallel) , 0.0021918 secs][weak refs processing, 0.0000245 secs][class unloading, 0.0044098 secs][scrub symbol table, 0.0029752 secs][scrub string table, 0.0006820 secs][1 CMS-remark: 19498K(32768K)] 36316K(62272K), 0.0104997 secs] [Times: user=0.02 sys=0.00, real=0.01 secs] 
[CMS-concurrent-sweep-start]
[CMS-concurrent-sweep: 0.007/0.007 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 
[CMS-concurrent-reset-start]
[CMS-concurrent-reset: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 

The CMS log is divided into two STW(stop the world)

The two stages are init remark (1) and remark (7).Usually it takes about 10 times longer than YGC (personal experience).

(1),[GC (CMS Initial Mark) [1 CMS-initial-mark: 19498K(32768K)] 36184K(62272K), 0.0018083 secs][Times: user=0.01 sys=0.00, real=0.01 secs]

Will STW(Stop The World), which is 32768K in old age capacity and will start initializing the marker when it is used to 498K.Short time consuming.

(2),[CMS-concurrent-mark-start]

Start of concurrent markup phase

(3),[CMS-concurrent-mark: 0.011/0.011 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]

Concurrent markup phase takes time.

(4),[CMS-concurrent-preclean-start]

The concurrent pre-cleanup phase is also executed concurrently with the user thread.The virtual machine looks for objects that are new to the older generation during the concurrent markup phase (there may be some objects from Cenozoic era Promote to older generations, or some objects are assigned to older generations.Reduce the next stage of "Remarking" work by rescanning, as the next stage stops The World.

(5),[CMS-concurrent-preclean: 0.001/0.001 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]

Concurrent pre-cleanup phase takes time.

(6),[CMS-concurrent-abortable-preclean-start] CMS: abort preclean due to time [CMS-concurrent-abortable-preclean: 0.558/5.093 secs][Times: user=0.57 sys=0.00, real=5.09 secs]

Concurrency aborts the pre-cleanup phase and runs between parallel pre-cleanup and re-marking until the desired eden space occupancy is achieved.This stage was added to avoid a garbage clean-up immediately after the relabeling stage

(7),[GC (CMS Final Remark) [YG occupancy: 16817 K (29504 K)][Rescan (parallel) , 0.0021918 secs][weak refs processing, 0.0000245 secs][class unloading, 0.0044098 secs][scrub symbol table, 0.0029752 secs][scrub string table, 0.0006820 secs][1 CMS-remark: 19498K(32768K)] 36316K(62272K), 0.0104997 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]

Will STW(Stop The World), the collection phase, which marks all living objects of older generations, including those that have changed or newly created during the concurrent markup phase

(8),[CMS-concurrent-sweep-start]

The concurrent cleanup phase begins, executed concurrently with the user thread.

(9),[CMS-concurrent-sweep: 0.007/0.007 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]

The time taken for the concurrent cleanup phase to end.

(10),[CMS-concurrent-reset-start]

Start concurrent reset.At this stage, the data structures associated with CMS are reinitialized so that the next cycle can proceed normally.

(11),[CMS-concurrent-reset: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]

End of concurrent reset, time taken.

Reference Address

If you like my article, you can focus on your personal subscription number.Welcome to leave a message and exchange at any time.If you want to join the WeChat group and discuss it together, add lastpass4u, the administrator's culture assistant, who will pull you into the group.

Tags: Programming Java jvm Redis JDK

Posted on Sun, 24 May 2020 17:24:29 -0700 by hubbardude