Java processes pull CPU or load up and quickly locate scripts

Perhaps students will often encounter a Java server in the production environment. Everything is normal when it is released. After a period of time, there will be a phenomenon of high CPU occupancy or high load. The better load or CPU will be higher and higher day by day. The worse situation is random jitter and then return to normal, which brings a lot of trouble to students in operation, maintenance and development. Of course, when this problem arises, how to improve it in the follow-up, such as: review before the code goes online, isolation/degradation of related strength-dependent services, unit testing, regression testing, online audit of SQL, basic and business monitoring, and related process systems.

If CPU usage or load soars and lasts for a long time, there are also a lot of checking steps on the Internet.

Method 1

1. Use top to locate process PID with high CPU usage

top

2. Getting thread information

ps -mp PID  -o THREAD,tid,time | sort -rn 

3. Converting the required thread ID to 16-bit format

printf "%x\n" tid

4. Print thread stack information

jstack pid | grep tid, where tid is the hexadecimal format tid generated in Step 3


Method 2 (Recommendation)

cpu usage for fast positioning threads and threads

#!/bin/bash
# @Function
# Find out the most cpu consumed threads of java,and print the stack trace of these threads.
#
# @Usage
#   $./javacpu -h
#
PROG=`basename $0`
usage(){
cat <<EOF
Usage: ${PROG} [OPTION] ...
Find out the highest cpu consumed threads of java,and print the stack of these threads.
Example: ${PROG} -c 10
Options:
-p,--pid    find out  highest cpu consumed threads from the specifed java process,
default from all java process.
-c,--count   set the thread count to show,default is 5
-h,--help    display this help and exit
EOF
exit $1
}
ARGS=`getopt -n "$PROG" -a -o c:p:h -l count:,pid:,help -- "$@" `
[ $? -ne 0 ] && usage 1
eval set -- "${ARGS}"
while true;do
case "$1" in
	-c|--count)
	count="$2"
		shift 2
		;;
	-p|--pid)
	pid="$2"
		shift 2
		;;
	-h|--help)
	usage
		;;
	--)
	shift
break
		;;
	esac
	done
	count=${count:-10}
	redEcho(){
	[ -c /dev/stdout ] &&{
# if stdout is console,turn on color output.
echo  -ne "\033[1;31m"
echo -n "$@"
echo -e "\033[0m"		 
	} || echo "$@"
	}
	
	## check jstack cmd
	if ! which jstack &> /dev/null; then
	[ -n "$JAVA_HOME" ] && [ -f "$JAVA_HOME/bin/jstack" ] && [ -x "$JAVA_HOME/bin/jstack" ] &&{
	export PATH="$JAVA_HOME/bin:$PATH"
	} || {
	redEcho "Error:jstack nof found on PATH and JAVA_HOME!"
	exit 1
	}
	fi
	
	uuid=`date +%s`_${RANDOM}_$$
	
	cleanupWhenExit(){
	rm /tmp/${uuid}_* &> /dev/null
	}
	trap "cleanupWhenExit" EXIT
	
	printStackOfThread(){
	while read threadLine ; do
	pid=`echo ${threadLine} | awk '{print $1}'`
		threadId=`echo ${threadLine} | awk '{print $2}'`
		threadId0x=`printf %x ${threadId}`
		user=`echo ${threadLine}  | awk '{print $3}'`
		pcpu=`echo ${threadLine}   | awk '{print $5}'`
		jstackFile=/tmp/${uuid}_${pid}
		[ ! -f "${jstackFile}"  ] && {
		jstack ${pid} > ${jstackFile} ||{
		redEcho "Fail to jstack java process ${pid}!"
			rm ${jstackFile}
			continue
		}
		}
		
		redEcho "The stack of busy(${pcpu}%) thread(${threadId}/0x${htreadId0x})
		of java process(${pid}) of user(${user}):"
		sed "/nid=0x${threadId0x}/,/^$/p" -n ${jstackFile}
		done
	}
	
	[ -z "${pid}" ] && {
	ps -Leo pid,lwp,user,comm,pcpu --no-headers|awk '$4=="java"{print $0}' |sort -k5 -r -n |head --lines "${count}" | printStackOfThread
	} || {
	ps -Leo pid,lwp,user,comm,pcpu --no-headers |awk -v "pid=${pid}" '$1==pid,$4=="java"{print $0}' | sort -k5 -r -n |head --lines "${count}" | printStackOfThread
	}	
	


Method 3 (Random load jitter for Java servers)

#!/usr/bin/env python
import os
import time, datetime
import threading
# desc: when system loadavg 1 min load lt 10,then dump java jstack
def load_stat():
loadavg = {}
f = open("/proc/loadavg")
info = f.read().split()
f.close()
loadavg['lavg_1'] = info[0]
loadavg['lavg_5']= info[1]
loadavg['lavg_15']= info[2]
start_time = datetime.datetime.strptime(str(datetime.datetime.now().date()) + '00:00', '%Y-%m-%d%H:%M')
curr_time = datetime.datetime.now()
end_time = datetime.datetime.strptime(str(datetime.datetime.now().date() + datetime.timedelta(days=2)) + '23:59', '%Y-%m-%d%H:%M')
if (start_time <= curr_time  <= end_time ) :
if float(loadavg['lavg_1']) >= 11:
pid = os.popen("jps |grep -v Jps|awk '{print $1}'").read()
cmd = "jstack" + " " + pid
stack = os.popen(cmd).read()
tm = time.strftime("%Y-%m-%d_%H-%M-%S", time.localtime())
timeslog = 'java_stack_' + tm + r'.txt'
log_f = open(timeslog, 'w')
log_f.write(stack)
log_f.close()
cmd_2="ps -mp " + pid.strip('\n') + " -o THREAD,tid,time | sort -rn"
top_tid_info=os.popen(cmd_2).read()
cpu_tid_logs='tid_cpu_' + tm + r'.txt'
log_f2 = open(cpu_tid_logs,'w')
log_f2.write(top_tid_info)
log_f2.close()
threading.Timer(5, load_stat).start()
else:
threading.Timer(5, load_stat).start()
else:
exit
#return loadavg
load_stat()



Tags: Operation & Maintenance Java SQL Python

Posted on Thu, 10 Oct 2019 07:16:25 -0700 by sincspecv