Architecture Musings: Analyzing System/CPU Load on Linux Part I

One of our staging instances is experiencing high CPU load and our Cacti monitor periodically sends email alerts.

It is a small instance so the load is not unexpected. But the load can be reduced with configuration and software changes. Upgrade to higher AMI (Amazon Machine Image) would be the last resort. One of the culprits I believe is the Quartz scheduler. If we fix the issue on staging instance, we can avoid it on production instances.

Check CPU load:

> uptime 21:53:23 up 24 days, 7:26, 2 users, load average: 5.69, 3.62, 4.07

Load averages of 5.69, 3.62 and 4.07 for the past 1, 5, and 15 minutes are pretty high on a single CPU machine. Here is an article on understanding CPU load:
http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages

Check CPU load by process:

> ps -e --format pid,pcpu,comm --sort=-pcpu | head 10686 60.3 java 8825 21.3 mysqld 15356 0.9 php 15302 0.0 top 76 0.0 kswapd0 935 0.0 memcached 926 0.0 snmpd 12170 0.0 bash 904 0.0 rsyslogd

Unfortunately this does not say much; The Java process is the Tomcat process, which is running a bunch of threads, including threads for the Quartz scheduler. We need to dive down at Tomcat thread level, and also see what MySQL is doing.

Note: The CPU load has gone down after certain changes were performed few days after writing this post. I will not continue with Part || of this post at this time.

Architecture Musings

Friday, November 16, 2012

Analyzing System/CPU Load on Linux Part I

No comments:

Post a Comment