It is a small instance so the load is not unexpected. But the load can be reduced with configuration and software changes. Upgrade to higher AMI (Amazon Machine Image) would be the last resort. One of the culprits I believe is the Quartz scheduler. If we fix the issue on staging instance, we can avoid it on production instances.
Check CPU load:
> uptime
21:53:23 up 24 days, 7:26, 2 users, load average: 5.69, 3.62, 4.07
Load averages of 5.69, 3.62 and 4.07 for the past 1, 5, and 15 minutes are pretty high on a single CPU machine. Here is an article on understanding CPU load:http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages
Check CPU load by process:
> ps -e --format pid,pcpu,comm --sort=-pcpu | head
10686 60.3 java
8825 21.3 mysqld
15356 0.9 php
15302 0.0 top
76 0.0 kswapd0
935 0.0 memcached
926 0.0 snmpd
12170 0.0 bash
904 0.0 rsyslogd
Unfortunately this does not say much; The Java process is the Tomcat process, which is running a bunch of threads, including threads for the Quartz scheduler. We need to dive down at Tomcat thread level, and also see what MySQL is doing.Note: The CPU load has gone down after certain changes were performed few days after writing this post. I will not continue with Part || of this post at this time.
No comments:
Post a Comment