To explain what server loads really are and how to control them, how to watch loads and signs of server trouble.
Server Load Explanation
The load average tries to measure the number of active processes at any time. As a measure of CPU utilization, the load average is simplistic, poorly defined, but far from useless. High load averages usually mean that the system is being used heavily and the response time is correspondingly slow. What’s high? Ideally, you’d like a load average under, say, 3, Ultimately, ‘high’ means high enough so that you don’t need uptime to tell you that the system is overloaded. When seeing the results of the load averages, they are for the past 1, 5, and 15 minutes.
Checking the servers load
There are a few different ways to keep an eye on your servers load, the first thing you need to do is login to your server by SSH.
Method 1 using the uptime command:
The uptime shell command produces the following output:
09:53:04 up 34 days, 14:40, 1 user, load average: 0.01, 0.03, 0.00
It shows the time since the system was last booted, the number of active user processes and something called the load average.
Method 2 using the w command:
The w command produces the following output:
09:52:14 up 34 days, 14:39, 1 user, load average: 0.02, 0.04, 0.01
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
mir ttyp0 :0.0 Fri10pm 3days 0.09s 0.09s bash
giles pts/0 18.104.22.168 9:40am 0.00s 0.29s 0.15s w
Notice that the first line of the output is identical to the output of the uptime command.
Method 3 using the top command preferred:
The top command is a more recent addition to the UNIX command set that ranks processes according to the amount of CPU time they consume. It produces the following output:
top – 09:54:47 up 34 days, 14:42, 1 user, load average: 0.07, 0.05, 0.01
Tasks: 371 total, 1 running, 370 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.1%us, 0.1%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 37043972k total, 24892516k used, 12151456k free, 284460k buffers
Swap: 39092216k total, 0k used, 39092216k free, 3825204k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31963 root 15 0 12868 1212 720 R 1.9 0.0 0:00.02 top
1 root 15 0 10348 692 580 S 0.0 0.0 0:05.16 init
2 root RT -5 0 0 0 S 0.0 0.0 0:10.52 migration/0
3 root 34 19 0 0 0 S 0.0 0.0 0:00.04 ksoftirqd/0
We like to use the top command because it also shows server uptime, memory information and the list of processes that you can sort by CPU usage, etc.
So what is a good load, bad load and in between?
I know you’re thinking, so what is a good system load or what is a bad load? Anything around 1.0 and below is fine, try to stick to under 1.0 for regular load averages. If you notice your server slowing down, check the load first. We hosted a site that was mentioned on the media (TV, News, Radio) recently and the server load skyrocketed because of the huge wave of traffic. The load went from 0.25 to 37.00 just because the server was getting hammered.
When your regular average starts to creep up around 2.0 then your server is very busy and you should consider getting another machine or upgrading your hardware. When I say regular average, I mean when the system is idle during the day and isn’t processing all your logs or backing up data.
Having an overloaded server can lead to many problems and should always be avoided. I hope this guide was helpful by giving you some more insight to server loads, what to use to monitor them and what is a good and bad load average.