I expect a difference between the CPU usage recorded by SAR and that recorded by UNIX process accounting, but how big should it reasonably be?
If you have been measuring
the system for a reasonably long period
(say 1 hour) then the system v accounting
discrepancy should be no more than 5%. This
is on the assumption that short-duration
jobs have been recorded by the 'accton'
command, and long-running jobs have been
measured using 'ps' snapshots. If the difference
is larger than that, there are one or two
possibilities. There could be a large kernel
time element, involving kernel processes
not directly related to visible UNIX or
user loggable processes. Processor management
in a large multi-processor machine is one
example. On a very fast machine, the usual
culprits are very short-lived processes.
UNIX accounting only reports the CPU usage
of a process to two decimal places, i.e.
hundredths of a second. On today's high
speed CPU's, processes can run and terminate
in milliseconds, and these will show up
as 0.00 seconds in the accounting. We saw
one site where in one day, the same process
was executed 930,000 times, but each execution
was averaging 5 milliseconds. This generated
a loss of recorded CPU usage of 4,650 seconds
over the day.
Next
UNIX Tip |