"I've got two workloads which use the same device, one randomly the other serially. The disk service times will be very different, but I only get one average service time from the measurements. How can I represent this in a model?"
Let's pick some nice easy
numbers (to make the calculation simple,
but the technique works even with awkward
ones). Workload 1 does 20 I/Os per second
reading serially. Workload 2 does 5 per
second reading randomly. You estimate that
each serial I/O will take 5ms and each random
one 15ms. What you actually get from the
measurements is something like 7 because
it's an average, and that's a useless figure
for either workload. What to do?
Set the service time of the device to 5ms.
This will be fine for the serial-read component,
but that's too low for the random one by
a factor of three. Simply multiply the I/O
rate for the random component by three.
Proof?
Workload 1 - (20 x 5) per second = 100ms in
a second - 10% device
utilization
Workload 2 - (5 x 15) per second = 75ms in
a second - 7.5% utilization
Total = 175 ms for 25 I/Os - calculated service
time will be 7ms -
utilization = 17.5%
Applying
that (wrong) service time in a model will
give:
Workload 1 - (20 x 7) per second = 140 ms
in a second - 14% utilization -
too high Workload 2 - (5 x 7) per second
- 35 ms in a second - 3.5%
utilization - too low.
Now
do the multiplication:
Workload 1 - (20 x 5) per second - 100ms
in a second - 10% device
utilization - OK
Workload 2 - (15 x 5) per second - 75 ms
in a second - 7.5% utilization - OK
Both workloads are now right, and the total
device utilization comes out right too.
Motto - Always look both ways before crossing
the road.
Next
modeling Tip |