profile_blockio
The profile_blockio
gadget gathers information about the usage of the
block device I/O (disk I/O), generating a histogram distribution of I/O
latency (time), when the gadget is stopped.
Notice that the latency of the disk I/O is measured from when the call is issued to the device until its completion, it does not include time spent in the kernel queue. This means that the histogram reflects only the performance of the device and not the effective latency suffered by the applications.
The histogram shows the number of I/O operations (count
column) that lie in
the latency range interval-start
-> interval-end
(µs
column), which,
as the columns name indicates, is given in microseconds.
For this guide, we will use the stress
tool that allows
us to load and stress the system in many different ways. In particular, we will use the --io
flag
that will generate a given number of workers to spin on the sync()
syscall. In this way, we will generate disk I/O
that we will analyse using the profile_blockio
gadget.
Getting started
Running the gadget:
- kubectl gadget
- ig
$ kubectl gadget run ghcr.io/inspektor-gadget/gadget/profile_blockio:latest [flags]
$ sudo ig run ghcr.io/inspektor-gadget/gadget/profile_blockio:latest [flags]
Guide
- kubectl gadget
- ig
Run the gadget in a terminal:
$ kubectl gadget run profile_blockio:latest --node minikube-docker
It will start to display the I/O latency distribution as follows:
latency
µs : count distribution
0 -> 1 : 0 | |
1 -> 2 : 0 | |
2 -> 4 : 0 | |
4 -> 8 : 1 |* |
8 -> 16 : 21 |*************************** |
16 -> 32 : 17 |********************* |
32 -> 64 : 12 |*************** |
64 -> 128 : 0 | |
128 -> 256 : 0 | |
256 -> 512 : 0 | |
512 -> 1024 : 21 |*************************** |
1024 -> 2048 : 3 |*** |
2048 -> 4096 : 31 |****************************************|
4096 -> 8192 : 0 | |
8192 -> 16384 : 0 | |
16384 -> 32768 : 0 | |
32768 -> 65536 : 0 | |
65536 -> 131072 : 0 | |
131072 -> 262144 : 0 | |
262144 -> 524288 : 0 | |
524288 -> 1048576 : 0 | |
1048576 -> 2097152 : 0 | |
2097152 -> 4194304 : 0 | |
4194304 -> 8388608 : 0 | |
8388608 -> 16777216 : 0 | |
16777216 -> 33554432 : 0 | |
33554432 -> 67108864 : 0 | |
Now, let's increase the I/O operations using the stress tool:
# Start by creating our testing namespace
$ kubectl create ns test-biolatency
# Run stress with 1 worker that will generate I/O operations
$ kubectl run --restart=Never --image=polinux/stress stress-io -n test-biolatency -- stress --io 1
$ kubectl wait --timeout=-1s -n test-biolatency --for=condition=ready pod/stress-io
pod/stress-io condition met
$ kubectl get pod -n test-biolatency -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
stress-io 1/1 Running 0 2s 10.244.1.7 worker-node <none> <none>
Using the profile block-io gadget, we can generate another histogram to analyse the disk I/O with this load:
# Run the gadget again
$ kubectl gadget run profile_blockio:latest --node minikube-docker
latency
µs : count distribution
0 -> 1 : 0 | |
1 -> 2 : 0 | |
2 -> 4 : 0 | |
4 -> 8 : 786 | |
8 -> 16 : 57788 |****************************************|
16 -> 32 : 39060 |*************************** |
32 -> 64 : 1513 |* |
64 -> 128 : 36 | |
128 -> 256 : 16 | |
256 -> 512 : 260 | |
512 -> 1024 : 2045 |* |
1024 -> 2048 : 986 | |
2048 -> 4096 : 57 | |
4096 -> 8192 : 1 | |
8192 -> 16384 : 0 | |
16384 -> 32768 : 0 | |
32768 -> 65536 : 0 | |
65536 -> 131072 : 0 | |
131072 -> 262144 : 0 | |
262144 -> 524288 : 0 | |
524288 -> 1048576 : 0 | |
1048576 -> 2097152 : 0 | |
2097152 -> 4194304 : 0 | |
4194304 -> 8388608 : 0 | |
8388608 -> 16777216 : 0 | |
16777216 -> 33554432 : 0 | |
33554432 -> 67108864 : 0 | |
The new histogram shows how the number of I/O operations increased significantly.
Run the gadget in a terminal:
$ sudo ig run profile_blockio:latest
It will start to display the I/O latency distribution as follows:
latency
µs : count distribution
0 -> 1 : 0 | |
1 -> 2 : 0 | |
2 -> 4 : 0 | |
4 -> 8 : 4 | |
8 -> 16 : 11 |** |
16 -> 32 : 1 | |
32 -> 64 : 10 |* |
64 -> 128 : 0 | |
128 -> 256 : 0 | |
256 -> 512 : 2 | |
512 -> 1024 : 11 |** |
1024 -> 2048 : 1 | |
2048 -> 4096 : 213 |****************************************|
4096 -> 8192 : 0 | |
8192 -> 16384 : 0 | |
16384 -> 32768 : 0 | |
32768 -> 65536 : 0 | |
65536 -> 131072 : 0 | |
131072 -> 262144 : 0 | |
262144 -> 524288 : 0 | |
524288 -> 1048576 : 0 | |
1048576 -> 2097152 : 0 | |
2097152 -> 4194304 : 0 | |
4194304 -> 8388608 : 0 | |
8388608 -> 16777216 : 0 | |
16777216 -> 33554432 : 0 | |
33554432 -> 67108864 : 0 | |
Now, let's increase the I/O operations using the stress tool:
$ docker run -d --rm --name stresstest polinux/stress stress --io 10
Run the gadget to observe the I/O latency distribution:
$ sudo ig run profile_blockio:latest
latency
µs : count distribution
0 -> 1 : 0 | |
1 -> 2 : 0 | |
2 -> 4 : 0 | |
4 -> 8 : 202 | |
8 -> 16 : 13027 |************************************* |
16 -> 32 : 13833 |****************************************|
32 -> 64 : 4272 |************ |
64 -> 128 : 876 |** |
128 -> 256 : 13 | |
256 -> 512 : 529 |* |
512 -> 1024 : 2913 |******** |
1024 -> 2048 : 720 |** |
2048 -> 4096 : 86 | |
4096 -> 8192 : 5 | |
8192 -> 16384 : 0 | |
16384 -> 32768 : 0 | |
32768 -> 65536 : 0 | |
65536 -> 131072 : 0 | |
131072 -> 262144 : 0 | |
262144 -> 524288 : 0 | |
524288 -> 1048576 : 0 | |
1048576 -> 2097152 : 0 | |
2097152 -> 4194304 : 0 | |
4194304 -> 8388608 : 0 | |
8388608 -> 16777216 : 0 | |
16777216 -> 33554432 : 0 | |
33554432 -> 67108864 : 0 | |
Congratulations! You reached the end of this guide! You can clean up the resources created during this guide by running the following commands:
- kubectl gadget
- ig
$ kubectl delete ns test-biolatency
$ docker rm -f stresstest