profile_blockio
The profile_blockio
gadget gathers information about the usage of the
block device I/O (disk I/O), generating a histogram distribution of I/O
latency (time), when the gadget is stopped.
Notice that the latency of the disk I/O is measured from when the call is issued to the device until its completion, it does not include time spent in the kernel queue. This means that the histogram reflects only the performance of the device and not the effective latency suffered by the applications.
The histogram shows the number of I/O operations (count
column) that lie in
the latency range interval-start
-> interval-end
(µs
column), which,
as the columns name indicates, is given in microseconds.
For this guide, we will use the stress
tool that allows
us to load and stress the system in many different ways. In particular, we will use the --io
flag
that will generate a given number of workers to spin on the sync()
syscall. In this way, we will generate disk I/O
that we will analyse using the profile_blockio
gadget.
Getting started
Running the gadget:
- kubectl gadget
- ig
$ kubectl gadget run ghcr.io/inspektor-gadget/gadget/profile_blockio:latest [flags]
$ sudo ig run ghcr.io/inspektor-gadget/gadget/profile_blockio:latest [flags]
Guide
Run the gadget in a terminal:
- kubectl gadget
- ig
$ kubectl gadget run profile_blockio:latest --node minikube-docker
It will start to display the I/O latency distribution as follows:
latency
µs : count distribution
0 -> 1 : 0 | |
1 -> 2 : 0 | |
2 -> 4 : 0 | |
4 -> 8 : 1 |* |
8 -> 16 : 21 |*************************** |
16 -> 32 : 17 |********************* |
32 -> 64 : 12 |*************** |
64 -> 128 : 0 | |
128 -> 256 : 0 | |
256 -> 512 : 0 | |
512 -> 1024 : 21 |*************************** |
1024 -> 2048 : 3 |*** |
2048 -> 4096 : 31 |****************************************|
4096 -> 8192 : 0 | |
8192 -> 16384 : 0 | |
16384 -> 32768 : 0 | |
32768 -> 65536 : 0 | |
65536 -> 131072 : 0 | |
131072 -> 262144 : 0 | |
262144 -> 524288 : 0 | |
524288 -> 1048576 : 0 | |
1048576 -> 2097152 : 0 | |
2097152 -> 4194304 : 0 | |
4194304 -> 8388608 : 0 | |
8388608 -> 16777216 : 0 | |
16777216 -> 33554432 : 0 | |
33554432 -> 67108864 : 0 | |
Now, let's increase the I/O operations using the stress tool:
# Start by creating our testing namespace
$ kubectl create ns test-biolatency
# Run stress with 1 worker that will generate I/O operations
$ kubectl run --restart=Never --image=polinux/stress stress-io -n test-biolatency -- stress --io 1
$ kubectl wait --timeout=-1s -n test-biolatency --for=condition=ready pod/stress-io
pod/stress-io condition met
$ kubectl get pod -n test-biolatency -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
stress-io 1/1 Running 0 2s 10.244.1.7 worker-node <none> <none>
Using the profile block-io gadget, we can generate another histogram to analyse the disk I/O with this load:
# Run the gadget again
$ kubectl gadget run profile_blockio:latest --node minikube-docker
latency
µs : count distribution
0 -> 1 : 0 | |
1 -> 2 : 0 | |
2 -> 4 : 0 | |
4 -> 8 : 786 | |
8 -> 16 : 57788 |****************************************|
16 -> 32 : 39060 |*************************** |
32 -> 64 : 1513 |* |
64 -> 128 : 36 | |
128 -> 256 : 16 | |
256 -> 512 : 260 | |
512 -> 1024 : 2045 |* |
1024 -> 2048 : 986 | |
2048 -> 4096 : 57 | |
4096 -> 8192 : 1 | |
8192 -> 16384 : 0 | |
16384 -> 32768 : 0 | |
32768 -> 65536 : 0 | |
65536 -> 131072 : 0 | |
131072 -> 262144 : 0 | |
262144 -> 524288 : 0 | |
524288 -> 1048576 : 0 | |
1048576 -> 2097152 : 0 | |
2097152 -> 4194304 : 0 | |
4194304 -> 8388608 : 0 | |
8388608 -> 16777216 : 0 | |
16777216 -> 33554432 : 0 | |
33554432 -> 67108864 : 0 | |
The new histogram shows how the number of I/O operations increased significantly.
$ sudo ig run profile_blockio:latest
It will start to display the I/O latency distribution as follows:
latency
µs : count distribution
0 -> 1 : 0 | |
1 -> 2 : 0 | |
2 -> 4 : 0 | |
4 -> 8 : 4 | |
8 -> 16 : 11 |** |
16 -> 32 : 1 | |
32 -> 64 : 10 |* |
64 -> 128 : 0 | |
128 -> 256 : 0 | |
256 -> 512 : 2 | |
512 -> 1024 : 11 |** |
1024 -> 2048 : 1 | |
2048 -> 4096 : 213 |****************************************|
4096 -> 8192 : 0 | |
8192 -> 16384 : 0 | |
16384 -> 32768 : 0 | |
32768 -> 65536 : 0 | |
65536 -> 131072 : 0 | |
131072 -> 262144 : 0 | |
262144 -> 524288 : 0 | |
524288 -> 1048576 : 0 | |
1048576 -> 2097152 : 0 | |
2097152 -> 4194304 : 0 | |
4194304 -> 8388608 : 0 | |
8388608 -> 16777216 : 0 | |
16777216 -> 33554432 : 0 | |
33554432 -> 67108864 : 0 | |
Now, let's increase the I/O operations using the stress tool:
$ docker run -d --rm --name stresstest polinux/stress stress --io 10
Run the gadget to observe the I/O latency distribution:
$ sudo ig run profile_blockio:latest
latency
µs : count distribution
0 -> 1 : 0 | |
1 -> 2 : 0 | |
2 -> 4 : 0 | |
4 -> 8 : 202 | |
8 -> 16 : 13027 |************************************* |
16 -> 32 : 13833 |****************************************|
32 -> 64 : 4272 |************ |
64 -> 128 : 876 |** |
128 -> 256 : 13 | |
256 -> 512 : 529 |* |
512 -> 1024 : 2913 |******** |
1024 -> 2048 : 720 |** |
2048 -> 4096 : 86 | |
4096 -> 8192 : 5 | |
8192 -> 16384 : 0 | |
16384 -> 32768 : 0 | |
32768 -> 65536 : 0 | |
65536 -> 131072 : 0 | |
131072 -> 262144 : 0 | |
262144 -> 524288 : 0 | |
524288 -> 1048576 : 0 | |
1048576 -> 2097152 : 0 | |
2097152 -> 4194304 : 0 | |
4194304 -> 8388608 : 0 | |
8388608 -> 16777216 : 0 | |
16777216 -> 33554432 : 0 | |
33554432 -> 67108864 : 0 | |
You can clean up the resources created during this guide by running the following commands:
- kubectl gadget
- ig
$ kubectl delete ns test-biolatency
$ docker rm -f stresstest
Exporting metrics
The profile_blockio
gadget can expose the histograms it generates to a
Prometheus endpoint. To do so, you need to activate both the metrics listener as
well as the gadget collector. To enable the metrics listener, check the
Exporting Metrics documentation. To enable
the collector for the profile_blockio
gadget with the metrics name
blockio-metrics
, run the following command:
- kubectl gadget
- ig
WIP: Headless mode for kubectl gadget is under development
gadgetctl run ghcr.io/inspektor-gadget/gadget/profile_blockio:latest \
--annotate=blockio:metrics.collect=true \
--otel-metrics-name=blockio:blockio-metrics \
--detach
- kubectl gadget
- ig
WIP: Headless mode for kubectl gadget is under development
Unless you configured the metrics listener to do differently, the
metrics will be available at http://localhost:2224/metrics
on the
server side. And for the profile_blockio
gadget we ran above, the
metrics will be available under the blockio-metrics
scope:
$ curl http://localhost:2224/metrics -s | grep blockio-metrics
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="1"} 0
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="2"} 0
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="4"} 0
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="8"} 10
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="16"} 193
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="32"} 374
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="64"} 943
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="128"} 1825
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="256"} 2829
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="512"} 3905
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="1024"} 4280
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="2048"} 4351
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="4096"} 4351
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="8192"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="16384"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="32768"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="65536"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="131072"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="262144"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="524288"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="1.048576e+06"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="2.097152e+06"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="4.194304e+06"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="8.388608e+06"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="1.6777216e+07"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="3.3554432e+07"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="6.7108864e+07"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="+Inf"} 4354
latency_sum{otel_scope_name="blockio-metrics",otel_scope_version=""} 1.520032e+06
latency_count{otel_scope_name="blockio-metrics",otel_scope_version=""} 4354
otel_scope_info{otel_scope_name="blockio-metrics",otel_scope_version=""} 1
Finally, stop metrics collection:
- kubectl gadget
- ig
WIP: Headless mode for kubectl gadget is under development
$ gadgetctl list
ID NAME TAGS GADGET
3e68634c4c28 amazing_payne ghcr.io/inspektor-gadget/gadget/profile_blockio:latest
$ gadgetctl/... delete 3e68634c4c28
3e68634c4c28a981a60fe96a29d24a99