Version: latest

profile_blockio

The profile_blockio gadget gathers information about the usage of the block device I/O (disk I/O), generating a histogram distribution of I/O latency (time), when the gadget is stopped.

Notice that the latency of the disk I/O is measured from when the call is issued to the device until its completion, it does not include time spent in the kernel queue. This means that the histogram reflects only the performance of the device and not the effective latency suffered by the applications.

The histogram shows the number of I/O operations (count column) that lie in the latency range interval-start -> interval-end (µs column), which, as the columns name indicates, is given in microseconds.

For this guide, we will use the stress tool that allows us to load and stress the system in many different ways. In particular, we will use the --io flag that will generate a given number of workers to spin on the sync() syscall. In this way, we will generate disk I/O that we will analyse using the profile_blockio gadget.

Requirements

Minimum Kernel Version : 5.10

Getting started

Running the gadget:

kubectl gadget
ig

$ kubectl gadget run ghcr.io/inspektor-gadget/gadget/profile_blockio:latest [flags]

$ sudo ig run ghcr.io/inspektor-gadget/gadget/profile_blockio:latest [flags]

Guide

Run the gadget in a terminal:

kubectl gadget
ig

$ kubectl gadget run profile_blockio:latest --node minikube-docker

It will start to display the I/O latency distribution as follows:

latency
      µs               : count    distribution
       0 -> 1          : 0        |                                        |
       1 -> 2          : 0        |                                        |
       2 -> 4          : 0        |                                        |
       4 -> 8          : 1        |*                                       |
       8 -> 16         : 21       |***************************             |
      16 -> 32         : 17       |*********************                   |
      32 -> 64         : 12       |***************                         |
      64 -> 128        : 0        |                                        |
     128 -> 256        : 0        |                                        |
     256 -> 512        : 0        |                                        |
     512 -> 1024       : 21       |***************************             |
    1024 -> 2048       : 3        |***                                     |
    2048 -> 4096       : 31       |****************************************|
    4096 -> 8192       : 0        |                                        |
    8192 -> 16384      : 0        |                                        |
   16384 -> 32768      : 0        |                                        |
   32768 -> 65536      : 0        |                                        |
   65536 -> 131072     : 0        |                                        |
  131072 -> 262144     : 0        |                                        |
  262144 -> 524288     : 0        |                                        |
  524288 -> 1048576    : 0        |                                        |
 1048576 -> 2097152    : 0        |                                        |
 2097152 -> 4194304    : 0        |                                        |
 4194304 -> 8388608    : 0        |                                        |
 8388608 -> 16777216   : 0        |                                        |
16777216 -> 33554432   : 0        |                                        |
33554432 -> 67108864   : 0        |                                        |

Now, let's increase the I/O operations using the stress tool:

# Start by creating our testing namespace
$ kubectl create ns test-biolatency

# Run stress with 1 worker that will generate I/O operations
$ kubectl run --restart=Never --image=polinux/stress stress-io -n test-biolatency -- stress --io 1
$ kubectl wait --timeout=-1s -n test-biolatency --for=condition=ready pod/stress-io
pod/stress-io condition met
$ kubectl get pod -n test-biolatency -o wide
NAME        READY   STATUS    RESTARTS   AGE   IP           NODE          NOMINATED NODE   READINESS GATES
stress-io   1/1     Running   0          2s    10.244.1.7   worker-node   <none>           <none>

Using the profile block-io gadget, we can generate another histogram to analyse the disk I/O with this load:

# Run the gadget again
$ kubectl gadget run profile_blockio:latest --node minikube-docker
latency
      µs               : count    distribution
       0 -> 1          : 0        |                                        |
       1 -> 2          : 0        |                                        |
       2 -> 4          : 0        |                                        |
       4 -> 8          : 786      |                                        |
       8 -> 16         : 57788    |****************************************|
      16 -> 32         : 39060    |***************************             |
      32 -> 64         : 1513     |*                                       |
      64 -> 128        : 36       |                                        |
     128 -> 256        : 16       |                                        |
     256 -> 512        : 260      |                                        |
     512 -> 1024       : 2045     |*                                       |
    1024 -> 2048       : 986      |                                        |
    2048 -> 4096       : 57       |                                        |
    4096 -> 8192       : 1        |                                        |
    8192 -> 16384      : 0        |                                        |
   16384 -> 32768      : 0        |                                        |
   32768 -> 65536      : 0        |                                        |
   65536 -> 131072     : 0        |                                        |
  131072 -> 262144     : 0        |                                        |
  262144 -> 524288     : 0        |                                        |
  524288 -> 1048576    : 0        |                                        |
 1048576 -> 2097152    : 0        |                                        |
 2097152 -> 4194304    : 0        |                                        |
 4194304 -> 8388608    : 0        |                                        |
 8388608 -> 16777216   : 0        |                                        |
16777216 -> 33554432   : 0        |                                        |
33554432 -> 67108864   : 0        |                                        |

The new histogram shows how the number of I/O operations increased significantly.

$ sudo ig run profile_blockio:latest

It will start to display the I/O latency distribution as follows:

latency
      µs               : count    distribution
       0 -> 1          : 0        |                                        |
       1 -> 2          : 0        |                                        |
       2 -> 4          : 0        |                                        |
       4 -> 8          : 4        |                                        |
       8 -> 16         : 11       |**                                      |
      16 -> 32         : 1        |                                        |
      32 -> 64         : 10       |*                                       |
      64 -> 128        : 0        |                                        |
     128 -> 256        : 0        |                                        |
     256 -> 512        : 2        |                                        |
     512 -> 1024       : 11       |**                                      |
    1024 -> 2048       : 1        |                                        |
    2048 -> 4096       : 213      |****************************************|
    4096 -> 8192       : 0        |                                        |
    8192 -> 16384      : 0        |                                        |
   16384 -> 32768      : 0        |                                        |
   32768 -> 65536      : 0        |                                        |
   65536 -> 131072     : 0        |                                        |
  131072 -> 262144     : 0        |                                        |
  262144 -> 524288     : 0        |                                        |
  524288 -> 1048576    : 0        |                                        |
 1048576 -> 2097152    : 0        |                                        |
 2097152 -> 4194304    : 0        |                                        |
 4194304 -> 8388608    : 0        |                                        |
 8388608 -> 16777216   : 0        |                                        |
16777216 -> 33554432   : 0        |                                        |
33554432 -> 67108864   : 0        |                                        |

Now, let's increase the I/O operations using the stress tool:

$ docker run -d --rm --name stresstest polinux/stress stress --io 10

Run the gadget to observe the I/O latency distribution:

$ sudo ig run profile_blockio:latest
latency
      µs               : count    distribution
       0 -> 1          : 0        |                                        |
       1 -> 2          : 0        |                                        |
       2 -> 4          : 0        |                                        |
       4 -> 8          : 202      |                                        |
       8 -> 16         : 13027    |*************************************   |
      16 -> 32         : 13833    |****************************************|
      32 -> 64         : 4272     |************                            |
      64 -> 128        : 876      |**                                      |
     128 -> 256        : 13       |                                        |
     256 -> 512        : 529      |*                                       |
     512 -> 1024       : 2913     |********                                |
    1024 -> 2048       : 720      |**                                      |
    2048 -> 4096       : 86       |                                        |
    4096 -> 8192       : 5        |                                        |
    8192 -> 16384      : 0        |                                        |
   16384 -> 32768      : 0        |                                        |
   32768 -> 65536      : 0        |                                        |
   65536 -> 131072     : 0        |                                        |
  131072 -> 262144     : 0        |                                        |
  262144 -> 524288     : 0        |                                        |
  524288 -> 1048576    : 0        |                                        |
 1048576 -> 2097152    : 0        |                                        |
 2097152 -> 4194304    : 0        |                                        |
 4194304 -> 8388608    : 0        |                                        |
 8388608 -> 16777216   : 0        |                                        |
16777216 -> 33554432   : 0        |                                        |
33554432 -> 67108864   : 0        |                                        |

You can clean up the resources created during this guide by running the following commands:

kubectl gadget
ig

$ kubectl delete ns test-biolatency

$ docker rm -f stresstest

Exporting metrics

The profile_blockio gadget can expose the histograms it generates to a Prometheus endpoint. To do so, you need to activate both the metrics listener as well as the gadget collector. To enable the metrics listener, check the Exporting Metrics documentation. To enable the collector for the profile_blockio gadget with the metrics name blockio-metrics, run the following command:

kubectl gadget
ig

WIP: Headless mode for kubectl gadget is under development

gadgetctl run ghcr.io/inspektor-gadget/gadget/profile_blockio:latest \
            --annotate=blockio:metrics.collect=true \
            --otel-metrics-name=blockio:blockio-metrics \
            --detach

kubectl gadget
ig

WIP: Headless mode for kubectl gadget is under development

Unless you configured the metrics listener to do differently, the metrics will be available at http://localhost:2224/metrics on the server side. And for the profile_blockio gadget we ran above, the metrics will be available under the blockio-metrics scope:

$ curl http://localhost:2224/metrics -s | grep blockio-metrics
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="1"} 0
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="2"} 0
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="4"} 0
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="8"} 10
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="16"} 193
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="32"} 374
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="64"} 943
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="128"} 1825
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="256"} 2829
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="512"} 3905
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="1024"} 4280
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="2048"} 4351
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="4096"} 4351
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="8192"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="16384"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="32768"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="65536"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="131072"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="262144"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="524288"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="1.048576e+06"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="2.097152e+06"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="4.194304e+06"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="8.388608e+06"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="1.6777216e+07"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="3.3554432e+07"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="6.7108864e+07"} 4354
latency_bucket{otel_scope_name="blockio-metrics",otel_scope_version="",le="+Inf"} 4354
latency_sum{otel_scope_name="blockio-metrics",otel_scope_version=""} 1.520032e+06
latency_count{otel_scope_name="blockio-metrics",otel_scope_version=""} 4354
otel_scope_info{otel_scope_name="blockio-metrics",otel_scope_version=""} 1

Finally, stop metrics collection:

kubectl gadget
ig

WIP: Headless mode for kubectl gadget is under development

$ gadgetctl list
ID           NAME                                                      TAGS                                                      GADGET
3e68634c4c28 amazing_payne                                                                                                       ghcr.io/inspektor-gadget/gadget/profile_blockio:latest

$ gadgetctl/... delete 3e68634c4c28
3e68634c4c28a981a60fe96a29d24a99

Requirements​

Getting started​

Guide​

Exporting metrics​

Requirements

Getting started

Guide

Exporting metrics