Skip to main content
Version: main

top_cpu_throttle

The top_cpu_throttle gadget periodically reports cgroup v2 CPU throttling statistics, making it easy to identify which containers, pods, or namespaces are the worst CFS (Completely Fair Scheduler) throttle offenders.

This is particularly useful when PSI (Pressure Stall Information) shows high CPU pressure on a node but top/htop shows available CPU time — the pressure is likely caused by too-tight cgroup CPU limits rather than genuine CPU overutilization.

Getting started

Running the gadget:

$ kubectl gadget run ghcr.io/inspektor-gadget/gadget/top_cpu_throttle:latest

Guide

The gadget reads the following cgroup v2 files for every cgroup that has a CPU bandwidth limit:

FileData
cpu.maxQuota and period (the CPU limit)
cpu.statnr_periods, nr_throttled, throttled_usec (cumulative throttle counters)
cpu.pressurePSI averages (optional — gracefully skipped if unavailable)

Reported fields

ColumnDescription
CGROUPCgroup v2 path (hidden by default)
PERIODSNumber of CFS enforcement intervals elapsed in the reporting period
THROTTLEDNumber of times the cgroup was throttled
THROTTLED_TIMETotal time spent throttled (human-readable duration)
%THROTThrottle percentage (THROTTLED / PERIODS × 100, 0–100)
QUOTACPU quota per CFS period (human-readable duration)
LIMITEffective CPU core limit (quota / period)
%PSI10CPU pressure % over 10 s (PSI "some" avg)
%PSI60CPU pressure % over 60 s (PSI "some" avg)

Only cgroups with an explicit CPU bandwidth limit are shown — cgroups without a cpu.max quota (i.e., unlimited) are filtered out.

PSI "full" is omitted because CFS throttling is all-or-nothing at the cgroup level — "full" is always identical to "some" for CPU pressure.

All throttle counters are per-interval deltas, not cumulative totals — they reflect what happened since the last report.

Sorting by worst offenders

$ kubectl gadget run ghcr.io/inspektor-gadget/gadget/top_cpu_throttle:latest --sort -throttleRatio
K8S.NODE K8S.NAMESPACE K8S.PODNAME K8S.CONTAINER PERIODS THROTTLED THROTTLED_TIME %THROT QUOTA LIMIT %PSI10 %PSI60
minikube default heavy-web heavy-web 50 50 7.24511s 100.00 5.00ms 0.05 79.43 29.55
minikube default api-server api-server 50 50 2.87386s 100.00 20.00ms 0.20 60.54 24.27
minikube default light-load light-load 50 8 62.72ms 16.00 10.00ms 0.10 0.27 0.08
minikube default batch-job batch-job 50 0 0ns 0.00 100.00m 1.00 0.00 0.00

Adjusting the interval

$ sudo ig run top_cpu_throttle --interval 3s

Requirements

  • cgroup v2 — the host must use cgroup v2 (unified hierarchy). Cgroup v1 is not supported.
  • PSI — PSI metrics require CONFIG_PSI=y in the kernel (Linux 4.20+). If PSI is not available, the PSI columns will report 0 and the gadget will continue to work normally.