Skip to main content
Version: latest

trace_capabilities

The trace_capabilities gadget allows us to see what capability security checks are triggered by applications running in a container.

Linux capabilities allow for a finer privilege control because they can give root-like capabilities to processes without giving them full root access. They can also be taken away from root processes. If a pod is directly executing programs as root, we can further lock it down by taking capabilities away. Sometimes we need to add capabilities which are not there by default. You can see the list of default and available capabilities in Docker. Specially if our pod is directly run as user instead of root (runAsUser: ID), we can give some more capabilities (think as partly root) and still take all unused capabilities to really lock it down.

Getting started

Running the gadget:

$ kubectl gadget run ghcr.io/inspektor-gadget/gadget/trace_capabilities:latest [flags]

Flags

--audit-only

Only show audit checks

Default value: "false"

--print-stack

controls whether the gadget will send kernel stack to userspace

Default value: "true"

--unique

Only show a capability once on the same container

Default value: "false"

Guide

Here we have a small demo app which logs failures due to lacking capabilities. Since none of the default capabilities is dropped, we have to find out what non-default capability we have to add.

$ kubectl run set-priority-0 --image=busybox --labels=name=set-priority-0 --restart=Never -- /bin/sh -c "while /bin/true ; do nice -n -20 echo ; sleep 5; done"
pod/set-priority created
$ kubectl logs -lname=set-priority
nice: setpriority(-20): Permission denied
nice: setpriority(-20): Permission denied

We could see the error messages in the pod's log. Let's use Inspektor Gadget to watch the capability checks:

$ kubectl gadget run trace_capabilities --selector name=set-priority-0
K8S.NODE K8S.NAMESPACE K8S.PODNAME K8S.CONTAINERNAME COMM PID TID UID GID CAPABLE AUDIT CAP SYSCALL
minikube-docker default set-priority set-priority nice 169528 169528 0 0 false 1 CAP_SYS_NICE SYS_SETPRIORITY
minikube-docker default set-priority set-priority nice 169594 169594 0 0 false 1 CAP_SYS_NICE SYS_SETPRIORITY
minikube-docker default set-priority set-priority nice 169661 169661 0 0 false 1 CAP_SYS_NICE SYS_SETPRIORITY
minikube-docker default set-priority set-priority nice 169736 169736 0 0 false 1 CAP_SYS_NICE SYS_SETPRIORITY
^C

We can stop the gadget with Ctrl-C. In the output we see that the CAP_SYS_NICE capability got checked when nice was run. We should probably add it to our pod template for nice to work. We can also drop all other capabilities from the default list (see link above) since nice did not use them:

The meaning of the columns is:

  • SYSCALL: the system call that caused the capability to be exercised
  • CAP: capability name in a human friendly format
  • AUDIT: whether the kernel should audit the security request or not
  • CAPABLE: whether the process has the capability or not

Let's create a new pod with the missing capability:

$ kubectl run set-priority-1 --image=busybox --labels=name=set-priority-1 --restart=Always --overrides='{"spec":{"containers":[{"name":"set-priority-1","command":["sh", "-c", "while /bin/true ; do nice -n -20 echo ; sleep 5; done"],"image":"busybox","securityContext":{"capabilities":{"add":["SYS_NICE"],"drop":["ALL"]}}}]}}'
pod/set-priority-1 created
$ kubectl logs -lname=set-priority-1

The logs are clean, so everything works!

We can see the same checks but this time with the CAPABLE column set to true:

$ kubectl gadget run trace_capabilities:latest --selector name=set-priority-1
K8S.NODE K8S.NAMESPACE K8S.PODNAME K8S.CONTAINERNAME COMM PID TID UID GID CAPABLE AUDIT CAP SYSCALL
minikube-docker default set-priority-1 set-priority-1 nice 225549 225549 0 0 true 1 CAP_SYS_NICE SYS_SETPRIORITY
minikube-docker default set-priority-1 set-priority-1 nice 225615 225615 0 0 true 1 CAP_SYS_NICE SYS_SETPRIORITY
minikube-docker default set-priority-1 set-priority-1 nice 225688 225688 0 0 true 1 CAP_SYS_NICE SYS_SETPRIORITY
^C

Finally, clean the system:

$ kubectl delete pod set-priority-0 set-priority-1