Skip to main content

Deep Kubernetes Troubleshooting with HolmesGPT and Inspektor Gadget

Qasim Sarfraz
· 3 min read

Troubleshooting production incidents in Kubernetes is hard. Logs and metrics tell you what happened, but not always why. Sometimes you need to look deeper — at which processes are running, what files are being opened, which DNS queries are failing, or what TCP connections are being made. That's the kind of low-level visibility that Inspektor Gadget provides using eBPF.

Now, with a new integration, HolmesGPT — a CNCF Sandbox project for AI-powered incident investigation — can use Inspektor Gadget's eBPF gadgets as part of its troubleshooting workflow.

See It in Action

Instead of learning troubleshooting commands, you can now ask HolmesGPT directly:

holmes ask "Capture network traffic for the order-service pod and summarize the activity"

Behind the scenes, HolmesGPT runs Inspektor Gadget's tcpdump gadget, parses the results and summarizes the next steps:

tcpdump gadget output

tip

Gadgets are invoked via kubectl debug and are ephemeral pods. No DaemonSet installation required. HolmesGPT handles the complexity of running gadgets on the right nodes with the right filters.

What Does the Integration Add?

The Inspektor Gadget toolset gives HolmesGPT access to eight tools running at the node level via kubectl debug:

ToolPurpose
trace_dnsDNS queries and responses
trace_tcpTCP connections (connect, accept, close)
tcpdumpNetwork packet capture with filters
trace_execProcess execution events
trace_openFile open events
snapshot_processRunning processes on a node
snapshot_socketOpen sockets on a node
traceloopSystem calls (flight recorder)

Enabling the Integration

Assuming you have HolmesGPT installed and configured, you can enable the Inspektor Gadget toolset with a single environment variable:

export ENABLE_INSPEKTOR_GADGET=true

Verify the toolset is loaded:

holmes toolset list

Trace DNS for a pod:

holmes ask "Trace DNS for the payments pod in the production namespace for 30 seconds and summarize the results"
tip

Please refer to HolmesGPT's documentation for installation instructions and Inspektor Gadget toolset details.

What's Next

This initial integration covers node-level gadgets using kubectl debug. Future work includes:

  • Cluster-level toolset using kubectl gadget for broader cluster-wide observability
  • More gadgets — network policies, capabilities, OOM kills, and more

Give it a try and let us know what you think! We'd love to hear your feedback and ideas for new gadgets to integrate.

Special thanks to the HolmesGPT maintainers for their reviews and feedback on this integration.

Get Involved

Both projects are CNCF Sandbox projects and welcome contributions — whether that's new toolsets, gadgets, runbooks, or documentation.