Troubleshooting

Common issues and solutions for Garnet deployment and operation.

Installation Issues

Kubernetes: Pods Stuck in Pending

Symptoms:

kubectl get pods -n garnet
# NAME           READY   STATUS    RESTARTS   AGE
# jibril-xyz     0/1     Pending   0          5m

Diagnosis:

kubectl describe pod -n garnet
kubectl get events -n garnet --sort-by='.lastTimestamp'

Common causes:

Insufficient node resources

Error: 0/10 nodes available: insufficient cpu/memoryFix:

resources:
  requests:
    cpu: 50m      # Reduce from 100m
    memory: 64Mi  # Reduce from 128Mi

Pod Security Policy blocking privileged pods

Error: pods "jibril-xyz" is forbidden: unable to validate against any pod security policyFix: Enable privileged pods in your PSP or use Pod Security Standards:

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: garnet-privileged
spec:
  privileged: true
  allowedCapabilities:
    - SYS_ADMIN
    - NET_ADMIN

Node selectors not matching

Error: 0/10 nodes available: node(s) didn't match node selectorFix: Check your node labels:

kubectl get nodes --show-labels

Update nodeSelector in values.yaml or remove it:

nodeSelector: {}  # Allow all nodes

GitHub Actions: Step Fails

Symptoms:

Error: Garnet authentication failed (401 Unauthorized)

Diagnosis:

Verify Secret Exists

# Check if secret is set
gh secret list | grep GARNET

Should show GARNET_API_TOKEN.

Test Token

curl -H "Authorization: Bearer YOUR_TOKEN" \
  https://api.garnet.ai/v1/health

Should return {"status":"ok"}.

Regenerate Token

If invalid, generate new token at dashboard.garnet.ai and update secret.

Connectivity Issues

No Events Appearing in Dashboard

Symptoms:

Agent shows “Connected” in logs
But no events in Dashboard → Events

Diagnosis:

Kubernetes
GitHub Actions

# Check agent logs
kubectl logs -l app=jibril -n garnet --tail=100

# Look for:
# ✓ Good: "Connected to Garnet Platform"
# ✓ Good: "Event batch sent successfully"
# ✗ Bad: "Failed to send events: 403 Forbidden"

Common causes:

API token has no permissions

Fix: Regenerate token with correct project permissions.Dashboard → Settings → API Tokens → Create new token

Network egress blocked

Test connectivity:

kubectl run test-curl --image=curlimages/curl --rm -it -- \
  curl -v https://api.garnet.ai/v1/health

Fix: Update firewall rules to allow outbound HTTPS to api.garnet.ai.

No actual network activity

Cause: Workload isn’t making outbound connections.Fix: Trigger test traffic:

kubectl run test --image=curlimages/curl --rm -it -- \
  curl https://example.com

Check Dashboard → Events for the curl request.

Detection Issues

Too Many False Positives

Symptoms:

Many Issues for legitimate traffic
False positive rate >5%

Diagnosis: Dashboard → Issues → Review recent Issues Look for patterns:

Same domain flagged repeatedly?
Specific micro-context generating noise?

Solutions:

Extend Baseline Period

Dashboard → Settings → Baselining → Period: 14 days (from 7)Gives more time to learn normal behavior.

Create Allow Policy

For known-good domains that vary frequently:

name: "CDN allowlist"
type: allow
rules:
  - pattern: "*.cloudfront.net"
  - pattern: "*.fastly.net"

Review Micro-Context Granularity

For GitHub Actions, very granular contexts (workflow+job+step) can cause FPs if steps vary.Workaround: Use broader policies for CI/CD environments.

No Detections (Expected Malicious Traffic Not Caught)

Symptoms:

Manually triggered unknown egress not appearing as Issue

Diagnosis:

Check if Domain is in Baseline

Dashboard → Events → Filter by domainIf domain appears in past events, it’s already baseline.

Verify Enforce Mode

If in detect-only, connections are allowed but should still create Issues.Check: Dashboard → Agents → Mode column

Check Policy Overrides

Dashboard → Policies → Check if an allow policy matches the domain.Policies override baseline.

Test with guaranteed-unknown domain:

curl https://test-garnet-unknown-$(date +%s).com

This creates a unique domain each time, guaranteed to be unknown.

Performance Issues

High CPU Usage

Symptoms:

kubectl top pods -n garnet
# NAME        CPU    MEMORY
# jibril-xyz  500m   256Mi   # >200m is high

Causes:

Very high network activity on node

Diagnosis:

kubectl get pods -o wide | grep <node-name>
# Check number of pods on this node

Fix: Increase CPU limits or reduce monitoring scope:

resources:
  limits:
    cpu: 500m  # Increase from 200m

# OR reduce scope
env:
  - name: GARNET_SAMPLE_RATE
    value: "0.5"  # Sample 50% of events

eBPF programs inefficient

Fix: Upgrade to latest agent version:

helm repo update
helm upgrade jibril garnet/jibril -n garnet --reuse-values

Each version includes eBPF optimizations.

High Memory Usage

Symptoms:

kubectl top pods -n garnet
# NAME        CPU   MEMORY
# jibril-xyz  100m  512Mi   # >256Mi is high

Causes:

Event buffer overflow

Fix: Increase memory limits and buffer size:

resources:
  limits:
    memory: 512Mi

env:
  - name: GARNET_BUFFER_SIZE
    value: "8192"  # Increase from default 4096

Memory leak (rare)

Diagnosis: Memory usage increases steadily over days.Fix: Restart pods:

kubectl rollout restart daemonset/jibril -n garnet

If issue persists, contact support@garnet.ai with pod logs.

Enforce Mode Issues

Legitimate Traffic Being Blocked

Symptoms:

Application errors like ConnectionError or EPERM
Dashboard shows Issue with verdict=blocked

Immediate fix (< 1 minute):

# Create emergency allow policy
name: "Emergency: Unblock example.com"
type: allow
scope: global
rules:
  - pattern: "example.com"

Apply via Dashboard → Policies → Create. Long-term fix:

Review why domain wasn’t in baseline
Add to corporate allowlist if recurring
Extend baseline period if too short

Enforce Mode Not Blocking

Symptoms:

Mode set to enforce
Unknown egress detected (Issue created)
But connection NOT blocked (verdict=detected)

Diagnosis:

Kubernetes
GitHub Actions

kubectl get pods -n garnet -o yaml | grep -A5 "env:"

Check if GARNET_MODE: enforce is set.

Fix:

# Kubernetes
helm upgrade jibril garnet/jibril \
  --set mode=enforce \
  --namespace garnet \
  --reuse-values

# Verify
kubectl rollout status daemonset/jibril -n garnet

Common Error Messages

`eBPF program failed to load`

Error in logs:

ERROR: Failed to load eBPF program: operation not permitted

Causes:

Not running as privileged

Fix:

securityContext:
  privileged: true

Kernel too old

Check kernel version:

kubectl debug node/NODE_NAME -it --image=ubuntu -- uname -r

Must be >=5.8. If older, upgrade node OS.

SELinux blocking

Temporary fix:

setenforce 0  # Set to permissive mode

Permanent fix: Create SELinux policy for Garnet (contact support).

`Connection refused to api.garnet.ai`

Error in logs:

ERROR: Failed to connect: connection refused

Causes:

Network policy blocking egress

Test:

kubectl run test --image=curlimages/curl --rm -it -- \
  curl -v https://api.garnet.ai

Fix: Update network policy:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-garnet-egress
spec:
  podSelector:
    matchLabels:
      app: jibril
  policyTypes:
    - Egress
  egress:
    - to:
        - podSelector: {}
    - ports:
        - protocol: TCP
          port: 443  # HTTPS

Proxy not configured

Fix: Set proxy environment variables:

env:
  - name: HTTPS_PROXY
    value: "http://proxy.corp.example.com:8080"
  - name: NO_PROXY
    value: ".svc.cluster.local"

Getting Help

Still stuck? Contact support with:

Platform: GitHub Actions or Kubernetes
Agent version: helm list -n garnet or workflow logs
Error logs: Last 100 lines
Issue ID: If related to a specific Issue

Email: support@garnet.ai Include:

# Kubernetes diagnostics
kubectl get pods -n garnet -o wide
kubectl logs -l app=jibril -n garnet --tail=200
kubectl describe pod <pod-name> -n garnet

Overview

Quickstart

Install

Concepts

Policies

Operate

Reference

Troubleshooting

Troubleshooting

Installation Issues

Kubernetes: Pods Stuck in Pending

GitHub Actions: Step Fails

Connectivity Issues

No Events Appearing in Dashboard

Detection Issues

Too Many False Positives

No Detections (Expected Malicious Traffic Not Caught)

Performance Issues

High CPU Usage

High Memory Usage

Enforce Mode Issues

Legitimate Traffic Being Blocked

Enforce Mode Not Blocking

Common Error Messages

`eBPF program failed to load`

`Connection refused to api.garnet.ai`

Getting Help

Next Steps

FAQ

Compatibility Matrix

Overview

Quickstart

Install

Concepts

Policies

Operate

Reference

​Troubleshooting

​Installation Issues

​Kubernetes: Pods Stuck in Pending

​GitHub Actions: Step Fails

​Connectivity Issues

​No Events Appearing in Dashboard

​Detection Issues

​Too Many False Positives

​No Detections (Expected Malicious Traffic Not Caught)

​Performance Issues

​High CPU Usage

​High Memory Usage

​Enforce Mode Issues

​Legitimate Traffic Being Blocked

​Enforce Mode Not Blocking

​Common Error Messages

​eBPF program failed to load

​Connection refused to api.garnet.ai

​Getting Help

​Next Steps

FAQ

Compatibility Matrix

Troubleshooting

Installation Issues

Kubernetes: Pods Stuck in Pending

GitHub Actions: Step Fails

Connectivity Issues

No Events Appearing in Dashboard

Detection Issues

Too Many False Positives

No Detections (Expected Malicious Traffic Not Caught)

Performance Issues

High CPU Usage

High Memory Usage

Enforce Mode Issues

Legitimate Traffic Being Blocked

Enforce Mode Not Blocking

Common Error Messages

`eBPF program failed to load`

`Connection refused to api.garnet.ai`

Getting Help

Next Steps