mirror of
https://github.com/rcourtman/Pulse.git
synced 2026-05-21 18:46:08 +00:00
The chart's agent.image.repository defaulted to ghcr.io/rcourtman/pulse-agent,
an image that has never been published. publish-docker.yml only pushes
rcourtman/pulse; the Dockerfile defines an agent_runtime stage that
*could* be published but it isn't, and commit da7969fb4 from earlier in
this session removed the corresponding pulse-agent attestation
expectations — a clear signal the separate agent image was intentionally
dropped without updating the chart. Customers running
`helm install pulse pulse/pulse --set agent.enabled=true` were silently
hitting ImagePullBackOff on the agent DaemonSet.
Route the chart through the main rcourtman/pulse image instead. To make
that work without per-arch chart overrides, the runtime stage in the
Dockerfile now creates an arch-resolved /usr/local/bin/pulse-agent
symlink to the right /opt/pulse/bin/pulse-agent-linux-{amd64,arm64,armv7}
binary. The chart's agent.command default is /usr/local/bin/pulse-agent,
which overrides the server ENTRYPOINT and runs the pod as a unified
agent on whichever arch the node provides. agent.yaml renders the
command via toYaml so list values pass through cleanly.
KUBERNETES.md's DaemonSet example switches from the arch-hardcoded
/opt/pulse/bin/pulse-agent-linux-amd64 to the new arch-resolved path,
restoring multi-arch portability of the docs example.
validate-release.sh asserts the symlink exists, points at one of the
three supported Linux arch binaries, and is executable in the published
image. A new TestHelmAgentRuntimePointsAtRealImage pins the chart
defaults, the template wiring, the Dockerfile symlink, and the
validate-release.sh guard so the regression class can't quietly
resurface.
Governance: extend the helm-chart-release-runtime verification policy's
exact_files to include scripts/installtests/build_release_assets_test.go
(matching its existing pin set for related deployment-installability
policies); update the subsystem_lookup_test.py fixture that pins the
exact_files list; document the agent-image and pulse-agent symlink
contract in deployment-installability.md Extension Point 7.
Verified locally: `helm lint` passes; `helm template --set agent.enabled=true`
renders a DaemonSet with image rcourtman/pulse:6.0.0,
command ["/usr/local/bin/pulse-agent"], args ["--enable-docker", "--enable-host=false"].
End-to-end image build + agent DaemonSet smoke will run via helm_smoke
on the next release once rcourtman/pulse:6.0.0 is published.
274 lines
8.7 KiB
Markdown
274 lines
8.7 KiB
Markdown
# Pulse on Kubernetes
|
|
|
|
This guide explains how to deploy the Pulse Server (Hub) and Pulse Agents on Kubernetes clusters, including immutable distributions like Talos Linux.
|
|
|
|
> **Navigation note (v6):** Kubernetes cluster and node resources appear on the **Infrastructure** page, while pods appear on the **Workloads** page. The legacy `/kubernetes` URL redirects to `/workloads?type=k8s`.
|
|
|
|
## Prerequisites
|
|
|
|
- A Kubernetes cluster (v1.19+)
|
|
- `helm` (v3+) installed locally
|
|
- `kubectl` configured to talk to your cluster
|
|
|
|
## 1. Deploying the Pulse Server
|
|
|
|
The Pulse Server is the central hub that collects metrics and manages agents.
|
|
|
|
### Option A: Using Helm (Recommended)
|
|
|
|
1. Add the Pulse Helm repository:
|
|
```bash
|
|
helm repo add pulse https://rcourtman.github.io/Pulse
|
|
helm repo update
|
|
```
|
|
|
|
2. Install the chart:
|
|
```bash
|
|
helm upgrade --install pulse pulse/pulse \
|
|
--namespace pulse \
|
|
--create-namespace \
|
|
--set persistence.enabled=true \
|
|
--set persistence.size=10Gi
|
|
```
|
|
|
|
> **Note**: For production, ensure you configure a proper `persistence.storageClass` or `strategy.type=Recreate` if using ReadWriteOnce (RWO) volumes. The chart's default `strategy.type` is `RollingUpdate`, which can hit Multi-Attach errors with RWO PVCs during upgrade.
|
|
|
|
### Option B: Generating Static Manifests (For Talos / GitOps)
|
|
|
|
If you cannot use Helm directly on the cluster (e.g., restricted Talos environment), you can generate standard Kubernetes YAML manifests:
|
|
|
|
```bash
|
|
helm repo add pulse https://rcourtman.github.io/Pulse
|
|
helm repo update
|
|
helm template pulse pulse/pulse \
|
|
--namespace pulse \
|
|
--set persistence.enabled=true \
|
|
> pulse-server.yaml
|
|
```
|
|
|
|
You can then apply this file:
|
|
|
|
```bash
|
|
kubectl apply -f pulse-server.yaml
|
|
```
|
|
|
|
## 2. Deploying the Pulse Agent
|
|
|
|
### Helm Chart Agent Mode
|
|
|
|
The Helm chart includes an optional `agent` section that deploys the unified `pulse-agent`.
|
|
By default, this workload runs in container-monitoring mode (`--enable-docker --enable-host=false`).
|
|
|
|
For Kubernetes monitoring, use a custom DaemonSet as shown below.
|
|
|
|
### Unified Agent on Kubernetes (DaemonSet)
|
|
|
|
To monitor Kubernetes resources, run the unified agent as a DaemonSet and enable the Kubernetes module.
|
|
|
|
**Recommended options:**
|
|
- **Kubernetes-only monitoring**: `PULSE_ENABLE_KUBERNETES=true` and `PULSE_ENABLE_HOST=false` (no host mounts required).
|
|
- **Kubernetes + node metrics**: `PULSE_ENABLE_KUBERNETES=true` and `PULSE_ENABLE_HOST=true` (requires host mounts and privileged mode).
|
|
|
|
#### Minimal DaemonSet Example
|
|
|
|
This uses the main `rcourtman/pulse` image but runs the `pulse-agent` binary directly.
|
|
|
|
```yaml
|
|
apiVersion: apps/v1
|
|
kind: DaemonSet
|
|
metadata:
|
|
name: pulse-agent
|
|
namespace: pulse
|
|
spec:
|
|
selector:
|
|
matchLabels:
|
|
app: pulse-agent
|
|
template:
|
|
metadata:
|
|
labels:
|
|
app: pulse-agent
|
|
spec:
|
|
serviceAccountName: pulse-agent
|
|
containers:
|
|
- name: pulse-agent
|
|
image: rcourtman/pulse:latest
|
|
# /usr/local/bin/pulse-agent is an arch-resolved symlink in the
|
|
# main Pulse image, so this manifest works on both amd64 and
|
|
# arm64 nodes without changes.
|
|
command: ["/usr/local/bin/pulse-agent"]
|
|
args:
|
|
- --enable-kubernetes
|
|
env:
|
|
- name: PULSE_URL
|
|
value: "http://pulse-server.pulse.svc.cluster.local:7655"
|
|
- name: PULSE_TOKEN
|
|
value: "YOUR_API_TOKEN_HERE"
|
|
- name: PULSE_AGENT_ID
|
|
value: "my-k8s-cluster"
|
|
- name: PULSE_ENABLE_HOST
|
|
value: "false"
|
|
- name: PULSE_KUBE_INCLUDE_ALL_PODS
|
|
value: "true"
|
|
- name: PULSE_KUBE_INCLUDE_ALL_DEPLOYMENTS
|
|
value: "true"
|
|
securityContext:
|
|
readOnlyRootFilesystem: true
|
|
allowPrivilegeEscalation: false
|
|
resources:
|
|
requests:
|
|
cpu: 50m
|
|
memory: 128Mi
|
|
limits:
|
|
memory: 512Mi
|
|
tolerations:
|
|
- operator: Exists
|
|
```
|
|
|
|
> **Note for ARM64 clusters**: Replace `pulse-agent-linux-amd64` with `pulse-agent-linux-arm64`.
|
|
|
|
Use a token scoped for the agent:
|
|
- `kubernetes:report` for Kubernetes reporting
|
|
- `agent:report` if you enable host metrics
|
|
|
|
#### Important DaemonSet Configuration
|
|
|
|
##### PULSE_AGENT_ID (Required for DaemonSets)
|
|
|
|
When running as a DaemonSet, all pods share the same API token but need a unified identity. Without `PULSE_AGENT_ID`, each pod auto-generates a unique ID (e.g., `mac-xxxxx`), causing token conflicts:
|
|
|
|
```text
|
|
API token is already in use by agent "mac-aa5496fed726". Each Kubernetes agent must use a unique API token.
|
|
```
|
|
|
|
Set `PULSE_AGENT_ID` to a shared cluster name so all pods report as one logical agent:
|
|
|
|
```yaml
|
|
- name: PULSE_AGENT_ID
|
|
value: "my-k8s-cluster"
|
|
```
|
|
|
|
##### Resource Visibility Flags
|
|
|
|
By default, Pulse only shows resources with problems (unhealthy pods, failing deployments). To see all resources:
|
|
|
|
| Environment Variable | Description | Default |
|
|
|---------------------|-------------|---------|
|
|
| `PULSE_KUBE_INCLUDE_ALL_PODS` | Show all non-succeeded pods, not just problematic ones | `false` |
|
|
| `PULSE_KUBE_INCLUDE_ALL_DEPLOYMENTS` | Show all deployments, not just those with issues | `false` |
|
|
|
|
For most monitoring use cases, set both to `true`:
|
|
|
|
```yaml
|
|
- name: PULSE_KUBE_INCLUDE_ALL_PODS
|
|
value: "true"
|
|
- name: PULSE_KUBE_INCLUDE_ALL_DEPLOYMENTS
|
|
value: "true"
|
|
```
|
|
|
|
See [UNIFIED_AGENT.md](UNIFIED_AGENT.md) for all available configuration options.
|
|
|
|
#### Add Host Metrics (Optional)
|
|
|
|
If you want node CPU/memory/disk metrics, add privileged mode plus host mounts:
|
|
|
|
```yaml
|
|
env:
|
|
- name: PULSE_ENABLE_HOST
|
|
value: "true"
|
|
- name: HOST_PROC
|
|
value: "/host/proc"
|
|
- name: HOST_SYS
|
|
value: "/host/sys"
|
|
- name: HOST_ETC
|
|
value: "/host/etc"
|
|
securityContext:
|
|
privileged: true
|
|
volumeMounts:
|
|
- name: host-proc
|
|
mountPath: /host/proc
|
|
readOnly: true
|
|
- name: host-sys
|
|
mountPath: /host/sys
|
|
readOnly: true
|
|
- name: host-root
|
|
mountPath: /host/root
|
|
readOnly: true
|
|
volumes:
|
|
- name: host-proc
|
|
hostPath:
|
|
path: /proc
|
|
- name: host-sys
|
|
hostPath:
|
|
path: /sys
|
|
- name: host-root
|
|
hostPath:
|
|
path: /
|
|
```
|
|
|
|
#### RBAC
|
|
|
|
The Kubernetes agent uses the in-cluster API and needs read access to cluster resources (nodes, pods, deployments, etc.). Create a read-only `ClusterRole` and bind it to the `pulse-agent` service account.
|
|
|
|
```yaml
|
|
apiVersion: v1
|
|
kind: ServiceAccount
|
|
metadata:
|
|
name: pulse-agent
|
|
namespace: pulse
|
|
---
|
|
apiVersion: rbac.authorization.k8s.io/v1
|
|
kind: ClusterRole
|
|
metadata:
|
|
name: pulse-agent-read
|
|
rules:
|
|
- apiGroups: [""]
|
|
resources: ["nodes", "pods"]
|
|
verbs: ["get", "list", "watch"]
|
|
- apiGroups: ["apps"]
|
|
resources: ["deployments"]
|
|
verbs: ["get", "list", "watch"]
|
|
# Optional (Recovery): VolumeSnapshots and Velero backups.
|
|
# These rules are safe to include even if the APIs are not installed; the agent will
|
|
# feature-detect and ignore 404/403 responses.
|
|
- apiGroups: ["snapshot.storage.k8s.io"]
|
|
resources: ["volumesnapshots"]
|
|
verbs: ["get", "list", "watch"]
|
|
- apiGroups: ["velero.io"]
|
|
resources: ["backups"]
|
|
verbs: ["get", "list", "watch"]
|
|
---
|
|
apiVersion: rbac.authorization.k8s.io/v1
|
|
kind: ClusterRoleBinding
|
|
metadata:
|
|
name: pulse-agent-read
|
|
subjects:
|
|
- kind: ServiceAccount
|
|
name: pulse-agent
|
|
namespace: pulse
|
|
roleRef:
|
|
kind: ClusterRole
|
|
name: pulse-agent-read
|
|
apiGroup: rbac.authorization.k8s.io
|
|
```
|
|
|
|
## 3. Talos Linux Specifics
|
|
|
|
Talos Linux is immutable, so you cannot install the agent via the shell script. Use the DaemonSet approach above.
|
|
|
|
### Agent Configuration for Talos
|
|
- **Storage**: Talos mounts the ephemeral OS on `/`. Persistent data is usually in `/var`. The Pulse agent generally doesn't store state, but if it did, ensure it maps to a persistent path.
|
|
- **Network**: The agent will report the Pod IP by default. To report the Node IP, set `PULSE_REPORT_IP` using the Downward API:
|
|
|
|
Add this to the DaemonSet `env` section:
|
|
```yaml
|
|
- name: PULSE_REPORT_IP
|
|
valueFrom:
|
|
fieldRef:
|
|
fieldPath: status.hostIP
|
|
```
|
|
|
|
## 4. Troubleshooting
|
|
|
|
- **Agent not showing in UI**: Check logs for the DaemonSet pods, for example: `kubectl logs -l app=pulse-agent -n pulse`.
|
|
- **"Permission Denied" on metrics**: Ensure `securityContext.privileged: true` is set or proper capabilities are added.
|
|
- **Connection Refused**: Ensure `PULSE_URL` is correct and reachable from the agent pods.
|