chore: snapshot current changes

This commit is contained in:
rcourtman 2025-11-02 22:47:55 +00:00
parent fb22469eb0
commit 5c4be1921c
38 changed files with 11149 additions and 4102 deletions

View file

@ -113,6 +113,8 @@ RUN if [ "$TARGETARCH" = "arm64" ]; then \
COPY --from=backend-builder /app/VERSION /VERSION
ENV PULSE_NO_AUTO_UPDATE=true
ENTRYPOINT ["/usr/local/bin/pulse-docker-agent"]
# Final stage (Pulse server runtime)

View file

@ -1,6 +1,6 @@
# Docker Monitoring Agent
Pulse is focused on Proxmox VE and PBS, but many homelabs also run application stacks in Docker. The optional Pulse Docker agent turns container health and resource usage into first-class metrics that show up alongside your hypervisor data.
Pulse is focused on Proxmox VE and PBS, but many homelabs also run application stacks in Docker. The optional Pulse Docker agent turns container health and resource usage into first-class metrics that show up alongside your hypervisor data. The recommended deployment is the bundled, least-privilege systemd service that runs the static `pulse-docker-agent` binary directly on the host. That path lets the installer lock down permissions, manage upgrades automatically, and integrate with the native init system. Containerising the agent is still available for orchestrated environments, but it trades away some of those controls (and still needs the Docker socket) so treat that option as advanced.
## What the agent reports
@ -12,6 +12,7 @@ Every check interval (30s by default) the agent collects:
- CPU usage, memory consumption and limits
- Images, port mappings, network addresses, and start times
- Writable layer size, root filesystem size, block I/O totals, and mount metadata (shown in the Docker table drawer)
- Read/write throughput derived from Docker block I/O counters so you can spot noisy workloads at a glance
- Health-check failures, restart-loop windows, and recent exit codes (displayed in the UI under each container drawer)
Data is pushed to Pulse over HTTPS using your existing API token no inbound firewall rules required.
@ -38,7 +39,7 @@ Copy the binary to your Docker host (e.g. `/usr/local/bin/pulse-docker-agent`) a
> **Why `CGO_ENABLED=0`?** Building a fully static binary ensures the agent runs on hosts still using older glibc releases (for example Debian 11 with glibc 2.31).
### Quick install from your Pulse server
### Quick install from your Pulse server (recommended)
Use the bundled installation script (ships with Pulse v4.22.0+) to deploy and manage the agent. Replace the token placeholder with an API token generated in **Settings → Security**. Create a dedicated token for each Docker host so you can revoke individual credentials without touching others—sharing one token across many hosts makes incident response much harder. Tokens used here should include the `docker:report` scope so the agent can submit telemetry (add `docker:manage` only if you plan to issue lifecycle commands remotely).
@ -49,6 +50,10 @@ curl -fsSL http://pulse.example.com/install-docker-agent.sh \
> **Why sudo?** The installer needs to drop binaries under `/usr/local/bin`, create a systemd service, and start it—actions that require root privileges. Piping to `sudo bash …` saves you from retrying if you run the command as an unprivileged user.
The script stores credentials in `/etc/pulse/pulse-docker-agent.env` (mode `600`) and creates a locked-down `pulse-docker` service account that only needs access to the Docker socket. Rotate tokens by editing that env file and running `sudo systemctl restart pulse-docker-agent`.
To keep remote stop/remove commands working from Pulse, the installer also drops a small polkit rule that lets the `pulse-docker` service account run `systemctl stop/disable pulse-docker-agent` without password prompts. If you remove that rule, expect to acknowledge stop requests manually with `sudo systemctl disable --now pulse-docker-agent`.
Running the one-liner again from another Pulse server (with its own URL/token) will merge that server into the same agent automatically—no extra flags required.
To report to more than one Pulse instance from the same Docker host, repeat the `--target` flag (format: `https://pulse.example.com|<api-token>`) or export `PULSE_TARGETS` before running the script:
@ -112,32 +117,58 @@ A single `pulse-docker-agent` process can now serve any number of Pulse backends
```ini
[Unit]
Description=Pulse Docker Agent
After=network.target docker.service
Requires=docker.service
After=network-online.target docker.socket docker.service
Wants=network-online.target docker.socket
[Service]
Type=simple
Environment=PULSE_URL=https://pulse.example.com
Environment=PULSE_TOKEN=replace-me
Environment=PULSE_TARGETS=https://pulse.example.com|replace-me;https://pulse-dr.example.com|replace-me-dr
EnvironmentFile=-/etc/pulse/pulse-docker-agent.env
ExecStart=/usr/local/bin/pulse-docker-agent --interval 30s
Restart=always
RestartSec=5
Restart=on-failure
RestartSec=5s
StartLimitIntervalSec=120
StartLimitBurst=5
User=pulse-docker
Group=pulse-docker
SupplementaryGroups=docker
UMask=0077
NoNewPrivileges=yes
RestrictSUIDSGID=yes
RestrictRealtime=yes
PrivateTmp=yes
ProtectSystem=full
ProtectHome=read-only
ProtectControlGroups=yes
ProtectKernelModules=yes
ProtectKernelTunables=yes
ProtectKernelLogs=yes
LockPersonality=yes
MemoryDenyWriteExecute=yes
RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6
ReadWritePaths=/var/run/docker.sock
ProtectHostname=yes
ProtectClock=yes
[Install]
WantedBy=multi-user.target
```
### Containerised agent (advanced)
Rotate credentials or add additional Pulse targets by editing `/etc/pulse/pulse-docker-agent.env` and reloading the service with `sudo systemctl restart pulse-docker-agent`.
### Containerised agent (advanced / optional)
If you prefer to run the agent inside a container, mount the Docker socket and supply the same environment variables:
```bash
docker run -d \
--name pulse-docker-agent \
--pid=host \
--uts=host \
-e PULSE_URL="https://pulse.example.com" \
-e PULSE_TOKEN="<token>" \
-e PULSE_TARGETS="https://pulse.example.com|<token>;https://pulse-dr.example.com|<token-dr>" \
-e PULSE_NO_AUTO_UPDATE=true \
-v /etc/machine-id:/etc/machine-id:ro \
-v /var/run/docker.sock:/var/run/docker.sock \
--restart unless-stopped \
ghcr.io/rcourtman/pulse-docker-agent:latest
@ -145,6 +176,8 @@ docker run -d \
> **Note**: Official images for `linux/amd64` and `linux/arm64` are published to `ghcr.io/rcourtman/pulse-docker-agent`. To test local changes, run `docker build --target agent_runtime -t pulse-docker-agent:test .` from the repository root.
`--pid=host`, `--uts=host`, and the `/etc/machine-id` bind keep host metadata stable so Pulse doesnt think the container itself is the Docker host. Auto-update is disabled in the image by default; rebuild or override `PULSE_NO_AUTO_UPDATE=false` only if you manage upgrades outside of your orchestrator. Expect to grant the container the same level of Docker socket access as the systemd service—running inside Docker doesnt sandbox the agent from the host.
## Configuration reference
| Flag / Env var | Description | Default |

View file

@ -173,6 +173,13 @@ The auto-setup script (Settings → Nodes → Setup Script) will prompt you to c
If the node is part of a Proxmox cluster, the script will now detect the other members and offer to configure the same SSH/lm-sensors setup on each of them automatically—confirm when prompted to roll it out cluster-wide.
### Host-side responsibilities
- Run the host installer (`install-sensor-proxy.sh`) on the Proxmox machine that hosts Pulse to install and maintain the `pulse-sensor-proxy` service. The node setup script does not create this service.
- Re-run the host installer if the service or socket disappears after a host upgrade or configuration cleanup; the installer is idempotent.
- The installer now ships a self-heal timer (`pulse-sensor-proxy-selfheal.timer`) that restarts or reinstalls the proxy if it ever goes missing; leave it enabled for automatic recovery.
- Hot dev builds now warn when only a container-local proxy socket is present, signalling that the host proxy needs to be reinstalled before temperatures will flow back into Pulse.
### Turnkey Setup for Standalone Nodes (v4.25.0+)
**For standalone nodes** (not in a Proxmox cluster) running **containerized Pulse**, the setup script now automatically configures temperature monitoring with zero manual steps:

3788
frontend-modern/pnpm-lock.yaml generated Normal file

File diff suppressed because it is too large Load diff

View file

@ -148,6 +148,76 @@ export class MonitoringAPI {
}
}
static async allowDockerHostReenroll(hostId: string): Promise<void> {
const url = `${this.baseUrl}/agents/docker/hosts/${encodeURIComponent(hostId)}/allow-reenroll`;
const response = await apiFetch(url, {
method: 'POST',
});
if (!response.ok) {
let message = `Failed with status ${response.status}`;
try {
const text = await response.text();
if (text?.trim()) {
message = text.trim();
try {
const parsed = JSON.parse(text);
if (typeof parsed?.error === 'string' && parsed.error.trim()) {
message = parsed.error.trim();
}
} catch (_err) {
// ignore parse error, use raw text
}
}
} catch (_err) {
// ignore read error
}
throw new Error(message);
}
}
static async deleteHostAgent(hostId: string): Promise<void> {
if (!hostId) {
throw new Error('Host ID is required to remove a host agent.');
}
const url = `${this.baseUrl}/agents/host/${encodeURIComponent(hostId)}`;
const response = await apiFetch(url, { method: 'DELETE' });
if (!response.ok) {
let message = `Failed with status ${response.status}`;
try {
const text = await response.text();
if (text?.trim()) {
message = text.trim();
try {
const parsed = JSON.parse(text);
if (typeof parsed?.error === 'string' && parsed.error.trim()) {
message = parsed.error.trim();
} else if (typeof parsed?.message === 'string' && parsed.message.trim()) {
message = parsed.message.trim();
}
} catch (_err) {
// Ignore JSON parse errors, fallback to raw text.
}
}
} catch (_err) {
// Ignore body read errors, keep default message.
}
throw new Error(message);
}
// Consume and ignore the body so the fetch can be reused by the connection pool.
try {
await response.text();
} catch (_err) {
// Swallow body read errors; the deletion already succeeded.
}
}
static async lookupHost(params: { id?: string; hostname?: string }): Promise<HostLookupResponse | null> {
const search = new URLSearchParams();
if (params.id) search.set('id', params.id);

View file

@ -4,6 +4,7 @@ import { StatusBadge } from '@/components/shared/StatusBadge';
import type { Alert } from '@/types/api';
import { Card } from '@/components/shared/Card';
import { SectionHeader } from '@/components/shared/SectionHeader';
import { ThresholdSlider } from '@/components/Dashboard/ThresholdSlider';
const COLUMN_TOOLTIP_LOOKUP: Record<string, string> = {
'cpu %': 'Percent CPU utilization allowed before an alert fires.',
@ -48,6 +49,8 @@ const COLUMN_TOOLTIP_LOOKUP: Record<string, string> = {
const OFFLINE_ALERTS_TOOLTIP =
'Toggle default behavior for powered-off or connectivity alerts for this resource type.';
const SLIDER_METRICS = new Set(['cpu', 'memory', 'disk', 'temperature']);
export interface Resource {
id: string;
name: string;
@ -79,6 +82,7 @@ export interface Resource {
toggleTitleEnabled?: string;
toggleTitleDisabled?: string;
editable?: boolean;
note?: string;
[key: string]: unknown;
}
@ -103,6 +107,7 @@ interface ResourceTableProps {
resourceId: string,
thresholds: Record<string, number | undefined>,
defaults: Record<string, number | undefined>,
note: string | undefined,
) => void;
onSaveEdit: (resourceId: string) => void;
onCancelEdit: () => void;
@ -136,6 +141,8 @@ interface ResourceTableProps {
groupHeaderMeta?: Record<string, GroupHeaderMeta>;
factoryDefaults?: Record<string, number | undefined>;
onResetDefaults?: () => void;
editingNote: () => string;
setEditingNote: (value: string) => void;
}
type OfflineState = 'off' | 'warning' | 'critical';
@ -513,7 +520,7 @@ export function ResourceTable(props: ResourceTableProps) {
</thead>
<tbody class="divide-y divide-gray-200 dark:divide-gray-700">
{/* Global Defaults Row */}
<Show
<Show
when={props.globalDefaults && props.setGlobalDefaults && props.setHasUnsavedChanges}
>
<tr
@ -961,6 +968,26 @@ export function ResourceTable(props: ResourceTableProps) {
Custom
</span>
</Show>
<Show when={isEditing()}>
<div class="mt-2 w-full">
<label class="sr-only" for={`note-${resource.id}`}>
Override note
</label>
<textarea
id={`note-${resource.id}`}
class="w-full rounded border border-gray-300 bg-white px-2 py-1 text-xs text-gray-700 shadow-sm focus:border-blue-500 focus:outline-none focus:ring-1 focus:ring-blue-500 dark:border-gray-600 dark:bg-gray-800 dark:text-gray-200"
rows={2}
placeholder="Add a note about this override (optional)"
value={props.editingNote()}
onInput={(e) => props.setEditingNote(e.currentTarget.value)}
/>
</div>
</Show>
<Show when={!isEditing() && resource.note}>
<p class="mt-2 text-xs italic text-gray-500 dark:text-gray-400 break-words">
{resource.note as string}
</p>
</Show>
</div>
</td>
{/* Metric columns - dynamically rendered based on resource type */}
@ -979,6 +1006,7 @@ export function ResourceTable(props: ResourceTableProps) {
resource.id,
resource.thresholds ? { ...resource.thresholds } : {},
resource.defaults ? { ...resource.defaults } : {},
typeof resource.note === 'string' ? resource.note : undefined,
);
};
@ -1011,56 +1039,121 @@ export function ResourceTable(props: ResourceTableProps) {
</div>
}
>
<div class="flex items-center justify-center">
<input
type="number"
min={bounds.min}
max={bounds.max}
step={metricStep(metric)}
value={thresholds()?.[metric] ?? ''}
placeholder={isDisabled() ? 'Off' : ''}
title="Set to -1 to disable alerts for this metric"
ref={(el) => {
if (
isEditing() &&
activeMetricInput()?.resourceId === resource.id &&
activeMetricInput()?.metric === metric
) {
queueMicrotask(() => {
el.focus();
el.select();
});
}
}}
onInput={(e) => {
const raw = e.currentTarget.value;
if (raw === '') {
props.setEditingThresholds({
...props.editingThresholds(),
[metric]: undefined,
});
return;
}
const val = parseFloat(raw);
if (!Number.isNaN(val)) {
props.setEditingThresholds({
...props.editingThresholds(),
[metric]: val,
});
}
}}
onBlur={() => {
if (props.editingId() === resource.id) {
props.onSaveEdit(resource.id);
}
setActiveMetricInput(null);
}}
class={`w-16 px-2 py-0.5 text-sm text-center border rounded ${
isDisabled()
? 'bg-gray-100 dark:bg-gray-800 text-gray-400 dark:text-gray-600 border-gray-300 dark:border-gray-600'
: 'bg-white dark:bg-gray-700 text-gray-900 dark:text-gray-100 border-gray-300 dark:border-gray-600'
}`}
/>
<div class="flex w-full items-center gap-3">
<Show when={SLIDER_METRICS.has(metric)}>
{(() => {
const sliderMin =
metric === 'temperature'
? Math.max(0, bounds.min)
: Math.max(0, bounds.min);
const sliderMax =
metric === 'temperature'
? Math.max(
sliderMin,
bounds.max > 0 ? bounds.max : 120,
)
: bounds.max;
const defaultSliderValue = () => {
if (metric === 'disk') return 90;
if (metric === 'memory') return 85;
if (metric === 'temperature') return 80;
return 80;
};
const currentSliderValue = () => {
const editingVal =
props.editingThresholds()?.[metric];
if (
typeof editingVal === 'number' &&
editingVal >= 0
) {
return Math.round(editingVal);
}
const displayVal = displayValue(metric);
if (
typeof displayVal === 'number' &&
displayVal >= 0
) {
return Math.round(displayVal);
}
return defaultSliderValue();
};
return (
<div class="w-36">
<ThresholdSlider
value={Math.max(
sliderMin,
Math.min(sliderMax, currentSliderValue()),
)}
onChange={(val) => {
props.setEditingThresholds({
...props.editingThresholds(),
[metric]: val,
});
}}
type={
metric === 'temperature'
? 'temperature'
: (metric as 'cpu' | 'memory' | 'disk')
}
min={sliderMin}
max={sliderMax}
/>
</div>
);
})()}
</Show>
<div class="flex items-center justify-center">
<input
type="number"
min={bounds.min}
max={bounds.max}
step={metricStep(metric)}
value={thresholds()?.[metric] ?? ''}
placeholder={isDisabled() ? 'Off' : ''}
title="Set to -1 to disable alerts for this metric"
ref={(el) => {
if (
isEditing() &&
activeMetricInput()?.resourceId ===
resource.id &&
activeMetricInput()?.metric === metric
) {
queueMicrotask(() => {
el.focus();
el.select();
});
}
}}
onInput={(e) => {
const raw = e.currentTarget.value;
if (raw === '') {
props.setEditingThresholds({
...props.editingThresholds(),
[metric]: undefined,
});
return;
}
const val = parseFloat(raw);
if (!Number.isNaN(val)) {
props.setEditingThresholds({
...props.editingThresholds(),
[metric]: val,
});
}
}}
onBlur={() => {
if (props.editingId() === resource.id) {
props.onSaveEdit(resource.id);
}
setActiveMetricInput(null);
}}
class={`w-16 px-2 py-0.5 text-sm text-center border rounded ${
isDisabled()
? 'bg-gray-100 dark:bg-gray-800 text-gray-400 dark:text-gray-600 border-gray-300 dark:border-gray-600'
: 'bg-white dark:bg-gray-700 text-gray-900 dark:text-gray-100 border-gray-300 dark:border-gray-600'
}`}
/>
</div>
</div>
</Show>
</Show>
@ -1171,6 +1264,7 @@ export function ResourceTable(props: ResourceTableProps) {
resource.id,
resource.thresholds ? { ...resource.thresholds } : {},
resource.defaults ? { ...resource.defaults } : {},
typeof resource.note === 'string' ? resource.note : undefined,
)
}
class="p-1 text-blue-600 hover:text-blue-700 dark:text-blue-400 dark:hover:text-blue-300"
@ -1421,11 +1515,12 @@ export function ResourceTable(props: ResourceTableProps) {
return;
}
setActiveMetricInput({ resourceId: resource.id, metric });
props.onEdit(
resource.id,
resource.thresholds ? { ...resource.thresholds } : {},
resource.defaults ? { ...resource.defaults } : {},
);
props.onEdit(
resource.id,
resource.thresholds ? { ...resource.thresholds } : {},
resource.defaults ? { ...resource.defaults } : {},
typeof resource.note === 'string' ? resource.note : undefined,
);
};
return (
@ -1597,11 +1692,12 @@ export function ResourceTable(props: ResourceTableProps) {
<div class="flex items-center gap-1">
<button
type="button"
onClick={() =>
onClick={() =>
props.onEdit(
resource.id,
resource.thresholds ? { ...resource.thresholds } : {},
resource.defaults ? { ...resource.defaults } : {},
typeof resource.note === 'string' ? resource.note : undefined,
)
}
class="p-1 text-blue-600 hover:text-blue-700 dark:text-blue-400 dark:hover:text-blue-300"

View file

@ -52,6 +52,7 @@ interface Override {
disabled?: boolean;
disableConnectivity?: boolean; // For nodes only - disable offline alerts
poweredOffSeverity?: 'warning' | 'critical';
note?: string;
thresholds: {
cpu?: number;
memory?: number;
@ -307,9 +308,10 @@ export function ThresholdsTable(props: ThresholdsTableProps) {
const [searchTerm, setSearchTerm] = createSignal('');
const [editingId, setEditingId] = createSignal<string | null>(null);
const [editingThresholds, setEditingThresholds] = createSignal<
Record<string, number | undefined>
>({});
const [editingThresholds, setEditingThresholds] = createSignal<
Record<string, number | undefined>
>({});
const [editingNote, setEditingNote] = createSignal('');
const [activeTab, setActiveTab] = createSignal<'proxmox' | 'pmg' | 'hosts' | 'docker'>('proxmox');
let searchInputRef: HTMLInputElement | undefined;
const [dockerIgnoredInput, setDockerIgnoredInput] = createSignal(
@ -585,6 +587,9 @@ export function ThresholdsTable(props: ThresholdsTableProps) {
);
});
const note = typeof override?.note === 'string' ? override.note : undefined;
const hasNote = Boolean(note && note.trim().length > 0);
const originalDisplayName = node.displayName?.trim() || node.name;
const friendlyName = getFriendlyNodeName(originalDisplayName, node.clusterName);
const rawName = node.name;
@ -608,7 +613,8 @@ export function ThresholdsTable(props: ThresholdsTableProps) {
uptime: node.uptime,
cpu: node.cpu,
memory: node.memory?.usage,
hasOverride: hasCustomThresholds || false,
hasOverride:
hasCustomThresholds || hasNote || Boolean(override?.disableConnectivity) || false,
disabled: false,
disableConnectivity: override?.disableConnectivity || false,
thresholds: override?.thresholds || {},
@ -616,6 +622,7 @@ export function ThresholdsTable(props: ThresholdsTableProps) {
clusterName: node.isClusterMember ? node.clusterName?.trim() : undefined,
isClusterMember: node.isClusterMember ?? false,
instance: node.instance,
note,
} satisfies Resource;
});
@ -1518,11 +1525,13 @@ export function ThresholdsTable(props: ThresholdsTableProps) {
resourceId: string,
currentThresholds: Record<string, number | undefined>,
defaults: Record<string, number | undefined>,
note?: string,
) => {
setEditingId(resourceId);
// Merge defaults with overrides for editing
const mergedThresholds = { ...defaults, ...currentThresholds };
setEditingThresholds(mergedThresholds);
setEditingNote(note ?? '');
};
const saveEdit = (resourceId: string) => {
@ -1542,6 +1551,8 @@ export function ThresholdsTable(props: ThresholdsTableProps) {
if (!resource) return;
const editedThresholds = editingThresholds();
const trimmedNote = editingNote().trim();
const noteForOverride = trimmedNote.length > 0 ? trimmedNote : undefined;
if (resource.editScope === 'backup') {
const currentBackupDefaults = props.backupDefaults();
@ -1610,7 +1621,8 @@ export function ThresholdsTable(props: ThresholdsTableProps) {
const hasStateOnlyOverride = Boolean(
resource.disabled ||
resource.disableConnectivity ||
resource.poweredOffSeverity !== undefined,
resource.poweredOffSeverity !== undefined ||
noteForOverride !== undefined,
);
// If no threshold overrides or state flags remain, remove the override entirely
@ -1642,6 +1654,7 @@ export function ThresholdsTable(props: ThresholdsTableProps) {
disabled: resource.disabled,
disableConnectivity: resource.disableConnectivity,
poweredOffSeverity: resource.poweredOffSeverity,
note: noteForOverride,
thresholds: overrideThresholds,
};
@ -1701,18 +1714,25 @@ export function ThresholdsTable(props: ThresholdsTableProps) {
delete hysteresisThresholds.poweredOffSeverity;
}
}
if (noteForOverride) {
hysteresisThresholds.note = noteForOverride;
} else {
delete hysteresisThresholds.note;
}
newRawConfig[resourceId] = hysteresisThresholds;
props.setRawOverridesConfig(newRawConfig);
props.setHasUnsavedChanges(true);
setEditingId(null);
setEditingThresholds({});
setEditingNote('');
};
const cancelEdit = () => {
setEditingId(null);
setEditingThresholds({});
};
const cancelEdit = () => {
setEditingId(null);
setEditingThresholds({});
setEditingNote('');
};
const updateMetricDelay = (
typeKey: 'guest' | 'node' | 'storage' | 'pbs',

View file

@ -3,7 +3,7 @@ import { createSignal, createEffect, onMount } from 'solid-js';
interface ThresholdSliderProps {
value: number;
onChange: (value: number) => void;
type: 'cpu' | 'memory' | 'disk';
type: 'cpu' | 'memory' | 'disk' | 'temperature';
min?: number;
max?: number;
}
@ -15,10 +15,11 @@ export function ThresholdSlider(props: ThresholdSliderProps) {
const [isDragging, setIsDragging] = createSignal(false);
// Color mapping
const colorMap = {
const colorMap: Record<ThresholdSliderProps['type'], string> = {
cpu: 'text-blue-500',
memory: 'text-green-500',
disk: 'text-amber-500',
temperature: 'text-rose-500',
};
// Calculate visual position - allow full range 0-100%
@ -81,7 +82,9 @@ export function ThresholdSlider(props: ThresholdSliderProps) {
? 'bg-blue-500/30'
: props.type === 'memory'
? 'bg-green-500/30'
: 'bg-amber-500/30'
: props.type === 'disk'
? 'bg-amber-500/30'
: 'bg-rose-500/30'
}`}
style={{ width: `${calculateVisualPosition(props.value)}%` }}
></div>
@ -98,7 +101,11 @@ export function ThresholdSlider(props: ThresholdSliderProps) {
onWheel={(e) => e.preventDefault()}
class="absolute inset-0 w-full h-3.5 opacity-0 cursor-pointer z-20"
style={{ 'touch-action': 'none' }}
title={`${props.type.toUpperCase()}: ${props.value}%`}
title={
props.type === 'temperature'
? `Temperature: ${props.value}°C`
: `${props.type.toUpperCase()}: ${props.value}%`
}
/>
{/* Custom thumb with value */}
@ -118,7 +125,9 @@ export function ThresholdSlider(props: ThresholdSliderProps) {
>
<div class="relative">
<div class="w-9 h-4 bg-white dark:bg-gray-800 rounded-full shadow-md border-2 border-current flex items-center justify-center">
<span class="text-[9px] font-semibold">{props.value}%</span>
<span class="text-[9px] font-semibold">
{props.type === 'temperature' ? `${props.value}°` : `${props.value}%`}
</span>
</div>
</div>
</div>

View file

@ -374,13 +374,26 @@ const DockerContainerRow: Component<{
const blockIo = createMemo(() => container.blockIo);
const blockIoReadBytes = createMemo(() => blockIo()?.readBytes ?? 0);
const blockIoWriteBytes = createMemo(() => blockIo()?.writeBytes ?? 0);
const blockIoReadRate = createMemo(() => blockIo()?.readRateBytesPerSecond ?? null);
const blockIoWriteRate = createMemo(() => blockIo()?.writeRateBytesPerSecond ?? null);
const formatIoRate = (value?: number | null) => {
if (value === undefined || value === null) return undefined;
if (value <= 0) return undefined;
const decimals = value >= 1024 * 1024 ? 1 : value >= 1024 ? 1 : 0;
return `${formatBytes(value, decimals)}/s`;
};
const blockIoReadRateLabel = createMemo(() => formatIoRate(blockIoReadRate()));
const blockIoWriteRateLabel = createMemo(() => formatIoRate(blockIoWriteRate()));
const hasBlockIo = createMemo(() => {
const stats = blockIo();
if (!stats) return false;
const read = stats.readBytes ?? 0;
const write = stats.writeBytes ?? 0;
return read > 0 || write > 0;
const readRate = stats.readRateBytesPerSecond ?? 0;
const writeRate = stats.writeRateBytesPerSecond ?? 0;
return read > 0 || write > 0 || readRate > 0 || writeRate > 0;
});
const hasBlockIoRates = createMemo(() => !!blockIoReadRateLabel() || !!blockIoWriteRateLabel());
const hasDrawerContent = createMemo(() => {
return (
@ -765,6 +778,19 @@ const DockerContainerRow: Component<{
/>
</Show>
</Show>
<Show when={hasBlockIoRates()}>
<div class="mt-1 text-[11px] text-gray-500 dark:text-gray-400">
<Show when={blockIoReadRateLabel()}>
<span>R {blockIoReadRateLabel()}</span>
</Show>
<Show when={blockIoReadRateLabel() && blockIoWriteRateLabel()}>
<span class="mx-1 text-gray-300 dark:text-gray-600"></span>
</Show>
<Show when={blockIoWriteRateLabel()}>
<span>W {blockIoWriteRateLabel()}</span>
</Show>
</div>
</Show>
</td>
<td class="px-2 py-0.5 text-xs text-gray-700 dark:text-gray-300">
<Show when={isRunning()} fallback={<span class="text-gray-400"></span>}>
@ -838,15 +864,29 @@ const DockerContainerRow: Component<{
<div class="mt-1 space-y-1 text-[11px] text-gray-600 dark:text-gray-300">
<div class="flex items-center justify-between">
<span>Read</span>
<span class="font-semibold text-gray-900 dark:text-gray-100">
{formatBytes(blockIoReadBytes())}
</span>
<div class="text-right">
<div class="font-semibold text-gray-900 dark:text-gray-100">
{formatBytes(blockIoReadBytes())}
</div>
<Show when={blockIoReadRateLabel()}>
<div class="text-[10px] text-gray-500 dark:text-gray-400">
{blockIoReadRateLabel()}
</div>
</Show>
</div>
</div>
<div class="flex items-center justify-between">
<span>Write</span>
<span class="font-semibold text-gray-900 dark:text-gray-100">
{formatBytes(blockIoWriteBytes())}
</span>
<div class="text-right">
<div class="font-semibold text-gray-900 dark:text-gray-100">
{formatBytes(blockIoWriteBytes())}
</div>
<Show when={blockIoWriteRateLabel()}>
<div class="text-[10px] text-gray-500 dark:text-gray-400">
{blockIoWriteRateLabel()}
</div>
</Show>
</div>
</div>
</div>
</div>

View file

@ -17,25 +17,46 @@ export const DockerAgents: Component = () => {
const [showHidden, setShowHidden] = createSignal(false);
const dockerHosts = () => {
const all = state.dockerHosts || [];
return showHidden() ? all : all.filter(host => !host.hidden);
};
const allDockerHosts = () => state.dockerHosts || [];
const hiddenCount = () => (state.dockerHosts || []).filter(host => host.hidden).length;
const dockerHosts = createMemo(() => {
const all = allDockerHosts();
const includeHidden = showHidden();
let filtered = includeHidden ? all : all.filter(host => !host.hidden);
const pendingHosts = () =>
dockerHosts().filter(host => {
if (!includeHidden) {
filtered = filtered.filter(host => {
if (!host.pendingUninstall) {
return true;
}
const status = host.command?.status;
return status === 'failed' || status === 'expired';
});
}
return filtered;
});
const hiddenCount = () => allDockerHosts().filter(host => host.hidden).length;
const pendingHosts = createMemo(() =>
allDockerHosts().filter(host => {
if (host.pendingUninstall) return true;
const status = host.command?.status;
if (status === 'queued' || status === 'dispatched' || status === 'acknowledged') return true;
return Boolean(host.pendingUninstall);
});
return status === 'queued' || status === 'dispatched' || status === 'acknowledged' || status === 'completed';
}),
);
const removedHosts = () => state.removedDockerHosts ?? [];
const hasRemovedHosts = () => removedHosts().length > 0;
const [removingHostId, setRemovingHostId] = createSignal<string | null>(null);
const [showRemoveModal, setShowRemoveModal] = createSignal(false);
const [hostToRemoveId, setHostToRemoveId] = createSignal<string | null>(null);
const [uninstallCommandCopied, setUninstallCommandCopied] = createSignal(false);
const [removeActionLoading, setRemoveActionLoading] = createSignal<'queue' | 'force' | 'hide' | null>(null);
const [removeActionLoading, setRemoveActionLoading] = createSignal<
'queue' | 'force' | 'hide' | 'awaitingCommand' | null
>(null);
const [showAdvancedOptions, setShowAdvancedOptions] = createSignal(false);
const [securityStatus, setSecurityStatus] = createSignal<SecurityStatus | null>(null);
const [isGeneratingToken, setIsGeneratingToken] = createSignal(false);
@ -131,6 +152,10 @@ const modalLastHeartbeat = createMemo(() => {
return host?.lastReportTime ? formatRelativeTime(new Date(host.lastReportTime)) : null;
});
const modalHostPendingUninstall = createMemo(() => Boolean(hostToRemove()?.pendingUninstall));
const modalHasCommand = createMemo(() => Boolean(modalCommand()));
const [hasShownCommandCompletion, setHasShownCommandCompletion] = createSignal(false);
const formatElapsedTime = (seconds: number) => {
if (seconds < 60) {
return `${seconds}s`;
@ -140,6 +165,55 @@ const formatElapsedTime = (seconds: number) => {
return `${mins}m ${secs}s`;
};
type RemovalStatusTone = 'info' | 'success' | 'danger';
const removalBadgeClassMap: Record<RemovalStatusTone, string> = {
info: 'inline-flex items-center gap-1 rounded-full bg-blue-100 px-2 py-0.5 text-[11px] font-semibold uppercase tracking-wide text-blue-700 dark:bg-blue-900/40 dark:text-blue-200',
success:
'inline-flex items-center gap-1 rounded-full bg-emerald-100 px-2 py-0.5 text-[11px] font-semibold uppercase tracking-wide text-emerald-700 dark:bg-emerald-900/40 dark:text-emerald-200',
danger:
'inline-flex items-center gap-1 rounded-full bg-red-100 px-2 py-0.5 text-[11px] font-semibold uppercase tracking-wide text-red-600 dark:bg-red-900/40 dark:text-red-200',
};
const removalTextClassMap: Record<RemovalStatusTone, string> = {
info: 'text-blue-700 dark:text-blue-300',
success: 'text-emerald-700 dark:text-emerald-300',
danger: 'text-red-600 dark:text-red-300',
};
const getRemovalStatusInfo = (host: DockerHost): { label: string; tone: RemovalStatusTone } | null => {
const status = host.command?.status ?? null;
switch (status) {
case 'failed':
return {
label: host.command?.failureReason || 'Pulse could not stop the agent automatically.',
tone: 'danger',
};
case 'expired':
return {
label: 'Stop command expired before the agent responded.',
tone: 'danger',
};
case 'completed':
return {
label: 'Agent stopped. Pulse will hide this host after the next missed heartbeat.',
tone: 'success',
};
case 'acknowledged':
return { label: 'Agent acknowledged the stop command—waiting for shutdown.', tone: 'info' };
case 'dispatched':
return { label: 'Instruction delivered to the agent.', tone: 'info' };
case 'queued':
return { label: 'Stop command queued; waiting to reach the agent.', tone: 'info' };
default:
if (host.pendingUninstall) {
return { label: 'Marked for uninstall; waiting for agent confirmation.', tone: 'info' };
}
return null;
}
};
createEffect(() => {
if (!showRemoveModal()) return;
const id = hostToRemoveId();
@ -149,6 +223,17 @@ const formatElapsedTime = (seconds: number) => {
}
});
createEffect(() => {
if (!showRemoveModal()) {
return;
}
if (removeActionLoading() === 'awaitingCommand') {
if (modalHasCommand() || modalHostPendingUninstall() || modalCommandFailed()) {
setRemoveActionLoading(null);
}
}
});
// Track elapsed time for command execution
createEffect(() => {
const cmd = modalCommand();
@ -178,6 +263,23 @@ const formatElapsedTime = (seconds: number) => {
}
});
createEffect(() => {
if (!showRemoveModal()) {
return;
}
if (modalCommandCompleted() && !hasShownCommandCompletion()) {
setHasShownCommandCompletion(true);
notificationStore.success('Agent stopped. Pulse will hide this host after the next heartbeat.', 5000);
if (typeof window !== 'undefined') {
window.setTimeout(() => {
closeRemoveModal();
}, 1200);
} else {
closeRemoveModal();
}
}
});
onMount(() => {
if (typeof window === 'undefined') {
return;
@ -293,6 +395,32 @@ User=root
WantedBy=multi-user.target`;
};
const getAllowReenrollCommand = (hostId: string) => {
const url = pulseUrl();
return `curl -X POST -H "X-API-Token: <token-with-docker:manage>" ${url}/api/agents/docker/hosts/${hostId}/allow-reenroll`;
};
const handleAllowReenroll = async (hostId: string, label: string) => {
try {
await MonitoringAPI.allowDockerHostReenroll(hostId);
notificationStore.success(`Allowed ${label} to report again`, 4000);
} catch (error) {
console.error('Failed to allow Docker host re-enroll', error);
const message = error instanceof Error ? error.message : 'Failed to clear the removal block. Confirm your account has docker:manage access.';
notificationStore.error(message, 8000);
}
};
const handleCopyAllowCommand = async (hostId: string, label: string) => {
const command = getAllowReenrollCommand(hostId);
const copied = await copyToClipboard(command);
if (copied) {
notificationStore.success(`Command copied for ${label}`, 3500);
} else {
notificationStore.error('Copy failed. You can still manually copy the snippet.', 4000);
}
};
const isRemovingHost = (hostId: string) => removingHostId() === hostId;
const openRemoveModal = (host: DockerHost) => {
@ -301,6 +429,7 @@ WantedBy=multi-user.target`;
setRemoveActionLoading(null);
setShowAdvancedOptions(false);
setShowRemoveModal(true);
setHasShownCommandCompletion(false);
};
const closeRemoveModal = () => {
@ -309,6 +438,7 @@ WantedBy=multi-user.target`;
setUninstallCommandCopied(false);
setRemoveActionLoading(null);
setShowAdvancedOptions(false);
setHasShownCommandCompletion(false);
};
const handleQueueStopCommand = async () => {
@ -322,19 +452,20 @@ WantedBy=multi-user.target`;
try {
await MonitoringAPI.deleteDockerHost(host.id);
notificationStore.success(`Stop command sent to ${displayName}`, 3500);
setRemoveActionLoading('awaitingCommand');
} catch (error) {
console.error('Failed to queue Docker host stop command', error);
const message = error instanceof Error ? error.message : 'Failed to send stop command';
notificationStore.error(message, 8000);
setRemoveActionLoading(null);
} finally {
setRemovingHostId(null);
setRemoveActionLoading(null);
}
};
const handleHideHostFromModal = async () => {
const host = hostToRemove();
if (!host || removeActionLoading()) return;
if (!host || (removeActionLoading() && removeActionLoading() !== 'awaitingCommand')) return;
const displayName = getDisplayName(host);
setRemovingHostId(host.id);
@ -356,7 +487,7 @@ WantedBy=multi-user.target`;
const handleRemoveHostNow = async () => {
const host = hostToRemove();
if (!host || removeActionLoading()) return;
if (!host || (removeActionLoading() && removeActionLoading() !== 'awaitingCommand')) return;
const displayName = getDisplayName(host);
setRemovingHostId(host.id);
@ -415,6 +546,62 @@ WantedBy=multi-user.target`;
return (
<div class="space-y-6">
<Show when={hasRemovedHosts()}>
<Card
padding="lg"
class="space-y-4 border border-amber-300 bg-amber-50 text-amber-900 shadow-sm dark:border-amber-500/40 dark:bg-amber-500/10 dark:text-amber-100"
>
<div class="space-y-1">
<h3 class="text-sm font-semibold">Recently removed Docker hosts</h3>
<p class="text-sm text-amber-800 dark:text-amber-200">
Pulse is currently blocking these hosts because they were explicitly removed. Allow them to re-enroll or
copy the command below and run it with a token that includes the <code>docker:manage</code> scope.
</p>
</div>
<div class="space-y-3">
<For each={removedHosts()}>
{(entry) => {
const label = entry.displayName || entry.hostname || entry.id;
return (
<div class="rounded-lg border border-amber-200 bg-white/80 p-4 shadow-sm dark:border-amber-500/40 dark:bg-amber-950/20">
<div class="flex flex-col gap-1 sm:flex-row sm:items-center sm:justify-between">
<div>
<p class="text-sm font-semibold text-gray-900 dark:text-gray-100">{label}</p>
<p class="text-xs text-gray-500 dark:text-gray-400">Host ID: {entry.id}</p>
</div>
<div class="text-xs text-gray-500 dark:text-gray-400">
Removed {formatRelativeTime(entry.removedAt)}
</div>
</div>
<div class="mt-3 flex flex-wrap gap-2">
<button
type="button"
onClick={() => handleAllowReenroll(entry.id, label)}
class="inline-flex items-center justify-center gap-2 rounded-md bg-emerald-600 px-3 py-1.5 text-xs font-medium text-white transition hover:bg-emerald-700 focus:outline-none focus:ring-2 focus:ring-emerald-500 focus:ring-offset-1 dark:focus:ring-offset-gray-900"
>
Allow re-enroll
</button>
<button
type="button"
onClick={() => handleCopyAllowCommand(entry.id, label)}
class="inline-flex items-center justify-center gap-2 rounded-md border border-emerald-600/50 px-3 py-1.5 text-xs font-medium text-emerald-700 transition hover:bg-emerald-50 focus:outline-none focus:ring-2 focus:ring-emerald-500 focus:ring-offset-1 dark:border-emerald-400/60 dark:text-emerald-200 dark:hover:bg-emerald-500/20 dark:focus:ring-offset-gray-900"
>
Copy curl command
</button>
</div>
</div>
);
}}
</For>
<p class="text-xs text-amber-800 dark:text-amber-200">
If you removed a host intentionally, you can simply ignore itentries expire automatically after 24 hours.
</p>
</div>
</Card>
</Show>
<Card padding="lg" class="space-y-5">
<div class="space-y-1">
<h3 class="text-base font-semibold text-gray-900 dark:text-gray-100">Add a Docker host</h3>
@ -631,7 +818,12 @@ WantedBy=multi-user.target`;
<button
type="button"
onClick={handleQueueStopCommand}
disabled={removeActionLoading() !== null || modalCommandInProgress() || modalCommandStatus() === 'completed'}
disabled={
removeActionLoading() !== null ||
modalCommandInProgress() ||
modalCommandStatus() === 'completed' ||
(modalHostPendingUninstall() && !modalHasCommand())
}
class={`inline-flex items-center justify-center rounded px-4 py-2 text-sm font-medium text-white transition-colors ${
modalCommandStatus() === 'completed'
? 'bg-emerald-600 dark:bg-emerald-500'
@ -640,61 +832,110 @@ WantedBy=multi-user.target`;
>
{(() => {
if (removeActionLoading() === 'queue') return 'Sending…';
if (removeActionLoading() === 'awaitingCommand') return 'Waiting for agent…';
if (modalCommandInProgress()) return 'Waiting for agent…';
if (modalCommandStatus() === 'completed') return 'Agent stopped';
if (!modalHasCommand() && modalHostPendingUninstall()) return 'Waiting for host…';
if (modalCommandFailed()) return 'Retry stop command';
return 'Stop agent now';
})()}
</button>
<Show when={modalCommandInProgress()}>
<div class="space-y-3">
{/* Progress steps */}
<div class="rounded border border-blue-200 bg-white p-3 dark:border-blue-700 dark:bg-blue-800/20">
<div class="mb-2 flex items-center justify-between">
<span class="text-xs font-semibold uppercase tracking-wide text-blue-700 dark:text-blue-300">Progress</span>
<span class="text-xs text-blue-600 dark:text-blue-400">{formatElapsedTime(elapsedSeconds())} elapsed</span>
</div>
<ul class="space-y-1.5">
<For each={modalCommandProgress()}>
{(step) => (
<li
class={`${step.done || step.active ? 'text-blue-700 dark:text-blue-200' : 'text-gray-500 dark:text-gray-400'} flex items-center gap-2 text-xs`}
>
<span
class={`h-2 w-2 flex-shrink-0 rounded-full ${
step.done
? 'bg-blue-500'
: step.active
? 'bg-blue-400 animate-pulse'
: 'bg-gray-300 dark:bg-gray-600'
}`}
/>
{step.label}
</li>
)}
</For>
</ul>
</div>
{/* Expected time and last heartbeat */}
<div class="flex items-start gap-2 text-xs text-blue-700 dark:text-blue-300">
<svg class="w-4 h-4 mt-0.5 flex-shrink-0" fill="none" viewBox="0 0 24 24" stroke="currentColor">
<Show
when={
removeActionLoading() === 'awaitingCommand' &&
!modalHasCommand() &&
!modalHostPendingUninstall()
}
>
<div class="rounded border border-blue-200 bg-white p-3 dark:border-blue-700 dark:bg-blue-800/20">
<div class="flex items-start gap-2 text-xs text-blue-700 dark:text-blue-200">
<svg class="mt-0.5 h-4 w-4 flex-shrink-0" fill="none" viewBox="0 0 24 24" stroke="currentColor">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M12 8v4l3 3m6-3a9 9 0 11-18 0 9 9 0 0118 0z" />
</svg>
<div>
<p>
<Show when={!modalCommandTimedOut()} fallback="This is taking longer than expected.">
This usually takes 30-60 seconds.
</Show>
<Show when={modalLastHeartbeat()}>
{' '}Last heartbeat: {modalLastHeartbeat()}.
</Show>
<p class="font-semibold">Stop command sent.</p>
<p class="mt-1 leading-snug">
Pulse is waiting for <span class="font-medium">{modalHostname()}</span> to pick up the shutdown instruction. This usually finishes within 30-60 seconds.
</p>
</div>
</div>
</div>
</Show>
<Show
when={
modalCommandInProgress() ||
modalCommandCompleted() ||
(!modalHasCommand() && modalHostPendingUninstall())
}
>
<div class="space-y-3">
<Show when={modalHasCommand()}>
<div class="rounded border border-blue-200 bg-white p-3 dark:border-blue-700 dark:bg-blue-800/20">
<div class="mb-2 flex items-center justify-between">
<span class="text-xs font-semibold uppercase tracking-wide text-blue-700 dark:text-blue-300">Progress</span>
<Show
when={!modalCommandCompleted()}
fallback={<span class="text-xs font-semibold text-emerald-600 dark:text-emerald-300">Completed</span>}
>
<span class="text-xs text-blue-600 dark:text-blue-400">{formatElapsedTime(elapsedSeconds())} elapsed</span>
</Show>
</div>
<ul class="space-y-1.5">
<For each={modalCommandProgress()}>
{(step) => (
<li
class={`${step.done || step.active ? 'text-blue-700 dark:text-blue-200' : 'text-gray-500 dark:text-gray-400'} flex items-center gap-2 text-xs`}
>
<span
class={`relative h-2 w-2 flex-shrink-0 rounded-full ${
step.done
? 'bg-blue-500'
: step.active
? 'bg-blue-400 animate-pulse'
: 'bg-gray-300 dark:bg-gray-600'
} ${modalCommandCompleted() && step.done ? 'after:absolute after:-inset-1 after:rounded-full after:border after:border-emerald-400/40 after:animate-pulse' : ''}`}
/>
{step.label}
</li>
)}
</For>
</ul>
</div>
</Show>
{/* Timeout warning */}
<Show when={modalCommandTimedOut()}>
<Show when={!modalHasCommand() && modalHostPendingUninstall()}>
<div class="rounded border border-blue-200 bg-white p-3 text-xs text-blue-700 dark:border-blue-700 dark:bg-blue-800/20 dark:text-blue-200">
<p class="font-semibold">Agent already stopped.</p>
<p class="mt-1 leading-snug">
Pulse is waiting for <span class="font-medium">{modalHostname()}</span> to miss its next heartbeat so the host can be removed automatically. No further action is requiredthis usually finishes within 60 seconds.
</p>
<Show when={modalLastHeartbeat()}>
<p class="mt-2 text-[11px]">
Last heartbeat: {modalLastHeartbeat()}. Pulse will clear the entry after the next missed report.
</p>
</Show>
</div>
</Show>
<Show when={modalHasCommand() && !modalCommandCompleted()}>
<div class="flex items-start gap-2 text-xs text-blue-700 dark:text-blue-300">
<svg class="w-4 h-4 mt-0.5 flex-shrink-0" fill="none" viewBox="0 0 24 24" stroke="currentColor">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M12 8v4l3 3m6-3a9 9 0 11-18 0 9 9 0 0118 0z" />
</svg>
<div>
<p>
<Show when={!modalCommandTimedOut()} fallback="This is taking longer than expected.">
This usually takes 30-60 seconds.
</Show>
<Show when={modalLastHeartbeat()}>
{' '}Last heartbeat: {modalLastHeartbeat()}.
</Show>
</p>
</div>
</div>
</Show>
<Show when={modalHasCommand() && modalCommandTimedOut() && !modalCommandCompleted()}>
<div class="rounded border border-yellow-200 bg-yellow-50 p-3 dark:border-yellow-700 dark:bg-yellow-900/20">
<div class="flex items-start gap-2">
<svg class="w-4 h-4 mt-0.5 flex-shrink-0 text-yellow-600 dark:text-yellow-400" fill="none" viewBox="0 0 24 24" stroke="currentColor">
@ -747,13 +988,13 @@ WantedBy=multi-user.target`;
class={`${step.done || step.active ? 'text-blue-700 dark:text-blue-200' : 'text-gray-500 dark:text-gray-400'} flex items-center gap-2`}
>
<span
class={`h-2 w-2 rounded-full ${
class={`relative h-2 w-2 rounded-full ${
step.done
? 'bg-blue-500'
: step.active
? 'bg-blue-400 animate-pulse'
: 'bg-gray-300 dark:bg-gray-600'
}`}
} ${modalCommandCompleted() && step.done ? 'after:absolute after:-inset-1 after:rounded-full after:border after:border-emerald-400/40 after:animate-pulse' : ''}`}
/>
{step.label}
</li>
@ -790,7 +1031,7 @@ WantedBy=multi-user.target`;
<button
type="button"
onClick={handleRemoveHostNow}
disabled={removeActionLoading() !== null}
disabled={removeActionLoading() !== null && removeActionLoading() !== 'awaitingCommand'}
class="self-start rounded bg-orange-600 px-4 py-2 text-sm font-semibold text-white transition-colors hover:bg-orange-700 disabled:cursor-not-allowed disabled:opacity-60 dark:bg-orange-500 dark:hover:bg-orange-400 whitespace-nowrap"
>
{removeActionLoading() === 'force' ? 'Removing…' : 'Force remove now'}
@ -811,7 +1052,7 @@ WantedBy=multi-user.target`;
<button
type="button"
onClick={handleRemoveHostNow}
disabled={removeActionLoading() !== null}
disabled={removeActionLoading() !== null && removeActionLoading() !== 'awaitingCommand'}
class="self-start rounded bg-emerald-600 px-4 py-2 text-sm font-semibold text-white transition-colors hover:bg-emerald-700 disabled:cursor-not-allowed disabled:opacity-60 dark:bg-emerald-500 dark:hover:bg-emerald-400"
>
{removeActionLoading() === 'force' ? 'Removing…' : 'Remove host'}
@ -877,7 +1118,7 @@ WantedBy=multi-user.target`;
<button
type="button"
onClick={handleRemoveHostNow}
disabled={removeActionLoading() !== null}
disabled={removeActionLoading() !== null && removeActionLoading() !== 'awaitingCommand'}
class="self-start rounded bg-red-600 px-3 py-1.5 text-xs font-semibold text-white transition-colors hover:bg-red-700 disabled:cursor-not-allowed disabled:opacity-60 dark:bg-red-500 dark:hover:bg-red-400"
>
{removeActionLoading() === 'force' ? 'Removing…' : 'Force remove now'}
@ -896,7 +1137,7 @@ WantedBy=multi-user.target`;
<button
type="button"
onClick={handleHideHostFromModal}
disabled={removeActionLoading() !== null || modalHostHidden()}
disabled={(removeActionLoading() !== null && removeActionLoading() !== 'awaitingCommand') || modalHostHidden()}
class="self-start rounded bg-gray-200 px-3 py-1.5 text-xs font-medium text-gray-800 transition-colors hover:bg-gray-300 disabled:cursor-not-allowed disabled:opacity-60 dark:bg-gray-700 dark:text-gray-200 dark:hover:bg-gray-600"
>
{removeActionLoading() === 'hide' ? 'Hiding...' : modalHostHidden() ? 'Already hidden' : 'Hide host'}
@ -930,13 +1171,75 @@ WantedBy=multi-user.target`;
<svg class="w-5 h-5 text-yellow-600 dark:text-yellow-400 mt-0.5 flex-shrink-0" fill="none" viewBox="0 0 24 24" stroke="currentColor">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M12 8v4l3 3m6-3a9 9 0 11-18 0 9 9 0 0118 0z" />
</svg>
<div class="flex-1">
<h4 class="text-sm font-semibold text-yellow-900 dark:text-yellow-100">
Stopping {pendingHosts().length} host{pendingHosts().length !== 1 ? 's' : ''}
</h4>
<p class="mt-1 text-sm text-yellow-800 dark:text-yellow-200">
Pulse has sent the stop command. Once an agent acknowledges (or goes offline), the entry will disappear automatically.
</p>
<div class="flex-1 space-y-3">
<div>
<h4 class="text-sm font-semibold text-yellow-900 dark:text-yellow-100">
Stopping {pendingHosts().length} host{pendingHosts().length !== 1 ? 's' : ''}
</h4>
<p class="mt-1 text-sm text-yellow-800 dark:text-yellow-200">
Pulse has the stop command in flight. You can keep workingthese hosts will disappear automatically once the agent shuts down or misses its next heartbeat.
</p>
</div>
<div class="space-y-2">
<For each={pendingHosts()}>
{(host) => {
const label = getDisplayName(host);
const statusInfo = getRemovalStatusInfo(host) ?? {
label: 'Marked for uninstall; waiting for agent confirmation.',
tone: 'info' as RemovalStatusTone,
};
const status = host.command?.status ?? (host.pendingUninstall ? 'pending' : 'unknown');
const isOnline = host.status?.toLowerCase() === 'online';
const lastSeenLabel = host.lastSeen ? formatRelativeTime(host.lastSeen) : 'Awaiting first report';
const badgeText =
status === 'completed'
? 'Agent stopped'
: status === 'acknowledged'
? 'Acknowledged'
: status === 'dispatched'
? 'Dispatched'
: status === 'queued'
? 'Queued'
: 'Pending';
return (
<div class="rounded-lg border border-yellow-200 bg-white/80 p-3 shadow-sm dark:border-yellow-700/40 dark:bg-yellow-900/20">
<div class="flex flex-col gap-2 sm:flex-row sm:items-start sm:justify-between">
<div>
<p class="text-sm font-semibold text-gray-900 dark:text-gray-100">{label}</p>
<p class="text-xs text-gray-500 dark:text-gray-400">{host.hostname || host.id}</p>
<span class={removalBadgeClassMap[statusInfo.tone]}>{badgeText}</span>
</div>
<div class="text-[11px] text-gray-500 dark:text-gray-400 sm:text-right">
Last seen {lastSeenLabel}
</div>
</div>
<p class={`mt-2 text-xs ${removalTextClassMap[statusInfo.tone]}`}>{statusInfo.label}</p>
<div class="mt-3 flex flex-col gap-2 sm:flex-row sm:items-center sm:gap-3">
<button
type="button"
class="inline-flex items-center justify-center gap-2 rounded bg-blue-600 px-3 py-1.5 text-xs font-semibold text-white transition-colors hover:bg-blue-700 disabled:cursor-not-allowed disabled:opacity-60 dark:bg-blue-500 dark:hover:bg-blue-400"
onClick={() => openRemoveModal(host)}
>
View progress
</button>
<Show when={!isOnline || status === 'failed' || status === 'expired'}>
<button
type="button"
class="inline-flex items-center justify-center gap-2 rounded border border-blue-500/40 px-3 py-1.5 text-xs font-medium text-blue-700 transition-colors hover:bg-blue-50 dark:border-blue-400/40 dark:text-blue-200 dark:hover:bg-blue-500/20"
onClick={() => handleCleanupOfflineHost(host.id, label)}
disabled={isRemovingHost(host.id)}
>
{isRemovingHost(host.id) ? 'Cleaning up…' : 'Force remove now'}
</button>
</Show>
</div>
</div>
);
}}
</For>
</div>
</div>
</div>
</div>
@ -1013,17 +1316,21 @@ WantedBy=multi-user.target`;
const isOnline = host.status?.toLowerCase() === 'online';
const displayName = getDisplayName(host);
const commandStatus = host.command?.status ?? null;
const removalStatusInfo = getRemovalStatusInfo(host);
const commandInProgress =
commandStatus === 'queued' ||
commandStatus === 'dispatched' ||
commandStatus === 'acknowledged';
const commandFailed = commandStatus === 'failed';
const commandCompleted = commandStatus === 'completed';
const offlineActionLabel = commandFailed
? 'Force remove host'
: host.pendingUninstall
? 'Clean up pending host'
: 'Remove offline host';
const offlineActionLabel =
commandFailed || commandStatus === 'expired'
? 'Force remove host'
: removalStatusInfo?.tone === 'success'
? 'Remove host'
: host.pendingUninstall
? 'Skip wait and remove now'
: 'Remove offline host';
const tokenRevoked = typeof host.tokenRevokedAt === 'number';
const tokenRevokedRelative = tokenRevoked ? formatRelativeTime(host.tokenRevokedAt!) : '';
@ -1088,12 +1395,14 @@ WantedBy=multi-user.target`;
</span>
</Show>
</div>
<Show when={host.pendingUninstall}>
<div class="mt-2 text-xs text-gray-500 dark:text-gray-400">
Stop command queued; waiting for acknowledgement.
</div>
<Show when={removalStatusInfo}>
{(info) => (
<div class={`mt-2 text-xs ${removalTextClassMap[info.tone]}`}>
{info.label}
</div>
)}
</Show>
</td>
</td>
<td class="py-3 px-4 align-top">
<div class="text-sm font-medium text-gray-900 dark:text-gray-100">
{host.dockerVersion ? `Docker ${host.dockerVersion}` : 'Docker version unavailable'}

View file

@ -94,6 +94,25 @@ const commandsByPlatform: Record<
},
};
const computeHostStaleness = (host: Host) => {
const intervalSeconds = host.intervalSeconds && host.intervalSeconds > 0 ? host.intervalSeconds : 30;
const staleThresholdMs = Math.max(intervalSeconds * 1000 * 3, 60_000);
const lastSeenValue =
typeof host.lastSeen === 'number'
? host.lastSeen
: Number.isFinite(Number(host.lastSeen))
? Number(host.lastSeen)
: NaN;
const lastSeenMs = Number.isFinite(lastSeenValue) ? lastSeenValue : null;
const isStale = lastSeenMs === null ? true : Date.now() - lastSeenMs >= staleThresholdMs;
return {
isStale,
lastSeenMs,
staleThresholdMs,
};
};
export const HostAgents: Component = () => {
const { state } = useWebSocket();
@ -111,6 +130,13 @@ export const HostAgents: Component = () => {
const [lookupLoading, setLookupLoading] = createSignal(false);
const [highlightedHostId, setHighlightedHostId] = createSignal<string | null>(null);
let highlightTimer: ReturnType<typeof setTimeout> | null = null;
const [showRemoveModal, setShowRemoveModal] = createSignal(false);
const [hostToRemoveId, setHostToRemoveId] = createSignal<string | null>(null);
const [removeActionLoading, setRemoveActionLoading] = createSignal<'remove' | null>(null);
const [uninstallCommandCopied, setUninstallCommandCopied] = createSignal(false);
const [uninstallCommandCopiedAt, setUninstallCommandCopiedAt] = createSignal<number | null>(null);
const [hostRemovalCountdownSeconds, setHostRemovalCountdownSeconds] = createSignal<number | null>(null);
const [uninstallConfirmed, setUninstallConfirmed] = createSignal(false);
createEffect(() => {
if (requiresToken()) {
@ -146,6 +172,129 @@ export const HostAgents: Component = () => {
return value === 'online' || value === 'running' || value === 'healthy';
};
const hostToRemove = createMemo(() => {
const id = hostToRemoveId();
if (!id) return null;
return allHosts().find((host) => host.id === id) ?? null;
});
const hostRemovalDisplayName = () => {
const host = hostToRemove();
return host ? host.displayName || host.hostname || host.id : '';
};
const hostRemovalPlatform = createMemo(() => hostToRemove()?.platform?.toLowerCase() || '');
const hostRemovalStatus = createMemo(() => {
const host = hostToRemove();
return (host?.status || 'unknown').toLowerCase();
});
const hostRemovalStatusLabel = () => hostToRemove()?.status || 'unknown';
const hostRemovalIsOnline = createMemo(() => {
const status = hostRemovalStatus();
return status === 'online' || status === 'running' || status === 'healthy';
});
const hostRemovalStaleness = createMemo(() => {
const host = hostToRemove();
if (!host) {
return { isStale: false, lastSeenMs: null as number | null, staleThresholdMs: 90_000 };
}
return computeHostStaleness(host);
});
const hostRemovalIsStale = createMemo(() => hostRemovalStaleness().isStale);
const hostRemovalLastSeen = createMemo(() => {
const { lastSeenMs } = hostRemovalStaleness();
if (!lastSeenMs) return null;
return {
relative: formatRelativeTime(lastSeenMs),
absolute: formatAbsoluteTime(lastSeenMs),
};
});
const hostRemovalStaleThresholdSeconds = createMemo(() => {
const threshold = hostRemovalStaleness().staleThresholdMs;
return Math.max(Math.round(threshold / 1000), 0);
});
const hostRemovalUninstallCommand = createMemo(() => getHostUninstallCommand(hostToRemove()));
const hostRemovalUninstallNote = () => {
const platform = hostRemovalPlatform();
if (platform === 'macos' || platform === 'darwin' || platform === 'mac') {
return 'Unloads the launch agent, removes the plist, deletes the binary, and clears the local log.';
}
if (platform === 'windows' || platform === 'win32' || platform === 'windows_nt') {
return 'Stops the Windows service, removes it, and deletes the installed binary and log. Run from an elevated PowerShell window.';
}
return 'Stops the agent, removes the systemd unit, deletes the binary, and reloads systemd.';
};
const formatCountdown = (seconds: number | null) => {
if (seconds === null || seconds < 0) return null;
if (seconds === 0) return 'any moment now';
const mins = Math.floor(seconds / 60);
const secs = seconds % 60;
if (mins <= 0) {
return `${secs}s`;
}
if (secs === 0) {
return `${mins}m`;
}
return `${mins}m ${secs}s`;
};
const countdownLabel = createMemo(() => formatCountdown(hostRemovalCountdownSeconds()));
const canRemoveHost = createMemo(() => uninstallCommandCopied() && uninstallConfirmed());
const openRemoveModal = (host: Host) => {
setHostToRemoveId(host.id);
setShowRemoveModal(true);
setRemoveActionLoading(null);
setUninstallCommandCopied(false);
setUninstallCommandCopiedAt(null);
setHostRemovalCountdownSeconds(null);
setUninstallConfirmed(false);
};
const closeRemoveModal = () => {
setShowRemoveModal(false);
setHostToRemoveId(null);
setRemoveActionLoading(null);
setUninstallCommandCopied(false);
setUninstallCommandCopiedAt(null);
setHostRemovalCountdownSeconds(null);
setUninstallConfirmed(false);
};
const performHostRemoval = async () => {
const host = hostToRemove();
if (!host || removeActionLoading()) return;
setRemoveActionLoading('remove');
const displayName = host.displayName || host.hostname || host.id;
try {
await MonitoringAPI.deleteHostAgent(host.id);
notificationStore.success(`Host "${displayName}" removed`, 4000);
closeRemoveModal();
} catch (error) {
console.error('Failed to remove host agent', error);
const message = error instanceof Error ? error.message : 'Failed to remove host. Please try again.';
notificationStore.error(message, 6000);
} finally {
setRemoveActionLoading(null);
}
};
const handleRemoveHost = () => {
void performHostRemoval();
};
createEffect(() => {
const current = lookupResult();
if (!current) return;
@ -196,6 +345,48 @@ export const HostAgents: Component = () => {
setLookupError(null);
});
createEffect(() => {
if (!showRemoveModal()) return;
const id = hostToRemoveId();
const host = hostToRemove();
if (id && !host) {
closeRemoveModal();
}
});
createEffect(() => {
if (!showRemoveModal()) {
setHostRemovalCountdownSeconds(null);
return;
}
const updateCountdown = () => {
const host = hostToRemove();
if (!host) {
setHostRemovalCountdownSeconds(null);
return;
}
const { lastSeenMs, staleThresholdMs } = computeHostStaleness(host);
if (!lastSeenMs) {
setHostRemovalCountdownSeconds(null);
return;
}
const elapsed = Date.now() - lastSeenMs;
const remaining = staleThresholdMs - elapsed;
setHostRemovalCountdownSeconds(remaining > 0 ? Math.ceil(remaining / 1000) : 0);
};
updateCountdown();
const interval = setInterval(updateCountdown, 1000);
return () => {
clearInterval(interval);
setHostRemovalCountdownSeconds(null);
};
});
onMount(() => {
if (typeof window === 'undefined') {
return;
@ -348,12 +539,30 @@ User=root
[Install]
WantedBy=multi-user.target`;
const getManualUninstallCommand = () =>
`sudo systemctl stop pulse-host-agent && \\
function getManualUninstallCommand(): string {
return `sudo systemctl stop pulse-host-agent && \\
sudo systemctl disable pulse-host-agent && \\
sudo rm -f /etc/systemd/system/pulse-host-agent.service && \\
sudo rm -f /usr/local/bin/pulse-host-agent && \\
sudo systemctl daemon-reload`;
}
function getHostUninstallCommand(host: Host | null): string {
const platform = host?.platform?.toLowerCase();
if (platform === 'macos' || platform === 'darwin' || platform === 'mac') {
return `launchctl unload ~/Library/LaunchAgents/com.pulse.host-agent.plist >/dev/null 2>&1 || true && \\
rm -f ~/Library/LaunchAgents/com.pulse.host-agent.plist && \\
sudo rm -f /usr/local/bin/pulse-host-agent && \\
rm -f ~/Library/Logs/pulse-host-agent.log`;
}
if (platform === 'windows' || platform === 'win32' || platform === 'windows_nt') {
return `Stop-Service -Name PulseHostAgent -ErrorAction SilentlyContinue; \\
sc.exe delete PulseHostAgent; \\
Remove-Item 'C:\\\\Program Files\\\\Pulse\\\\pulse-host-agent.exe' -Force -ErrorAction SilentlyContinue; \\
Remove-Item '$env:ProgramData\\\\Pulse\\\\pulse-host-agent.log' -Force -ErrorAction SilentlyContinue`;
}
return getManualUninstallCommand();
}
return (
<div class="space-y-6">
@ -697,22 +906,15 @@ sudo systemctl daemon-reload`;
<tbody class="divide-y divide-gray-200 dark:divide-gray-700">
<For each={allHosts()}>
{(host) => {
const [isDeleting, setIsDeleting] = createSignal(false);
const staleness = computeHostStaleness(host);
const isStale = staleness.isStale;
const tokenRevokedAt = host.tokenRevokedAt;
const tokenRevoked = typeof tokenRevokedAt === 'number';
const lastSeenMs = host.lastSeen ? new Date(host.lastSeen).getTime() : null;
const expectedIntervalMs =
(host.intervalSeconds && host.intervalSeconds > 0 ? host.intervalSeconds : 30) * 1000;
const staleThresholdMs = Math.max(expectedIntervalMs * 3, 60_000);
const isStale =
lastSeenMs === null || Date.now() - lastSeenMs >= staleThresholdMs;
const status = (host.status || 'unknown').toLowerCase();
const isOnline =
status === 'online' ||
status === 'running' ||
status === 'healthy';
status === 'online' || status === 'running' || status === 'healthy';
const isHighlighted = highlightedHostId() === host.id;
const isRemovingThisHost = hostToRemoveId() === host.id && removeActionLoading() !== null;
const baseRowClass = isStale
? 'bg-gray-50 dark:bg-gray-800/50 opacity-60'
@ -724,33 +926,8 @@ sudo systemctl daemon-reload`;
? 'ring-2 ring-blue-500/70 dark:ring-blue-400/70 shadow-md'
: '';
const handleDelete = async () => {
if (!confirm(`Remove host "${host.displayName || host.hostname || host.id}"?\n\nThis will remove the host from Pulse monitoring. The host agent will re-register if it continues to report.`)) {
return;
}
setIsDeleting(true);
try {
const response = await fetch(`/api/agents/host/${host.id}`, {
method: 'DELETE',
credentials: 'include',
});
if (!response.ok) {
const errorData = await response.json();
throw new Error(errorData.message || 'Failed to delete host');
}
notificationStore.success(`Host "${host.displayName || host.hostname}" removed`, 4000);
} catch (err) {
console.error('Failed to delete host:', err);
notificationStore.error(
err instanceof Error ? err.message : 'Failed to delete host. Please try again.',
6000,
);
} finally {
setIsDeleting(false);
}
const handleRemoveClick = () => {
openRemoveModal(host);
};
return (
@ -828,16 +1005,16 @@ sudo systemctl daemon-reload`;
<td class="py-3 px-4 text-right">
<button
type="button"
onClick={handleDelete}
disabled={isDeleting() || !isStale}
onClick={handleRemoveClick}
disabled={isRemovingThisHost}
class="inline-flex items-center gap-1 px-2 py-1 text-xs font-medium text-red-600 dark:text-red-400 hover:bg-red-50 dark:hover:bg-red-900/20 rounded transition-colors disabled:opacity-50 disabled:cursor-not-allowed"
title={
isStale
? 'Remove this stale host entry from the inventory'
: 'Host is still reporting — stop the agent before removing'
: 'Host is still reporting — review the removal steps first'
}
>
{isDeleting() ? (
{isRemovingThisHost ? (
<>
<svg class="animate-spin h-3 w-3" fill="none" viewBox="0 0 24 24">
<circle class="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" stroke-width="4" />
@ -848,7 +1025,7 @@ sudo systemctl daemon-reload`;
) : (
<>
<svg class="h-3 w-3" fill="none" viewBox="0 0 24 24" stroke="currentColor">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M19 7l-.867 12.142A2 2 0 0116.138 21H7.862a2 2 0 01-1.995-1.858L5 7m5 4v6m4-6v6m1-10V4a1 1 0 00-1-1h-4a1 1 0 00-1 1v3M4 7h16" />
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M19 7l-.867 12.142A2 2 0 0116.138 21H7.862a2 2 0 01-1.995-1.858L5 7m5 4v6m4-6v6m1-10V4a1 1 0 00-1 1v3M4 7h16" />
</svg>
<span>Remove</span>
</>
@ -865,6 +1042,181 @@ sudo systemctl daemon-reload`;
</Show>
</div>
</Card>
<Show when={showRemoveModal()}>
<div class="fixed inset-0 z-50 flex items-center justify-center bg-black/50 p-4">
<div class="w-full max-w-2xl rounded-lg bg-white p-6 shadow-xl dark:bg-gray-800">
<div class="space-y-2">
<h3 class="text-lg font-semibold text-gray-900 dark:text-gray-100">
Remove host "{hostRemovalDisplayName()}"
</h3>
<p class="text-sm text-gray-600 dark:text-gray-400">
Walk through uninstalling the agent and cleaning up the entry in Pulse.
</p>
</div>
<div class="mt-4 space-y-4">
<div class="space-y-3 rounded-lg border border-blue-200 bg-blue-50 p-4 dark:border-blue-800 dark:bg-blue-900/20">
<div class="flex items-start gap-3">
<svg class="w-5 h-5 text-blue-600 dark:text-blue-400 mt-0.5 flex-shrink-0" fill="none" viewBox="0 0 24 24" stroke="currentColor">
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M13 16h-1v-4h-1m1-4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
</svg>
<div class="flex-1 space-y-2">
<h4 class="text-sm font-semibold text-blue-900 dark:text-blue-100">Step 1 · Stop the agent on {hostRemovalDisplayName()}</h4>
<p class="text-sm text-blue-800 dark:text-blue-200">
Copy the tailored uninstall script below, run it on the host, then confirm once the command finishes. It runs silently, so no terminal output is expected.
</p>
</div>
</div>
<div class="space-y-2 rounded border border-blue-200 bg-white p-3 text-xs text-blue-800 dark:border-blue-700 dark:bg-blue-800/20 dark:text-blue-200">
<div class="flex items-center justify-between gap-2">
<span class="font-semibold uppercase tracking-wide text-[11px] text-blue-600 dark:text-blue-300">Manual uninstall</span>
<button
type="button"
onClick={async () => {
const command = hostRemovalUninstallCommand();
if (!command) return;
const success = await copyToClipboard(command);
if (success) {
setUninstallCommandCopied(true);
setUninstallCommandCopiedAt(Date.now());
}
if (typeof window !== 'undefined' && window.showToast) {
window.showToast(success ? 'success' : 'error', success ? 'Copied!' : 'Failed to copy');
}
}}
class="inline-flex items-center gap-2 rounded bg-blue-600 px-3 py-1.5 text-[11px] font-semibold text-white transition-colors hover:bg-blue-700 dark:bg-blue-500 dark:hover:bg-blue-400"
>
<svg class="h-3 w-3" viewBox="0 0 20 20" fill="currentColor" aria-hidden="true">
<path d="M4 4a2 2 0 012-2h5a2 2 0 012 2v2h-2V4H6v10h5v-2h2v2a2 2 0 01-2 2H6a2 2 0 01-2-2V4z" />
<path d="M14.293 7.293a1 1 0 011.414 0L18 9.586l-2.293 2.293a1 1 0 01-1.414-1.415L14.586 10H9a1 1 0 110-2h5.586l-.293-.293a1 1 0 010-1.414z" />
</svg>
{uninstallCommandCopied() ? 'Copied' : 'Copy command'}
</button>
</div>
<code class="block overflow-x-auto rounded bg-gray-900 px-3 py-2 font-mono text-xs text-gray-100 dark:bg-gray-950 whitespace-pre-wrap">
{hostRemovalUninstallCommand()}
</code>
<p class="text-[11px] leading-snug">{hostRemovalUninstallNote()}</p>
<Show when={uninstallCommandCopied()}>
<div class="space-y-2 rounded border border-blue-200 bg-white p-3 text-[11px] text-blue-800 dark:border-blue-700 dark:bg-blue-800/20 dark:text-blue-200">
<p class="font-medium">Command copied.</p>
<p>This script runs silentlyno CLI output is expected. Once it completes, mark it finished below.</p>
<Show when={uninstallCommandCopiedAt()}>
<p class="text-blue-700/80 dark:text-blue-200/80">Copied {formatRelativeTime(uninstallCommandCopiedAt()!)}.</p>
</Show>
<button
type="button"
onClick={() => setUninstallConfirmed(true)}
disabled={uninstallConfirmed()}
class={`inline-flex items-center gap-2 rounded ${uninstallConfirmed() ? 'bg-blue-100 text-blue-500 cursor-default' : 'bg-blue-600 text-white hover:bg-blue-700'} px-3 py-1.5 text-[11px] font-semibold transition-colors dark:${uninstallConfirmed() ? 'bg-blue-900/30 text-blue-200' : 'bg-blue-500 hover:bg-blue-400'}`}
>
<svg class="h-3.5 w-3.5" viewBox="0 0 20 20" fill="currentColor" aria-hidden="true">
<path fill-rule="evenodd" d="M10 18a8 8 0 100-16 8 8 0 000 16zm3.707-9.707a1 1 0 00-1.414-1.414L9 10.172 7.707 8.879A1 1 0 006.293 10.293l2 2a1 1 0 001.414 0l3-3z" clip-rule="evenodd" />
</svg>
{uninstallConfirmed() ? 'Marked complete' : 'I ran this command'}
</button>
<Show when={!uninstallConfirmed()}>
<p>Click once you have run the script on {hostRemovalDisplayName()}.</p>
</Show>
</div>
</Show>
</div>
</div>
<div class="space-y-3 rounded-lg border border-gray-200 bg-white p-4 shadow-sm dark:border-gray-700 dark:bg-gray-900">
<div class="flex items-center justify-between text-xs text-gray-600 dark:text-gray-300">
<span class="font-semibold uppercase tracking-wide text-[11px] text-gray-500 dark:text-gray-400">Host status</span>
<span
class={`inline-flex items-center gap-1 rounded-full px-2 py-0.5 text-[11px] font-semibold uppercase ${
hostRemovalIsOnline()
? 'bg-emerald-100 text-emerald-700 dark:bg-emerald-900/40 dark:text-emerald-200'
: 'bg-gray-200 text-gray-700 dark:bg-gray-700 dark:text-gray-200'
}`}
>
{hostRemovalStatusLabel()}
</span>
</div>
<div class="text-xs text-gray-600 dark:text-gray-300">
<div class="flex items-center justify-between">
<span class="font-medium">Last heartbeat</span>
<span class="text-gray-700 dark:text-gray-200">
{hostRemovalLastSeen()?.relative ?? 'No reports yet'}
</span>
</div>
<Show when={hostRemovalLastSeen()}>
<div class="text-[11px] text-gray-500 dark:text-gray-400 text-right">
{hostRemovalLastSeen()?.absolute}
</div>
</Show>
</div>
<Show when={!hostRemovalIsStale()}>
<div class="rounded border border-yellow-200 bg-yellow-50 p-3 text-xs text-yellow-800 dark:border-yellow-700 dark:bg-yellow-900/20 dark:text-yellow-200">
<p class="font-semibold">Host still reporting</p>
<p class="mt-1 leading-snug">
Pulse revokes the host's API token as soon as you remove it. After the uninstall script stops the service, the next heartbeat will fail and the agent will disappear within about {hostRemovalStaleThresholdSeconds()} seconds.
</p>
<Show when={countdownLabel()}>
<p class="mt-2 rounded bg-yellow-100/60 px-2 py-1 text-[11px] font-medium text-yellow-800 dark:bg-yellow-900/40 dark:text-yellow-100">
Waiting for the next missed heartbeat ({countdownLabel()}).
</p>
</Show>
</div>
</Show>
<Show when={hostRemovalIsStale()}>
<div class="rounded border border-emerald-200 bg-emerald-50 p-3 text-xs text-emerald-800 dark:border-emerald-700 dark:bg-emerald-900/20 dark:text-emerald-200">
<p class="flex items-center gap-1 font-semibold">
<svg class="h-3.5 w-3.5" viewBox="0 0 20 20" fill="currentColor" aria-hidden="true">
<path fill-rule="evenodd" d="M10 18a8 8 0 100-16 8 8 0 000 16zm3.707-9.707a1 1 0 00-1.414-1.414L9 10.172 7.707 8.879A1 1 0 006.293 10.293l2 2a1 1 0 001.414 0l3-3z" clip-rule="evenodd" />
</svg>
Host offline
</p>
<p class="mt-1 leading-snug">
Pulse no longer receives heartbeats from {hostRemovalDisplayName()}. Its safe to remove the entry now.
</p>
</div>
</Show>
</div>
</div>
<div class="mt-6 flex flex-col gap-3 sm:flex-row sm:items-center sm:justify-between">
<button
type="button"
onClick={closeRemoveModal}
class="self-start rounded-lg px-4 py-2 text-sm font-medium text-gray-700 transition-colors hover:bg-gray-100 dark:text-gray-300 dark:hover:bg-gray-700"
>
Close
</button>
<Show
when={canRemoveHost()}
fallback={
<div class="max-w-sm text-xs text-gray-500 dark:text-gray-400">
<p class="font-semibold text-gray-600 dark:text-gray-200">Run the uninstall script and mark it complete above.</p>
<p class="mt-1">Once confirmed, Pulse will enable the final removal step.</p>
</div>
}
>
<div class="flex flex-col gap-2 sm:flex-row sm:items-center sm:gap-3">
<button
type="button"
onClick={handleRemoveHost}
disabled={removeActionLoading() !== null}
class="rounded bg-red-600 px-4 py-2 text-sm font-semibold text-white transition-colors hover:bg-red-700 disabled:cursor-not-allowed disabled:opacity-60 dark:bg-red-500 dark:hover:bg-red-400"
>
{removeActionLoading() === 'remove' ? 'Removing…' : 'Remove host'}
</button>
<div class="text-xs text-gray-500 dark:text-gray-400">
<p>
Removing a host revokes its API token so it cannot register again. With the service already uninstalled, Pulse just needs one more missed heartbeat to clear the row automatically.
</p>
</div>
</div>
</Show>
</div>
</div>
</div>
</Show>
</div>
);
};

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,218 @@
import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
import { render, fireEvent, screen, waitFor, cleanup } from '@solidjs/testing-library';
import { createStore } from 'solid-js/store';
import { DockerAgents } from '../DockerAgents';
import type { DockerHost, RemovedDockerHost } from '@/types/api';
let mockWsStore: {
state: { dockerHosts: DockerHost[]; removedDockerHosts: RemovedDockerHost[] };
connected: () => boolean;
reconnecting: () => boolean;
activeAlerts: unknown[];
};
const allowDockerHostReenrollMock = vi.fn();
const deleteDockerHostMock = vi.fn();
const unhideDockerHostMock = vi.fn();
const notificationSuccessMock = vi.fn();
const notificationErrorMock = vi.fn();
const fetchMock = vi.fn();
vi.mock('@/App', () => ({
useWebSocket: () => mockWsStore,
}));
vi.mock('@/api/monitoring', () => ({
MonitoringAPI: {
allowDockerHostReenroll: (...args: unknown[]) => allowDockerHostReenrollMock(...args),
deleteDockerHost: (...args: unknown[]) => deleteDockerHostMock(...args),
unhideDockerHost: (...args: unknown[]) => unhideDockerHostMock(...args),
},
}));
vi.mock('@/api/security', () => ({
SecurityAPI: {
createToken: vi.fn(),
},
}));
vi.mock('@/stores/notifications', () => ({
notificationStore: {
success: (...args: unknown[]) => notificationSuccessMock(...args),
error: (...args: unknown[]) => notificationErrorMock(...args),
},
}));
const createDockerHost = (overrides?: Partial<DockerHost>): DockerHost => ({
id: 'host-1',
agentId: 'agent-1',
hostname: 'host-1.local',
displayName: 'Host One',
cpus: 4,
totalMemoryBytes: 8 * 1024 * 1024 * 1024,
uptimeSeconds: 12_345,
status: 'online',
lastSeen: Date.now(),
intervalSeconds: 30,
containers: [],
...overrides,
});
const createRemovedHost = (overrides?: Partial<RemovedDockerHost>): RemovedDockerHost => ({
id: 'host-removed',
hostname: 'retired-node.local',
displayName: 'Retired Node',
removedAt: Date.now() - 60_000,
...overrides,
});
const setupComponent = (hosts: DockerHost[], removedHosts: RemovedDockerHost[] = []) => {
const [state] = createStore({
dockerHosts: hosts,
removedDockerHosts: removedHosts,
});
mockWsStore = {
state,
connected: () => true,
reconnecting: () => false,
activeAlerts: [],
};
return render(() => <DockerAgents />);
};
beforeEach(() => {
allowDockerHostReenrollMock.mockReset();
deleteDockerHostMock.mockReset();
unhideDockerHostMock.mockReset();
notificationSuccessMock.mockReset();
notificationErrorMock.mockReset();
fetchMock.mockReset();
fetchMock.mockResolvedValue(
new Response(JSON.stringify({ requiresAuth: true, apiTokenConfigured: false }), {
status: 200,
headers: { 'Content-Type': 'application/json' },
}),
);
vi.stubGlobal('fetch', fetchMock);
});
afterEach(() => {
cleanup();
vi.unstubAllGlobals();
});
describe('DockerAgents removed hosts', () => {
it('renders removed hosts card and triggers allow/copy actions', async () => {
allowDockerHostReenrollMock.mockResolvedValue(undefined);
const clipboardSpy = vi.fn().mockResolvedValue(undefined);
vi.stubGlobal('navigator', { clipboard: { writeText: clipboardSpy } } as Navigator);
setupComponent([], [createRemovedHost()]);
expect(screen.getByText('Recently removed Docker hosts')).toBeInTheDocument();
const allowButton = screen.getByRole('button', { name: 'Allow re-enroll' });
fireEvent.click(allowButton);
await waitFor(() => expect(allowDockerHostReenrollMock).toHaveBeenCalledWith('host-removed'), { interval: 0 });
const copyButton = screen.getByRole('button', { name: 'Copy curl command' });
fireEvent.click(copyButton);
await waitFor(() => expect(clipboardSpy).toHaveBeenCalledTimes(1), { interval: 0 });
const copiedCommand = clipboardSpy.mock.calls.at(-1)?.[0];
expect(typeof copiedCommand).toBe('string');
expect(copiedCommand).toContain('/api/agents/docker/hosts/host-removed/allow-reenroll');
expect(notificationSuccessMock).toHaveBeenCalled();
expect(notificationErrorMock).not.toHaveBeenCalled();
});
it('shows progress stepper for an in-progress removal command', async () => {
const now = Date.now();
const host = createDockerHost({
command: {
id: 'cmd-1',
type: 'stop',
status: 'acknowledged',
createdAt: now - 60_000,
updatedAt: now - 20_000,
dispatchedAt: now - 40_000,
acknowledgedAt: now - 10_000,
},
});
setupComponent([host]);
const removeButton = screen.getByRole('button', { name: 'Remove' });
fireEvent.click(removeButton);
await screen.findByText('Remove Docker host "Host One"');
expect(screen.getByText('acknowledged')).toBeInTheDocument();
const progressHeading = screen.getByText('Progress');
const progressCard = progressHeading.parentElement?.parentElement;
if (!progressCard) {
throw new Error('Progress card not found');
}
const steps = Array.from(progressCard.querySelectorAll('li'));
expect(steps).toHaveLength(4);
expect(steps[0]).toHaveTextContent('Stop command queued');
expect(steps[1]).toHaveTextContent('Instruction delivered to the agent');
expect(steps[2]).toHaveTextContent('Agent acknowledged the stop request');
expect(steps[3]).toHaveTextContent('Agent disabled the service and removed autostart');
const indicators = steps.map((step) => step.querySelector('span'));
expect(indicators[0]?.className).toContain('bg-blue-500');
expect(indicators[1]?.className).toContain('bg-blue-500');
expect(indicators[2]?.className).toContain('animate-pulse');
expect(indicators[3]?.className).toContain('bg-gray-300');
});
it('shows waiting message once the agent has already stopped', async () => {
const host = createDockerHost({
status: 'online',
pendingUninstall: true,
command: undefined,
lastSeen: Date.now() - 45_000,
});
setupComponent([host]);
const viewButton = await screen.findByRole('button', { name: 'View progress' });
fireEvent.click(viewButton);
await screen.findByText('Pulse is waiting for', { exact: false });
const stopButton = screen.getByRole('button', { name: 'Waiting for host…' });
expect(stopButton).toBeDisabled();
expect(screen.queryByText('Progress')).not.toBeInTheDocument();
});
it('shows confirmation while waiting for the agent to acknowledge the stop command', async () => {
deleteDockerHostMock.mockResolvedValue({});
const host = createDockerHost();
setupComponent([host]);
const removeButton = screen.getByRole('button', { name: 'Remove' });
fireEvent.click(removeButton);
await screen.findByText('Remove Docker host "Host One"');
const stopButton = screen.getByRole('button', { name: 'Stop agent now' });
fireEvent.click(stopButton);
await waitFor(() => expect(deleteDockerHostMock).toHaveBeenCalledWith('host-1'), { interval: 0 });
const waitingButton = await screen.findByRole('button', { name: 'Waiting for agent…' });
expect(waitingButton).toBeDisabled();
expect(screen.getByText('Stop command sent.')).toBeInTheDocument();
expect(notificationSuccessMock).toHaveBeenCalledWith('Stop command sent to Host One', 3500);
});
});

View file

@ -13,6 +13,11 @@ let mockWsStore: {
const lookupMock = vi.fn();
const createTokenMock = vi.fn();
const deleteHostAgentMock = vi.fn();
const notificationSuccessMock = vi.fn();
const notificationErrorMock = vi.fn();
const notificationInfoMock = vi.fn();
const clipboardSpy = vi.fn();
vi.mock('@/App', () => ({
useWebSocket: () => mockWsStore,
@ -21,6 +26,7 @@ vi.mock('@/App', () => ({
vi.mock('@/api/monitoring', () => ({
MonitoringAPI: {
lookupHost: (...args: unknown[]) => lookupMock(...args),
deleteHostAgent: (...args: unknown[]) => deleteHostAgentMock(...args),
},
}));
@ -30,6 +36,14 @@ vi.mock('@/api/security', () => ({
},
}));
vi.mock('@/stores/notifications', () => ({
notificationStore: {
success: (...args: unknown[]) => notificationSuccessMock(...args),
error: (...args: unknown[]) => notificationErrorMock(...args),
info: (...args: unknown[]) => notificationInfoMock(...args),
},
}));
const createHost = (overrides?: Partial<Host>): Host => ({
id: 'host-1',
hostname: 'host-1.local',
@ -69,9 +83,29 @@ const createHost = (overrides?: Partial<Host>): Host => ({
const stubFetchSuccess = vi.fn();
const setupComponent = (hosts: Host[]) => {
const [state] = createStore({
hosts,
connectionHealth: {},
});
mockWsStore = {
state,
connected: () => true,
reconnecting: () => false,
activeAlerts: [],
};
return render(() => <HostAgents />);
};
beforeEach(() => {
lookupMock.mockReset();
createTokenMock.mockReset();
deleteHostAgentMock.mockReset();
notificationSuccessMock.mockReset();
notificationErrorMock.mockReset();
notificationInfoMock.mockReset();
stubFetchSuccess.mockImplementation(
async () =>
new Response(JSON.stringify({ requiresAuth: true, apiTokenConfigured: false }), {
@ -80,6 +114,8 @@ beforeEach(() => {
}),
);
vi.stubGlobal('fetch', stubFetchSuccess);
clipboardSpy.mockReset();
vi.stubGlobal('navigator', { clipboard: { writeText: clipboardSpy } } as unknown as Navigator);
});
afterEach(() => {
@ -88,22 +124,6 @@ afterEach(() => {
});
describe('HostAgents lookup flow', () => {
const setupComponent = (hosts: Host[]) => {
const [state] = createStore({
hosts,
connectionHealth: {},
});
mockWsStore = {
state,
connected: () => true,
reconnecting: () => false,
activeAlerts: [],
};
return render(() => <HostAgents />);
};
it('highlights a host after a successful lookup and clears highlight after timeout', async () => {
const host = createHost();
setupComponent([host]);
@ -204,3 +224,93 @@ describe('HostAgents lookup flow', () => {
expect(row.classList.contains('ring-2')).toBe(false);
});
});
describe('Host removal modal', () => {
it('removes a host while it is still reporting and explains the impact', async () => {
deleteHostAgentMock.mockResolvedValue(undefined);
const host = createHost({
lastSeen: Date.now(),
status: 'online',
});
setupComponent([host]);
const removeButton = screen.getByRole('button', { name: 'Remove' });
fireEvent.click(removeButton);
await screen.findByText('Remove host "Host One"');
const copyButton = screen.getByRole('button', { name: 'Copy command' });
fireEvent.click(copyButton);
await waitFor(() => expect(clipboardSpy).toHaveBeenCalled(), { interval: 0 });
const confirmButton = await screen.findByRole('button', { name: 'I ran this command' });
fireEvent.click(confirmButton);
expect(
screen.getByText("Pulse revokes the host's API token", {
exact: false,
}),
).toBeInTheDocument();
const removeHostButton = screen.getByRole('button', { name: 'Remove host' });
expect(removeHostButton).toBeEnabled();
fireEvent.click(removeHostButton);
await waitFor(() => expect(deleteHostAgentMock).toHaveBeenCalledWith('host-1'), { interval: 0 });
await waitFor(() => expect(notificationSuccessMock).toHaveBeenCalledWith('Host "Host One" removed', 4000), {
interval: 0,
});
await waitFor(() => expect(screen.queryByText('Remove host "Host One"')).not.toBeInTheDocument(), {
interval: 0,
});
});
it('removes a stale host without forcing', async () => {
deleteHostAgentMock.mockResolvedValue(undefined);
const host = createHost({
lastSeen: Date.now() - 5 * 60_000,
status: 'offline',
});
setupComponent([host]);
const removeButton = screen.getByRole('button', { name: 'Remove' });
fireEvent.click(removeButton);
await screen.findByText('Remove host "Host One"');
const copyButton = screen.getByRole('button', { name: 'Copy command' });
fireEvent.click(copyButton);
await waitFor(() => expect(clipboardSpy).toHaveBeenCalled(), { interval: 0 });
const confirmButton = await screen.findByRole('button', { name: 'I ran this command' });
fireEvent.click(confirmButton);
const removeHostButton = screen.getByRole('button', { name: 'Remove host' });
fireEvent.click(removeHostButton);
await waitFor(() => expect(deleteHostAgentMock).toHaveBeenCalledWith('host-1'), { interval: 0 });
await waitFor(() => expect(notificationSuccessMock).toHaveBeenCalledWith('Host "Host One" removed', 4000), {
interval: 0,
});
await waitFor(() => expect(screen.queryByText('Remove host "Host One"')).not.toBeInTheDocument(), {
interval: 0,
});
});
it('shows macOS-specific uninstall guidance', async () => {
const host = createHost({
platform: 'macos',
});
setupComponent([host]);
const removeButton = screen.getByRole('button', { name: 'Remove' });
fireEvent.click(removeButton);
await screen.findByText('Remove host "Host One"');
expect(screen.getByText('launchctl unload', { exact: false })).toBeInTheDocument();
expect(screen.getByText('Unloads the launch agent, removes the plist, deletes the binary, and clears the local log.')).toBeInTheDocument();
});
});

View file

@ -320,7 +320,7 @@ export const extractTriggerValues = (
const result: Record<string, number> = {};
Object.entries(thresholds).forEach(([key, value]) => {
// Skip non-threshold fields
if (key === 'disabled' || key === 'disableConnectivity' || key === 'poweredOffSeverity') return;
if (key === 'disabled' || key === 'disableConnectivity' || key === 'poweredOffSeverity' || key === 'note') return;
if (typeof value === 'string') return;
result[key] = getTriggerValue(value);
});

View file

@ -32,6 +32,7 @@ export type RawOverrideConfig = AlertThresholds & {
disabled?: boolean;
disableConnectivity?: boolean;
poweredOffSeverity?: 'warning' | 'critical';
note?: string;
// NOTE: To disable individual metrics, set threshold to -1
};

View file

@ -5,6 +5,7 @@ export interface State {
vms: VM[];
containers: Container[];
dockerHosts: DockerHost[];
removedDockerHosts?: RemovedDockerHost[];
hosts: Host[];
replicationJobs: ReplicationJob[];
storage: Storage[];
@ -25,6 +26,13 @@ export interface State {
lastUpdate: string;
}
export interface RemovedDockerHost {
id: string;
hostname?: string;
displayName?: string;
removedAt: number;
}
export interface Node {
id: string;
name: string;
@ -271,6 +279,8 @@ export interface DockerContainerNetwork {
export interface DockerContainerBlockIO {
readBytes?: number;
writeBytes?: number;
readRateBytesPerSecond?: number;
writeRateBytesPerSecond?: number;
}
export interface DockerContainerMount {

1
go.mod
View file

@ -42,6 +42,7 @@ require (
github.com/go-logr/stdr v1.2.2 // indirect
github.com/go-ole/go-ole v1.2.6 // indirect
github.com/inconshreveable/mousetrap v1.1.0 // indirect
github.com/kylelemons/godebug v1.1.0 // indirect
github.com/lufia/plan9stats v0.0.0-20211012122336-39d0f177ccd0 // indirect
github.com/mattn/go-colorable v0.1.14 // indirect
github.com/mattn/go-isatty v0.0.20 // indirect

View file

@ -167,6 +167,7 @@ type ThresholdConfig struct {
NetworkOut *HysteresisThreshold `json:"networkOut,omitempty"`
Usage *HysteresisThreshold `json:"usage,omitempty"` // For storage devices
Temperature *HysteresisThreshold `json:"temperature,omitempty"` // For node CPU temperature
Note *string `json:"note,omitempty"`
// Legacy fields for backward compatibility
CPULegacy *float64 `json:"cpuLegacy,omitempty"`
MemoryLegacy *float64 `json:"memoryLegacy,omitempty"`
@ -7015,6 +7016,14 @@ func cloneThreshold(threshold *HysteresisThreshold) *HysteresisThreshold {
return &clone
}
func cloneStringPtr(value *string) *string {
if value == nil {
return nil
}
copy := *value
return &copy
}
func cloneThresholdConfig(cfg ThresholdConfig) ThresholdConfig {
clone := cfg
clone.CPU = cloneThreshold(cfg.CPU)
@ -7026,6 +7035,7 @@ func cloneThresholdConfig(cfg ThresholdConfig) ThresholdConfig {
clone.NetworkOut = cloneThreshold(cfg.NetworkOut)
clone.Temperature = cloneThreshold(cfg.Temperature)
clone.Usage = cloneThreshold(cfg.Usage)
clone.Note = cloneStringPtr(cfg.Note)
return clone
}
@ -7089,6 +7099,16 @@ func (m *Manager) applyThresholdOverride(base ThresholdConfig, override Threshol
result.Usage = ensureHysteresisThreshold(cloneThreshold(override.Usage))
}
if override.Note != nil {
note := strings.TrimSpace(*override.Note)
if note == "" {
result.Note = nil
} else {
noteCopy := note
result.Note = &noteCopy
}
}
return result
}

View file

@ -38,6 +38,7 @@ type DiagnosticsInfo struct {
Nodes []NodeDiagnostic `json:"nodes"`
PBS []PBSDiagnostic `json:"pbs"`
System SystemDiagnostic `json:"system"`
Discovery *DiscoveryDiagnostic `json:"discovery,omitempty"`
TemperatureProxy *TemperatureProxyDiagnostic `json:"temperatureProxy,omitempty"`
APITokens *APITokenDiagnostic `json:"apiTokens,omitempty"`
DockerAgents *DockerAgentDiagnostic `json:"dockerAgents,omitempty"`
@ -51,6 +52,36 @@ type DiagnosticsInfo struct {
MemorySources []MemorySourceStat `json:"memorySources,omitempty"`
}
// DiscoveryDiagnostic summarizes discovery configuration and recent activity.
type DiscoveryDiagnostic struct {
Enabled bool `json:"enabled"`
ConfiguredSubnet string `json:"configuredSubnet,omitempty"`
ActiveSubnet string `json:"activeSubnet,omitempty"`
EnvironmentOverride string `json:"environmentOverride,omitempty"`
SubnetAllowlist []string `json:"subnetAllowlist"`
SubnetBlocklist []string `json:"subnetBlocklist"`
Scanning bool `json:"scanning"`
ScanInterval string `json:"scanInterval,omitempty"`
LastScanStartedAt string `json:"lastScanStartedAt,omitempty"`
LastResultTimestamp string `json:"lastResultTimestamp,omitempty"`
LastResultServers int `json:"lastResultServers,omitempty"`
LastResultErrors int `json:"lastResultErrors,omitempty"`
History []DiscoveryHistoryItem `json:"history,omitempty"`
}
// DiscoveryHistoryItem summarizes the outcome of a recent discovery scan.
type DiscoveryHistoryItem struct {
StartedAt string `json:"startedAt"`
CompletedAt string `json:"completedAt"`
Duration string `json:"duration"`
DurationMs int64 `json:"durationMs"`
Subnet string `json:"subnet"`
ServerCount int `json:"serverCount"`
ErrorCount int `json:"errorCount"`
BlocklistLength int `json:"blocklistLength"`
Status string `json:"status"`
}
// MemorySourceStat aggregates memory-source usage per instance.
type MemorySourceStat struct {
Instance string `json:"instance"`
@ -475,6 +506,8 @@ func (r *Router) computeDiagnostics(ctx context.Context) DiagnosticsInfo {
diag.DockerAgents = buildDockerAgentDiagnostic(r.monitor, diag.Version)
diag.Alerts = buildAlertsDiagnostic(r.monitor)
diag.Discovery = buildDiscoveryDiagnostic(r.config, r.monitor)
if r.monitor != nil {
snapshots := r.monitor.GetDiagnosticSnapshots()
if len(snapshots.Nodes) > 0 {
@ -536,6 +569,90 @@ func (r *Router) computeDiagnostics(ctx context.Context) DiagnosticsInfo {
return diag
}
func copyStringSlice(values []string) []string {
if len(values) == 0 {
return []string{}
}
return append([]string(nil), values...)
}
func buildDiscoveryDiagnostic(cfg *config.Config, monitor *monitoring.Monitor) *DiscoveryDiagnostic {
if cfg == nil {
return nil
}
discovery := &DiscoveryDiagnostic{
Enabled: cfg.DiscoveryEnabled,
ConfiguredSubnet: strings.TrimSpace(cfg.DiscoverySubnet),
EnvironmentOverride: strings.TrimSpace(cfg.Discovery.EnvironmentOverride),
SubnetAllowlist: copyStringSlice(cfg.Discovery.SubnetAllowlist),
SubnetBlocklist: copyStringSlice(cfg.Discovery.SubnetBlocklist),
}
if discovery.ConfiguredSubnet == "" {
discovery.ConfiguredSubnet = "auto"
}
if discovery.SubnetAllowlist == nil {
discovery.SubnetAllowlist = []string{}
}
if discovery.SubnetBlocklist == nil {
discovery.SubnetBlocklist = []string{}
}
if monitor != nil {
if svc := monitor.GetDiscoveryService(); svc != nil {
status := svc.GetStatus()
if val, ok := status["subnet"].(string); ok {
discovery.ActiveSubnet = val
}
if val, ok := status["is_scanning"].(bool); ok {
discovery.Scanning = val
}
if val, ok := status["interval"].(string); ok {
discovery.ScanInterval = val
}
if val, ok := status["last_scan"].(time.Time); ok && !val.IsZero() {
discovery.LastScanStartedAt = val.UTC().Format(time.RFC3339)
}
if result, updated := svc.GetCachedResult(); result != nil {
discovery.LastResultServers = len(result.Servers)
if len(result.StructuredErrors) > 0 {
discovery.LastResultErrors = len(result.StructuredErrors)
} else if len(result.Errors) > 0 {
discovery.LastResultErrors = len(result.Errors)
}
if !updated.IsZero() {
discovery.LastResultTimestamp = updated.UTC().Format(time.RFC3339)
}
}
history := svc.GetHistory(10)
if len(history) > 0 {
items := make([]DiscoveryHistoryItem, 0, len(history))
for _, entry := range history {
item := DiscoveryHistoryItem{
StartedAt: entry.startedAt.UTC().Format(time.RFC3339),
CompletedAt: entry.completedAt.UTC().Format(time.RFC3339),
Duration: entry.duration.Truncate(time.Millisecond).String(),
DurationMs: entry.duration.Milliseconds(),
Subnet: entry.subnet,
ServerCount: entry.serverCount,
ErrorCount: entry.errorCount,
BlocklistLength: entry.blocklistLength,
Status: entry.status,
}
items = append(items, item)
}
discovery.History = items
}
}
}
return discovery
}
func buildTemperatureProxyDiagnostic(cfg *config.Config, legacyDetected, recommendProxy bool) *TemperatureProxyDiagnostic {
diag := &TemperatureProxyDiagnostic{
LegacySSHDetected: legacyDetected,

View file

@ -3197,10 +3197,35 @@ func (r *Router) handleDownloadAgent(w http.ResponseWriter, req *http.Request) {
if candidate == "" {
continue
}
if info, err := os.Stat(candidate); err == nil && !info.IsDir() {
http.ServeFile(w, req, candidate)
return
info, err := os.Stat(candidate)
if err != nil || info.IsDir() {
continue
}
file, err := os.Open(candidate)
if err != nil {
log.Error().Err(err).Str("path", candidate).Msg("Failed to open docker agent binary for download")
continue
}
hasher := sha256.New()
if _, err := io.Copy(hasher, file); err != nil {
file.Close()
log.Error().Err(err).Str("path", candidate).Msg("Failed to hash docker agent binary")
continue
}
if _, err := file.Seek(0, io.SeekStart); err != nil {
file.Close()
log.Error().Err(err).Str("path", candidate).Msg("Failed to rewind docker agent binary")
continue
}
w.Header().Set("X-Checksum-Sha256", hex.EncodeToString(hasher.Sum(nil)))
http.ServeContent(w, req, filepath.Base(candidate), info.ModTime(), file)
file.Close()
return
}
http.Error(w, "Agent binary not found", http.StatusNotFound)

View file

@ -6,6 +6,7 @@ import (
"sync"
"time"
"github.com/prometheus/client_golang/prometheus"
"github.com/rcourtman/pulse-go-rewrite/internal/config"
"github.com/rcourtman/pulse-go-rewrite/internal/websocket"
pkgdiscovery "github.com/rcourtman/pulse-go-rewrite/pkg/discovery"
@ -14,17 +15,20 @@ import (
// Service handles background network discovery
type Service struct {
scanner *pkgdiscovery.Scanner
wsHub *websocket.Hub
cache *DiscoveryCache
interval time.Duration
subnet string
mu sync.RWMutex
lastScan time.Time
isScanning bool
stopChan chan struct{}
ctx context.Context
cfgProvider func() config.DiscoveryConfig
scanner discoveryScanner
wsHub *websocket.Hub
cache *DiscoveryCache
interval time.Duration
subnet string
mu sync.RWMutex
lastScan time.Time
isScanning bool
stopChan chan struct{}
ctx context.Context
cfgProvider func() config.DiscoveryConfig
history []historyEntry
historyLimit int
scannerFactory scannerFactory
}
// DiscoveryCache stores the latest discovery results
@ -34,6 +38,66 @@ type DiscoveryCache struct {
updated time.Time
}
type historyEntry struct {
startedAt time.Time
completedAt time.Time
subnet string
serverCount int
errorCount int
duration time.Duration
blocklistLength int
status string
}
const defaultHistoryLimit = 20
type discoveryScanner interface {
DiscoverServersWithCallbacks(ctx context.Context, subnet string, serverCallback pkgdiscovery.ServerCallback, progressCallback pkgdiscovery.ProgressCallback) (*pkgdiscovery.DiscoveryResult, error)
}
type scannerFactory func(config.DiscoveryConfig) (discoveryScanner, error)
var (
discoveryScanResults = prometheus.NewCounterVec(
prometheus.CounterOpts{
Namespace: "pulse",
Subsystem: "discovery",
Name: "scans_total",
Help: "Total number of discovery scans by result status.",
},
[]string{"result"},
)
discoveryScanDuration = prometheus.NewHistogram(
prometheus.HistogramOpts{
Namespace: "pulse",
Subsystem: "discovery",
Name: "scan_duration_seconds",
Help: "Duration of discovery scans in seconds.",
Buckets: []float64{5, 10, 20, 30, 45, 60, 90, 120, 180},
},
)
discoveryScanServers = prometheus.NewGauge(
prometheus.GaugeOpts{
Namespace: "pulse",
Subsystem: "discovery",
Name: "last_scan_servers",
Help: "Number of servers found in the most recent discovery scan.",
},
)
discoveryScanErrors = prometheus.NewGauge(
prometheus.GaugeOpts{
Namespace: "pulse",
Subsystem: "discovery",
Name: "last_scan_errors",
Help: "Number of errors encountered in the most recent discovery scan.",
},
)
)
func init() {
prometheus.MustRegister(discoveryScanResults, discoveryScanDuration, discoveryScanServers, discoveryScanErrors)
}
// NewService creates a new discovery service
func NewService(wsHub *websocket.Hub, interval time.Duration, subnet string, cfgProvider func() config.DiscoveryConfig) *Service {
if interval == 0 {
@ -48,13 +112,18 @@ func NewService(wsHub *websocket.Hub, interval time.Duration, subnet string, cfg
}
return &Service{
scanner: pkgdiscovery.NewScanner(),
wsHub: wsHub,
cache: &DiscoveryCache{},
interval: interval,
subnet: subnet,
stopChan: make(chan struct{}),
cfgProvider: cfgProvider,
scanner: pkgdiscovery.NewScanner(),
wsHub: wsHub,
cache: &DiscoveryCache{},
interval: interval,
subnet: subnet,
stopChan: make(chan struct{}),
cfgProvider: cfgProvider,
history: make([]historyEntry, 0, defaultHistoryLimit),
historyLimit: defaultHistoryLimit,
scannerFactory: func(cfg config.DiscoveryConfig) (discoveryScanner, error) {
return BuildScanner(cfg)
},
}
}
@ -97,8 +166,46 @@ func (s *Service) scanLoop() {
}
}
func (s *Service) appendHistory(entry historyEntry) {
s.mu.Lock()
defer s.mu.Unlock()
if s.historyLimit <= 0 {
s.historyLimit = defaultHistoryLimit
}
s.history = append(s.history, entry)
if len(s.history) > s.historyLimit {
offset := len(s.history) - s.historyLimit
s.history = append([]historyEntry(nil), s.history[offset:]...)
}
}
// GetHistory returns up to limit recent discovery history entries (most recent first).
func (s *Service) GetHistory(limit int) []historyEntry {
s.mu.RLock()
defer s.mu.RUnlock()
if limit <= 0 || limit > len(s.history) {
limit = len(s.history)
}
if limit == 0 {
return nil
}
result := make([]historyEntry, 0, limit)
for i := len(s.history) - 1; i >= 0 && len(result) < limit; i-- {
result = append(result, s.history[i])
}
return result
}
// performScan executes a network scan
func (s *Service) performScan() {
startTime := time.Now()
var scanErr error
var blocklistLength int
s.mu.Lock()
if s.isScanning {
s.mu.Unlock()
@ -111,16 +218,54 @@ func (s *Service) performScan() {
var result *pkgdiscovery.DiscoveryResult
defer func() {
duration := time.Since(startTime)
completedAt := time.Now()
serverCount := 0
errorCount := 0
status := "success"
if result != nil {
serverCount = len(result.Servers)
if len(result.StructuredErrors) > 0 {
errorCount = len(result.StructuredErrors)
} else {
errorCount = len(result.Errors)
}
}
if scanErr != nil {
if result == nil || serverCount == 0 {
status = "failure"
} else {
status = "partial"
}
}
discoveryScanDuration.Observe(duration.Seconds())
discoveryScanServers.Set(float64(serverCount))
discoveryScanErrors.Set(float64(errorCount))
discoveryScanResults.WithLabelValues(status).Inc()
s.appendHistory(historyEntry{
startedAt: startTime,
completedAt: completedAt,
subnet: s.subnet,
serverCount: serverCount,
errorCount: errorCount,
duration: duration,
blocklistLength: blocklistLength,
status: status,
})
s.mu.Lock()
s.isScanning = false
s.lastScan = time.Now()
s.lastScan = completedAt
s.mu.Unlock()
// Send scan complete notification
if s.wsHub != nil {
data := map[string]interface{}{
"scanning": false,
"timestamp": time.Now().Unix(),
"timestamp": completedAt.Unix(),
}
if result != nil && result.Environment != nil {
data["environment"] = result.Environment
@ -155,12 +300,25 @@ func (s *Service) performScan() {
if s.cfgProvider != nil {
cfg = config.NormalizeDiscoveryConfig(config.CloneDiscoveryConfig(s.cfgProvider()))
}
blocklistLength = len(cfg.SubnetBlocklist)
newScanner, err := BuildScanner(cfg)
var (
newScanner discoveryScanner
err error
)
if s.scannerFactory != nil {
newScanner, err = s.scannerFactory(cfg)
} else {
newScanner, err = BuildScanner(cfg)
}
if err != nil {
log.Warn().Err(err).Msg("Environment detection failed during discovery; falling back to default scanner configuration")
newScanner = pkgdiscovery.NewScanner()
}
if newScanner == nil {
log.Warn().Msg("Discovery scanner factory returned nil; using default scanner configuration")
newScanner = pkgdiscovery.NewScanner()
}
s.mu.Lock()
s.scanner = newScanner
s.mu.Unlock()
@ -199,6 +357,7 @@ func (s *Service) performScan() {
}
result, err = newScanner.DiscoverServersWithCallbacks(scanCtx, s.subnet, serverCallback, progressCallback)
scanErr = err
if err != nil {
// Even if scan timed out, we might have partial results
if result == nil || (len(result.Servers) == 0 && !errors.Is(err, context.DeadlineExceeded)) {
@ -235,7 +394,7 @@ func (s *Service) performScan() {
if s.wsHub != nil {
data := map[string]interface{}{
"servers": result.Servers,
"errors": result.Errors, // Legacy format (deprecated)
"errors": result.Errors, // Legacy format (deprecated)
"structured_errors": result.StructuredErrors, // New structured format
"scanning": false,
"timestamp": time.Now().Unix(),
@ -256,11 +415,11 @@ func (s *Service) GetCachedResult() (*pkgdiscovery.DiscoveryResult, time.Time) {
s.cache.mu.RLock()
defer s.cache.mu.RUnlock()
if s.cache.result == nil {
return &pkgdiscovery.DiscoveryResult{
Servers: []pkgdiscovery.DiscoveredServer{},
Errors: []string{},
}, time.Time{}
if s.cache.result == nil {
return &pkgdiscovery.DiscoveryResult{
Servers: []pkgdiscovery.DiscoveredServer{},
Errors: []string{},
}, time.Time{}
}
return s.cache.result, s.cache.updated

View file

@ -0,0 +1,155 @@
package discovery
import (
"context"
"errors"
"testing"
"time"
"github.com/prometheus/client_golang/prometheus/testutil"
"github.com/rcourtman/pulse-go-rewrite/internal/config"
pkgdiscovery "github.com/rcourtman/pulse-go-rewrite/pkg/discovery"
)
type fakeScanner struct {
result *pkgdiscovery.DiscoveryResult
err error
}
func (f *fakeScanner) DiscoverServersWithCallbacks(ctx context.Context, subnet string, serverCallback pkgdiscovery.ServerCallback, progressCallback pkgdiscovery.ProgressCallback) (*pkgdiscovery.DiscoveryResult, error) {
if serverCallback != nil && f.result != nil {
for _, server := range f.result.Servers {
serverCallback(server, "test-phase")
}
}
if progressCallback != nil {
progressCallback(pkgdiscovery.ScanProgress{
CurrentPhase: "test-phase",
PhaseNumber: 1,
TotalPhases: 1,
})
}
return f.result, f.err
}
func TestPerformScanRecordsHistoryAndMetrics(t *testing.T) {
t.Parallel()
service := NewService(nil, time.Minute, "192.168.1.0/24", func() config.DiscoveryConfig {
cfg := config.DefaultDiscoveryConfig()
cfg.SubnetBlocklist = []string{"10.0.0.0/24", "172.16.0.0/24"}
return cfg
})
service.ctx = context.Background()
scanner := &fakeScanner{
result: &pkgdiscovery.DiscoveryResult{
Servers: []pkgdiscovery.DiscoveredServer{
{IP: "192.168.1.10", Port: 8006, Type: "pve"},
{IP: "192.168.1.11", Port: 8007, Type: "pbs"},
},
StructuredErrors: []pkgdiscovery.DiscoveryError{
{Phase: "test-phase", ErrorType: "timeout"},
},
},
}
beforeSuccess := testutil.ToFloat64(discoveryScanResults.WithLabelValues("success"))
service.scannerFactory = func(config.DiscoveryConfig) (discoveryScanner, error) {
return scanner, nil
}
service.performScan()
afterSuccess := testutil.ToFloat64(discoveryScanResults.WithLabelValues("success"))
if afterSuccess != beforeSuccess+1 {
t.Fatalf("expected success counter to increment by 1; before=%f after=%f", beforeSuccess, afterSuccess)
}
if got := testutil.ToFloat64(discoveryScanServers); got != float64(len(scanner.result.Servers)) {
t.Fatalf("expected discoveryScanServers gauge to equal %d, got %f", len(scanner.result.Servers), got)
}
if got := testutil.ToFloat64(discoveryScanErrors); got != float64(len(scanner.result.StructuredErrors)) {
t.Fatalf("expected discoveryScanErrors gauge to equal %d, got %f", len(scanner.result.StructuredErrors), got)
}
history := service.GetHistory(5)
if len(history) != 1 {
t.Fatalf("expected 1 history entry, got %d", len(history))
}
entry := history[0]
if entry.status != "success" {
t.Fatalf("expected history status success, got %s", entry.status)
}
if entry.serverCount != len(scanner.result.Servers) {
t.Fatalf("expected serverCount %d, got %d", len(scanner.result.Servers), entry.serverCount)
}
if entry.errorCount != len(scanner.result.StructuredErrors) {
t.Fatalf("expected errorCount %d, got %d", len(scanner.result.StructuredErrors), entry.errorCount)
}
if entry.blocklistLength != 2 {
t.Fatalf("expected blocklist length 2, got %d", entry.blocklistLength)
}
if entry.duration <= 0 {
t.Fatalf("expected positive duration, got %v", entry.duration)
}
if entry.startedAt.IsZero() || entry.completedAt.IsZero() {
t.Fatalf("expected timestamps to be populated, got startedAt=%v completedAt=%v", entry.startedAt, entry.completedAt)
}
}
func TestPerformScanRecordsPartialFailure(t *testing.T) {
t.Parallel()
service := NewService(nil, time.Minute, "auto", func() config.DiscoveryConfig {
cfg := config.DefaultDiscoveryConfig()
return cfg
})
service.ctx = context.Background()
scanner := &fakeScanner{
result: &pkgdiscovery.DiscoveryResult{
Servers: []pkgdiscovery.DiscoveredServer{
{IP: "192.168.2.20", Port: 8006, Type: "pve"},
},
StructuredErrors: []pkgdiscovery.DiscoveryError{
{Phase: "phase-one", ErrorType: "timeout"},
{Phase: "phase-two", ErrorType: "connection_refused"},
},
},
err: errors.New("scan timeout"),
}
beforePartial := testutil.ToFloat64(discoveryScanResults.WithLabelValues("partial"))
service.scannerFactory = func(config.DiscoveryConfig) (discoveryScanner, error) {
return scanner, nil
}
service.performScan()
afterPartial := testutil.ToFloat64(discoveryScanResults.WithLabelValues("partial"))
if afterPartial != beforePartial+1 {
t.Fatalf("expected partial counter to increment by 1; before=%f after=%f", beforePartial, afterPartial)
}
history := service.GetHistory(5)
if len(history) == 0 {
t.Fatalf("expected history entry to be recorded")
}
entry := history[0]
if entry.status != "partial" {
t.Fatalf("expected status partial, got %s", entry.status)
}
if entry.serverCount != len(scanner.result.Servers) {
t.Fatalf("expected serverCount %d, got %d", len(scanner.result.Servers), entry.serverCount)
}
if entry.errorCount != len(scanner.result.StructuredErrors) {
t.Fatalf("expected errorCount %d, got %d", len(scanner.result.StructuredErrors), entry.errorCount)
}
}

View file

@ -3,12 +3,16 @@ package dockeragent
import (
"bytes"
"context"
"crypto/rand"
"crypto/sha256"
"crypto/tls"
"encoding/hex"
"encoding/json"
"errors"
"fmt"
"io"
"math"
"math/big"
"net/http"
"os"
"os/exec"
@ -267,12 +271,22 @@ func (a *Agent) Run(ctx context.Context) error {
ticker := time.NewTicker(interval)
defer ticker.Stop()
// Check for updates on startup
go a.checkForUpdates(ctx)
const (
updateInterval = 24 * time.Hour
startupJitterWindow = 2 * time.Minute
recurringJitterWindow = 5 * time.Minute
)
// Check for updates daily
updateTicker := time.NewTicker(24 * time.Hour)
defer updateTicker.Stop()
initialDelay := 5*time.Second + randomDuration(startupJitterWindow)
updateTimer := time.NewTimer(initialDelay)
defer func() {
if !updateTimer.Stop() {
select {
case <-updateTimer.C:
default:
}
}
}()
if err := a.collectOnce(ctx); err != nil {
if errors.Is(err, ErrStopRequested) {
@ -284,6 +298,12 @@ func (a *Agent) Run(ctx context.Context) error {
for {
select {
case <-ctx.Done():
if !updateTimer.Stop() {
select {
case <-updateTimer.C:
default:
}
}
return ctx.Err()
case <-ticker.C:
if err := a.collectOnce(ctx); err != nil {
@ -292,8 +312,13 @@ func (a *Agent) Run(ctx context.Context) error {
}
a.logger.Error().Err(err).Msg("Failed to send docker report")
}
case <-updateTicker.C:
case <-updateTimer.C:
go a.checkForUpdates(ctx)
nextDelay := updateInterval + randomDuration(recurringJitterWindow)
if nextDelay <= 0 {
nextDelay = updateInterval
}
updateTimer.Reset(nextDelay)
}
}
}
@ -644,7 +669,20 @@ func (a *Agent) sendReportToTarget(ctx context.Context, target TargetConfig, pay
defer resp.Body.Close()
if resp.StatusCode >= 300 {
return fmt.Errorf("target %s: pulse responded with status %s", target.URL, resp.Status)
bodyBytes, _ := io.ReadAll(resp.Body)
if hostRemoved := detectHostRemovedError(bodyBytes); hostRemoved != "" {
a.logger.Warn().
Str("hostID", a.hostID).
Str("pulseURL", target.URL).
Str("detail", hostRemoved).
Msg("Pulse rejected docker report because this host was previously removed. Allow the host to re-enroll from the Pulse UI or rerun the installer with a docker:manage token.")
return ErrStopRequested
}
errMsg := strings.TrimSpace(string(bodyBytes))
if errMsg == "" {
errMsg = resp.Status
}
return fmt.Errorf("target %s: pulse responded %s: %s", target.URL, resp.Status, errMsg)
}
body, err := io.ReadAll(resp.Body)
@ -745,10 +783,14 @@ func disableSystemdService(ctx context.Context, service string) error {
if err != nil {
if exitErr, ok := err.(*exec.ExitError); ok {
exitCode := exitErr.ExitCode()
lowerOutput := strings.ToLower(string(output))
trimmedOutput := strings.TrimSpace(string(output))
lowerOutput := strings.ToLower(trimmedOutput)
if exitCode == 5 || strings.Contains(lowerOutput, "could not be found") || strings.Contains(lowerOutput, "not-found") {
return nil
}
if strings.Contains(lowerOutput, "access denied") || strings.Contains(lowerOutput, "permission denied") {
return fmt.Errorf("systemctl disable %s: access denied. Run 'sudo systemctl disable --now %s' or rerun the installer with sudo so it can install the polkit rule (systemctl output: %s)", service, service, trimmedOutput)
}
}
return fmt.Errorf("systemctl disable %s: %w (%s)", service, err, strings.TrimSpace(string(output)))
}
@ -770,11 +812,15 @@ func stopSystemdService(ctx context.Context, service string) error {
if err != nil {
if exitErr, ok := err.(*exec.ExitError); ok {
exitCode := exitErr.ExitCode()
lowerOutput := strings.ToLower(string(output))
trimmedOutput := strings.TrimSpace(string(output))
lowerOutput := strings.ToLower(trimmedOutput)
// Ignore "not found" errors since the service might already be stopped
if exitCode == 5 || strings.Contains(lowerOutput, "could not be found") || strings.Contains(lowerOutput, "not-found") {
return nil
}
if strings.Contains(lowerOutput, "access denied") || strings.Contains(lowerOutput, "permission denied") {
return fmt.Errorf("systemctl stop %s: access denied. Run 'sudo systemctl stop %s' or rerun the installer with sudo so it can install the polkit rule (systemctl output: %s)", service, service, trimmedOutput)
}
}
return fmt.Errorf("systemctl stop %s: %w (%s)", service, err, strings.TrimSpace(string(output)))
}
@ -982,6 +1028,40 @@ func readSystemUptime() int64 {
return int64(seconds)
}
func randomDuration(max time.Duration) time.Duration {
if max <= 0 {
return 0
}
n, err := rand.Int(rand.Reader, big.NewInt(int64(max)))
if err != nil {
return 0
}
return time.Duration(n.Int64())
}
func detectHostRemovedError(body []byte) string {
if len(body) == 0 {
return ""
}
var payload struct {
Error string `json:"error"`
Code string `json:"code"`
}
if err := json.Unmarshal(body, &payload); err != nil {
return ""
}
if strings.ToLower(payload.Code) != "invalid_report" {
return ""
}
if !strings.Contains(strings.ToLower(payload.Error), "was removed") {
return ""
}
return payload.Error
}
// checkForUpdates checks if a newer version is available and performs self-update if needed
func (a *Agent) checkForUpdates(ctx context.Context) {
// Skip updates if disabled via config
@ -1170,6 +1250,8 @@ func (a *Agent) selfUpdate(ctx context.Context) error {
}
defer resp.Body.Close()
checksumHeader := strings.TrimSpace(resp.Header.Get("X-Checksum-Sha256"))
// Create temporary file
tmpFile, err := os.CreateTemp("", "pulse-docker-agent-*.tmp")
if err != nil {
@ -1179,11 +1261,26 @@ func (a *Agent) selfUpdate(ctx context.Context) error {
defer os.Remove(tmpPath) // Clean up if something goes wrong
// Write downloaded binary to temp file
if _, err := tmpFile.ReadFrom(resp.Body); err != nil {
hasher := sha256.New()
if _, err := io.Copy(tmpFile, io.TeeReader(resp.Body, hasher)); err != nil {
tmpFile.Close()
return fmt.Errorf("failed to write downloaded binary: %w", err)
}
tmpFile.Close()
if err := tmpFile.Close(); err != nil {
return fmt.Errorf("failed to close temp file: %w", err)
}
downloadChecksum := hex.EncodeToString(hasher.Sum(nil))
if checksumHeader != "" {
expected := strings.ToLower(strings.TrimSpace(checksumHeader))
actual := strings.ToLower(downloadChecksum)
if expected != actual {
return fmt.Errorf("checksum verification failed: expected %s, got %s", expected, actual)
}
a.logger.Debug().Str("checksum", downloadChecksum).Msg("Self-update: checksum verified")
} else {
a.logger.Warn().Msg("Self-update: checksum header missing; skipping verification")
}
// Make temp file executable
if err := os.Chmod(tmpPath, 0755); err != nil {

View file

@ -0,0 +1,162 @@
package dockeragent
import (
"context"
"encoding/json"
"errors"
"io"
"net/http"
"net/http/httptest"
"strings"
"sync"
"testing"
agentsdocker "github.com/rcourtman/pulse-go-rewrite/pkg/agents/docker"
"github.com/rs/zerolog"
)
func TestSendReportIntegration(t *testing.T) {
t.Parallel()
var (
mu sync.Mutex
requests []agentsdocker.Report
tokenValues []string
userAgents []string
)
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
t.Fatalf("unexpected method %s", r.Method)
}
if r.URL.Path != "/api/agents/docker/report" {
t.Fatalf("unexpected path %s", r.URL.Path)
}
body, err := io.ReadAll(r.Body)
if err != nil {
t.Fatalf("failed to read request body: %v", err)
}
_ = r.Body.Close()
var report agentsdocker.Report
if err := json.Unmarshal(body, &report); err != nil {
t.Fatalf("failed to unmarshal report: %v", err)
}
mu.Lock()
requests = append(requests, report)
tokenValues = append(tokenValues, r.Header.Get("X-API-Token"))
userAgents = append(userAgents, r.Header.Get("User-Agent"))
mu.Unlock()
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusOK)
_, _ = w.Write([]byte(`{"commands":[]}`))
}))
defer server.Close()
client := server.Client()
agent := &Agent{
cfg: Config{
Targets: []TargetConfig{{
URL: server.URL,
Token: "secret-token",
}},
},
httpClients: map[bool]*http.Client{
false: client,
},
logger: zerolog.New(io.Discard),
targets: []TargetConfig{{
URL: server.URL,
Token: "secret-token",
}},
}
report := agentsdocker.Report{
Agent: agentsdocker.AgentInfo{
IntervalSeconds: 30,
},
Host: agentsdocker.HostInfo{
Hostname: "stub-host",
},
Containers: []agentsdocker.Container{
{ID: "container-1"},
},
}
if err := agent.sendReport(context.Background(), report); err != nil {
t.Fatalf("sendReport returned error: %v", err)
}
mu.Lock()
defer mu.Unlock()
if got := len(requests); got != 1 {
t.Fatalf("expected 1 request, got %d", got)
}
if tokenValues[0] != "secret-token" {
t.Fatalf("expected token %q, got %q", "secret-token", tokenValues[0])
}
if requests[0].Host.Hostname != "stub-host" {
t.Fatalf("expected hostname stub-host, got %s", requests[0].Host.Hostname)
}
if len(requests[0].Containers) != 1 {
t.Fatalf("expected 1 container reported, got %d", len(requests[0].Containers))
}
if userAgents[0] == "" {
t.Fatalf("missing user-agent header")
}
if !strings.HasPrefix(userAgents[0], "pulse-docker-agent/") {
t.Fatalf("unexpected user-agent header: %s", userAgents[0])
}
}
func TestSendReportHostRemoved(t *testing.T) {
t.Parallel()
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusBadRequest)
_, _ = w.Write([]byte(`{"error":"docker host \"5316f5e1\" was removed at 2025-11-02T13:45:15Z and cannot report again","code":"invalid_report"}`))
}))
defer server.Close()
agent := &Agent{
cfg: Config{
Targets: []TargetConfig{{
URL: server.URL,
Token: "secret-token",
}},
},
httpClients: map[bool]*http.Client{
false: server.Client(),
},
logger: zerolog.New(io.Discard),
targets: []TargetConfig{{
URL: server.URL,
Token: "secret-token",
}},
hostID: "5316f5e1",
}
report := agentsdocker.Report{
Agent: agentsdocker.AgentInfo{
IntervalSeconds: 30,
},
Host: agentsdocker.HostInfo{
Hostname: "homeassistant",
},
}
err := agent.sendReport(context.Background(), report)
if !errors.Is(err, ErrStopRequested) {
t.Fatalf("expected ErrStopRequested, got %v", err)
}
}

View file

@ -280,6 +280,16 @@ func (d DockerHost) ToFrontend() DockerHostFrontend {
return h
}
// ToFrontend converts a RemovedDockerHost to its frontend representation.
func (r RemovedDockerHost) ToFrontend() RemovedDockerHostFrontend {
return RemovedDockerHostFrontend{
ID: r.ID,
Hostname: r.Hostname,
DisplayName: r.DisplayName,
RemovedAt: r.RemovedAt.Unix() * 1000,
}
}
// ToFrontend converts a Host to HostFrontend.
func (h Host) ToFrontend() HostFrontend {
host := HostFrontend{
@ -401,8 +411,10 @@ func (c DockerContainer) ToFrontend() DockerContainerFrontend {
if c.BlockIO != nil {
container.BlockIO = &DockerContainerBlockIOFrontend{
ReadBytes: c.BlockIO.ReadBytes,
WriteBytes: c.BlockIO.WriteBytes,
ReadBytes: c.BlockIO.ReadBytes,
WriteBytes: c.BlockIO.WriteBytes,
ReadRateBytesPerSecond: c.BlockIO.ReadRateBytesPerSecond,
WriteRateBytesPerSecond: c.BlockIO.WriteRateBytesPerSecond,
}
}

View file

@ -9,29 +9,30 @@ import (
// State represents the current state of all monitored resources
type State struct {
mu sync.RWMutex
Nodes []Node `json:"nodes"`
VMs []VM `json:"vms"`
Containers []Container `json:"containers"`
DockerHosts []DockerHost `json:"dockerHosts"`
Hosts []Host `json:"hosts"`
Storage []Storage `json:"storage"`
CephClusters []CephCluster `json:"cephClusters"`
PhysicalDisks []PhysicalDisk `json:"physicalDisks"`
PBSInstances []PBSInstance `json:"pbs"`
PMGInstances []PMGInstance `json:"pmg"`
PBSBackups []PBSBackup `json:"pbsBackups"`
PMGBackups []PMGBackup `json:"pmgBackups"`
Backups Backups `json:"backups"`
ReplicationJobs []ReplicationJob `json:"replicationJobs"`
Metrics []Metric `json:"metrics"`
PVEBackups PVEBackups `json:"pveBackups"`
Performance Performance `json:"performance"`
ConnectionHealth map[string]bool `json:"connectionHealth"`
Stats Stats `json:"stats"`
ActiveAlerts []Alert `json:"activeAlerts"`
RecentlyResolved []ResolvedAlert `json:"recentlyResolved"`
LastUpdate time.Time `json:"lastUpdate"`
mu sync.RWMutex
Nodes []Node `json:"nodes"`
VMs []VM `json:"vms"`
Containers []Container `json:"containers"`
DockerHosts []DockerHost `json:"dockerHosts"`
RemovedDockerHosts []RemovedDockerHost `json:"removedDockerHosts"`
Hosts []Host `json:"hosts"`
Storage []Storage `json:"storage"`
CephClusters []CephCluster `json:"cephClusters"`
PhysicalDisks []PhysicalDisk `json:"physicalDisks"`
PBSInstances []PBSInstance `json:"pbs"`
PMGInstances []PMGInstance `json:"pmg"`
PBSBackups []PBSBackup `json:"pbsBackups"`
PMGBackups []PMGBackup `json:"pmgBackups"`
Backups Backups `json:"backups"`
ReplicationJobs []ReplicationJob `json:"replicationJobs"`
Metrics []Metric `json:"metrics"`
PVEBackups PVEBackups `json:"pveBackups"`
Performance Performance `json:"performance"`
ConnectionHealth map[string]bool `json:"connectionHealth"`
Stats Stats `json:"stats"`
ActiveAlerts []Alert `json:"activeAlerts"`
RecentlyResolved []ResolvedAlert `json:"recentlyResolved"`
LastUpdate time.Time `json:"lastUpdate"`
}
// Alert represents an active alert (simplified for State)
@ -224,6 +225,14 @@ type DockerHost struct {
Command *DockerHostCommandStatus `json:"command,omitempty"`
}
// RemovedDockerHost tracks a docker host that was deliberately removed and blocked from reporting.
type RemovedDockerHost struct {
ID string `json:"id"`
Hostname string `json:"hostname,omitempty"`
DisplayName string `json:"displayName,omitempty"`
RemovedAt time.Time `json:"removedAt"`
}
// DockerContainer represents the state of a Docker container on a monitored host.
type DockerContainer struct {
ID string `json:"id"`
@ -268,8 +277,10 @@ type DockerContainerNetworkLink struct {
// DockerContainerBlockIO captures aggregate block IO usage for a container.
type DockerContainerBlockIO struct {
ReadBytes uint64 `json:"readBytes,omitempty"`
WriteBytes uint64 `json:"writeBytes,omitempty"`
ReadBytes uint64 `json:"readBytes,omitempty"`
WriteBytes uint64 `json:"writeBytes,omitempty"`
ReadRateBytesPerSecond *float64 `json:"readRateBytesPerSecond,omitempty"`
WriteRateBytesPerSecond *float64 `json:"writeRateBytesPerSecond,omitempty"`
}
// DockerContainerMount describes a mount exposed to a container.
@ -1231,6 +1242,52 @@ func (s *State) GetDockerHosts() []DockerHost {
return hosts
}
// AddRemovedDockerHost records a removed docker host entry.
func (s *State) AddRemovedDockerHost(entry RemovedDockerHost) {
s.mu.Lock()
defer s.mu.Unlock()
replaced := false
for i, existing := range s.RemovedDockerHosts {
if existing.ID == entry.ID {
s.RemovedDockerHosts[i] = entry
replaced = true
break
}
}
if !replaced {
s.RemovedDockerHosts = append(s.RemovedDockerHosts, entry)
}
sort.Slice(s.RemovedDockerHosts, func(i, j int) bool {
return s.RemovedDockerHosts[i].RemovedAt.After(s.RemovedDockerHosts[j].RemovedAt)
})
s.LastUpdate = time.Now()
}
// RemoveRemovedDockerHost deletes a removed docker host entry by ID.
func (s *State) RemoveRemovedDockerHost(hostID string) {
s.mu.Lock()
defer s.mu.Unlock()
for i, entry := range s.RemovedDockerHosts {
if entry.ID == hostID {
s.RemovedDockerHosts = append(s.RemovedDockerHosts[:i], s.RemovedDockerHosts[i+1:]...)
s.LastUpdate = time.Now()
break
}
}
}
// GetRemovedDockerHosts returns a copy of removed docker host entries.
func (s *State) GetRemovedDockerHosts() []RemovedDockerHost {
s.mu.RLock()
defer s.mu.RUnlock()
entries := make([]RemovedDockerHost, len(s.RemovedDockerHosts))
copy(entries, s.RemovedDockerHosts)
return entries
}
// UpsertHost inserts or updates a generic host in state.
func (s *State) UpsertHost(host Host) {
s.mu.Lock()

View file

@ -130,6 +130,14 @@ type DockerHostFrontend struct {
Command *DockerHostCommandFrontend `json:"command,omitempty"`
}
// RemovedDockerHostFrontend represents a blocked docker host entry for the frontend.
type RemovedDockerHostFrontend struct {
ID string `json:"id"`
Hostname string `json:"hostname,omitempty"`
DisplayName string `json:"displayName,omitempty"`
RemovedAt int64 `json:"removedAt"`
}
// DockerContainerFrontend represents a Docker container for the frontend
type DockerContainerFrontend struct {
ID string `json:"id"`
@ -174,8 +182,10 @@ type DockerContainerNetworkFrontend struct {
// DockerContainerBlockIOFrontend exposes aggregate block IO counters.
type DockerContainerBlockIOFrontend struct {
ReadBytes uint64 `json:"readBytes,omitempty"`
WriteBytes uint64 `json:"writeBytes,omitempty"`
ReadBytes uint64 `json:"readBytes,omitempty"`
WriteBytes uint64 `json:"writeBytes,omitempty"`
ReadRateBytesPerSecond *float64 `json:"readRateBytesPerSecond,omitempty"`
WriteRateBytesPerSecond *float64 `json:"writeRateBytesPerSecond,omitempty"`
}
// DockerContainerMountFrontend represents a container mount for the UI.
@ -392,25 +402,26 @@ type ReplicationJobFrontend struct {
// StateFrontend represents the state with frontend-friendly field names
type StateFrontend struct {
Nodes []NodeFrontend `json:"nodes"`
VMs []VMFrontend `json:"vms"`
Containers []ContainerFrontend `json:"containers"`
DockerHosts []DockerHostFrontend `json:"dockerHosts"`
Hosts []HostFrontend `json:"hosts"`
Storage []StorageFrontend `json:"storage"`
CephClusters []CephClusterFrontend `json:"cephClusters"`
PhysicalDisks []PhysicalDisk `json:"physicalDisks"`
PBS []PBSInstance `json:"pbs"` // Keep as is
PMG []PMGInstance `json:"pmg"`
PBSBackups []PBSBackup `json:"pbsBackups"`
PMGBackups []PMGBackup `json:"pmgBackups"`
Backups Backups `json:"backups"`
ReplicationJobs []ReplicationJobFrontend `json:"replicationJobs"`
ActiveAlerts []Alert `json:"activeAlerts"` // Active alerts
Metrics map[string]any `json:"metrics"` // Empty object for now
PVEBackups PVEBackups `json:"pveBackups"` // Keep as is
Performance map[string]any `json:"performance"` // Empty object for now
ConnectionHealth map[string]bool `json:"connectionHealth"` // Keep as is
Stats map[string]any `json:"stats"` // Empty object for now
LastUpdate int64 `json:"lastUpdate"` // Unix timestamp
Nodes []NodeFrontend `json:"nodes"`
VMs []VMFrontend `json:"vms"`
Containers []ContainerFrontend `json:"containers"`
DockerHosts []DockerHostFrontend `json:"dockerHosts"`
RemovedDockerHosts []RemovedDockerHostFrontend `json:"removedDockerHosts"`
Hosts []HostFrontend `json:"hosts"`
Storage []StorageFrontend `json:"storage"`
CephClusters []CephClusterFrontend `json:"cephClusters"`
PhysicalDisks []PhysicalDisk `json:"physicalDisks"`
PBS []PBSInstance `json:"pbs"` // Keep as is
PMG []PMGInstance `json:"pmg"`
PBSBackups []PBSBackup `json:"pbsBackups"`
PMGBackups []PMGBackup `json:"pmgBackups"`
Backups Backups `json:"backups"`
ReplicationJobs []ReplicationJobFrontend `json:"replicationJobs"`
ActiveAlerts []Alert `json:"activeAlerts"` // Active alerts
Metrics map[string]any `json:"metrics"` // Empty object for now
PVEBackups PVEBackups `json:"pveBackups"` // Keep as is
Performance map[string]any `json:"performance"` // Empty object for now
ConnectionHealth map[string]bool `json:"connectionHealth"` // Keep as is
Stats map[string]any `json:"stats"` // Empty object for now
LastUpdate int64 `json:"lastUpdate"` // Unix timestamp
}

View file

@ -4,28 +4,29 @@ import "time"
// StateSnapshot represents a snapshot of the state without mutex
type StateSnapshot struct {
Nodes []Node `json:"nodes"`
VMs []VM `json:"vms"`
Containers []Container `json:"containers"`
DockerHosts []DockerHost `json:"dockerHosts"`
Hosts []Host `json:"hosts"`
Storage []Storage `json:"storage"`
CephClusters []CephCluster `json:"cephClusters"`
PhysicalDisks []PhysicalDisk `json:"physicalDisks"`
PBSInstances []PBSInstance `json:"pbs"`
PMGInstances []PMGInstance `json:"pmg"`
PBSBackups []PBSBackup `json:"pbsBackups"`
PMGBackups []PMGBackup `json:"pmgBackups"`
Backups Backups `json:"backups"`
ReplicationJobs []ReplicationJob `json:"replicationJobs"`
Metrics []Metric `json:"metrics"`
PVEBackups PVEBackups `json:"pveBackups"`
Performance Performance `json:"performance"`
ConnectionHealth map[string]bool `json:"connectionHealth"`
Stats Stats `json:"stats"`
ActiveAlerts []Alert `json:"activeAlerts"`
RecentlyResolved []ResolvedAlert `json:"recentlyResolved"`
LastUpdate time.Time `json:"lastUpdate"`
Nodes []Node `json:"nodes"`
VMs []VM `json:"vms"`
Containers []Container `json:"containers"`
DockerHosts []DockerHost `json:"dockerHosts"`
RemovedDockerHosts []RemovedDockerHost `json:"removedDockerHosts"`
Hosts []Host `json:"hosts"`
Storage []Storage `json:"storage"`
CephClusters []CephCluster `json:"cephClusters"`
PhysicalDisks []PhysicalDisk `json:"physicalDisks"`
PBSInstances []PBSInstance `json:"pbs"`
PMGInstances []PMGInstance `json:"pmg"`
PBSBackups []PBSBackup `json:"pbsBackups"`
PMGBackups []PMGBackup `json:"pmgBackups"`
Backups Backups `json:"backups"`
ReplicationJobs []ReplicationJob `json:"replicationJobs"`
Metrics []Metric `json:"metrics"`
PVEBackups PVEBackups `json:"pveBackups"`
Performance Performance `json:"performance"`
ConnectionHealth map[string]bool `json:"connectionHealth"`
Stats Stats `json:"stats"`
ActiveAlerts []Alert `json:"activeAlerts"`
RecentlyResolved []ResolvedAlert `json:"recentlyResolved"`
LastUpdate time.Time `json:"lastUpdate"`
}
// GetSnapshot returns a snapshot of the current state without mutex
@ -43,18 +44,19 @@ func (s *State) GetSnapshot() StateSnapshot {
// Create a snapshot without mutex
snapshot := StateSnapshot{
Nodes: append([]Node{}, s.Nodes...),
VMs: append([]VM{}, s.VMs...),
Containers: append([]Container{}, s.Containers...),
DockerHosts: append([]DockerHost{}, s.DockerHosts...),
Hosts: append([]Host{}, s.Hosts...),
Storage: append([]Storage{}, s.Storage...),
CephClusters: append([]CephCluster{}, s.CephClusters...),
PhysicalDisks: append([]PhysicalDisk{}, s.PhysicalDisks...),
PBSInstances: append([]PBSInstance{}, s.PBSInstances...),
PMGInstances: append([]PMGInstance{}, s.PMGInstances...),
PBSBackups: pbsBackups,
PMGBackups: pmgBackups,
Nodes: append([]Node{}, s.Nodes...),
VMs: append([]VM{}, s.VMs...),
Containers: append([]Container{}, s.Containers...),
DockerHosts: append([]DockerHost{}, s.DockerHosts...),
RemovedDockerHosts: append([]RemovedDockerHost{}, s.RemovedDockerHosts...),
Hosts: append([]Host{}, s.Hosts...),
Storage: append([]Storage{}, s.Storage...),
CephClusters: append([]CephCluster{}, s.CephClusters...),
PhysicalDisks: append([]PhysicalDisk{}, s.PhysicalDisks...),
PBSInstances: append([]PBSInstance{}, s.PBSInstances...),
PMGInstances: append([]PMGInstance{}, s.PMGInstances...),
PBSBackups: pbsBackups,
PMGBackups: pmgBackups,
Backups: Backups{
PVE: pveBackups,
PBS: pbsBackups,
@ -104,6 +106,11 @@ func (s StateSnapshot) ToFrontend() StateFrontend {
dockerHosts[i] = host.ToFrontend()
}
removedDockerHosts := make([]RemovedDockerHostFrontend, len(s.RemovedDockerHosts))
for i, entry := range s.RemovedDockerHosts {
removedDockerHosts[i] = entry.ToFrontend()
}
hosts := make([]HostFrontend, len(s.Hosts))
for i, host := range s.Hosts {
hosts[i] = host.ToFrontend()
@ -127,26 +134,27 @@ func (s StateSnapshot) ToFrontend() StateFrontend {
}
return StateFrontend{
Nodes: nodes,
VMs: vms,
Containers: containers,
DockerHosts: dockerHosts,
Hosts: hosts,
Storage: storage,
CephClusters: cephClusters,
PhysicalDisks: s.PhysicalDisks,
PBS: s.PBSInstances,
PMG: s.PMGInstances,
PBSBackups: s.PBSBackups,
PMGBackups: s.PMGBackups,
Backups: s.Backups,
ReplicationJobs: replicationJobs,
ActiveAlerts: s.ActiveAlerts,
Metrics: make(map[string]any),
PVEBackups: s.PVEBackups,
Performance: make(map[string]any),
ConnectionHealth: s.ConnectionHealth,
Stats: make(map[string]any),
LastUpdate: s.LastUpdate.Unix() * 1000, // JavaScript timestamp
Nodes: nodes,
VMs: vms,
Containers: containers,
DockerHosts: dockerHosts,
RemovedDockerHosts: removedDockerHosts,
Hosts: hosts,
Storage: storage,
CephClusters: cephClusters,
PhysicalDisks: s.PhysicalDisks,
PBS: s.PBSInstances,
PMG: s.PMGInstances,
PBSBackups: s.PBSBackups,
PMGBackups: s.PMGBackups,
Backups: s.Backups,
ReplicationJobs: replicationJobs,
ActiveAlerts: s.ActiveAlerts,
Metrics: make(map[string]any),
PVEBackups: s.PVEBackups,
Performance: make(map[string]any),
ConnectionHealth: s.ConnectionHealth,
Stats: make(map[string]any),
LastUpdate: s.LastUpdate.Unix() * 1000, // JavaScript timestamp
}
}

View file

@ -205,3 +205,30 @@ func TestCleanupRemovedDockerHosts(t *testing.T) {
t.Fatalf("expected fresh host removal entry to remain")
}
}
func TestAllowDockerHostReenrollNoopWhenHostNotBlocked(t *testing.T) {
t.Parallel()
monitor := newTestMonitorForCommands(t)
host := models.DockerHost{
ID: "host-not-blocked",
Hostname: "not-blocked",
DisplayName: "Not Blocked",
Status: "online",
}
monitor.state.UpsertDockerHost(host)
if err := monitor.AllowDockerHostReenroll(host.ID); err != nil {
t.Fatalf("allow reenroll for non-blocked host returned error: %v", err)
}
if _, exists := monitor.removedDockerHosts[host.ID]; exists {
t.Fatalf("non-blocked host should not be added to removal map")
}
stateHost := findDockerHost(t, monitor, host.ID)
if stateHost.ID != host.ID {
t.Fatalf("expected host to remain in state; got %+v", stateHost)
}
}

View file

@ -29,6 +29,7 @@ import (
"github.com/rcourtman/pulse-go-rewrite/internal/mock"
"github.com/rcourtman/pulse-go-rewrite/internal/models"
"github.com/rcourtman/pulse-go-rewrite/internal/notifications"
"github.com/rcourtman/pulse-go-rewrite/internal/types"
"github.com/rcourtman/pulse-go-rewrite/internal/websocket"
agentsdocker "github.com/rcourtman/pulse-go-rewrite/pkg/agents/docker"
agentshost "github.com/rcourtman/pulse-go-rewrite/pkg/agents/host"
@ -519,6 +520,13 @@ func safeFloat(val float64) float64 {
return val
}
func clampUint64ToInt64(val uint64) int64 {
if val > math.MaxInt64 {
return math.MaxInt64
}
return int64(val)
}
func cloneStringFloatMap(src map[string]float64) map[string]float64 {
if len(src) == 0 {
return nil
@ -852,14 +860,23 @@ func (m *Monitor) RemoveDockerHost(hostID string) (models.DockerHost, error) {
}
// Track removal to prevent resurrection from cached reports
removedAt := time.Now()
m.mu.Lock()
m.removedDockerHosts[hostID] = time.Now()
m.removedDockerHosts[hostID] = removedAt
if cmd, ok := m.dockerCommands[hostID]; ok {
delete(m.dockerCommandIndex, cmd.status.ID)
}
delete(m.dockerCommands, hostID)
m.mu.Unlock()
m.state.AddRemovedDockerHost(models.RemovedDockerHost{
ID: hostID,
Hostname: host.Hostname,
DisplayName: host.DisplayName,
RemovedAt: removedAt,
})
m.state.RemoveConnectionHealth(dockerConnectionPrefix + hostID)
if m.alertManager != nil {
m.alertManager.HandleDockerHostRemoved(host)
@ -1003,9 +1020,13 @@ func (m *Monitor) AllowDockerHostReenroll(hostID string) error {
defer m.mu.Unlock()
if _, exists := m.removedDockerHosts[hostID]; !exists {
log.Debug().
Str("dockerHostID", hostID).
Msg("Allow re-enroll requested for docker host that was not blocked")
host, found := m.GetDockerHost(hostID)
event := log.Info().
Str("dockerHostID", hostID)
if found {
event = event.Str("dockerHost", host.Hostname)
}
event.Msg("Allow re-enroll requested but host was not blocked; ignoring")
return nil
}
@ -1015,6 +1036,7 @@ func (m *Monitor) AllowDockerHostReenroll(hostID string) error {
delete(m.dockerCommands, hostID)
}
m.state.SetDockerHostCommand(hostID, nil)
m.state.RemoveRemovedDockerHost(hostID)
log.Info().
Str("dockerHostID", hostID).
@ -1437,6 +1459,27 @@ func (m *Monitor) ApplyDockerReport(report agentsdocker.Report, tokenRecord *con
ReadBytes: payload.BlockIO.ReadBytes,
WriteBytes: payload.BlockIO.WriteBytes,
}
containerIdentifier := payload.ID
if strings.TrimSpace(containerIdentifier) == "" {
containerIdentifier = payload.Name
}
if strings.TrimSpace(containerIdentifier) != "" {
metrics := types.IOMetrics{
DiskRead: clampUint64ToInt64(payload.BlockIO.ReadBytes),
DiskWrite: clampUint64ToInt64(payload.BlockIO.WriteBytes),
Timestamp: timestamp,
}
readRate, writeRate, _, _ := m.rateTracker.CalculateRates(fmt.Sprintf("docker:%s:%s", identifier, containerIdentifier), metrics)
if readRate >= 0 {
value := readRate
container.BlockIO.ReadRateBytesPerSecond = &value
}
if writeRate >= 0 {
value := writeRate
container.BlockIO.WriteRateBytesPerSecond = &value
}
}
}
if len(payload.Mounts) > 0 {
@ -1771,6 +1814,7 @@ func (m *Monitor) cleanupRemovedDockerHosts(now time.Time) {
for hostID, removedAt := range m.removedDockerHosts {
if now.Sub(removedAt) > removedDockerHostsTTL {
delete(m.removedDockerHosts, hostID)
m.state.RemoveRemovedDockerHost(hostID)
log.Debug().
Str("dockerHostID", hostID).
Time("removedAt", removedAt).

View file

@ -17,6 +17,7 @@ func newTestMonitor(t *testing.T) *Monitor {
state: models.NewState(),
alertManager: alerts.NewManager(),
removedDockerHosts: make(map[string]time.Time),
rateTracker: NewRateTracker(),
}
}
@ -171,6 +172,9 @@ func TestApplyDockerReportIncludesContainerDiskDetails(t *testing.T) {
if container.BlockIO.ReadBytes != 123456 || container.BlockIO.WriteBytes != 654321 {
t.Fatalf("unexpected block IO values: %+v", container.BlockIO)
}
if container.BlockIO.ReadRateBytesPerSecond != nil || container.BlockIO.WriteRateBytesPerSecond != nil {
t.Fatalf("expected block IO rates to be unset on first sample: %+v", container.BlockIO)
}
if len(container.Mounts) != 1 {
t.Fatalf("expected mounts to be preserved, got %d", len(container.Mounts))

View file

@ -600,8 +600,8 @@ func (tc *TemperatureCollector) shouldDisableProxy(err error) bool {
var proxyErr *tempproxy.ProxyError
if errors.As(err, &proxyErr) {
switch proxyErr.Type {
case tempproxy.ErrorTypeTransport, tempproxy.ErrorTypeTimeout, tempproxy.ErrorTypeSSH:
return true
case tempproxy.ErrorTypeTransport, tempproxy.ErrorTypeTimeout:
return true
default:
return false
}

View file

@ -48,10 +48,25 @@ export FRONTEND_DEV_HOST FRONTEND_DEV_PORT
export PULSE_DEV_API_HOST PULSE_DEV_API_PORT PULSE_DEV_API_URL PULSE_DEV_WS_URL
# Auto-detect pulse-sensor-proxy socket if available
HOST_PROXY_SOCKET="/mnt/pulse-proxy/pulse-sensor-proxy.sock"
CONTAINER_PROXY_SOCKET="/run/pulse-sensor-proxy/pulse-sensor-proxy.sock"
if [[ -z ${PULSE_SENSOR_PROXY_SOCKET:-} ]]; then
if [[ -S /mnt/pulse-proxy/pulse-sensor-proxy.sock ]]; then
export PULSE_SENSOR_PROXY_SOCKET=/mnt/pulse-proxy/pulse-sensor-proxy.sock
if [[ -S "${HOST_PROXY_SOCKET}" ]]; then
export PULSE_SENSOR_PROXY_SOCKET="${HOST_PROXY_SOCKET}"
printf "[hot-dev] Detected pulse-sensor-proxy socket at %s\n" "${PULSE_SENSOR_PROXY_SOCKET}"
elif [[ -S "${CONTAINER_PROXY_SOCKET}" ]]; then
export PULSE_SENSOR_PROXY_SOCKET="${CONTAINER_PROXY_SOCKET}"
printf "[hot-dev] WARNING: Using container-local pulse-sensor-proxy socket at %s\n" "${PULSE_SENSOR_PROXY_SOCKET}"
printf "[hot-dev] WARNING: Host proxy is missing; temperatures will not reach Pulse until it is reinstalled.\n"
else
printf "[hot-dev] WARNING: No pulse-sensor-proxy socket detected. Temperatures will be unavailable.\n"
fi
else
if [[ ! -S "${PULSE_SENSOR_PROXY_SOCKET}" ]]; then
printf "[hot-dev] WARNING: Configured pulse-sensor-proxy socket not found at %s\n" "${PULSE_SENSOR_PROXY_SOCKET}"
elif [[ "${PULSE_SENSOR_PROXY_SOCKET}" == "${CONTAINER_PROXY_SOCKET}" && ! -S "${HOST_PROXY_SOCKET}" ]]; then
printf "[hot-dev] WARNING: Using container-local proxy socket; reinstall host pulse-sensor-proxy for real telemetry.\n"
fi
fi

View file

@ -697,6 +697,7 @@ download_agent_binary() {
fi
if http::download "${download_args[@]}"; then
AGENT_DOWNLOAD_SOURCE="${primary_url}"
return 0
fi
@ -711,6 +712,7 @@ download_agent_binary() {
download_args+=(--insecure)
fi
if http::download "${download_args[@]}"; then
AGENT_DOWNLOAD_SOURCE="${fallback_url}"
return 0
fi
@ -722,6 +724,8 @@ download_agent_binary() {
return 1
}
unset AGENT_DOWNLOAD_SOURCE
if download_agent_binary "$DOWNLOAD_URL" "$DOWNLOAD_URL_BASE"; then
:
else
@ -730,6 +734,103 @@ else
exit 1
fi
fetch_checksum_header() {
local url="$1"
local header=""
if command -v curl &> /dev/null; then
local curl_args=(-fsSI "$url")
if [[ "$PRIMARY_INSECURE" == "true" ]]; then
curl_args=(-k "${curl_args[@]}")
fi
header=$(curl "${curl_args[@]}" 2>/dev/null || true)
elif command -v wget &> /dev/null; then
local tmp
tmp=$(mktemp)
if [[ "$PRIMARY_INSECURE" == "true" ]]; then
wget --spider --no-check-certificate --server-response "$url" >/dev/null 2>"$tmp" || true
else
wget --spider --server-response "$url" >/dev/null 2>"$tmp" || true
fi
header=$(cat "$tmp" 2>/dev/null || true)
rm -f "$tmp"
fi
if [[ -z "$header" ]]; then
return 1
fi
local checksum_line
checksum_line=$(printf '%s\n' "$header" | awk 'BEGIN{IGNORECASE=1} /^ *X-Checksum-Sha256:/{print $0; exit}')
if [[ -z "$checksum_line" ]]; then
return 1
fi
local value
value=$(printf '%s\n' "$checksum_line" | awk -F':' '{sub(/^[[:space:]]*/,"",$2); print $2}')
value=$(printf '%s' "$value" | tr '[:upper:]' '[:lower:]')
if [[ -z "$value" ]]; then
return 1
fi
FETCHED_CHECKSUM="$value"
return 0
}
calculate_sha256() {
local file="$1"
local hash=""
if command -v sha256sum &> /dev/null; then
hash=$(sha256sum "$file" | awk '{print $1}')
elif command -v shasum &> /dev/null; then
hash=$(shasum -a 256 "$file" | awk '{print $1}')
fi
if [[ -z "$hash" ]]; then
return 1
fi
CALCULATED_CHECKSUM=$(printf '%s' "$hash" | tr '[:upper:]' '[:lower:]')
return 0
}
verify_agent_checksum() {
local url="$1"
if common::is_dry_run; then
log_info '[dry-run] Skipping checksum verification'
return 0
fi
if ! fetch_checksum_header "$url"; then
log_warn 'Agent download did not include X-Checksum-Sha256 header; skipping verification'
return 0
fi
if ! calculate_sha256 "$AGENT_PATH"; then
log_warn 'Unable to calculate sha256 checksum locally; skipping verification'
return 0
fi
if [[ "$FETCHED_CHECKSUM" != "$CALCULATED_CHECKSUM" ]]; then
rm -f "$AGENT_PATH"
log_error "Checksum mismatch. Expected $FETCHED_CHECKSUM but downloaded $CALCULATED_CHECKSUM"
return 1
fi
log_success 'Checksum verified for agent binary'
unset FETCHED_CHECKSUM CALCULATED_CHECKSUM
return 0
}
if [[ -n "${AGENT_DOWNLOAD_SOURCE:-}" ]]; then
if ! verify_agent_checksum "$AGENT_DOWNLOAD_SOURCE"; then
log_error 'Agent download failed checksum verification'
exit 1
fi
fi
if ! common::is_dry_run; then
chmod +x "$AGENT_PATH"
fi
@ -841,8 +942,8 @@ SYSTEMD_ENV_INSECURE_LINE="Environment=\"PULSE_INSECURE_SKIP_VERIFY=$PRIMARY_INS
systemd::create_service "$SERVICE_PATH" <<EOF
[Unit]
Description=Pulse Docker Agent
After=network-online.target docker.service
Wants=network-online.target
After=network-online.target docker.socket docker.service
Wants=network-online.target docker.socket
[Service]
Type=simple
@ -854,6 +955,22 @@ ExecStart=$AGENT_PATH --url "$PRIMARY_URL" --interval "$INTERVAL"$NO_AUTO_UPDATE
Restart=on-failure
RestartSec=5s
User=root
ProtectSystem=full
ProtectHome=read-only
ProtectControlGroups=yes
ProtectKernelModules=yes
ProtectKernelTunables=yes
ProtectKernelLogs=yes
UMask=0077
NoNewPrivileges=yes
RestrictSUIDSGID=yes
RestrictRealtime=yes
PrivateTmp=yes
MemoryDenyWriteExecute=yes
RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6
ReadWritePaths=/var/run/docker.sock
ProtectHostname=yes
ProtectClock=yes
[Install]
WantedBy=multi-user.target

View file

@ -343,6 +343,193 @@ resolve_agent_path_for_uninstall() {
printf '%s' "$DEFAULT_AGENT_PATH"
}
ensure_service_user() {
SERVICE_USER_ACTUAL="$SERVICE_USER"
SERVICE_GROUP_ACTUAL="$SERVICE_GROUP"
SERVICE_USER_AVAILABLE="true"
if [[ "$SERVICE_USER" == "root" ]]; then
SERVICE_GROUP_ACTUAL="root"
SERVICE_USER_AVAILABLE="false"
return
fi
if id -u "$SERVICE_USER" >/dev/null 2>&1; then
if getent group "$SERVICE_GROUP" >/dev/null 2>&1; then
SERVICE_GROUP_ACTUAL="$SERVICE_GROUP"
else
SERVICE_GROUP_ACTUAL="$(id -gn "$SERVICE_USER")"
fi
return
fi
if command -v useradd >/dev/null 2>&1; then
if useradd --system --home-dir "$SERVICE_HOME" --shell /usr/sbin/nologin "$SERVICE_USER" >/dev/null 2>&1; then
SERVICE_USER_CREATED="true"
fi
elif command -v adduser >/dev/null 2>&1; then
if adduser --system --home "$SERVICE_HOME" --shell /usr/sbin/nologin "$SERVICE_USER" >/dev/null 2>&1; then
SERVICE_USER_CREATED="true"
fi
else
log_warn "Unable to create dedicated service user; running agent as root"
SERVICE_USER_ACTUAL="root"
SERVICE_GROUP_ACTUAL="root"
SERVICE_USER_AVAILABLE="false"
return
fi
if id -u "$SERVICE_USER" >/dev/null 2>&1; then
SERVICE_USER_ACTUAL="$SERVICE_USER"
if getent group "$SERVICE_GROUP" >/dev/null 2>&1; then
SERVICE_GROUP_ACTUAL="$SERVICE_GROUP"
else
SERVICE_GROUP_ACTUAL="$(id -gn "$SERVICE_USER")"
fi
if [[ "$SERVICE_USER_CREATED" == "true" ]]; then
log_success "Created service user: $SERVICE_USER"
fi
return
fi
log_warn "Failed to create service user; falling back to root"
SERVICE_USER_ACTUAL="root"
SERVICE_GROUP_ACTUAL="root"
SERVICE_USER_AVAILABLE="false"
}
ensure_service_home() {
if [[ "$SERVICE_USER_ACTUAL" == "root" ]]; then
return
fi
if [[ -z "$SERVICE_HOME" ]]; then
return
fi
if [[ ! -d "$SERVICE_HOME" ]]; then
mkdir -p "$SERVICE_HOME"
fi
chown "$SERVICE_USER_ACTUAL":"$SERVICE_GROUP_ACTUAL" "$SERVICE_HOME" >/dev/null 2>&1 || true
chmod 750 "$SERVICE_HOME" >/dev/null 2>&1 || true
}
ensure_docker_group_membership() {
SYSTEMD_SUPPLEMENTARY_GROUPS_LINE=""
if [[ "$SERVICE_USER_ACTUAL" == "root" ]]; then
return
fi
if getent group docker >/dev/null 2>&1; then
DOCKER_GROUP_PRESENT="true"
if ! id -nG "$SERVICE_USER_ACTUAL" 2>/dev/null | tr ' ' '\n' | grep -Fxq "docker"; then
if command -v usermod >/dev/null 2>&1; then
usermod -a -G docker "$SERVICE_USER_ACTUAL" >/dev/null 2>&1 || log_warn "Failed to add $SERVICE_USER_ACTUAL to docker group; adjust socket permissions manually."
elif command -v adduser >/dev/null 2>&1; then
adduser "$SERVICE_USER_ACTUAL" docker >/dev/null 2>&1 || log_warn "Failed to add $SERVICE_USER_ACTUAL to docker group; adjust socket permissions manually."
else
log_warn "Unable to manage docker group membership; ensure $SERVICE_USER_ACTUAL can access /var/run/docker.sock"
fi
fi
if id -nG "$SERVICE_USER_ACTUAL" 2>/dev/null | tr ' ' '\n' | grep -Fxq "docker"; then
SYSTEMD_SUPPLEMENTARY_GROUPS_LINE="SupplementaryGroups=docker"
log_success "Ensured docker group access for $SERVICE_USER_ACTUAL"
else
log_warn "Service user $SERVICE_USER_ACTUAL is not in docker group; ensure the Docker socket ACL grants access."
fi
else
log_warn "docker group not found; ensure the agent user can access /var/run/docker.sock"
fi
}
write_env_file() {
local target="$ENV_FILE"
local dir
dir=$(dirname "$target")
mkdir -p "$dir"
local tmp
tmp=$(mktemp "${target}.XXXXXX")
chmod 600 "$tmp"
{
if [[ -n "$PRIMARY_URL" ]]; then
printf 'PULSE_URL=%q\n' "$PRIMARY_URL"
fi
if [[ -n "$PRIMARY_TOKEN" ]]; then
printf 'PULSE_TOKEN=%q\n' "$PRIMARY_TOKEN"
fi
if [[ -n "$JOINED_TARGETS" ]]; then
printf 'PULSE_TARGETS=%q\n' "$JOINED_TARGETS"
fi
if [[ -n "$PRIMARY_INSECURE" ]]; then
printf 'PULSE_INSECURE_SKIP_VERIFY=%q\n' "$PRIMARY_INSECURE"
fi
if [[ -n "$INTERVAL" ]]; then
printf 'PULSE_INTERVAL=%q\n' "$INTERVAL"
fi
if [[ -n "$NO_AUTO_UPDATE_FLAG" ]]; then
printf 'PULSE_NO_AUTO_UPDATE=true\n'
fi
} > "$tmp"
chown root:root "$tmp"
chmod 600 "$tmp"
mv "$tmp" "$target"
log_success "Wrote environment file: $target"
}
configure_polkit_rule() {
if [[ "$SERVICE_USER_ACTUAL" == "root" ]]; then
return
fi
local polkit_dir="/etc/polkit-1/rules.d"
if [[ ! -d "$polkit_dir" ]]; then
log_warn "polkit not detected; remote stop commands may require manual sudo access"
return
fi
local rule_path="${polkit_dir}/90-pulse-docker-agent.rules"
local tmp
tmp=$(mktemp "${rule_path}.XXXXXX")
cat > "$tmp" <<EOF
// Pulse Docker agent installer managed rule
polkit.addRule(function(action, subject) {
if ((action.id == "org.freedesktop.systemd1.manage-units" ||
action.id == "org.freedesktop.systemd1.manage-unit-files") &&
subject.user == "$SERVICE_USER_ACTUAL") {
return polkit.Result.YES;
}
});
EOF
chown root:root "$tmp" 2>/dev/null || true
chmod 0644 "$tmp"
if [[ -f "$rule_path" ]]; then
if command -v cmp >/dev/null 2>&1 && cmp -s "$tmp" "$rule_path" 2>/dev/null; then
rm -f "$tmp"
log_info "polkit rule already present for $SERVICE_USER_ACTUAL"
return
fi
fi
mv "$tmp" "$rule_path"
log_success "Configured polkit rule allowing $SERVICE_USER_ACTUAL to manage pulse-docker-agent service"
}
remove_polkit_rule() {
local polkit_dir="/etc/polkit-1/rules.d"
local rule_path="${polkit_dir}/90-pulse-docker-agent.rules"
if [[ -f "$rule_path" ]] && grep -q 'Pulse Docker agent installer managed rule' "$rule_path" 2>/dev/null; then
rm -f "$rule_path"
log_success "Removed polkit rule: $rule_path"
fi
}
# Pulse Docker Agent Installer/Uninstaller
# Install (single target):
# curl -fsSL http://pulse.example.com/install-docker-agent.sh | bash -s -- --url http://pulse.example.com --token <api-token>
@ -370,6 +557,7 @@ DEFAULT_AGENT_PATH_WRITABLE="unknown"
EXISTING_AGENT_PATH=""
AGENT_PATH=""
SERVICE_PATH="/etc/systemd/system/pulse-docker-agent.service"
ENV_FILE="/etc/pulse/pulse-docker-agent.env"
UNRAID_STARTUP="/boot/config/go.d/pulse-docker-agent.sh"
LOG_PATH="/var/log/pulse-docker-agent.log"
INTERVAL="30s"
@ -385,6 +573,15 @@ PRIMARY_TOKEN=""
PRIMARY_INSECURE="false"
JOINED_TARGETS=""
ORIGINAL_ARGS=("$@")
SERVICE_USER="pulse-docker"
SERVICE_GROUP="$SERVICE_USER"
SERVICE_HOME="/var/lib/pulse-docker-agent"
SERVICE_USER_ACTUAL="$SERVICE_USER"
SERVICE_GROUP_ACTUAL="$SERVICE_GROUP"
SERVICE_USER_CREATED="false"
SERVICE_USER_AVAILABLE="true"
DOCKER_GROUP_PRESENT="false"
SYSTEMD_SUPPLEMENTARY_GROUPS_LINE=""
# Parse arguments
while [[ $# -gt 0 ]]; do
@ -687,6 +884,11 @@ if [ "$UNINSTALL" = true ]; then
log_warn "systemctl not found; skipping service disable"
fi
if [ -f "$ENV_FILE" ]; then
rm -f "$ENV_FILE"
log_success "Removed environment file: $ENV_FILE"
fi
if pgrep -f pulse-docker-agent > /dev/null 2>&1; then
log_info "Stopping running agent processes"
pkill -f pulse-docker-agent 2>/dev/null || true
@ -705,6 +907,8 @@ if [ "$UNINSTALL" = true ]; then
log_success "Removed Unraid startup script: $UNRAID_STARTUP"
fi
remove_polkit_rule
if [ "$PURGE" = true ]; then
if [ -f "$LOG_PATH" ]; then
rm -f "$LOG_PATH"
@ -712,6 +916,10 @@ if [ "$UNINSTALL" = true ]; then
else
log_info "Agent log file already absent: $LOG_PATH"
fi
if [[ -d "$SERVICE_HOME" ]]; then
rm -rf "$SERVICE_HOME"
log_success "Removed service home directory: $SERVICE_HOME"
fi
elif [ -f "$LOG_PATH" ]; then
log_info "Preserving agent log file at $LOG_PATH (use --purge to remove)"
fi
@ -946,16 +1154,109 @@ download_agent_from_url() {
return 1
}
fetch_checksum_header() {
local url="$1"
local header=""
if command -v curl &> /dev/null; then
local curl_args=(-fsSI "$url")
if [[ "$PRIMARY_INSECURE" == "true" ]]; then
curl_args=(-k "${curl_args[@]}")
fi
header=$(curl "${curl_args[@]}" 2>/dev/null || true)
elif command -v wget &> /dev/null; then
local tmp
tmp=$(mktemp)
if [[ "$PRIMARY_INSECURE" == "true" ]]; then
wget --spider --no-check-certificate --server-response "$url" >/dev/null 2>"$tmp" || true
else
wget --spider --server-response "$url" >/dev/null 2>"$tmp" || true
fi
header=$(cat "$tmp" 2>/dev/null || true)
rm -f "$tmp"
fi
if [[ -z "$header" ]]; then
return 1
fi
local checksum_line
checksum_line=$(printf '%s\n' "$header" | awk 'BEGIN{IGNORECASE=1} /^ *X-Checksum-Sha256:/{print $0; exit}')
if [[ -z "$checksum_line" ]]; then
return 1
fi
local value
value=$(printf '%s\n' "$checksum_line" | awk -F':' '{sub(/^[[:space:]]*/,"",$2); print $2}')
value=$(printf '%s' "$value" | tr '[:upper:]' '[:lower:]')
if [[ -z "$value" ]]; then
return 1
fi
FETCHED_CHECKSUM="$value"
return 0
}
calculate_sha256() {
local file="$1"
local hash=""
if command -v sha256sum &> /dev/null; then
hash=$(sha256sum "$file" | awk '{print $1}')
elif command -v shasum &> /dev/null; then
hash=$(shasum -a 256 "$file" | awk '{print $1}')
fi
if [[ -z "$hash" ]]; then
return 1
fi
CALCULATED_CHECKSUM=$(printf '%s' "$hash" | tr '[:upper:]' '[:lower:]')
return 0
}
verify_agent_checksum() {
local url="$1"
if ! fetch_checksum_header "$url"; then
log_warn 'Agent download did not include X-Checksum-Sha256 header; skipping verification'
return 0
fi
if ! calculate_sha256 "$AGENT_PATH"; then
log_warn 'Unable to calculate sha256 checksum locally; skipping verification'
return 0
fi
if [[ "$FETCHED_CHECKSUM" != "$CALCULATED_CHECKSUM" ]]; then
rm -f "$AGENT_PATH"
log_error "Checksum mismatch. Expected $FETCHED_CHECKSUM but downloaded $CALCULATED_CHECKSUM"
return 1
fi
log_success 'Checksum verified for agent binary'
unset FETCHED_CHECKSUM CALCULATED_CHECKSUM
return 0
}
DOWNLOAD_SUCCESS_URL=""
if download_agent_from_url "$DOWNLOAD_URL"; then
:
DOWNLOAD_SUCCESS_URL="$DOWNLOAD_URL"
elif [[ "$DOWNLOAD_URL" != "$DOWNLOAD_URL_BASE" ]] && download_agent_from_url "$DOWNLOAD_URL_BASE"; then
log_info 'Falling back to server default agent binary'
DOWNLOAD_SUCCESS_URL="$DOWNLOAD_URL_BASE"
else
log_warn 'Failed to download agent binary'
log_warn "Ensure the Pulse server is reachable at $PRIMARY_URL"
exit 1
fi
if [[ -n "$DOWNLOAD_SUCCESS_URL" ]]; then
if ! verify_agent_checksum "$DOWNLOAD_SUCCESS_URL"; then
log_error 'Agent download failed checksum verification'
exit 1
fi
fi
chmod +x "$AGENT_PATH"
log_success "Agent binary installed"
@ -991,9 +1292,11 @@ allow_reenroll_if_needed() {
fi
if [[ "$success" == "true" ]]; then
log_success "Cleared any previous stop block for host"
log_success "Cleared previous removal block via Pulse API"
else
log_warn "Unable to confirm removal block clearance (continuing)"
log_warn 'Pulse still considers this host removed.'
log_warn 'If you just reinstalled, visit Pulse → Docker → Removed Hosts and allow re-enroll,'
log_warn 'or rerun this installer with an API token that includes the docker:manage scope.'
fi
return 0
@ -1070,31 +1373,49 @@ if [[ -n "$NO_AUTO_UPDATE_FLAG" ]]; then
fi
fi
log_header 'Preparing service environment'
ensure_service_user
ensure_service_home
ensure_docker_group_membership
write_env_file
configure_polkit_rule
# Create systemd service
log_header 'Configuring systemd service'
SYSTEMD_ENV_TARGETS_LINE=""
if [[ -n "$JOINED_TARGETS" ]]; then
SYSTEMD_ENV_TARGETS_LINE="Environment=\"PULSE_TARGETS=$JOINED_TARGETS\""
fi
SYSTEMD_ENV_URL_LINE="Environment=\"PULSE_URL=$PRIMARY_URL\""
SYSTEMD_ENV_TOKEN_LINE="Environment=\"PULSE_TOKEN=$PRIMARY_TOKEN\""
SYSTEMD_ENV_INSECURE_LINE="Environment=\"PULSE_INSECURE_SKIP_VERIFY=$PRIMARY_INSECURE\""
cat > "$SERVICE_PATH" << EOF
[Unit]
Description=Pulse Docker Agent
After=network-online.target docker.service
Wants=network-online.target
After=network-online.target docker.socket docker.service
Wants=network-online.target docker.socket
[Service]
Type=simple
$SYSTEMD_ENV_URL_LINE
$SYSTEMD_ENV_TOKEN_LINE
$SYSTEMD_ENV_TARGETS_LINE
$SYSTEMD_ENV_INSECURE_LINE
EnvironmentFile=-$ENV_FILE
ExecStart=$AGENT_PATH --url "$PRIMARY_URL" --interval "$INTERVAL"$NO_AUTO_UPDATE_FLAG
Restart=on-failure
RestartSec=5s
User=root
StartLimitIntervalSec=120
StartLimitBurst=5
User=$SERVICE_USER_ACTUAL
Group=$SERVICE_GROUP_ACTUAL
$SYSTEMD_SUPPLEMENTARY_GROUPS_LINE
UMask=0077
NoNewPrivileges=yes
RestrictSUIDSGID=yes
RestrictRealtime=yes
PrivateTmp=yes
ProtectSystem=full
ProtectHome=read-only
ProtectControlGroups=yes
ProtectKernelModules=yes
ProtectKernelTunables=yes
ProtectKernelLogs=yes
LockPersonality=yes
MemoryDenyWriteExecute=yes
RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6
ReadWritePaths=/var/run/docker.sock
ProtectHostname=yes
ProtectClock=yes
[Install]
WantedBy=multi-user.target

View file

@ -67,6 +67,10 @@ INTERVAL="30s"
UNINSTALL="false"
PLATFORM=""
FORCE=false
KEYCHAIN_ENABLED=true
KEYCHAIN_OPT_OUT=false
KEYCHAIN_OPT_OUT_REASON=""
USE_KEYCHAIN=false
while [[ $# -gt 0 ]]; do
case "$1" in
@ -94,6 +98,12 @@ while [[ $# -gt 0 ]]; do
FORCE=true
shift
;;
--no-keychain)
KEYCHAIN_ENABLED=false
KEYCHAIN_OPT_OUT=true
KEYCHAIN_OPT_OUT_REASON="flag"
shift
;;
*)
echo "Unknown option: $1"
exit 1
@ -136,6 +146,10 @@ fi
print_header
if [[ "$FORCE" == true ]]; then
log_warn "--force enabled: skipping interactive confirmations and accepting secure defaults."
fi
# Interactive prompts if parameters not provided (unless --force is used)
if [[ -z "$PULSE_URL" ]]; then
if [[ "$FORCE" == false ]]; then
@ -148,7 +162,10 @@ fi
if [[ -z "$PULSE_URL" ]]; then
log_error "Pulse URL is required"
echo "Usage: $0 --url <pulse-url> --token <api-token> [--interval 30s] [--platform linux|darwin|windows] [--force]"
echo "Usage: $0 --url <pulse-url> --token <api-token> [--interval 30s] [--platform linux|darwin|windows] [--force] [--no-keychain]"
echo ""
echo " --force Skip interactive prompts and accept secure defaults (including Keychain storage)."
echo " --no-keychain Disable Keychain storage and embed the token in the launch agent plist instead."
exit 1
fi
@ -416,8 +433,32 @@ elif [[ "$PLATFORM" == "darwin" ]] && command -v launchctl &> /dev/null; then
mkdir -p "$MACOS_LOG_DIR"
mkdir -p "$HOME/Library/LaunchAgents"
if [[ -n "$PULSE_TOKEN" && "$KEYCHAIN_ENABLED" == true && "$FORCE" == false ]]; then
echo ""
log_info "It is recommended to store the token in your Keychain so it never lands on disk."
KEYCHAIN_PROMPTED=false
if [[ -t 0 ]]; then
read -r -p "Store the token in the macOS Keychain? [Y/n]: " KEYCHAIN_RESPONSE
KEYCHAIN_PROMPTED=true
elif [[ -r /dev/tty ]]; then
read -r -p "Store the token in the macOS Keychain? [Y/n]: " KEYCHAIN_RESPONSE </dev/tty
KEYCHAIN_PROMPTED=true
else
log_warn "No interactive terminal detected; defaulting to Keychain storage. Use --no-keychain to opt out."
fi
if [[ "$KEYCHAIN_PROMPTED" == true && "$KEYCHAIN_RESPONSE" =~ ^[Nn] ]]; then
KEYCHAIN_ENABLED=false
KEYCHAIN_OPT_OUT=true
KEYCHAIN_OPT_OUT_REASON="prompt"
fi
echo ""
fi
# Store token in macOS Keychain for better security
if [[ -n "$PULSE_TOKEN" ]]; then
if [[ -n "$PULSE_TOKEN" && "$KEYCHAIN_ENABLED" == true ]]; then
log_info "For security, the token is stored in your macOS Keychain so it never lands on disk."
log_info "macOS may ask to allow access the first time the agent runs."
log_info "Use --no-keychain to opt out (the token will be embedded in the launchd plist instead)."
log_info "Storing token in macOS Keychain..."
# Delete existing keychain entry if it exists
@ -429,12 +470,13 @@ elif [[ "$PLATFORM" == "darwin" ]] && command -v launchctl &> /dev/null; then
KEYCHAIN_APPS=(
"/usr/local/bin/pulse-host-agent"
"/usr/local/bin/pulse-host-agent-wrapper.sh"
"/usr/bin/security"
)
KEYCHAIN_ARGS=()
for app in "${KEYCHAIN_APPS[@]}"; do
KEYCHAIN_ARGS+=(-T "$app")
if [[ -e "$app" ]]; then
KEYCHAIN_ARGS+=(-T "$app")
fi
done
if security add-generic-password \
@ -456,6 +498,17 @@ elif [[ "$PLATFORM" == "darwin" ]] && command -v launchctl &> /dev/null; then
log_info "You may need to grant Keychain access permissions"
USE_KEYCHAIN=false
fi
elif [[ -n "$PULSE_TOKEN" ]]; then
if [[ "$KEYCHAIN_OPT_OUT" == true ]]; then
if [[ "$KEYCHAIN_OPT_OUT_REASON" == "flag" ]]; then
log_warn "Keychain storage disabled via --no-keychain; token will be embedded in the launchd plist."
elif [[ "$KEYCHAIN_OPT_OUT_REASON" == "prompt" ]]; then
log_warn "Keychain storage skipped at user prompt; token will be embedded in the launchd plist."
fi
else
log_warn "Keychain storage disabled; token will be embedded in the launchd plist."
fi
USE_KEYCHAIN=false
else
USE_KEYCHAIN=false
fi

View file

@ -67,12 +67,19 @@ SOCKET_PATH="${RUNTIME_DIR}/pulse-sensor-proxy.sock"
WORK_DIR="/var/lib/pulse-sensor-proxy"
SSH_DIR="${WORK_DIR}/ssh"
CONFIG_DIR="/etc/pulse-sensor-proxy"
CTID_FILE="${CONFIG_DIR}/ctid"
CLEANUP_SCRIPT_PATH="/usr/local/bin/pulse-sensor-cleanup.sh"
CLEANUP_PATH_UNIT="/etc/systemd/system/pulse-sensor-cleanup.path"
CLEANUP_SERVICE_UNIT="/etc/systemd/system/pulse-sensor-cleanup.service"
CLEANUP_REQUEST_PATH="${WORK_DIR}/cleanup-request.json"
SERVICE_USER="pulse-sensor-proxy"
LOG_DIR="/var/log/pulse/sensor-proxy"
SHARE_DIR="/usr/local/share/pulse"
STORED_INSTALLER="${SHARE_DIR}/install-sensor-proxy.sh"
SELFHEAL_SCRIPT="/usr/local/bin/pulse-sensor-proxy-selfheal.sh"
SELFHEAL_SERVICE_UNIT="/etc/systemd/system/pulse-sensor-proxy-selfheal.service"
SELFHEAL_TIMER_UNIT="/etc/systemd/system/pulse-sensor-proxy-selfheal.timer"
SCRIPT_SOURCE="$(readlink -f "${BASH_SOURCE[0]:-$0}" 2>/dev/null || printf '%s' "${BASH_SOURCE[0]:-$0}")"
cleanup_local_authorized_keys() {
local auth_keys_file="/root/.ssh/authorized_keys"
@ -198,6 +205,34 @@ EOF
print_success "Removed cleanup service unit ${CLEANUP_SERVICE_UNIT}"
fi
if [[ -f "$SELFHEAL_TIMER_UNIT" ]]; then
systemctl stop pulse-sensor-proxy-selfheal.timer 2>/dev/null || true
systemctl disable pulse-sensor-proxy-selfheal.timer 2>/dev/null || true
rm -f "$SELFHEAL_TIMER_UNIT"
print_success "Removed self-heal timer ${SELFHEAL_TIMER_UNIT}"
fi
if [[ -f "$SELFHEAL_SERVICE_UNIT" ]]; then
systemctl stop pulse-sensor-proxy-selfheal.service 2>/dev/null || true
systemctl disable pulse-sensor-proxy-selfheal.service 2>/dev/null || true
rm -f "$SELFHEAL_SERVICE_UNIT"
print_success "Removed self-heal service ${SELFHEAL_SERVICE_UNIT}"
fi
if [[ -f "$SELFHEAL_SCRIPT" ]]; then
rm -f "$SELFHEAL_SCRIPT"
print_success "Removed self-heal helper ${SELFHEAL_SCRIPT}"
fi
if [[ -f "$STORED_INSTALLER" ]]; then
rm -f "$STORED_INSTALLER"
print_success "Removed cached installer ${STORED_INSTALLER}"
fi
if [[ -f "$CTID_FILE" ]]; then
rm -f "$CTID_FILE"
fi
if command -v systemctl >/dev/null 2>&1; then
systemctl daemon-reload 2>/dev/null || true
fi
@ -558,6 +593,11 @@ install -d -o pulse-sensor-proxy -g pulse-sensor-proxy -m 0700 "$SSH_DIR"
install -m 0600 -o pulse-sensor-proxy -g pulse-sensor-proxy /dev/null "$SSH_DIR/known_hosts"
install -d -o pulse-sensor-proxy -g pulse-sensor-proxy -m 0755 /etc/pulse-sensor-proxy
if [[ -n "$CTID" ]]; then
echo "$CTID" > "$CTID_FILE"
chmod 0644 "$CTID_FILE"
fi
# Create config file with ACL for Docker containers (standalone mode)
if [[ "$STANDALONE" == true ]]; then
print_info "Creating config file with Docker container ACL..."
@ -946,7 +986,7 @@ LimitNOFILE=1024
SERVICE_EOF
# Enable and start the path unit
systemctl daemon-reload
systemctl daemon-reload || true
systemctl enable pulse-sensor-cleanup.path
systemctl start pulse-sensor-cleanup.path
@ -1261,7 +1301,7 @@ for key_type in id_rsa id_dsa id_ecdsa id_ed25519; do
print_warn "Found legacy SSH key: /root/.ssh/$key_type"
fi
pct exec "$CTID" -- rm -f "/root/.ssh/$key_type" "/root/.ssh/${key_type}.pub"
print_info " Removed /root/.ssh/$key_type"
print_info " Removed /root/.ssh/$key_type"
fi
done
@ -1272,6 +1312,88 @@ if [ "$LEGACY_KEYS_FOUND" = true ] && [ "$QUIET" != true ]; then
fi
fi # End of container-specific configuration
# Install self-heal safeguards to keep proxy available
print_info "Configuring self-heal safeguards..."
if [[ -n "$SCRIPT_SOURCE" && -f "$SCRIPT_SOURCE" ]]; then
install -d "$SHARE_DIR"
cp "$SCRIPT_SOURCE" "$STORED_INSTALLER"
chmod 0755 "$STORED_INSTALLER"
else
print_warn "Unable to cache installer script for self-heal (source path unavailable)"
fi
cat > "$SELFHEAL_SCRIPT" <<'EOF'
#!/bin/bash
set -euo pipefail
SERVICE="pulse-sensor-proxy"
INSTALLER="/usr/local/share/pulse/install-sensor-proxy.sh"
CTID_FILE="/etc/pulse-sensor-proxy/ctid"
LOG_TAG="pulse-sensor-proxy-selfheal"
log() {
logger -t "$LOG_TAG" "$1"
}
if ! command -v systemctl >/dev/null 2>&1; then
exit 0
fi
if ! systemctl list-unit-files | grep -q "^${SERVICE}\\.service"; then
if [[ -x "$INSTALLER" && -f "$CTID_FILE" ]]; then
log "Service unit missing; attempting reinstall"
bash "$INSTALLER" --ctid "$(cat "$CTID_FILE")" --skip-restart --quiet || log "Reinstall attempt failed"
fi
exit 0
fi
if ! systemctl is-active --quiet "${SERVICE}.service"; then
systemctl start "${SERVICE}.service" || true
sleep 2
fi
if ! systemctl is-active --quiet "${SERVICE}.service"; then
if [[ -x "$INSTALLER" && -f "$CTID_FILE" ]]; then
log "Service failed to start; attempting reinstall"
bash "$INSTALLER" --ctid "$(cat "$CTID_FILE")" --skip-restart --quiet || log "Reinstall attempt failed"
systemctl start "${SERVICE}.service" || true
fi
fi
EOF
chmod 0755 "$SELFHEAL_SCRIPT"
cat > "$SELFHEAL_SERVICE_UNIT" <<'EOF'
[Unit]
Description=Pulse Sensor Proxy Self-Heal
After=network-online.target
Wants=network-online.target
[Service]
Type=oneshot
ExecStart=/usr/local/bin/pulse-sensor-proxy-selfheal.sh
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
EOF
cat > "$SELFHEAL_TIMER_UNIT" <<'EOF'
[Unit]
Description=Ensure pulse-sensor-proxy stays installed and running
[Timer]
OnBootSec=5min
OnUnitActiveSec=30min
Unit=pulse-sensor-proxy-selfheal.service
[Install]
WantedBy=timers.target
EOF
systemctl daemon-reload
systemctl enable --now pulse-sensor-proxy-selfheal.timer >/dev/null 2>&1 || true
if [ "$QUIET" = true ]; then
print_success "pulse-sensor-proxy installed and running"
else