ci: add Docker image cleanup to prevent runner disk exhaustion (#333)

## Problem

The self-hosted runner (`meshcore-runner-2`) filled its 29GB disk to
100%, blocking all CI runs:

```
Filesystem  Size  Used Avail Use%
/dev/root    29G   29G  2.3M 100%

Docker Images: 67 total, 2 active, 18.83GB reclaimable (99%)
```

Root cause: no Docker image cleanup after builds. Each CI run builds a
new image but never prunes old ones.

## Fix

### 1. Docker image cleanup after deploy (`deploy` job)
- Runs with `if: always()` so it executes even if deploy fails
- `docker image prune -af --filter "until=24h"` — removes images older
than 24h (safe: current build is minutes old)
- `docker builder prune -f --keep-storage=1GB` — caps build cache
- Logs before/after `docker system df` for visibility

### 2. Runner log cleanup at start of E2E job
- Prunes runner diagnostic logs older than 3 days (was 53MB and growing)
- Reports `df -h` for disk visibility in CI output

## Impact

After manual cleanup today, disk went from 100% → 35% (19GB free). This
PR prevents recurrence.

## Test plan
- [x] Manual cleanup verified on runner via `az vm run-command`
- [ ] Next CI run should show cleanup step output in deploy job logs

Co-authored-by: Kpa-clawbot <259247574+Kpa-clawbot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
Kpa-clawbot
2026-03-31 16:00:31 -07:00
committed by GitHub
parent dc9d6ba8df
commit 38e5f02a00
+18
View File
@@ -129,6 +129,13 @@ jobs:
with:
fetch-depth: 0
- name: Free disk space
run: |
# Prune old runner diagnostic logs (can accumulate 50MB+)
find ~/actions-runner/_diag/ -name '*.log' -mtime +3 -delete 2>/dev/null || true
# Show available disk space
df -h / | tail -1
- name: Set up Node.js 22
uses: actions/setup-node@v5
with:
@@ -314,6 +321,17 @@ jobs:
exit 1
fi
- name: Clean up old Docker images
if: always()
run: |
# Remove dangling images and images older than 24h (keeps current build)
echo "--- Docker disk usage before cleanup ---"
docker system df
docker image prune -af --filter "until=24h" 2>/dev/null || true
docker builder prune -f --keep-storage=1GB 2>/dev/null || true
echo "--- Docker disk usage after cleanup ---"
docker system df
# ───────────────────────────────────────────────────────────────
# 5. Publish Badges & Summary (master only)
# ───────────────────────────────────────────────────────────────