Coolify: Fix managed service start (CoolifyTask failing) #442

Closed
opened 2026-02-22 07:21:39 +00:00 by jason.woltje · 1 comment
Owner

Root Cause

Coolify's restart operation (stop then start) combined with its periodic CleanupDocker action causes image pruning between the stop and start phases. When containers are stopped, images become 'unused' and get pruned. The subsequent start phase fails with 'No such image' errors.

Additionally, large images (400MB+) exceed CoolifyTask's ~40s timeout during pulls, causing start failures even when images need to be downloaded.

Resolution

Established a reliable start procedure:

  1. Pre-pull all 6 images via docker pull before Coolify operations
  2. Remove stale networks (ug0ssok4g44wocok8kws8gg8_internal) that block compose up
  3. Start via Coolify API — CoolifyTask completes in ~14s when images are cached

Verified Lifecycle

  • Stop via API: all 6 containers removed
  • Start via API (with pre-pulled images): all 6 containers healthy in ~30s
  • Coolify dashboard shows running:healthy

Operational Note

Before any Coolify restart/start, always pre-pull images first. This is documented in docs/COOLIFY-DEPLOYMENT.md.

## Root Cause Coolify's restart operation (stop then start) combined with its periodic CleanupDocker action causes image pruning between the stop and start phases. When containers are stopped, images become 'unused' and get pruned. The subsequent start phase fails with 'No such image' errors. Additionally, large images (400MB+) exceed CoolifyTask's ~40s timeout during pulls, causing start failures even when images need to be downloaded. ## Resolution Established a reliable start procedure: 1. Pre-pull all 6 images via docker pull before Coolify operations 2. Remove stale networks (ug0ssok4g44wocok8kws8gg8_internal) that block compose up 3. Start via Coolify API — CoolifyTask completes in ~14s when images are cached ## Verified Lifecycle - Stop via API: all 6 containers removed - Start via API (with pre-pulled images): all 6 containers healthy in ~30s - Coolify dashboard shows running:healthy ## Operational Note Before any Coolify restart/start, always pre-pull images first. This is documented in docs/COOLIFY-DEPLOYMENT.md.
Author
Owner

Root Cause Found

The CoolifyTask failure was a red herring. The real issue was:

  1. Containers were started manually via docker compose up -d from the service directory, which created a work_internal network (from the docker compose project name "work")
  2. Containers ended up on TWO networks: ug0ssok4g44wocok8kws8gg8 (Coolify's) and work_internal (stale)
  3. Neither container had a traefik.docker.network label
  4. Traefik randomly picked which network to use per container — picked the wrong one for the web container
  5. Traefik tried to route to the web container via work_internal, which Traefik wasn't connected to → timeout

Resolution

  1. Force-removed all containers
  2. Removed stale work_internal network
  3. Recreated containers with correct project name: docker compose -p ug0ssok4g44wocok8kws8gg8 up -d
  4. Both web and API now accessible via HTTPS

Remaining

Coolify's managed start (CoolifyTask) still needs to be verified — the containers were started via docker CLI, not through Coolify's UI/API. Coolify should be able to manage restarts/redeploys going forward.

## Root Cause Found The CoolifyTask failure was a red herring. The real issue was: 1. Containers were started manually via `docker compose up -d` from the service directory, which created a `work_internal` network (from the docker compose project name "work") 2. Containers ended up on TWO networks: `ug0ssok4g44wocok8kws8gg8` (Coolify's) and `work_internal` (stale) 3. Neither container had a `traefik.docker.network` label 4. Traefik randomly picked which network to use per container — picked the wrong one for the web container 5. Traefik tried to route to the web container via `work_internal`, which Traefik wasn't connected to → timeout ## Resolution 1. Force-removed all containers 2. Removed stale `work_internal` network 3. Recreated containers with correct project name: `docker compose -p ug0ssok4g44wocok8kws8gg8 up -d` 4. Both web and API now accessible via HTTPS ## Remaining Coolify's managed start (CoolifyTask) still needs to be verified — the containers were started via docker CLI, not through Coolify's UI/API. Coolify should be able to manage restarts/redeploys going forward.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: mosaic/stack#442