Deployment Guide¶

Comprehensive guide for deploying the Freeze Design webshop to the staging environment on a Hetzner Cloud VPS using Docker Compose, GitHub Actions CI/CD, and zero-downtime rolling deployments.

No production environment yet

There is currently no production VPS. Only staging (staging.freezedesign.eu) exists. The Deploy to Production workflow (.github/workflows/deploy-production.yml) has been disabled (June 2026) until a production server is provisioned. All production sections below are kept as a runbook for that moment.

1. Overview¶

Architecture¶

The application runs as seven Docker containers orchestrated by Docker Compose:

Container	Image	Purpose
nginx	`nginx:alpine`	Reverse proxy, SSL termination, static file serving
backend	GHCR (Django/Gunicorn)	REST API, admin panel
frontend	GHCR (Next.js)	Server-side rendered storefront
db	`postgres:15-alpine`	PostgreSQL 15 database with SSL
redis	`redis:7-alpine`	Celery broker and cache (NOT RabbitMQ)
celery	Same as backend	Async task processing (payments, emails)
celery-beat	Same as backend	Scheduled task scheduler

External services:

DigitalOcean Spaces -- media/image storage (S3-compatible)
AWS S3 -- database backup storage
Cloudflare -- CDN, DNS, DDoS protection
Let's Encrypt -- SSL certificates via certbot
Mollie -- payment processing
Sentry -- error tracking
Discord -- deployment and error notifications

Environment Comparison¶

Aspect	Staging	Production
Domain	`staging.freezedesign.eu`	`freezedesign.eu`
Public access	Basic auth protected	Public
VPS plan	CX23 or smaller (4GB RAM)	CX33 (8GB RAM)
Compose file	`docker-compose.staging.yml`	`docker-compose.prod.yml`
Env file	`backend/.env.staging`	`backend/.env.production`
CI/CD trigger	Auto: `deploy` job in `ci.yml` on push to `staging` (after tests pass)	None yet — workflow disabled (no production VPS)
Gunicorn workers	2	4
Celery concurrency	1	2
Redis memory	128MB	256MB
PostgreSQL tuning	512MB shared_buffers	1GB shared_buffers
Mollie API	Test key (`test_...`)	Live key (`live_...`)
Sentry environment	`staging`	`production`
CSP	Report-only	Enforced
Search indexing	`X-Robots-Tag: noindex`	Allowed
Backup retention	7 days	30 days

Deployment Flow¶

PR squash-merged into staging
       |
       v
CI runs (backend, frontend, build, E2E smoke + visual)
       |
       v (all green)
deploy job builds Docker images
       |
       v
Images pushed to GHCR (ghcr.io)
       |
       v
SSH into VPS, pull new images
       |
       v
Rolling deployment (scale up, health check, scale down)
       |
       v
Run database migrations
       |
       v
Restart nginx to refresh upstream DNS
       |
       v
Post-deploy health verification
       |
       v (if failed)
Automatic rollback to previous images

2. Server Setup¶

Initial VPS Configuration¶

Provision a Hetzner Cloud VPS (CX33 for production, CX23 or smaller for staging) with Ubuntu 22.04 LTS in the Nuremberg (eu-central) or Falkenstein region.

# Connect as root
ssh root@<SERVER_IP>

# Update system
apt update && apt upgrade -y

# Create deploy user
adduser deploy
usermod -aG sudo deploy

# Set up SSH key authentication
mkdir -p /home/deploy/.ssh
cp ~/.ssh/authorized_keys /home/deploy/.ssh/
chown -R deploy:deploy /home/deploy/.ssh
chmod 700 /home/deploy/.ssh
chmod 600 /home/deploy/.ssh/authorized_keys

# Disable password authentication
sed -i 's/PasswordAuthentication yes/PasswordAuthentication no/' /etc/ssh/sshd_config
systemctl restart sshd

Firewall¶

ufw allow OpenSSH
ufw allow 80/tcp
ufw allow 443/tcp
ufw enable

Install Docker¶

# Install Docker Engine
curl -fsSL https://get.docker.com | sh
usermod -aG docker deploy

# Configure DNS for Docker builds (required on some VPS providers)
echo '{"dns": ["8.8.8.8", "8.8.4.4"]}' | sudo tee /etc/docker/daemon.json
sudo systemctl restart docker

# Verify
docker --version
docker compose version

Install Certbot¶

sudo apt install -y certbot

Create Application Directory¶

CI/CD deploys to /opt/webshop, not the git clone directory:

sudo mkdir -p /opt/webshop
sudo chown deploy:deploy /opt/webshop

3. Staging Deployment¶

Prerequisites¶

Server setup completed (section 2)
DNS A record for staging.freezedesign.eu pointing to the server IP (via Cloudflare, proxied)
GHCR packages set to public visibility (or configure auth on VPS)

Step 1: Clone the Repository¶

su - deploy
git clone git@github.com:Voorman/webshop_freeze_design.git /opt/webshop
cd /opt/webshop

Note: Private repos require SSH deploy keys. Use the git@github.com: URL, not HTTPS. Add a deploy key in GitHub repo settings.

Step 2: Configure Environment¶

cp backend/.env.staging.example backend/.env.staging
nano backend/.env.staging

Fill in all secrets. Critical variables:

Variable	How to Generate / Where to Find
`SECRET_KEY`	`openssl rand -base64 32 \\| tr -dc 'a-zA-Z0-9'`
`REDIS_PASSWORD`	`openssl rand -base64 32 \\| tr -dc 'a-zA-Z0-9' \\| head -c 32`
`POSTGRES_PASSWORD`	Strong random password
`DB_PASSWORD`	Same value as `POSTGRES_PASSWORD`
`MOLLIE_API_KEY`	Mollie dashboard (use `test_...` key for staging)
`DO_SPACES_ACCESS_KEY`	DigitalOcean dashboard > API > Spaces Keys
`DO_SPACES_SECRET_KEY`	DigitalOcean dashboard > API > Spaces Keys
`SENTRY_DSN`	Sentry project settings

Important notes on secrets:

SECRET_KEY must be alphanumeric only -- avoid special characters that require shell escaping.
REDIS_PASSWORD must be alphanumeric only -- it is embedded in URLs (CELERY_BROKER_URL, REDIS_URL). Characters like / break URL parsing and cause "Port could not be cast to integer" errors.
BACKEND_URL must NOT include /api -- the payment service builds webhook URLs as {BACKEND_URL}/api/payments/webhook/. Setting it to https://staging.freezedesign.eu/api would produce a double /api/api/ path.
ALLOWED_HOSTS should include localhost for Docker health checks to work.

Lock down the file:

chmod 600 backend/.env.staging

Step 3: Create the .env File for Docker Compose¶

The command: directives in docker-compose.staging.yml use ${REDIS_PASSWORD} which is interpolated at parse time from the .env file at the compose level, not from env_file:. Create a .env file next to the compose file:

cat > /opt/webshop/.env << 'EOF'
REDIS_PASSWORD=<same-password-as-in-backend/.env.staging>
IMAGE_TAG=staging
EOF

chmod 600 /opt/webshop/.env

Step 4: Generate PostgreSQL SSL Certificates¶

cd /opt/webshop
mkdir -p certs

# Generate self-signed certs for PostgreSQL SSL
openssl req -new -x509 -days 3650 -nodes \
  -out certs/server.crt \
  -keyout certs/server.key \
  -subj "/CN=db"

# PostgreSQL Alpine runs as UID 70
sudo chown 70:70 certs/server.key certs/server.crt
chmod 600 certs/server.key

Step 5: Set Up Basic Auth for Staging¶

Staging is protected with HTTP basic authentication:

sudo apt install -y apache2-utils
mkdir -p nginx
htpasswd -c nginx/.htpasswd <USERNAME>

Step 6: Copy Nginx Configuration¶

Copy the staging nginx config from the repository:

cp /opt/webshop/nginx/nginx.staging.conf /opt/webshop/nginx/nginx.staging.conf

Step 7: Generate SSL Certificates¶

Verify DNS is properly configured before running certbot:

# Verify DNS from multiple sources
dig @8.8.8.8 staging.freezedesign.eu A +short
dig @8.8.8.8 staging.freezedesign.eu AAAA +short

# If AAAA record exists, ensure it points to the correct VPS IPv6
ip -6 addr show | grep "inet6.*global"

Warning: Let's Encrypt prefers IPv6. If the domain has an AAAA record pointing to the wrong server, SSL validation will fail. Delete the AAAA record or update it before proceeding.

Generate the certificate:

chmod +x nginx/certbot-init.sh

# Test with Let's Encrypt staging first
./nginx/certbot-init.sh --domain staging.freezedesign.eu --compose-file docker-compose.staging.yml --staging

# If staging works, generate real certificates
./nginx/certbot-init.sh --domain staging.freezedesign.eu --compose-file docker-compose.staging.yml

Step 8: Create the Database¶

The POSTGRES_DB env var tells Django which database to connect to, but does not auto-create it. After starting the db container:

# Start only the database service first
docker compose -f docker-compose.staging.yml up -d db

# Wait for it to be healthy
docker compose -f docker-compose.staging.yml ps db

# Create the database (connect to 'postgres' default database first)
docker compose -f docker-compose.staging.yml exec db \
  psql -U webshop_staging -d postgres -c "CREATE DATABASE webshop_staging;"

Step 9: Start All Services¶

docker compose -f docker-compose.staging.yml up -d

Verify all services are running and healthy:

docker compose -f docker-compose.staging.yml ps

All services should show "Up (healthy)" status.

Step 10: Run Database Migrations and Create Staff User¶

docker compose -f docker-compose.staging.yml exec backend python manage.py migrate
docker compose -f docker-compose.staging.yml exec backend python manage.py create_staff_user

Step 11: Verify Health¶

# Backend health (requires basic auth)
curl -u <USERNAME>:<PASSWORD> -f https://staging.freezedesign.eu/api/health/

# Nginx health (bypasses basic auth)
curl -f https://staging.freezedesign.eu/health

# Frontend loads
curl -u <USERNAME>:<PASSWORD> -sI https://staging.freezedesign.eu/ | head -5

# SSL certificate
openssl s_client -connect staging.freezedesign.eu:443 \
  -servername staging.freezedesign.eu < /dev/null 2>/dev/null | \
  openssl x509 -noout -dates

4. Production Deployment¶

Runbook only

This section describes the intended production setup. No production VPS exists yet; do not run these steps until one is provisioned.

Prerequisites¶

Server setup completed (section 2) on CX33 VPS (8GB RAM)
DNS A record for freezedesign.eu and www.freezedesign.eu pointing to the server
Cloudflare configured (SSL/TLS mode: Full strict)
GHCR packages set to public visibility

Step 1: Configure Environment¶

cd /opt/webshop
cp backend/.env.production.example backend/.env.production
nano backend/.env.production

Production-specific differences from staging:

Variable	Production Value
`ALLOWED_HOSTS`	`freezedesign.eu,www.freezedesign.eu,localhost`
`BACKEND_URL`	`https://freezedesign.eu` (no `/api` suffix)
`MOLLIE_API_KEY`	Live key (`live_...`)
`SENTRY_ENVIRONMENT`	`production`
`SENTRY_TRACES_SAMPLE_RATE`	`0.1`
`CSP_REPORT_ONLY`	`False`
`BACKUP_RETENTION_DAYS`	`30`

chmod 600 backend/.env.production

Create the compose-level .env:

cat > /opt/webshop/.env << 'EOF'
REDIS_PASSWORD=<your-alphanumeric-redis-password>
IMAGE_TAG=latest
NEXT_PUBLIC_API_URL=https://freezedesign.eu/api
EOF

chmod 600 /opt/webshop/.env

Step 2: PostgreSQL SSL Certificates¶

Same process as staging (see section 3, step 4).

Step 3: Generate SSL Certificates¶

# Verify DNS
dig @8.8.8.8 freezedesign.eu A +short
dig @8.8.8.8 www.freezedesign.eu A +short
curl -4 ifconfig.me  # confirm VPS IP matches

# Generate certificates (includes freezedesign.eu, www, and staging)
./nginx/certbot-init.sh

Step 4: Create Database and Start Services¶

# Start database
docker compose -f docker-compose.prod.yml up -d db
docker compose -f docker-compose.prod.yml exec db \
  psql -U webshop_prod -d postgres -c "CREATE DATABASE webshop_prod;"

# Start all services
docker compose -f docker-compose.prod.yml up -d

# Run migrations
docker compose -f docker-compose.prod.yml exec backend python manage.py migrate

# Create staff user
docker compose -f docker-compose.prod.yml exec backend python manage.py create_staff_user

Step 5: Verify Health¶

curl -f https://freezedesign.eu/api/health/
curl -f https://freezedesign.eu/health
curl -sI https://freezedesign.eu/ | head -5
openssl s_client -connect freezedesign.eu:443 \
  -servername freezedesign.eu < /dev/null 2>/dev/null | \
  openssl x509 -noout -dates

Step 6: Configure Cloudflare¶

DNS records: A record for freezedesign.eu and CNAME for www and staging
SSL/TLS mode: Full (strict)
Cache rules for Next.js static assets, Django static files, and media images
Enable Brotli compression

5. CI/CD Pipeline¶

Staging: Automatic on Push to `staging`¶

The staging deploy is the final deploy job inside .github/workflows/ci.yml. On every push to the staging branch (typically a squash-merged feature PR), CI runs all tests first; only when backend-test, frontend-test, frontend-build, e2e-smoke-test, and e2e-visual-test are green does the deploy job call .github/workflows/deploy-staging.yml via workflow_call.

deploy-staging.yml itself only has workflow_call and workflow_dispatch triggers -- the old standalone workflow_run trigger no longer exists. Use gh workflow run deploy-staging.yml for a manual (re)deploy of the current staging tip.

What it does:

Build phase -- Checks out the deployed SHA, builds Docker images for backend and frontend, pushes to GHCR with tags staging, sha-<commit>, and the branch name. Builds use the GitHub Actions cache with separate scopes (scope=backend / scope=frontend) so the two images don't evict each other's layers
Deploy phase -- Copies deployment scripts and compose file to /opt/webshop on the VPS via SCP, then executes scripts/deploy-rolling.sh via SSH (with immediate scripts/rollback.sh if it fails)
Health verification -- Checks Docker health status of backend and frontend containers. If either fails, triggers automatic rollback via scripts/rollback.sh
k6 performance gate -- Runs loadtests/homepage-load.js against the deployed staging site as a homepage performance gate
Notifications -- Sends Discord notifications on started/success/failure

Production: Disabled¶

The workflow .github/workflows/deploy-production.yml exists but is disabled (gh workflow disable) because there is no production VPS to deploy to. When a production server exists, it is designed to be triggered manually with a version tag input (e.g., v1.19): build images with semver tags, run scripts/deploy-rolling.sh with COMPOSE_FILE=docker-compose.prod.yml, verify health, and roll back automatically on failure.

Required GitHub Secrets¶

Secret	Staging	Production
`STAGING_HOST` / `PRODUCTION_HOST`	VPS IP	VPS IP
`STAGING_USER` / `PRODUCTION_USER`	`deploy`	`deploy`
`STAGING_SSH_KEY` / `PRODUCTION_SSH_KEY`	SSH private key	SSH private key
`STAGING_API_URL` / `PRODUCTION_API_URL`	`https://staging.freezedesign.eu/api`	`https://freezedesign.eu/api`
`STAGING_URL` / `PRODUCTION_URL`	`https://staging.freezedesign.eu`	`https://freezedesign.eu`
`STAGING_BASIC_AUTH_BASE64`	Base64 basic-auth credentials for the k6 gate	n/a
`DISCORD_WEBHOOK_URL`	Discord webhook URL	Discord webhook URL

Compose File Requirements for CI/CD¶

The compose files used by CI/CD must use image: directives pointing to GHCR, not build:. Using build: causes "No image to be pulled" errors during docker compose pull.

# Correct (staging compose)
backend:
  image: ghcr.io/voorman/webshop_freeze_design/backend:${IMAGE_TAG:-staging}

# Wrong (will fail with CI/CD)
backend:
  build:
    context: ./backend

Use the IMAGE_TAG environment variable for versioning: set it in the .env file at the compose level or export it before running compose commands.

GHCR Package Visibility¶

GHCR packages must be set to public so the VPS can pull images without authentication. Change visibility in GitHub: Package settings > Change visibility > Public.

6. SSL Certificate Management¶

Automatic Renewal¶

Let's Encrypt certificates are valid for 90 days. For staging, renewal is automated by the .github/workflows/ssl-renewal.yml workflow (weekly, Wednesday 04:00 UTC, via SSH), with a crontab on the VPS as safety net -- both run the same idempotent script.

To set up the crontab safety net manually:

# As the deploy user
crontab -e

Add the renewal job (runs daily at 3 AM):

# Staging
0 3 * * * certbot renew --quiet --deploy-hook "docker compose -f /opt/webshop/docker-compose.staging.yml restart nginx"

# Production (if on same server, one renewal covers all domains)
0 3 * * * certbot renew --quiet --deploy-hook "docker compose -f /opt/webshop/docker-compose.prod.yml restart nginx"

Manual Renewal¶

# Check certificate expiry
openssl s_client -connect freezedesign.eu:443 \
  -servername freezedesign.eu < /dev/null 2>/dev/null | \
  openssl x509 -noout -dates

# Force renewal
sudo certbot renew --force-renewal

# Restart nginx to load new certificates
docker compose -f docker-compose.prod.yml restart nginx

Troubleshooting SSL¶

If Let's Encrypt validation fails:

Verify DNS resolves correctly from external sources:

dig @8.8.8.8 freezedesign.eu A +short
dig @8.8.8.8 freezedesign.eu AAAA +short

If an AAAA record exists, ensure it points to the correct VPS IPv6 address or delete it. Let's Encrypt prefers IPv6 and will fail if the AAAA record points elsewhere.
Ensure port 80 is open (ufw allow 80/tcp).
Ensure the certbot_webroot volume is not mounted as read-only (:ro) in the nginx service. Certbot needs to write challenge files.

7. Updating / Deploying New Versions¶

Via CI/CD (Recommended)¶

Staging: Merge a PR into the staging branch (feature PRs always target staging, squash-merged). CI runs the full test suite and, when green, the deploy job deploys to staging automatically.

gh pr merge <pr-number> --squash
# CI runs on the staging push; the deploy job handles the rest

# Manual (re)deploy of the current staging tip:
gh workflow run deploy-staging.yml

Promotion to main: Run scripts/release.sh --promote-only --no-uat. Since June 2026 this is fully automated: it finds the green CI run with a successful deploy job for the staging HEAD, opens the promotion PR (base=main, head=staging), merges it (rebase), recreates staging if auto-delete removed it, and retargets open PRs back to staging. Note: promoting to main does not deploy anywhere — there is no production environment.

Production: Not possible yet. The "Deploy to Production" workflow is disabled until a production VPS exists.

Manual Deployment¶

If CI/CD is unavailable, deploy manually from the VPS:

cd /opt/webshop

# Pull latest images
export IMAGE_TAG=staging  # or a specific tag like v1.20
docker compose -f docker-compose.staging.yml pull

# Run rolling deployment
export COMPOSE_FILE=docker-compose.staging.yml
export DEPLOYMENT_DIR=/opt/webshop
./scripts/deploy-rolling.sh

# Or for a simpler (non-rolling) update:
docker compose -f docker-compose.staging.yml up -d
docker compose -f docker-compose.staging.yml exec backend python manage.py migrate
docker compose -f docker-compose.staging.yml exec backend python manage.py collectstatic --noinput

After Changing Environment Variables¶

When you change values in .env.staging or .env.production, restarting containers is not enough. Containers cache environment variables at creation time. You must recreate them:

# Stop, remove, and recreate the affected service
docker compose -f docker-compose.staging.yml stop backend
docker compose -f docker-compose.staging.yml rm -f backend
docker compose -f docker-compose.staging.yml up -d backend

8. Rollback Procedures¶

Automatic Rollback¶

The CI/CD pipeline automatically rolls back when health checks fail after deployment. The flow:

deploy-rolling.sh runs -- if it fails, rollback.sh is called immediately
Post-deploy health verification checks Docker health status of backend and frontend
If either service is unhealthy after multiple retries, rollback.sh pulls the previous tag from GHCR and restarts services
A final fallback step runs if the health check step itself fails

Manual Rollback¶

To manually roll back to a previous version:

cd /opt/webshop

# Option 1: Roll back to a specific image tag
export IMAGE_TAG=v1.18  # the known-good version
docker compose -f docker-compose.prod.yml pull
docker compose -f docker-compose.prod.yml up -d

# Option 2: Use the rollback script
export COMPOSE_FILE=docker-compose.prod.yml
export DEPLOYMENT_DIR=/opt/webshop
export GITHUB_REPOSITORY=Voorman/webshop_freeze_design
export REGISTRY=ghcr.io
export BACKUP_TAG=previous  # or a specific tag
./scripts/rollback.sh

# Verify services are healthy
docker compose -f docker-compose.prod.yml ps

9. Database Migration Rollback¶

Django migrations can be reversed if needed. This is a manual process.

Identify the Migration to Roll Back¶

# List applied migrations for an app
docker compose -f docker-compose.prod.yml exec backend \
  python manage.py showmigrations <app_name>

Roll Back a Migration¶

# Roll back to a specific migration (the one BEFORE the problematic one)
docker compose -f docker-compose.prod.yml exec backend \
  python manage.py migrate <app_name> <migration_number>

# Example: roll back products app to migration 0005
docker compose -f docker-compose.prod.yml exec backend \
  python manage.py migrate products 0005

Full Database Restore¶

If migration rollback is insufficient, restore from a backup:

# Stop services that use the database
docker compose -f docker-compose.prod.yml stop backend celery celery-beat

# Restore from backup
gunzip backup_20260131_120000.sql.gz
docker compose -f docker-compose.prod.yml exec -T db \
  psql -U webshop_prod webshop_prod < backup_20260131_120000.sql

# Restart services
docker compose -f docker-compose.prod.yml up -d

Manual Database Backup¶

# Create a backup before risky operations
docker compose -f docker-compose.prod.yml exec db \
  pg_dump -U webshop_prod webshop_prod > backup_$(date +%Y%m%d_%H%M%S).sql
gzip backup_*.sql

10. Troubleshooting¶

Service Health Checks¶

# Check all service statuses
docker compose -f docker-compose.prod.yml ps

# Check logs for a specific service
docker compose -f docker-compose.prod.yml logs backend --tail=50
docker compose -f docker-compose.prod.yml logs frontend --tail=50
docker compose -f docker-compose.prod.yml logs nginx --tail=50
docker compose -f docker-compose.prod.yml logs celery --tail=50

Health check details by container:

Container	Health check tool	Command
backend	`curl` (Python image has curl)	`curl -f http://localhost:8000/api/health/`
frontend	`wget` (Node Alpine has wget, NOT curl)	`wget -q --spider http://127.0.0.1:3000/`
nginx	`curl` (Alpine has curl)	`curl -f http://127.0.0.1/health`
db	`pg_isready`	`pg_isready -U $POSTGRES_USER`
redis	`redis-cli`	`redis-cli -a $REDIS_PASSWORD ping`
celery	`celery inspect`	`celery -A config inspect ping`

Nginx 502 Bad Gateway After Deployment¶

Nginx caches upstream DNS at startup. When containers are recreated during deployment, they get new Docker network IPs but nginx still routes to the old IPs.

Fix: Restart nginx after deploying services:

docker compose -f docker-compose.prod.yml restart nginx

The rolling deploy script (deploy-rolling.sh) already does this as its final step.

Container Environment Variable Changes Not Taking Effect¶

Containers cache env vars at creation time. Restarting is not enough:

# Wrong: restart only re-reads env_file on some Docker versions
docker compose -f docker-compose.prod.yml restart backend

# Correct: stop, remove, and recreate
docker compose -f docker-compose.prod.yml stop backend
docker compose -f docker-compose.prod.yml rm -f backend
docker compose -f docker-compose.prod.yml up -d backend

Redis "Port Could Not Be Cast to Integer" Error¶

This means REDIS_PASSWORD contains special characters (like /) that break URL parsing in CELERY_BROKER_URL and REDIS_URL. Regenerate with alphanumeric characters only:

openssl rand -base64 32 | tr -dc 'a-zA-Z0-9' | head -c 32

Update both REDIS_PASSWORD in the backend env file and the compose-level .env file, then recreate all services that use Redis:

docker compose -f docker-compose.staging.yml stop redis backend celery celery-beat
docker compose -f docker-compose.staging.yml rm -f redis backend celery celery-beat
docker compose -f docker-compose.staging.yml up -d

SSL Certificate Errors¶

# Verify certificate files exist
ls -la certbot_certs/live/freezedesign.eu/

# Check certificate validity and expiry
openssl x509 -in certbot_certs/live/freezedesign.eu/fullchain.pem -noout -dates

# Re-generate certificates
./nginx/certbot-init.sh

Database Connection Issues¶

# Check PostgreSQL logs
docker compose -f docker-compose.prod.yml logs db --tail=50

# Verify database accepts connections
docker compose -f docker-compose.prod.yml exec db pg_isready -U webshop_prod

# Check connection count
docker compose -f docker-compose.prod.yml exec db \
  psql -U webshop_prod -c "SELECT count(*) FROM pg_stat_activity;"

Memory Issues¶

# System memory
free -h

# Docker container memory usage
docker stats --no-stream

# Redis memory usage
docker compose -f docker-compose.prod.yml exec redis \
  redis-cli -a "$REDIS_PASSWORD" --no-auth-warning info memory | grep used_memory_human

Celery Task Issues¶

# Check active tasks
docker compose -f docker-compose.prod.yml exec celery \
  celery -A config inspect active

# Check Celery Beat schedule
docker compose -f docker-compose.prod.yml logs celery-beat --tail=20

# Check Redis queue length
docker compose -f docker-compose.prod.yml exec redis \
  redis-cli -a "$REDIS_PASSWORD" --no-auth-warning llen celery

Disk Space¶

# Check disk usage
df -h

# Check Docker disk usage
docker system df

# Clean up unused Docker resources
docker system prune -f

# Remove unused images (use with caution in production)
docker image prune -a -f

Resetting Staging Data¶

To reset the staging database to a clean state:

docker compose -f docker-compose.staging.yml down
docker volume rm webshop_freeze_design_postgres_data
docker compose -f docker-compose.staging.yml up -d
docker compose -f docker-compose.staging.yml exec backend python manage.py migrate
docker compose -f docker-compose.staging.yml exec backend python manage.py create_staff_user

Debugging Webhooks (Mollie)¶

If payment webhooks return 404, check that BACKEND_URL does not include /api. The webhook URL is constructed as {BACKEND_URL}/api/payments/webhook/, so a BACKEND_URL of https://freezedesign.eu/api would result in the path /api/api/payments/webhook/.

Shell Escaping in Remote Debugging¶

When debugging via SSH into Docker containers, avoid embedding JSON in nested shell commands. Multi-layer escaping (local -> ssh -> docker exec -> bash -c -> curl -d) silently corrupts request bodies. Instead, write JSON to a file first:

# Wrong: body gets silently corrupted
docker compose exec backend bash -c 'curl -d "{\"key\":\"value\"}" ...'

# Correct: write JSON to file first
docker compose exec backend bash -c 'echo "{\"key\":\"value\"}" > /tmp/body.json && curl -d @/tmp/body.json ...'

Monitoring Checklist¶

After deployment, verify these items regularly:

All services show "healthy" in docker compose ps
SSL certificate expiry > 30 days
Disk usage < 80%
Memory usage < 80%
Database connections < 40 (max 50 configured)
Redis memory < 200MB (256MB limit on production)
No errors in docker compose logs backend --tail=100
Sentry error count at acceptable level
Cloudflare cache hit ratio > 80% for static assets

Deployment Guide¶

1. Overview¶

Architecture¶

Environment Comparison¶

Deployment Flow¶

2. Server Setup¶

Initial VPS Configuration¶

Firewall¶

Install Docker¶

Install Certbot¶

Create Application Directory¶

3. Staging Deployment¶

Prerequisites¶

Step 1: Clone the Repository¶

Step 2: Configure Environment¶

Step 3: Create the .env File for Docker Compose¶

Step 4: Generate PostgreSQL SSL Certificates¶

Step 5: Set Up Basic Auth for Staging¶

Step 6: Copy Nginx Configuration¶

Step 7: Generate SSL Certificates¶

Step 8: Create the Database¶

Step 9: Start All Services¶

Step 10: Run Database Migrations and Create Staff User¶

Step 11: Verify Health¶

4. Production Deployment¶

Prerequisites¶

Step 1: Configure Environment¶

Step 2: PostgreSQL SSL Certificates¶

Step 3: Generate SSL Certificates¶

Step 4: Create Database and Start Services¶

Step 5: Verify Health¶

Step 6: Configure Cloudflare¶

5. CI/CD Pipeline¶

Staging: Automatic on Push to staging¶

Production: Disabled¶

Required GitHub Secrets¶

Compose File Requirements for CI/CD¶

GHCR Package Visibility¶

6. SSL Certificate Management¶

Automatic Renewal¶

Manual Renewal¶

Troubleshooting SSL¶

7. Updating / Deploying New Versions¶

Via CI/CD (Recommended)¶

Manual Deployment¶

After Changing Environment Variables¶

8. Rollback Procedures¶

Automatic Rollback¶

Manual Rollback¶

9. Database Migration Rollback¶

Identify the Migration to Roll Back¶

Roll Back a Migration¶

Full Database Restore¶

Manual Database Backup¶

10. Troubleshooting¶

Service Health Checks¶

Nginx 502 Bad Gateway After Deployment¶

Container Environment Variable Changes Not Taking Effect¶

Redis "Port Could Not Be Cast to Integer" Error¶

SSL Certificate Errors¶

Database Connection Issues¶

Memory Issues¶

Celery Task Issues¶

Disk Space¶

Resetting Staging Data¶

Debugging Webhooks (Mollie)¶

Shell Escaping in Remote Debugging¶

Monitoring Checklist¶

Staging: Automatic on Push to `staging`¶