Docker Deployment Guide

This guide covers containerized deployment of the computational pathology API using Docker and Docker Compose.

Quick Start

Prerequisites

Docker 20.10+ installed
Docker Compose 2.0+ installed
At least 4GB RAM available for containers
(Optional) NVIDIA Docker for GPU support

Basic Deployment

# Build and start the API service
docker-compose up -d api

# Check service status
docker-compose ps

# View logs
docker-compose logs -f api

# Test the API
curl http://localhost:8000/health

The API will be available at http://localhost:8000.

With Jupyter Notebook

# Start both API and notebook services
docker-compose up -d

# Access Jupyter at http://localhost:8888
# Access API at http://localhost:8000

Docker Images

Building the Image

# Build the image
docker build -t pathology-api:latest .

# Build with specific Python version
docker build --build-arg PYTHON_VERSION=3.11 -t pathology-api:py311 .

# Build without cache (clean build)
docker build --no-cache -t pathology-api:latest .

Image Size Optimization

The Dockerfile uses multi-stage builds and slim base images to minimize size:

Base image: python:3.10-slim (~150MB)
Final image: ~2-3GB (includes PyTorch and dependencies)

Further optimization options:

# Use Alpine for smaller base (more complex)
FROM python:3.10-alpine

# Use distroless for security
FROM gcr.io/distroless/python3

# Remove unnecessary files
RUN pip install --no-cache-dir -r requirements.txt && \
    rm -rf /root/.cache/pip

Docker Compose

Services

The docker-compose.yml defines two services:

api: FastAPI server for inference
notebook: Jupyter notebook for development (optional)

Configuration

# docker-compose.yml
services:
  api:
    build: .
    ports:
      - "8000:8000"  # Map host:container
    volumes:
      - ./models:/app/models:ro  # Mount trained models (read-only)
    environment:
      - LOG_LEVEL=info
    restart: unless-stopped

Volume Mounts

Models directory (required for trained weights):

volumes:
  - ./models:/app/models:ro

Data directory (optional, for batch processing):

volumes:
  - ./data:/app/data:ro

Results directory (for saving outputs):

volumes:
  - ./results:/app/results:rw

Environment Variables

Configure the container via environment variables:

environment:
  - PYTHONUNBUFFERED=1        # Real-time logging
  - LOG_LEVEL=info            # Logging level
  - MODEL_PATH=/app/models/best_model.pth  # Model checkpoint
  - DEVICE=cuda               # cpu or cuda
  - MAX_BATCH_SIZE=32         # Maximum batch size

GPU Support

NVIDIA Docker Setup

Install NVIDIA Docker runtime: ```bash
Ubuntu/Debian

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list |
sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update sudo apt-get install -y nvidia-docker2 sudo systemctl restart docker

2. Update `docker-compose.yml`:
```yaml
services:
  api:
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

Or use Docker CLI:

docker run --gpus all -p 8000:8000 pathology-api:latest

Verify GPU Access

# Inside container
docker exec -it pathology-api python -c "import torch; print(torch.cuda.is_available())"

Production Deployment

Security Hardening

Run as non-root user:

# Add to Dockerfile
RUN useradd -m -u 1000 appuser
USER appuser

Read-only filesystem: ```yaml services: api: read_only: true tmpfs:
- /tmp ```

Resource limits:

services:
  api:
 deploy:
   resources:
     limits:
       cpus: '4'
       memory: 8G
     reservations:
       cpus: '2'
       memory: 4G

Network isolation:

networks:
  frontend:
 driver: bridge
  backend:
 driver: bridge
 internal: true

Health Checks

The Dockerfile includes a health check:

HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD python -c "import requests; requests.get('http://localhost:8000/health')" || exit 1

Monitor health:

docker inspect --format='' pathology-api

Logging

View logs:

# Follow logs
docker-compose logs -f api

# Last 100 lines
docker-compose logs --tail=100 api

# Since timestamp
docker-compose logs --since 2024-01-01T00:00:00 api

Configure logging driver:

services:
  api:
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"

Monitoring

Prometheus metrics (add to deploy/api.py):

from prometheus_client import Counter, Histogram, make_asgi_app

# Add metrics endpoint
metrics_app = make_asgi_app()
app.mount("/metrics", metrics_app)

Container stats:

docker stats pathology-api

Kubernetes Deployment

Basic Deployment

# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: pathology-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: pathology-api
  template:
    metadata:
      labels:
        app: pathology-api
    spec:
      containers:
      - name: api
        image: pathology-api:latest
        ports:
        - containerPort: 8000
        resources:
          requests:
            memory: "4Gi"
            cpu: "2"
          limits:
            memory: "8Gi"
            cpu: "4"
        livenessProbe:
          httpGet:
            path: /health
            port: 8000
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health
            port: 8000
          initialDelaySeconds: 5
          periodSeconds: 5

Service

# k8s/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: pathology-api
spec:
  selector:
    app: pathology-api
  ports:
  - protocol: TCP
    port: 80
    targetPort: 8000
  type: LoadBalancer

Deploy to Kubernetes

# Apply configurations
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yaml

# Check status
kubectl get pods
kubectl get services

# View logs
kubectl logs -f deployment/pathology-api

# Scale replicas
kubectl scale deployment pathology-api --replicas=5

Cloud Deployment

AWS ECS

Push image to ECR:

aws ecr create-repository --repository-name pathology-api
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin <account-id>.dkr.ecr.us-east-1.amazonaws.com
docker tag pathology-api:latest <account-id>.dkr.ecr.us-east-1.amazonaws.com/pathology-api:latest
docker push <account-id>.dkr.ecr.us-east-1.amazonaws.com/pathology-api:latest

Create ECS task definition (see deploy/README.md for details)

Google Cloud Run

# Build and push to GCR
gcloud builds submit --tag gcr.io/PROJECT_ID/pathology-api

# Deploy to Cloud Run
gcloud run deploy pathology-api \
  --image gcr.io/PROJECT_ID/pathology-api \
  --platform managed \
  --region us-central1 \
  --memory 8Gi \
  --cpu 4 \
  --max-instances 10

Azure Container Instances

# Push to ACR
az acr create --resource-group myResourceGroup --name myregistry --sku Basic
az acr login --name myregistry
docker tag pathology-api:latest myregistry.azurecr.io/pathology-api:latest
docker push myregistry.azurecr.io/pathology-api:latest

# Deploy to ACI
az container create \
  --resource-group myResourceGroup \
  --name pathology-api \
  --image myregistry.azurecr.io/pathology-api:latest \
  --cpu 4 \
  --memory 8 \
  --ports 8000

Troubleshooting

Container Won’t Start

# Check logs
docker-compose logs api

# Common issues:
# 1. Port already in use
sudo lsof -i :8000
# Kill process or change port in docker-compose.yml

# 2. Missing model file
# Ensure models/best_model.pth exists or update MODEL_PATH

# 3. Out of memory
# Increase Docker memory limit in Docker Desktop settings

Slow Performance

# Check resource usage
docker stats pathology-api

# Solutions:
# 1. Increase CPU/memory limits in docker-compose.yml
# 2. Enable GPU support
# 3. Use model quantization
# 4. Reduce batch size

Connection Refused

# Check if container is running
docker ps

# Check if port is exposed
docker port pathology-api

# Test from inside container
docker exec -it pathology-api curl http://localhost:8000/health

# Check firewall rules
sudo ufw status

GPU Not Detected

# Verify NVIDIA Docker runtime
docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi

# Check Docker daemon configuration
cat /etc/docker/daemon.json
# Should include: {"runtimes": {"nvidia": {"path": "nvidia-container-runtime"}}}

# Restart Docker
sudo systemctl restart docker

Development Workflow

Local Development with Docker

# Mount source code for live reloading
docker run -it --rm \
  -v $(pwd)/src:/app/src \
  -v $(pwd)/deploy:/app/deploy \
  -p 8000:8000 \
  pathology-api:latest \
  uvicorn deploy.api:app --host 0.0.0.0 --port 8000 --reload

Running Tests in Container

# Run tests
docker-compose run --rm api pytest tests/ -v

# With coverage
docker-compose run --rm api pytest tests/ --cov=src --cov-report=html

Interactive Shell

# Open bash shell in running container
docker exec -it pathology-api bash

# Or start new container with shell
docker run -it --rm pathology-api:latest bash

Best Practices

Use specific tags: Avoid latest in production

docker tag pathology-api:latest pathology-api:v1.0.0

Multi-stage builds: Separate build and runtime dependencies
Layer caching: Order Dockerfile commands from least to most frequently changed
Security scanning: Scan images for vulnerabilities
```
docker scan pathology-api:latest
```
Resource limits: Always set CPU and memory limits in production
Health checks: Implement proper health check endpoints
Logging: Use structured logging and centralized log aggregation

Secrets management: Never hardcode secrets in images

docker run -e API_KEY=$(cat api_key.txt) pathology-api:latest

Performance Optimization

Image Size

# Check image size
docker images pathology-api

# Analyze layers
docker history pathology-api:latest

# Use dive for detailed analysis
dive pathology-api:latest

Build Cache

# Use BuildKit for better caching
DOCKER_BUILDKIT=1 docker build -t pathology-api:latest .

# Cache from registry
docker build --cache-from pathology-api:latest -t pathology-api:latest .

Runtime Performance

Use tmpfs for temporary files:
```
tmpfs:
  - /tmp
  - /app/cache
```

Optimize Python startup:

ENV PYTHONOPTIMIZE=1
ENV PYTHONDONTWRITEBYTECODE=1

Use gunicorn for production:

CMD ["gunicorn", "deploy.api:app", "-w", "4", "-k", "uvicorn.workers.UvicornWorker", "--bind", "0.0.0.0:8000"]

Docker Deployment Guide

Quick Start

Prerequisites

Basic Deployment

With Jupyter Notebook

Docker Images

Building the Image

Image Size Optimization

Docker Compose

Services

Configuration

Volume Mounts

Environment Variables

GPU Support

NVIDIA Docker Setup

Ubuntu/Debian

Verify GPU Access

Production Deployment

Security Hardening

Health Checks

Logging

Monitoring

Kubernetes Deployment

Basic Deployment

Service

Deploy to Kubernetes

Cloud Deployment

AWS ECS

Google Cloud Run

Azure Container Instances

Troubleshooting

Container Won’t Start

Slow Performance

Connection Refused

GPU Not Detected

Development Workflow

Local Development with Docker

Running Tests in Container

Interactive Shell

Best Practices

Performance Optimization

Image Size

Build Cache

Runtime Performance

References