277 lines
18 KiB
Markdown
277 lines
18 KiB
Markdown
# Design — Task Manager on Kubernetes (zkt25 / z2)
|
|
|
|
**Date:** 2026-04-29
|
|
**Author:** Brazing Technology (bvilborg@brazing-technology.com), with Claude Opus 4.7
|
|
**Course assignment:** Task 2 — Kubernetes (due 2026-03-31, late submission)
|
|
**Source application:** First assignment — `cloud assiagment/` (Docker-based 3-tier task manager)
|
|
|
|
## 1. Goal
|
|
|
|
Migrate the existing 3-tier Task Manager application from Docker Compose to Kubernetes, satisfying every requirement of the course assignment:
|
|
|
|
- ≥ 1 `Namespace`; all other objects belong to it.
|
|
- ≥ 1 `Deployment`.
|
|
- ≥ 1 `StatefulSet` with its `PersistentVolume` and `PersistentVolumeClaim`.
|
|
- ≥ 1 `Service`.
|
|
- Mandatory files at repo root: `start-app.sh`, `stop-app.sh`, `prepare-app.sh`, `deployment.yaml`, `service.yaml`, `statefulset.yaml`, `Dockerfile(s)`, `README.md`.
|
|
- Documentation covering: app description, containers, K8s objects, networks/volumes, container configuration, lifecycle instructions, web access instructions.
|
|
|
|
## 2. Target environment
|
|
|
|
- **Cluster:** minikube (local, Windows host)
|
|
- **kubectl:** assumed installed and configured (matches assignment wording)
|
|
- **Docker:** images built directly into minikube's docker daemon via `eval $(minikube -p minikube docker-env)` — no registry, no `minikube image load` step
|
|
|
|
This choice is portable: the same scripts and YAML run unchanged on any minikube install (Linux/Mac/Windows). The graders need only `minikube start && ./prepare-app.sh && ./start-app.sh`.
|
|
|
|
## 3. Application (unchanged from assignment 1)
|
|
|
|
A Task Manager web app:
|
|
|
|
- **Frontend** — Nginx serving static HTML/CSS/JS, reverse-proxying `/api/*` to the backend.
|
|
- **Backend** — Flask (Python) REST API on Gunicorn, CRUD on `tasks` (id, title, completed, created_at). Auto-creates the table on startup.
|
|
- **Database** — PostgreSQL 15.
|
|
|
|
Endpoints: `GET/POST /api/tasks`, `PUT /api/tasks/:id`, `DELETE /api/tasks/:id`. **New:** `GET /api/health` (returns 200 if DB reachable) — used by the readiness probe.
|
|
|
|
## 4. Architecture
|
|
|
|
```
|
|
Browser ─► minikube service ─► NodePort 30080
|
|
│
|
|
▼
|
|
┌──────────────────────┐
|
|
│ Service: web (NodePort)│
|
|
└──────────┬───────────┘
|
|
│
|
|
┌───────▼────────┐
|
|
│ Deployment: │
|
|
│ web (nginx, 2) │
|
|
└───────┬────────┘
|
|
│ /api/* (proxy_pass http://api:5000)
|
|
┌───────▼────────┐
|
|
│ Service: api │ (ClusterIP)
|
|
└───────┬────────┘
|
|
│
|
|
┌───────▼────────┐
|
|
│ Deployment: │
|
|
│ api (flask, 2) │
|
|
└───────┬────────┘
|
|
│ TCP 5432 (db.taskapp.svc:5432)
|
|
┌───────▼────────┐
|
|
│ Service: db │ (headless, ClusterIP None)
|
|
└───────┬────────┘
|
|
│
|
|
┌───────▼────────┐
|
|
│ StatefulSet: │
|
|
│ db (postgres,1)│── PVC ◄── PV (hostPath, 1Gi)
|
|
└────────────────┘
|
|
```
|
|
|
|
## 5. Kubernetes object inventory
|
|
|
|
| # | Object | Name | Purpose |
|
|
|---|--------|------|---------|
|
|
| 1 | Namespace | `taskapp` | Isolates all resources |
|
|
| 2 | Secret | `db-credentials` | `POSTGRES_USER`, `POSTGRES_PASSWORD`, `POSTGRES_DB` |
|
|
| 3 | ConfigMap | `nginx-config` | Holds `nginx.conf` (mounted at `/etc/nginx/nginx.conf`) |
|
|
| 4 | PersistentVolume | `db-pv` | 1 Gi, `hostPath: /mnt/data/taskapp-db`, `Retain` reclaim, `manual` storageClass |
|
|
| 5 | PersistentVolumeClaim | `db-pvc` | Binds to `db-pv`; consumed by StatefulSet pod |
|
|
| 6 | StatefulSet | `db` | 1 replica, `postgres:15`, mounts PVC at `/var/lib/postgresql/data`, env from Secret, `pg_isready` probes |
|
|
| 7 | Service | `db` | Headless (`clusterIP: None`), TCP 5432 |
|
|
| 8 | Deployment | `api` | 2 replicas, `taskapp-api:v1`, env from Secret, `GET /api/health` probes, resource requests/limits |
|
|
| 9 | Service | `api` | ClusterIP, TCP 5000 |
|
|
| 10 | Deployment | `web` | 2 replicas, `taskapp-web:v1`, ConfigMap-mounted nginx.conf, `GET /` readiness, resource requests/limits |
|
|
| 11 | Service | `web` | NodePort 30080 → 80 |
|
|
|
|
**Required by assignment:** Namespace ✓, Deployment ✓ (web, api), StatefulSet+PV+PVC ✓, Service ✓ (web, api, db).
|
|
**Engineering polish (item B from brainstorming):** Secret, ConfigMap, probes, resource requests/limits, multi-replica stateless tier.
|
|
|
|
## 6. Networking
|
|
|
|
- **Cluster DNS** — every Service is reachable inside the namespace by its short name: `web`, `api`, `db`. Across namespaces it would be `<svc>.taskapp.svc.cluster.local`.
|
|
- **Pod-to-pod** — handled by the CNI; no manual config.
|
|
- **Headless Service for `db`** — pairs with the StatefulSet so `db-0.db.taskapp.svc.cluster.local` is a stable DNS name. kube-proxy does not load-balance headless Services; clients connect directly to a pod.
|
|
- **External access** — only `web` is exposed (NodePort 30080). `api` and `db` are ClusterIP-only and not reachable from outside the cluster.
|
|
- **Web access for the user** — `minikube service web -n taskapp` opens the browser at the right URL automatically; alternative is `minikube ip` + `:30080`.
|
|
|
|
## 7. Storage
|
|
|
|
- **PV** — `db-pv`, 1 Gi, `hostPath: /mnt/data/taskapp-db` on the minikube node. Reclaim policy `Retain` so deleting the PVC does **not** wipe the underlying directory. StorageClass `manual` (matches the PVC's `storageClassName`).
|
|
- **PVC** — `db-pvc`, requests 1 Gi `ReadWriteOnce`, storageClassName `manual`. The StatefulSet pod's volume mount references the PVC by name (claim ref, not a volumeClaimTemplate, since we want a pre-bound static PV).
|
|
- **Why hostPath, not a StorageClass-driven dynamic PV** — assignment explicitly requires PV and PVC objects. Static provisioning is the textbook fit. (Note: in production, dynamic provisioning is preferred.)
|
|
- **Initialization** — `prepare-app.sh` runs `minikube ssh -- sudo mkdir -p /mnt/data/taskapp-db && sudo chmod 777 …` so the directory exists with permissions before the PV binds.
|
|
|
|
## 8. Configuration & secrets
|
|
|
|
- **DB password** — stored in Secret `db-credentials` (base64-encoded in YAML). Both Postgres (`POSTGRES_PASSWORD`) and Flask (`DB_PASSWORD`) read it via `envFrom: secretRef`.
|
|
- **DB host/port for Flask** — plain env vars in the Deployment manifest (`DB_HOST=db`, `DB_PORT=5432`); not secret.
|
|
- **`nginx.conf`** — held in ConfigMap `nginx-config`, mounted into the web container at `/etc/nginx/nginx.conf` (single-file mount via `subPath`). Tweaking the proxy block does not require rebuilding the image.
|
|
|
|
## 9. Container configuration
|
|
|
|
- **`taskapp-web` (Nginx)** — built from `nginx:alpine` + the static frontend files. The default `nginx.conf` is replaced at runtime by the ConfigMap mount. Listens on 80.
|
|
- **`taskapp-api` (Flask)** — built from `python:3.12-slim` + Flask + Gunicorn (2 workers). Reads DB credentials from env (Secret-sourced), creates the `tasks` table on startup if missing. Listens on 5000.
|
|
- **`postgres:15`** — official image, unmodified. Env from Secret. Volume mount at `/var/lib/postgresql/data`. Liveness and readiness use `pg_isready -U $POSTGRES_USER`.
|
|
|
|
## 10. File layout (repo root)
|
|
|
|
```
|
|
qubernetees/
|
|
├── README.md # documentation (assignment-required)
|
|
├── prepare-app.sh # build images into minikube; create PV directory
|
|
├── start-app.sh # kubectl apply in dependency order; wait for rollouts; open browser
|
|
├── stop-app.sh # kubectl delete (full teardown — assignment wording)
|
|
├── namespace.yaml # Namespace + Secret + ConfigMap
|
|
├── statefulset.yaml # PV + PVC + StatefulSet (no Service — see §10 note)
|
|
├── deployment.yaml # api + web Deployments
|
|
├── service.yaml # ALL Services: web (NodePort) + api (ClusterIP) + db (headless)
|
|
├── nginx.conf # source for the ConfigMap
|
|
├── backend/
|
|
│ ├── Dockerfile
|
|
│ ├── requirements.txt
|
|
│ └── app.py # + new /api/health endpoint
|
|
└── frontend/
|
|
├── Dockerfile
|
|
├── index.html
|
|
├── style.css
|
|
└── app.js
|
|
```
|
|
|
|
**Note on `service.yaml`:** the assignment maps file → object type strictly (`service.yaml` is "configuration file for object type Service"), so **all three Services** live in `service.yaml`: `web` (NodePort), `api` (ClusterIP), `db` (headless ClusterIP). The headless-service-next-to-its-StatefulSet pattern is idiomatic but ignored here in favor of the assignment's prescribed file layout.
|
|
|
|
**Note on `statefulset.yaml`:** holds PV + PVC + StatefulSet only. The `db` Service is in `service.yaml`. Object dependency at apply time (Service must exist before the StatefulSet pod starts so DNS resolves) is handled by ordering inside `start-app.sh`.
|
|
|
|
## 11. Lifecycle scripts
|
|
|
|
### `prepare-app.sh`
|
|
|
|
Per the assignment, this script "compiles images **and creates permanent volumes**". So image builds **and** PV creation happen here. The PV is cluster-scoped (no namespace prerequisite), so it can be applied before `start-app.sh` runs.
|
|
|
|
```
|
|
1. minikube status (or `minikube start` if not running)
|
|
2. eval "$(minikube -p minikube docker-env)"
|
|
3. docker build -t taskapp-api:v1 backend/
|
|
4. docker build -t taskapp-web:v1 frontend/
|
|
5. minikube ssh -- "sudo mkdir -p /mnt/data/taskapp-db && sudo chmod 777 /mnt/data/taskapp-db"
|
|
6. kubectl apply -f statefulset.yaml --dry-run=client -o yaml \
|
|
| kubectl apply -f - # NO — see decision below
|
|
6. kubectl apply -f statefulset.yaml # creates PV (cluster-scoped) — PVC and StatefulSet
|
|
# are namespaced and will fail-soft if NS missing,
|
|
# but we keep this here for "creating permanent volumes"
|
|
# per assignment wording
|
|
7. echo "App prepared."
|
|
```
|
|
|
|
**Apply-strategy decision:** `statefulset.yaml` contains the cluster-scoped PV plus the namespaced PVC and StatefulSet. Running `kubectl apply -f statefulset.yaml` before the namespace exists would fail on the namespaced objects. Two clean options:
|
|
|
|
- **(a)** Split the PV into its own file (e.g., `pv.yaml`) so `prepare-app.sh` applies only the PV. Cleaner, but adds an extra file beyond the assignment's mandatory set.
|
|
- **(b)** Apply `namespace.yaml` first inside `prepare-app.sh`, then apply `statefulset.yaml`. The namespace+PV live after prepare; PVC and StatefulSet are also created in prepare; `start-app.sh` then applies only `deployment.yaml` and `service.yaml`.
|
|
|
|
We pick **(b)**. It satisfies the assignment wording ("creating permanent volumes" — PV+PVC are both created in prepare), and `start-app.sh` still satisfies "create all Kubernetes objects" since `kubectl apply` is idempotent and re-running it on already-existing PV/PVC/StatefulSet is a no-op (resources are reconciled, not duplicated).
|
|
|
|
Final `prepare-app.sh`:
|
|
```
|
|
1. minikube status (or `minikube start` if not running)
|
|
2. eval "$(minikube -p minikube docker-env)"
|
|
3. docker build -t taskapp-api:v1 backend/
|
|
4. docker build -t taskapp-web:v1 frontend/
|
|
5. minikube ssh -- "sudo mkdir -p /mnt/data/taskapp-db && sudo chmod 777 /mnt/data/taskapp-db"
|
|
6. kubectl apply -f namespace.yaml # Namespace + Secret + ConfigMap (prereq for PVC)
|
|
7. kubectl apply -f statefulset.yaml # PV + PVC + StatefulSet
|
|
8. echo "App prepared."
|
|
```
|
|
|
|
### `start-app.sh`
|
|
|
|
Per assignment: "commands for kubectl to create all Kubernetes objects". Re-applies everything (idempotent). Resources already created by `prepare-app.sh` are reconciled with no side effect.
|
|
```
|
|
1. kubectl apply -f namespace.yaml # idempotent
|
|
2. kubectl apply -f statefulset.yaml # idempotent
|
|
3. kubectl apply -f service.yaml # web + api + db Services
|
|
4. kubectl apply -f deployment.yaml # web + api Deployments
|
|
5. kubectl -n taskapp rollout status statefulset/db
|
|
6. kubectl -n taskapp rollout status deployment/api
|
|
7. kubectl -n taskapp rollout status deployment/web
|
|
8. echo "App is running."
|
|
9. minikube service web -n taskapp # opens browser at http://<minikube-ip>:30080
|
|
```
|
|
|
|
Note: `service.yaml` is applied *before* `deployment.yaml` so the api and db Services exist before pods try to resolve them.
|
|
|
|
### `stop-app.sh`
|
|
Full teardown (matches assignment wording "drop the created Kubernetes objects"):
|
|
```
|
|
1. kubectl delete -f service.yaml --ignore-not-found
|
|
2. kubectl delete -f deployment.yaml --ignore-not-found
|
|
3. kubectl delete -f statefulset.yaml --ignore-not-found # also deletes PV+PVC+db Service
|
|
4. kubectl delete -f namespace.yaml --ignore-not-found # also deletes Secret+ConfigMap
|
|
5. echo "App stopped and removed."
|
|
```
|
|
|
|
The hostPath data on the node remains (`Retain` reclaim policy). It can be wiped manually with `minikube ssh -- sudo rm -rf /mnt/data/taskapp-db` if desired — that command is documented in the README, not in any script.
|
|
|
|
## 12. Health checks
|
|
|
|
| Container | Liveness | Readiness | Initial delay |
|
|
|-----------|----------|-----------|---------------|
|
|
| `db` | `exec: pg_isready -U $POSTGRES_USER` | same | 10 s / 5 s |
|
|
| `api` | `httpGet: /api/health :5000` | same | 5 s / 5 s |
|
|
| `web` | `httpGet: / :80` | same | 2 s / 1 s |
|
|
|
|
The `api` `/api/health` endpoint executes `SELECT 1` against the DB and returns 200 only if it succeeds — this means rolling updates wait for real DB reachability, not just Flask startup.
|
|
|
|
## 13. Resource limits
|
|
|
|
| Container | requests cpu / mem | limits cpu / mem |
|
|
|-----------|--------------------|------------------|
|
|
| `db` | 100m / 128Mi | 500m / 512Mi |
|
|
| `api` | 50m / 64Mi | 250m / 256Mi |
|
|
| `web` | 25m / 32Mi | 100m / 128Mi |
|
|
|
|
Conservative; fits comfortably in a 4 GiB minikube VM with overhead for system pods.
|
|
|
|
## 14. Documentation (README.md)
|
|
|
|
The README covers, in order:
|
|
|
|
1. What the application does (one-paragraph plus screenshots-optional).
|
|
2. Containers used (web/api/db) — short description each.
|
|
3. Kubernetes objects (the table from §5, with one-line "what it does" for each).
|
|
4. Virtual networks — cluster DNS, the four Services, headless Service rationale.
|
|
5. Named volumes — the PV, the PVC, hostPath path, reclaim policy.
|
|
6. Container configuration performed (env vars, ConfigMap mount, image build context).
|
|
7. Instructions: prepare → start → web access → stop.
|
|
8. How to view in browser (`minikube service web -n taskapp`).
|
|
9. Sources (assignment 1 + Kubernetes docs).
|
|
10. Use of AI (Claude Opus 4.7, Anthropic — disclosed per academic-integrity convention from assignment 1).
|
|
|
|
## 15. Out of scope (YAGNI)
|
|
|
|
- Ingress, HPA, NetworkPolicy, PodDisruptionBudget — not required, no demo value.
|
|
- TLS — assignment doesn't ask, browser access is over plain HTTP on a NodePort.
|
|
- Multiple DB replicas / streaming replication — out of scope for the assignment.
|
|
- CI/CD, Helm chart, Kustomize overlays — over-engineering for a single-environment school project.
|
|
|
|
## 16. Risks & mitigations
|
|
|
|
| Risk | Mitigation |
|
|
|------|-----------|
|
|
| Image not visible to minikube | `prepare-app.sh` runs `eval $(minikube docker-env)` before `docker build` so the image lands in minikube's daemon. |
|
|
| PV directory missing on node | `prepare-app.sh` creates `/mnt/data/taskapp-db` via `minikube ssh`. |
|
|
| `api` starts before `db` is reachable | Readiness probe on `/api/health` includes a DB ping; rolling update waits. App also tolerates and retries on connect failure at boot. |
|
|
| User runs `start-app.sh` twice | All operations are idempotent (`kubectl apply`); script is safe to re-run. |
|
|
| User stops then starts → data lost? | PV reclaim policy is `Retain`, so the underlying hostPath dir survives. On re-create the same PV is bound by the same PVC selector. |
|
|
|
|
## 17. Oral evaluation talking points
|
|
|
|
(For grading; not for the README.)
|
|
|
|
1. *Why StatefulSet for Postgres, not Deployment?* — stable pod identity (`db-0`), stable storage, ordered startup; Postgres can't tolerate two pods racing on the same data dir.
|
|
2. *Why headless Service for `db`?* — gives StatefulSet pods stable DNS; kube-proxy does not load-balance, which is what stateful clients want.
|
|
3. *Why a Secret instead of plain env in YAML?* — separate object, base64-encoded, can be replaced with sealed-secrets / vault; the YAML can be committed without the password (production direction).
|
|
4. *Why a ConfigMap for `nginx.conf`?* — separates config from image; tweaking the proxy block does not require a rebuild.
|
|
5. *Why two replicas on `web` and `api`?* — stateless = horizontally scalable; demonstrates that Deployment ≠ "one pod".
|
|
6. *Why `Retain` on the PV?* — deleting the PVC won't wipe the underlying directory; safer default; allows operator review before reuse.
|
|
7. *Why static PV (not StorageClass dynamic provisioning)?* — assignment explicitly asks for PV+PVC objects; static is the textbook match. In production we'd use a StorageClass.
|