62 Commits

Author SHA1 Message Date
d3a8cfe3ae feat(paas): create tenant t1 (small) 2026-02-24 22:03:52 +01:00
claude
5f6a909910 cleanup: remove tenant-t1 files (tenant deleted)
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 9s
2026-02-24 21:47:33 +01:00
claude
b9a84c674f feat: expose Gitea externally at git.georgepet.duckdns.org
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 3s
PR Checks / Validate & Security Scan (pull_request) Successful in 10s
Service+Endpoints pointing to 10.10.10.1:3000, Ingress with TLS.
Phase 22: Git-based PaaS deploy pipeline.
2026-02-24 20:09:28 +01:00
3058bf59c0 feat(paas): create tenant t1 (small) 2026-02-24 19:25:08 +01:00
claude
0599b4c3ee chore: remove test tenant-t1 ArgoCD app
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 13s
Test tenant created during PaaS Portal testing. User and namespace already cleaned up.
2026-02-24 18:55:54 +01:00
claude
ddc3def7c4 feat: rename naas-portal to paas-portal across all resources
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 2s
PR Checks / Validate & Security Scan (pull_request) Successful in 13s
- Helm chart: charts/naas-portal → charts/paas-portal
- ArgoCD app: naas-portal → paas-portal
- Environment values: naas-portal → paas-portal
- ClusterRole: naas-manager → paas-manager (operational-rbac)
- Tenant labels: naas.georgepet.duckdns.org → paas.georgepet.duckdns.org
- Secret: naas-portal-secrets → paas-portal-secrets
- Image: claude/naas-portal → claude/paas-portal
2026-02-24 18:24:21 +01:00
root
455250ee79 Add naas-portal Helm chart for K8s deployment
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 2s
PR Checks / Validate & Security Scan (pull_request) Successful in 12s
Migrate PaaS portal from Docker control-plane to K8s with:
- Dedicated Helm chart (Deployment, Service, Ingress, PVC, RBAC, NetworkPolicy)
- Domain: georgepaas.duckdns.org with TLS via cert-manager
- In-cluster ServiceAccount bound to naas-manager ClusterRole
- Longhorn PVC for SQLite persistence
- ArgoCD auto-sync application
2026-02-24 16:47:58 +01:00
4b22483d57 feat(naas): create tenant t1 (small) 2026-02-24 15:39:55 +01:00
69bc2425ec feat(naas): delete tenant t1 2026-02-24 14:33:39 +01:00
e091267db7 feat(naas): create tenant t1 (small) 2026-02-24 13:25:18 +01:00
6b593e3d49 cleanup: delete tenant t1 ArgoCD app 2026-02-24 13:03:30 +01:00
Claude
3dc6b0dd68 phase19: cleanup — remove unused ArgoCD apps, convert arch-docs to Deployment
Remove components not needed for PaaS-focused infrastructure:
- argo-rollouts: only used by arch-docs canary, convert to plain Deployment
- oauth2-proxy: was for dev/staging auth (removed in Phase 18)
- nginx-test: test deployment, not needed
- kube-bench: CIS benchmark scanner, not needed for PaaS
- trivy-operator: vulnerability scanner, not needed for PaaS
- drift-check RBAC: drift-check service being removed

arch-docs-prod: rollout.enabled=false → Helm uses Deployment template
2026-02-24 10:40:13 +01:00
cf51494a08 feat(naas): create tenant t1 (small) 2026-02-24 09:05:25 +01:00
741cd6359b feat(naas): delete tenant t2 2026-02-24 06:52:42 +01:00
c93d6b0ca1 feat(naas): delete tenant t1 2026-02-24 06:52:41 +01:00
cf2c8ed107 chore: remove dev/staging environment (PaaS transition) 2026-02-24 06:50:36 +01:00
ed02e6f8e5 chore: remove dev/staging environment (PaaS transition) 2026-02-24 06:50:34 +01:00
17baef3e7e feat(naas): create tenant t2 (small) 2026-02-23 21:08:25 +01:00
cebd7793d1 feat(naas): create tenant t1 (small) 2026-02-23 14:43:43 +01:00
68e1312bb4 feat(naas): delete tenant t1 2026-02-23 14:43:41 +01:00
5dfff53db6 feat(naas): create tenant t1 (medium) 2026-02-23 14:42:46 +01:00
c28ae380b0 feat(naas): delete tenant t1 2026-02-23 14:30:24 +01:00
04fb358c23 feat(naas): create tenant t1 (small) 2026-02-23 14:03:56 +01:00
61786efd28 feat: add NaaS tenant-namespace Helm chart + test tenant t1 2026-02-23 13:32:29 +01:00
Claude
3b6195d698 fix: trivy node-collector toleration + argo-rollouts CRD sync
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 11s
- Add control-plane toleration to trivy nodeCollector so it can
  schedule on k8s-master (was stuck Pending indefinitely)
- Add ignoreDifferences for CRDs + ServerSideApply to argo-rollouts
  to resolve perpetual OutOfSync caused by Helm CRD management gap

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 09:20:40 +01:00
root
b2b1d594e7 fix: remove ServerSideApply from argo-rollouts to resolve CRD drift
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 3s
PR Checks / Validate & Security Scan (pull_request) Successful in 11s
SSA causes perpetual OutOfSync on CRDs due to field manager conflicts.
Client-side apply works correctly for Helm charts with CRDs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 21:14:21 +01:00
root
1b353559ce fix: broaden CRD ignoreDifferences for argo-rollouts
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 10s
Use jqPathExpressions to ignore entire .metadata and .spec.versions
schema sections on CRDs, which drift due to ServerSideApply field
manager changes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 21:04:03 +01:00
root
4dd21b1e99 fix: resolve argo-rollouts CRD OutOfSync with ignoreDifferences
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 3s
PR Checks / Validate & Security Scan (pull_request) Successful in 14s
Add ignoreDifferences for CRDs (metadata labels/annotations drift
caused by ServerSideApply field managers) and RespectIgnoreDifferences
sync option.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 21:01:24 +01:00
root
465a9859b7 feat: add Argo Rollouts with canary strategy for prod
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 10s
- Install Argo Rollouts via ArgoCD (Helm chart 2.39.1)
- Add Rollout template with nginx traffic routing
- Add canary Service for traffic splitting
- Enable canary for prod arch-docs (20% → 60s → 50% → 60s → 100%)
- Dev/staging remain standard Deployment (1 replica, canary not useful)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 19:36:11 +01:00
root
65930ceb1e sec: remove plaintext passwords from realm ConfigMap
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 10s
Use keycloak-config-cli env var substitution $(env:VAR_NAME) to inject
user passwords from K8s Secret instead of hardcoding them in ConfigMap.

- realm-configmap.yaml: passwords replaced with $(env:KC_INFRA_ADMIN_PASSWORD)
  and $(env:KC_INFRA_CLAUDE_PASSWORD)
- keycloak ArgoCD app: added keycloakConfigCli.extraEnvVarsSecret
- Secrets sourced from OpenBao via create-keycloak-secrets.sh

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 13:24:44 +01:00
root
9acb62e515 chore: remove report-generator from all environments
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 8s
Report-generator was a load testing application. Decommissioning:
- Remove ArgoCD app definitions (6 apps)
- Remove infra manifests (networkpolicy, secrets, seed-jobs)
- Remove Helm values (dev/staging/prod)

K8s resources already deleted via ArgoCD cascade delete.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 09:43:02 +01:00
Claude
247beaca76 feat: add report-generator app (Go + PostgreSQL + MinIO) for load testing
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 9s
- 6 ArgoCD apps (API + infra for dev/staging/prod)
- PostgreSQL StatefulSet + MinIO Deployment per namespace
- NetworkPolicies for app-to-db and app-to-minio
- Seed Job (5M orders, 100K customers, 10K products)
- HPA enabled in prod (2-5 replicas, 70% CPU target)
- Helm values with path-based ingress /reports on existing hosts
2026-02-19 23:40:34 +01:00
Claude
8230257299 fix: override KC_HOSTNAME="" to clear Bitnami chart default
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 8s
Bitnami Keycloak chart auto-sets KC_HOSTNAME from ingress.hostname.
Override with empty string via extraEnvVars so Keycloak derives URLs
from request headers (X-Forwarded-* via ingress, Host via NodePort).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 17:17:06 +01:00
Claude
4fd6dfb3b9 fix: remove KC_HOSTNAME to fix NodePort OAuth flow
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 8s
With KC_HOSTNAME set, Keycloak always redirects to the configured
hostname in login form actions, breaking OAuth when accessed via
NodePort (127.0.0.1:30880). Without KC_HOSTNAME, Keycloak derives
URLs from request headers:
- Ingress: X-Forwarded-Host/Proto → https://keycloak.georgepet...
- NodePort: Host header → http://127.0.0.1:30880
KC_PROXY_HEADERS=xforwarded is kept to trust ingress-nginx headers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 17:09:11 +01:00
Claude
0eeae350cf fix: set KC_HOSTNAME_STRICT=false for NodePort access
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 8s
When accessing Keycloak via NodePort (127.0.0.1:30880), strict hostname
forces redirects to keycloak.georgepet.duckdns.org which is unreachable
from local browser. With strict=false, Keycloak uses the request's host
header for redirects when accessed via NodePort, while still using the
configured hostname for ingress access.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 17:01:11 +01:00
Claude
f71c583d69 Phase 16: fine-grained RBAC (infra-operators) + DB rotation prep
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 8s
- Add infra-operators group to Keycloak realm
- Add K8s RBAC: operators get full CRUD in dev/staging, readonly in prod,
  cluster-level readonly for nodes/namespaces/storage, no infra ns access
- Update ArgoCD RBAC: operators → role:readonly
- Update oauth2-proxy: allow infra-operators group
- Add PostgreSQL NodePort (35432) for OpenBao Database engine access
- Update NetworkPolicy: allow NodePort traffic from node CIDR
- Extend keycloak-secrets-manager Role: statefulset get/patch for rotation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 15:33:23 +01:00
root
d629bc5ef7 fix: add ServerSideApply for Kyverno CRDs (annotation too long)
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 7s
2026-02-18 06:14:08 +01:00
root
4188d1dd6f feat: add Kyverno admission controller + cosign image verification
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 7s
- Deploy Kyverno v1.13.4 (chart 3.3.4) via ArgoCD Helm chart
- Add ClusterPolicy to verify cosign signatures on registry images (Audit mode)
- Add NetworkPolicy for kyverno namespace (default-deny + selective allow)
- Extend keycloak-secrets-manager RBAC to kyverno namespace for cosign key sync
- ArgoCD Application for kyverno-policies directory
2026-02-18 06:06:07 +01:00
root
b7ee0875b8 feat: add NetworkPolicy for cert-manager and ingress-nginx
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 2s
PR Checks / Validate & Security Scan (pull_request) Successful in 8s
Default-deny + selective allow policies:
- cert-manager: DNS, K8s API, ACME HTTPS, webhook ingress, Prometheus scrape
- ingress-nginx: DNS, K8s API, external HTTP/HTTPS, backend forwarding

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 21:47:50 +01:00
root
1ab66afa5f feat: add operational RBAC — scoped ServiceAccounts for scripts
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 4s
PR Checks / Validate & Security Scan (pull_request) Successful in 8s
Create least-privilege ServiceAccounts for k8s-audit, drift-check,
and keycloak-secrets-manager instead of sharing admin kubeconfig.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 21:12:02 +01:00
root
59ef02b6d6 fix: accept group claim with and without leading slash
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 6s
Keycloak sends groups as 'infra-admins' but oauth2-proxy config
expected '/infra-admins'. Accept both formats.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 08:37:12 +01:00
root
2b8dce25fc fix: use public Keycloak URLs for token exchange
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 6s
Internal redeem_url caused context canceled errors because
KC_HOSTNAME_STRICT rejects mismatched hostnames. Use public
HTTPS URLs for all OIDC endpoints.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 08:30:36 +01:00
root
f8120c251a fix: remove server-snippet (disabled by ingress admin)
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 6s
Admin console is protected by Keycloak's own auth. No need for
external path blocking — brute-force protection is built-in.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 08:21:56 +01:00
root
43fee223a0 feat: expose Keycloak externally via ingress
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 2s
PR Checks / Validate & Security Scan (pull_request) Successful in 6s
- Enable Keycloak ingress at keycloak.georgepet.duckdns.org with TLS
- Update KC_HOSTNAME to public domain, add KC_PROXY_HEADERS=xforwarded
- Remove KC_HOSTNAME_PORT and KC_HTTP_ENABLED (standard HTTPS via ingress)
- Block /admin and /realms/master externally via server-snippet
- Update oauth2-proxy login_url and oidc_issuer_url to public HTTPS URL
- Keep redeem/jwks/profile URLs internal (keycloak.keycloak.svc:8080)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 08:12:21 +01:00
root
0aba0e7a87 feat(keycloak): move to localhost:30880 via SSH tunnel
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 8s
- Disable external ingress, add NodePort 30880
- Set KC_HOSTNAME=127.0.0.1:30880 (fixed issuer for OIDC)
- oauth2-proxy: skip-oidc-discovery + explicit K8s internal URLs
- ArgoCD: remove OIDC (already behind SSH tunnel, will add Dex later)
- Realm: sslRequired=none for HTTP access via tunnel

Access: user SSH tunnel → localhost:30880 → K8s NodePort → Keycloak

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 21:37:42 +01:00
root
e2c941bdd4 fix(keycloak): use bitnamilegacy images (Bitnami removed Docker Hub images Aug 2025)
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Image override - no code changes, validated manually
Override all image references to use bitnamilegacy/* since Bitnami
removed all free Docker images from docker.io/bitnami/* on 2025-08-28.

Images overridden:
- bitnamilegacy/keycloak:26.3.3-debian-12-r0
- bitnamilegacy/keycloak-config-cli:6.4.0-debian-12-r11
- bitnamilegacy/postgresql:17.6.0-debian-12-r0
- bitnamilegacy/os-shell:12-debian-12-r50

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 20:40:10 +01:00
root
ab7805eada fix(keycloak): replace deprecated proxy: edge with proxyHeaders: xforwarded
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 6s
Bitnami Keycloak chart 25.x (Keycloak 26.x) deprecated the 'proxy' parameter.
Production mode requires proxyHeaders to be set instead.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 20:01:57 +01:00
root
2277d3592d feat: add Keycloak SSO + oauth2-proxy + ArgoCD OIDC config
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 1s
PR Checks / Validate & Security Scan (pull_request) Successful in 7s
- Keycloak (Bitnami Helm chart) with PostgreSQL on Longhorn
- oauth2-proxy for arch-docs dev/staging auth
- ArgoCD OIDC integration via ConfigMap
- Realm 'infrastructure': users admin/claude, groups infra-admins/infra-bots
- 4 OIDC clients: grafana, argocd, gitea, oauth2-proxy
- NetworkPolicy: default-deny + selective allow
- oauth2-proxy ingress for dev/staging subdomains
2026-02-16 19:48:43 +01:00
root
21f5794851 feat: Longhorn S3 backup to MinIO (daily, retain 7)
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 2s
PR Checks / Validate & Security Scan (pull_request) Successful in 6s
2026-02-16 16:19:24 +01:00
root
7388a0b10a upgrade: Longhorn 1.10.1 → 1.11.0 (step 4/4, final)
All checks were successful
AI Review / AI Code Review (pull_request) Successful in 2s
PR Checks / Validate & Security Scan (pull_request) Successful in 6s
2026-02-16 13:06:00 +01:00