Kubernetes can look overwhelming because it has many objects, many YAML files, and many moving parts. The easiest way to understand it is to follow one story from start to finish:
- package an app in a container
- run one copy
- expose it to users
- scale it
- keep configuration separate
- store persistent data
- secure it
- observe it
- update it safely
- operate it across environments
This post follows exactly that flow.
Why Kubernetes exists
Containers solved packaging. Orchestration solves operation.
When teams moved from a few containers to dozens or hundreds, they needed answers for:
- where should each container run?
- what if a node dies?
- how do we scale during traffic spikes?
- how do we roll out updates without downtime?
- how do services find each other?
- how do we control secrets and permissions?
Kubernetes is the control system that answers these questions continuously.
The cluster mental model
A Kubernetes cluster has two major sides.
- Control plane: the brain that stores desired state and decides what should run where.
- Worker nodes: the machines that actually run your application workloads.
You declare desired state, Kubernetes keeps working until actual state matches it.
Core building blocks you must know
Pod
A Pod is the smallest deployable unit. Usually one app container runs inside one Pod.
Think of a Pod as:
- one network identity (IP)
- one lifecycle boundary
- one or more tightly coupled containers
Deployment
A Deployment manages stateless Pods and supports safe rolling updates.
You declare:
- which image to run
- how many replicas
- update strategy
Kubernetes ensures that number of healthy Pods exists.
Service
Pods are ephemeral. Their IPs change. A Service gives a stable virtual endpoint.
Common service types:
ClusterIP: internal cluster accessNodePort: exposes a port on each nodeLoadBalancer: cloud load balancer integration
Ingress
Ingress routes external HTTP/HTTPS traffic (for example /api to one service and / to another) using host/path rules.
ConfigMap and Secret
- ConfigMap stores non-sensitive configuration.
- Secret stores sensitive data such as tokens/passwords (base64 encoded by default; use encryption at rest for stronger protection).
StatefulSet and PersistentVolume
For stateful systems (databases, queues), use StatefulSet and persistent storage primitives:
- PersistentVolume (PV)
- PersistentVolumeClaim (PVC)
First deployment flow
1) Create namespace
Namespaces help isolate applications, teams, and environments.
kubectl create namespace demo
2) Deploy app
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
namespace: demo
spec:
replicas: 2
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: web
image: nginx:1.27
ports:
- containerPort: 80
Apply it:
kubectl apply -f web-deployment.yaml
3) Expose app via Service
apiVersion: v1
kind: Service
metadata:
name: web-svc
namespace: demo
spec:
selector:
app: web
ports:
- port: 80
targetPort: 80
type: ClusterIP
4) Test connectivity quickly
kubectl -n demo get pods
kubectl -n demo get svc
kubectl -n demo port-forward svc/web-svc 8080:80
Now browse http://localhost:8080.
How Kubernetes keeps apps alive
Kubernetes runs reconciliation loops. Controllers constantly compare actual state with desired state.
If one Pod crashes:
- Deployment sees replica count below target
- scheduler picks a node
- kubelet starts a replacement Pod
This self-healing behavior is the reason Kubernetes is reliable at scale.
Scheduling fundamentals
The scheduler decides node placement using constraints and preferences.
Key concepts:
- resource requests and limits (
cpu,memory) - node selectors and affinity rules
- taints and tolerations
- topology spread constraints
Without requests/limits, noisy-neighbor problems are common.
Probes and production health
Use probes to represent reality.
- liveness probe: should Kubernetes restart this container?
- readiness probe: should this Pod receive traffic?
- startup probe: should long boot time be tolerated before liveness starts?
Good probes reduce false restarts and avoid routing traffic to not-ready Pods.
Scaling patterns
Horizontal Pod Autoscaler (HPA)
HPA increases/decreases Pod replicas from metrics (CPU, memory, custom metrics).
Vertical Pod Autoscaler (VPA)
VPA suggests or applies better resource requests/limits.
Cluster Autoscaler
If Pods cannot be scheduled due to node capacity, Cluster Autoscaler adds nodes.
Together:
- HPA scales workload copies
- Cluster Autoscaler scales infrastructure capacity
Stateless and stateful workloads
Stateless apps are easier to scale because any replica can serve any request.
Stateful apps need identity and durable storage. For these:
- StatefulSet gives stable naming (
db-0,db-1) - PVC gives persistent storage
- headless Service helps direct pod-level discovery
Networking and traffic flow
Inside a cluster:
- every Pod gets an IP (via CNI plugin)
- Services provide stable discovery
- CoreDNS resolves service names
Outside a cluster:
- Ingress Controller handles external HTTP(S)
- TLS termination and routing rules are centralized
Example traffic path:
- user -> cloud load balancer -> ingress controller -> service -> pod
Configuration and secrets strategy
Use this rule:
- image contains code
- ConfigMap/Secret contains environment-specific values
Do not hardcode environment values in images.
Recommended practice:
- use separate namespaces for environments
- use separate Secret objects per environment
- rotate secrets regularly
- prefer external secret managers (Vault, AWS Secrets Manager, GCP Secret Manager) for stronger governance
Security model you should implement early
Identity and permissions
- ServiceAccount for workload identity
- RBAC for least privilege
Workload hardening
- run as non-root
- read-only root filesystem where possible
- drop Linux capabilities not needed
- use seccomp/AppArmor profiles
Supply-chain and image safety
- scan images for vulnerabilities
- pin image versions (avoid floating
latest) - sign and verify images where possible
Cluster policies
- enforce standards with policy engines (Kyverno or OPA Gatekeeper)
Observability stack
You need visibility across logs, metrics, and traces.
Common setup:
- Metrics: Prometheus + Grafana
- Logs: Loki/ELK/OpenSearch stack
- Traces: OpenTelemetry + Jaeger/Tempo
- Alerts: Alertmanager integrated with Slack/email/PagerDuty
Operational rule: if a service is critical, define SLI/SLO and alerts before production launch.
Release strategies without downtime
Kubernetes supports progressive delivery patterns:
- rolling update
- blue-green
- canary
For safer rollouts:
- set
maxUnavailableandmaxSurge - use readiness probes correctly
- automate rollback triggers from health metrics
CI/CD and GitOps
A mature delivery flow looks like this:
- code push triggers CI
- tests run
- image builds and scans
- image pushes to registry
- deployment manifest updates
- GitOps controller (Argo CD or Flux) reconciles cluster state from Git
Git becomes the source of truth for infrastructure and app deployments.
Multi-environment and multi-cluster strategy
As systems grow, you usually move to:
- separate clusters for dev/staging/prod
- possibly separate clusters per region
- standardized base manifests with overlays (Helm or Kustomize)
This helps reduce blast radius and improves change control.
Cost and performance optimization
Most Kubernetes waste comes from over-provisioning.
Improve cost by:
- right-sizing requests/limits
- enabling autoscaling
- using spot/preemptible nodes for tolerant workloads
- reducing idle namespaces/resources
- monitoring per-namespace and per-team cost allocation
Common mistakes and quick fixes
- Pods pending forever: check resource requests, node capacity, taints.
- CrashLoopBackOff: inspect container logs, command/args, missing config.
- Service not reachable: verify selectors and pod labels.
- Ingress 404/502: verify host/path rules, service port mapping, backend readiness.
- Storage issues: verify StorageClass, PVC binding, access modes.
Useful commands:
kubectl get all -n demo
kubectl describe pod <pod-name> -n demo
kubectl logs <pod-name> -n demo
kubectl get events -n demo --sort-by=.lastTimestamp
kubectl top pod -n demo
kubectl top node
End-to-end architecture map (Mermaid)
flowchart LR
U[User] --> LB[Cloud Load Balancer]
LB --> IC[Ingress Controller]
IC --> SVC[Service]
SVC --> P1[Pod Replica 1]
SVC --> P2[Pod Replica 2]
CP[Control Plane API Server Scheduler Controllers] --> N1[Worker Node 1]
CP --> N2[Worker Node 2]
P1 --> CM[ConfigMap]
P1 --> SEC[Secret]
DB[StatefulSet DB] --> PVC[PersistentVolumeClaim]
Decision framework: when Kubernetes is the right choice
Kubernetes fits best when you have:
- multiple services
- strict reliability requirements
- frequent deployments
- need for autoscaling and self-healing
- platform-level standardization goals
For small single-service apps with low traffic, a simpler platform can be enough.
Learning path after this post
Use this practical sequence:
- Run a local cluster (
kindorminikube) and deploy two services. - Add Ingress + TLS.
- Add HPA and observe behavior under load.
- Add Prometheus and one alert.
- Add GitOps (Argo CD/Flux) and move manifests to a dedicated repo.
- Add policy checks and image scanning in CI.
If you follow this order, Kubernetes becomes a predictable system instead of a confusing set of YAML files.