Skip to main content
DevOps
10 min read

GitOps with Flux CD: A Practical Guide to Kubernetes Deployments

Max van Anen - Profile Picture
By
Real-world GitOps implementation using Flux CD for Kubernetes. Learn the benefits, challenges, and best practices from production experience.
GitOpsKubernetesFlux CDCI/CDInfrastructure as Code
GitOps with Flux CD: A Practical Guide to Kubernetes Deployments - Featured article image for Maxzilla blog

After years of "kubectl apply" cowboys and fragile CI/CD pipelines pushing directly to production, we discovered GitOps. It transformed how we deploy to Kubernetes at scale. Here's what GitOps really means in practice, why it works, and the challenges nobody talks about.

What GitOps Actually Is (Without the Hype)

GitOps is simple: your Git repository becomes the single source of truth for what should be running in your Kubernetes clusters. Instead of CI pipelines pushing changes to clusters, specialized operators like Flux CD pull changes from Git and ensure your cluster matches what's declared.

Think of it as Infrastructure as Code, but with continuous enforcement. If someone manually changes something in the cluster, GitOps automatically reverts it to match Git. No more configuration drift, no more "who changed what in production?"

Our GitOps Architecture with Flux CD

Here's how we structure GitOps for our enterprise Kubernetes deployments:

repository-structure
# Application repository (e.g., atlas-resources-api) . ├── src/ # Application source code ├── helm/ │ ├── chart/ # Helm chart templates │ └── values/ │ ├── dev.yaml # Development values │ ├── staging.yaml # Staging values │ └── prod.yaml # Production values └── .github/ └── workflows/ └── build.yaml # CI pipeline # GitOps repository (e.g., platform-gitops) . ├── clusters/ │ ├── prod-eu-west/ │ │ ├── flux-system/ # Flux components │ │ └── apps/ # Application deployments │ └── staging-eu-west/ │ ├── flux-system/ │ └── apps/ └── infrastructure/ ├── sources/ # Helm repositories └── configs/ # Shared configurations

The Deployment Flow

Here's what happens when a developer pushes code:

  1. Developer pushes to main branch: Code triggers CI pipeline
  2. CI builds and pushes container: Image tagged with Git SHA goes to registry
  3. CI updates GitOps repo: Updates image tag in Helm values or HelmRelease
  4. Flux detects change: Polls GitOps repo every minute (configurable)
  5. Flux applies changes: Updates cluster to match desired state
  6. Flux monitors health: Ensures deployment succeeds, can trigger alerts
flux-helmrelease.yaml
apiVersion: helm.toolkit.fluxcd.io/v2beta1 kind: HelmRelease metadata: name: atlas-resources-api namespace: flux-system spec: interval: 5m targetNamespace: atlas chart: spec: chart: ./helm/chart sourceRef: kind: GitRepository name: atlas-resources-api interval: 1m values: image: repository: harbor.company.io/atlas/resources-api tag: ${GIT_SHA} # Updated by CI replicaCount: 3 ingress: enabled: true hostname: api.atlas.company.io resources: requests: memory: "512Mi" cpu: "250m" limits: memory: "2Gi" cpu: "1000m" # Automated rollback on failure upgrade: remediation: retries: 3 remediateLastFailure: true

The Real Benefits We've Experienced

Complete Audit Trail

Every change to production is a Git commit. Need to know who deployed what at 3 AM last Tuesday? It's in the Git history. Need to understand why a service was scaled up? Check the commit message. This has saved us countless hours during incident investigations.

Rollbacks That Actually Work

Rolling back is literally git revert. No custom scripts, no remembering the previous version, no hoping the rollback procedure still works. We've reduced rollback time from 15-20 minutes to under 2 minutes.

bash
# Instant rollback to previous version git revert HEAD --no-edit git push # Flux automatically applies the revert within minutes

Self-Healing Infrastructure

Someone manually scaled a deployment? Flux scales it back. Accidentally deleted a ConfigMap? Flux recreates it. This drift prevention has eliminated entire categories of production issues.

Developer Experience

Developers don't need kubectl access. They don't need to learn Kubernetes intricacies. They push code, CI builds it, and GitOps deploys it. The abstraction is clean and familiar.

The Challenges Nobody Mentions

Secret Management Complexity

You can't store secrets in Git (obviously). This means integrating tools like Sealed Secrets, SOPS, or external secret operators. We use Sealed Secrets, but it adds complexity:

sealed-secret.yaml
apiVersion: bitnami.com/v1alpha1 kind: SealedSecret metadata: name: database-credentials namespace: atlas spec: encryptedData: username: AgBvA8kOp5... # Encrypted value password: AgCdX9mRt2... # Encrypted value

The Git Bottleneck

When your Git repository is down, deployments stop. We've had GitHub outages block deployments for hours. You need contingency plans, like break-glass procedures for emergency changes.

Debugging Becomes Indirect

When something goes wrong, you're debugging Flux logs, not your deployment directly. The abstraction layer helps until it doesn't. Common issues we've faced:

  • Flux gets stuck reconciling due to resource conflicts
  • Image pull errors aren't immediately obvious
  • Helm chart errors can be cryptic in Flux logs
  • Dependency ordering issues with CRDs

Initial Learning Curve

Teams comfortable with traditional CI/CD need time to adjust. "Why can't I just kubectl apply?" is a common question. The mental model shift from push to pull takes time.

GitOps vs Traditional CI/CD: The Real Comparison

AspectTraditional CI/CDGitOps
Deployment MethodCI pushes to clusterOperator pulls from Git
Cluster CredentialsStored in CI systemNever leave cluster
Rollback Speed10-30 minutes1-2 minutes
Audit TrailCI logs (if retained)Complete Git history
Drift PreventionManual or scriptedAutomatic
Multi-clusterComplex pipeline logicDifferent Git branches/paths

Practical Flux CD Implementation

Bootstrap Flux in Your Cluster

bash
# Install Flux CLI curl -s https://fluxcd.io/install.sh | sudo bash # Check prerequisites flux check --pre # Bootstrap Flux with GitHub flux bootstrap github --owner=your-org --repository=platform-gitops --branch=main --path=clusters/prod --personal

Structure Your Helm Releases

helm-repository.yaml
apiVersion: source.toolkit.fluxcd.io/v1beta2 kind: HelmRepository metadata: name: ingress-nginx namespace: flux-system spec: interval: 1h url: https://kubernetes.github.io/ingress-nginx --- apiVersion: helm.toolkit.fluxcd.io/v2beta1 kind: HelmRelease metadata: name: ingress-nginx namespace: flux-system spec: interval: 5m chart: spec: chart: ingress-nginx version: '4.x' sourceRef: kind: HelmRepository name: ingress-nginx values: controller: service: type: LoadBalancer

Monitor Flux Operations

bash
# Check Flux component status flux get all # Watch Flux logs flux logs --follow # Get detailed reconciliation status flux get helmreleases -A # Force reconciliation (useful for testing) flux reconcile source git flux-system

When GitOps Makes Sense (And When It Doesn't)

Perfect for GitOps

  • ✓ Multi-cluster deployments requiring consistency
  • ✓ Teams needing strong audit and compliance requirements
  • ✓ Environments where configuration drift is problematic
  • ✓ Organizations with mature Git workflows
  • ✓ Stateless applications and services

Think Twice About GitOps

  • ✗ Rapid prototyping or experimental environments
  • ✗ Stateful applications requiring complex migrations
  • ✗ Teams without Kubernetes expertise
  • ✗ Environments requiring sub-minute deployment times
  • ✗ Applications with frequently changing secrets

Best Practices from Production

1. Separate Application and Infrastructure Repos

Keep application code separate from Kubernetes manifests. This allows different teams to own different parts and reduces merge conflicts.

2. Use Kustomize or Helm for Templating

Don't store raw YAML for every environment. Use Helm charts with environment-specific values or Kustomize overlays to reduce duplication.

3. Implement Progressive Delivery

Combine GitOps with Flagger for canary deployments. Flux deploys, Flagger gradually shifts traffic:

canary.yaml
apiVersion: flagger.app/v1beta1 kind: Canary metadata: name: atlas-api spec: targetRef: apiVersion: apps/v1 kind: Deployment name: atlas-api progressDeadlineSeconds: 60 service: port: 8080 analysis: interval: 30s threshold: 5 maxWeight: 50 stepWeight: 10 metrics: - name: request-success-rate thresholdRange: min: 99 interval: 30s

4. Set Up Alerts

Configure Flux to send alerts to Slack or PagerDuty when reconciliation fails:

alert.yaml
apiVersion: notification.toolkit.fluxcd.io/v1beta1 kind: Alert metadata: name: on-call-webapp namespace: flux-system spec: providerRef: name: slack eventSeverity: error eventSources: - kind: HelmRelease namespace: default name: '*' - kind: Kustomization namespace: flux-system name: '*'

The Verdict: Is GitOps Worth It?

After two years of GitOps in production across multiple clusters and teams, my answer is:absolutely yes, with caveats.

GitOps has eliminated entire categories of problems. No more configuration drift, no more mysterious production changes, no more failed rollbacks. The audit trail alone has justified the investment during compliance audits.

But it's not free. You need to invest in tooling, training, and new processes. Secret management becomes more complex. Debugging requires understanding an additional abstraction layer. And you're adding a dependency on Git availability.

For enterprises running Kubernetes at scale, GitOps is becoming the de facto standard. For smaller teams or simpler deployments, the overhead might not be worth it. Evaluate your specific needs, but don't dismiss GitOps as just another buzzword. It's a fundamental shift in how we think about deployment, and for many organizations, it's the right shift.

If you're considering GitOps for your organization, also check out our article on monorepo architectures, which explores another critical aspect of modern DevOps infrastructure organization.