Draft 2: Pluggable Backend for Cozystack AGL

Status: Draft Author: Maxim Belyy (kitsunoff) Date: 2026-05-20 Target project: cozystack/cozystack

Table of contents

  1. Summary
  2. Motivation
  3. Goals
  4. Non-goals
  5. Design
    1. Backend abstraction
    2. Updated ApplicationDefinition schema
    3. Go types
    4. API server changes
    5. Generic status
    6. Generic reconciler
    7. Backwards compatibility
    8. Packaging
  6. Implementation plan
  7. Risks
  8. Migration plan
  9. Open questions
  10. Example: side-by-side definitions

Summary

Refactor Cozystack’s Application Generation Layer to support multiple release backends through a single, generic ApplicationDefinition. The first two backends are Helm (existing behaviour) and Terraform/OpenTofu via tofu-controller; the design leaves room for ArgoCD Application, Flux Kustomization, or plain manifests later without further schema changes.

Existing packages continue to work with no manifest changes, thanks to a defaulted backend type.

Motivation

Today the AGL is structurally a Helm runtime dressed up as a generic abstraction: the CRD field is named release, types embed helmv2.CrossNamespaceSourceReference, the REST layer constructs HelmRelease directly, and a dedicated reconciler patches HelmRelease fields. Adding Terraform support by copying the stack (Draft 1) gets us there fast, but every additional backend pays the same duplication cost.

A pluggable backend turns AGL from “Helm generator” into a real abstraction layer: the user-facing kind, OpenAPI schema, dashboard wiring, secret/service inclusion, and config-hash restart logic are written once; the per-backend code is a small interface implementation.

Goals

Non-goals

Design

Backend abstraction

A new package pkg/agl/backend/ defines:

package backend

type Type string

const (
    TypeHelm      Type = "Helm"
    TypeTerraform Type = "Terraform"
)

// Backend translates between the user-facing Application and a concrete
// Flux-managed target object (HelmRelease, Terraform, ...).
type Backend interface {
    // Type returns the discriminator value, e.g. "Helm" or "Terraform".
    Type() Type

    // TargetGVK is the GroupVersionKind of the backing object (HelmRelease,
    // Terraform CR, ArgoCD Application, ...). Used by the REST layer to
    // list/get/watch.
    TargetGVK() schema.GroupVersionKind

    // TargetName computes the name of the backing object from the
    // user-facing name and the definition (typically applies a prefix).
    TargetName(appName string, def *v1alpha1.ApplicationDefinition) string

    // Build produces the backing object from an Application.
    Build(
        ctx context.Context,
        app *appsv1alpha1.Application,
        def *v1alpha1.ApplicationDefinition,
    ) (client.Object, error)

    // ProjectStatus translates the backing object's status into the
    // generic Application.Status the user sees.
    ProjectStatus(target client.Object) (appsv1alpha1.ApplicationStatus, error)

    // Reconcile keeps an existing backing object aligned with the
    // definition when the definition changes (mirrors today's
    // ApplicationDefinitionHelmReconciler logic).
    Reconcile(
        ctx context.Context,
        c client.Client,
        target client.Object,
        def *v1alpha1.ApplicationDefinition,
    ) (updated bool, err error)
}

// Registry resolves a definition to its backend implementation.
type Registry interface {
    Get(def *v1alpha1.ApplicationDefinition) (Backend, error)
    All() []Backend
}

Two implementations land in the same PR:

Updated ApplicationDefinition schema

apps.cozystack.io/v1alpha1 gains a backend field; the existing release field becomes an alias.

apiVersion: apps.cozystack.io/v1alpha1
kind: ApplicationDefinition
metadata:
  name: postgres
spec:
  application:
    kind: Postgres
    singular: postgres
    plural: postgreses
    openAPISchema: |
      { ... }
  backend:
    type: Helm                # discriminator: Helm | Terraform
    helm:                     # required when type=Helm
      chartRef:
        kind: ExternalArtifact
        name: cozystack-postgres-chart
        namespace: cozy-system
      prefix: postgres-
      labels:
        sharding.fluxcd.io/key: tenants
      valuesFrom:
        - kind: Secret
          name: cozystack-values
  secrets: { ... }
  services: { ... }
  dashboard: { ... }
apiVersion: apps.cozystack.io/v1alpha1
kind: ApplicationDefinition
metadata:
  name: vpc
spec:
  application:
    kind: VPC
    singular: vpc
    plural: vpcs
    openAPISchema: |
      { ... }
  backend:
    type: Terraform
    terraform:               # required when type=Terraform
      sourceRef:
        kind: OCIRepository
        name: aws-vpc-module
        namespace: cozy-system
      path: ./
      prefix: vpc-
      approvePlan: auto
      destroyResourcesOnDeletion: true
      writeOutputsToSecret:
        name: "-outputs"
      runnerPodTemplate:
        spec:
          serviceAccountName: aws-tofu-runner
  secrets: { ... }
  dashboard: { ... }

Go types

type ApplicationDefinitionSpec struct {
    Application ApplicationDefinitionApplication `json:"application"`

    // Backend is the new discriminated union.
    Backend *Backend `json:"backend,omitempty"`

    // Release is the legacy field. Kept for backwards compatibility.
    // If Backend is unset and Release is set, treated as Backend{Type: Helm, Helm: from(Release)}.
    // +deprecated
    Release *ApplicationDefinitionRelease `json:"release,omitempty"`

    Secrets   *ApplicationDefinitionResources `json:"secrets,omitempty"`
    Services  *ApplicationDefinitionResources `json:"services,omitempty"`
    Ingresses *ApplicationDefinitionResources `json:"ingresses,omitempty"`
    Dashboard *ApplicationDefinitionDashboard `json:"dashboard,omitempty"`
}

type Backend struct {
    // +kubebuilder:validation:Enum=Helm;Terraform
    Type BackendType `json:"type"`

    Helm      *HelmBackend      `json:"helm,omitempty"`
    Terraform *TerraformBackend `json:"terraform,omitempty"`
}

type HelmBackend struct {
    ChartRef   *helmv2.CrossNamespaceSourceReference `json:"chartRef"`
    Prefix     string                                `json:"prefix,omitempty"`
    Labels     map[string]string                     `json:"labels,omitempty"`
    ValuesFrom []helmv2.ValuesReference              `json:"valuesFrom,omitempty"`
}

type TerraformBackend struct {
    SourceRef                  tfv1alpha2.CrossNamespaceSourceReference `json:"sourceRef"`
    Path                       string                                   `json:"path,omitempty"`
    Prefix                     string                                   `json:"prefix,omitempty"`
    Labels                     map[string]string                        `json:"labels,omitempty"`
    Interval                   metav1.Duration                          `json:"interval,omitempty"`
    ApprovePlan                string                                   `json:"approvePlan,omitempty"`
    DestroyResourcesOnDeletion bool                                     `json:"destroyResourcesOnDeletion,omitempty"`
    WriteOutputsToSecret       *tfv1alpha2.WriteOutputsToSecretSpec     `json:"writeOutputsToSecret,omitempty"`
    RunnerPodTemplate          *tfv1alpha2.RunnerPodTemplate            `json:"runnerPodTemplate,omitempty"`
}

A defaulting webhook (or in-process normalization at apiserver startup) projects spec.release onto spec.backend.helm when only the legacy field is present, so existing packages keep working.

API server changes

func (r *REST) Create(ctx context.Context, obj runtime.Object, ...) (..., error) {
    app := obj.(*appsv1alpha1.Application)
    def := r.definitions.Get(r.kindName)
    b, err := r.backends.Get(def)
    if err != nil { return nil, err }

    target, err := b.Build(ctx, app, def)
    if err != nil { return nil, err }

    // Common label/annotation injection (extracted once).
    injectAGLLabels(target, app, r.kindName)

    if err := r.c.Create(ctx, target); err != nil { return nil, err }
    return app, nil
}

Get/List/Update/Delete follow the same shape: the REST layer is backend-agnostic, the backend knows the target object type.

Generic status

Application.Status becomes a small generic envelope plus an opaque per-backend extension:

type ApplicationStatus struct {
    // Common, projected by every backend.
    Conditions []metav1.Condition `json:"conditions,omitempty"`
    Ready      bool               `json:"ready,omitempty"`
    Message    string             `json:"message,omitempty"`

    // Backend-specific, raw JSON. Schema documented per backend.
    Backend *runtime.RawExtension `json:"backend,omitempty"`
}

This avoids forcing every field of every backend into the top-level schema while keeping the common ready/conditions contract uniform.

Generic reconciler

Today’s ApplicationDefinitionHelmReconciler is replaced by a single ApplicationDefinitionReconciler that:

  1. Watches ApplicationDefinition.
  2. Resolves backend via the registry.
  3. Lists existing target objects by label selector apps.cozystack.io/application.kind=<Kind>.
  4. For each target, calls backend.Reconcile(ctx, c, target, def).

The existing config-hash restart logic for the aggregation apiserver is untouched.

Backwards compatibility

Required behaviour:

Mechanism:

Packaging

Implementation plan

  1. Extract Helm logic — move all helm-specific code from rest.go and applicationdefinition_helmreconciler.go behind the new Backend interface. No behaviour change, no schema change. PR is large but mechanical; covered by existing e2e tests.
  2. Schema: add backend — introduce spec.backend as optional, defaulted from spec.release. CRD validation, conversion logic.
  3. Generic reconciler — replace the helm-specific reconciler with the registry-driven one. Helm backend wired through the registry.
  4. Terraform backend — implement pkg/agl/backend/terraform/. Includes vars marshalling (Application.SpecTerraform.Spec.Vars), status projection, reconcile diffing.
  5. Example package — one Terraform-backed package end-to-end.
  6. Deprecation notice — emit warning when spec.release is used; plan v1alpha2 removal.
  7. Docs — update author guide, add backend authoring guide.

Stages 1 and 3 are the risky ones (touch the hot path of every existing package). Stages 4–7 are additive.

Risks

Migration plan

Open questions

Example: side-by-side definitions

---
apiVersion: apps.cozystack.io/v1alpha1
kind: ApplicationDefinition
metadata:
  name: postgres
spec:
  application:
    kind: Postgres
    singular: postgres
    plural: postgreses
    openAPISchema: |
      { ... }
  backend:
    type: Helm
    helm:
      chartRef:
        kind: ExternalArtifact
        name: cozystack-postgres-chart
        namespace: cozy-system
      prefix: postgres-
      valuesFrom:
        - kind: Secret
          name: cozystack-values
  secrets:
    include:
      - resourceNames: ["postgres--credentials"]
---
apiVersion: apps.cozystack.io/v1alpha1
kind: ApplicationDefinition
metadata:
  name: dns-zone
spec:
  application:
    kind: DNSZone
    singular: dnszone
    plural: dnszones
    openAPISchema: |
      { ... }
  backend:
    type: Terraform
    terraform:
      sourceRef:
        kind: OCIRepository
        name: cloudflare-dns-module
        namespace: cozy-system
      path: ./
      prefix: dns-
      approvePlan: auto
      destroyResourcesOnDeletion: true
      writeOutputsToSecret:
        name: "-outputs"
      runnerPodTemplate:
        spec:
          serviceAccountName: cloudflare-tofu-runner
  secrets:
    include:
      - resourceNames: ["-outputs"]

End users in tenant-acme:

---
apiVersion: apps.cozystack.io/v1alpha1
kind: Postgres
metadata: { name: app-db, namespace: tenant-acme }
spec:
  size: 20Gi
  replicas: 3
---
apiVersion: apps.cozystack.io/v1alpha1
kind: DNSZone
metadata: { name: acme-prod, namespace: tenant-acme }
spec:
  zone: acme.example.com
  ttl: 300

One CRD, one apiserver, one reconciler. The backend boundary is the only place that knows whether the result is a HelmRelease or a Terraform CR.