Skip to content

Platform Engineer Guide

Complete guide for Platform Engineers: creating Golden Paths, modules, policies, and automations.

This guide walks you through creating and managing Argy platform capabilities.

Role Overview

As a Platform Engineer, you are responsible for:

  • Golden Paths: defining standard paths for development teams
  • Modules: creating reusable automations
  • Policies: defining governance rules
  • Integrations: connecting Argy to your existing tools

Golden Paths

Golden Paths are pre-approved paths that guide developers toward best practices.

Creating a Golden Path

  1. Go to StudioGolden Paths
  2. Click Create a Golden Path
  3. Fill in the basic information:
    • Name: e.g., nodejs-microservice
    • Description: detailed description
    • Tags: for search (e.g., backend, nodejs, api)
    • Icon: choose an icon

Golden Path Structure

A Golden Path is defined in YAML:

# golden-path.yaml
apiVersion: argy.cloud/v1
kind: GoldenPath
metadata:
  name: nodejs-microservice
  description: "Node.js microservice with Express, TypeScript, and tests"
  tags:
    - backend
    - nodejs
    - typescript
    - api

spec:
  # File template
  template:
    source: git
    repository: https://github.com/your-org/templates
    path: nodejs-microservice
    branch: main

  # Variables to be filled by the user
  parameters:
    - name: serviceName
      type: string
      description: "Service name"
      required: true
      pattern: "^[a-z][a-z0-9-]*$"
    
    - name: port
      type: number
      description: "Listening port"
      default: 3000
      min: 1024
      max: 65535
    
    - name: database
      type: select
      description: "Database type"
      options:
        - postgresql
        - mongodb
        - none
      default: postgresql
    
    - name: enableAuth
      type: boolean
      description: "Enable JWT authentication"
      default: true

  # Modules to apply automatically
  modules:
    - name: create-repo
      params:
        repoName: "{{ serviceName }}"
        template: nodejs-microservice
    
    - name: setup-ci
      params:
        type: github-actions
        tests: true
        lint: true
    
    - name: deploy-k8s
      when: "{{ database != 'none' }}"
      params:
        namespace: "{{ serviceName }}"
        replicas: 2

  # Policies to verify
  policies:
    - security-scan
    - code-coverage-80
    - no-secrets-in-code

  # Associated documentation
  docs:
    - title: "Getting Started Guide"
      url: "https://docs.internal/nodejs-microservice"
    - title: "Code Standards"
      url: "https://docs.internal/code-standards"

File Templates

Create a Git repository with your template structure:

nodejs-microservice/
├── .github/
│   └── workflows/
│       └── ci.yml.tpl
├── src/
│   ├── index.ts.tpl
│   ├── config/
│   │   └── index.ts.tpl
│   └── routes/
│       └── health.ts.tpl
├── tests/
│   └── health.test.ts.tpl
├── Dockerfile.tpl
├── package.json.tpl
├── tsconfig.json
└── README.md.tpl

.tpl files use templating syntax:

// src/index.ts.tpl
import express from 'express';
import { config } from './config';

const app = express();
const PORT = {{ port }};

app.get('/health', (req, res) => {
  res.json({ status: 'ok', service: '{{ serviceName }}' });
});

{{#if enableAuth}}
import { authMiddleware } from './middleware/auth';
app.use(authMiddleware);
{{/if}}

app.listen(PORT, () => {
  console.log(`{{ serviceName }} listening on port ${PORT}`);
});

Publishing a Golden Path

  1. Test your Golden Path locally:

    argy-code golden-path validate ./golden-path.yaml
    
  2. Publish it:

    argy-code golden-path publish ./golden-path.yaml
    
  3. Or via the interface:

    • Go to StudioGolden Paths
    • Click on your Golden Path
    • Click Publish

Modules

Modules are reusable automations that encapsulate complex actions.

Module Types

TypeDescriptionExamples
ProvisioningResource creationGit Repo, K8s Namespace, Database
DeploymentApplication deploymentHelm, Terraform, ArgoCD
ConfigurationService configurationSecrets, ConfigMaps, Variables
ObservabilityMonitoring and alertingDashboards, Alerts, Logs
SecuritySecurity and complianceScans, Policies, Certificates

Creating a Module

  1. Go to StudioModules
  2. Click Create a module
  3. Choose the module type

Module Structure

# module.yaml
apiVersion: argy.cloud/v1
kind: Module
metadata:
  name: create-k8s-namespace
  description: "Creates a Kubernetes namespace with base resources"
  category: provisioning
  tags:
    - kubernetes
    - namespace

spec:
  # Input parameters
  inputs:
    - name: namespaceName
      type: string
      required: true
      description: "Namespace name"
      validation:
        pattern: "^[a-z][a-z0-9-]*$"
        maxLength: 63
    
    - name: resourceQuota
      type: select
      description: "Resource quota"
      options:
        - small   # 2 CPU, 4Gi RAM
        - medium  # 4 CPU, 8Gi RAM
        - large   # 8 CPU, 16Gi RAM
      default: small
    
    - name: labels
      type: object
      description: "Additional labels"
      default: {}

  # Module outputs
  outputs:
    - name: namespace
      type: string
      description: "Created namespace name"
    - name: kubeconfig
      type: secret
      description: "Kubeconfig to access the namespace"

  # Execution steps
  steps:
    - name: create-namespace
      action: kubernetes/apply
      params:
        manifest: |
          apiVersion: v1
          kind: Namespace
          metadata:
            name: {{ inputs.namespaceName }}
            labels:
              managed-by: argy
              {{#each inputs.labels}}
              {{ @key }}: {{ this }}
              {{/each}}

    - name: create-resource-quota
      action: kubernetes/apply
      params:
        manifest: |
          apiVersion: v1
          kind: ResourceQuota
          metadata:
            name: default-quota
            namespace: {{ inputs.namespaceName }}
          spec:
            hard:
              {{#if (eq inputs.resourceQuota "small")}}
              requests.cpu: "2"
              requests.memory: 4Gi
              limits.cpu: "4"
              limits.memory: 8Gi
              {{/if}}
              {{#if (eq inputs.resourceQuota "medium")}}
              requests.cpu: "4"
              requests.memory: 8Gi
              limits.cpu: "8"
              limits.memory: 16Gi
              {{/if}}
              {{#if (eq inputs.resourceQuota "large")}}
              requests.cpu: "8"
              requests.memory: 16Gi
              limits.cpu: "16"
              limits.memory: 32Gi
              {{/if}}

    - name: create-network-policy
      action: kubernetes/apply
      params:
        manifest: |
          apiVersion: networking.k8s.io/v1
          kind: NetworkPolicy
          metadata:
            name: default-deny
            namespace: {{ inputs.namespaceName }}
          spec:
            podSelector: {}
            policyTypes:
              - Ingress
              - Egress

    - name: set-outputs
      action: core/set-outputs
      params:
        namespace: "{{ inputs.namespaceName }}"
        kubeconfig: "{{ steps.create-namespace.kubeconfig }}"

  # Rollback on error
  rollback:
    - name: delete-namespace
      action: kubernetes/delete
      params:
        kind: Namespace
        name: "{{ inputs.namespaceName }}"

Available Actions

CategoryActions
kubernetesapply, delete, patch, exec, logs
terraforminit, plan, apply, destroy
gitclone, commit, push, create-pr
helminstall, upgrade, uninstall, template
shellexec (sandboxed)
httprequest
vaultread, write, delete
coreset-outputs, wait, condition, loop

Testing a Module

# Validate syntax
argy-code module validate ./module.yaml

# Run in dry-run mode
argy-code module run ./module.yaml \
  --input namespaceName=test-ns \
  --input resourceQuota=small \
  --dry-run

# Run for real
argy-code module run ./module.yaml \
  --input namespaceName=test-ns \
  --input resourceQuota=small

Policies

Policies define governance rules that are automatically applied.

Policy Types

TypeDescription
ValidationVerifies configurations before deployment
MutationAutomatically modifies configurations
AuditGenerates compliance reports

Creating a Policy

# policy.yaml
apiVersion: argy.cloud/v1
kind: Policy
metadata:
  name: require-resource-limits
  description: "All containers must have resource limits"
  severity: high
  category: security

spec:
  # Policy targets
  targets:
    - kind: Deployment
    - kind: StatefulSet
    - kind: DaemonSet

  # Validation rules
  rules:
    - name: check-cpu-limits
      message: "Container {{ container.name }} has no CPU limit"
      expression: |
        spec.template.spec.containers.all(c, 
          has(c.resources) && 
          has(c.resources.limits) && 
          has(c.resources.limits.cpu)
        )

    - name: check-memory-limits
      message: "Container {{ container.name }} has no memory limit"
      expression: |
        spec.template.spec.containers.all(c, 
          has(c.resources) && 
          has(c.resources.limits) && 
          has(c.resources.limits.memory)
        )

  # Actions on violation
  enforcement:
    mode: deny  # deny, warn, audit
    
  # Exceptions
  exceptions:
    - namespaces:
        - kube-system
        - argy-system
    - labels:
        policy-exempt: "true"

Mutation Policy

# policy-mutation.yaml
apiVersion: argy.cloud/v1
kind: Policy
metadata:
  name: inject-labels
  description: "Automatically adds standard labels"

spec:
  targets:
    - kind: Deployment
    - kind: Service

  mutations:
    - name: add-managed-by-label
      patch:
        metadata:
          labels:
            managed-by: argy
            team: "{{ product.team }}"
            environment: "{{ product.environment }}"

    - name: add-annotations
      patch:
        metadata:
          annotations:
            argy.cloud/product-id: "{{ product.id }}"
            argy.cloud/deployed-at: "{{ now }}"

Applying a Policy

  1. Via the interface:

    • Go to StudioPolicies
    • Click Create a policy
    • Paste your YAML
    • Click Publish
  2. Via CLI:

    argy-code policy apply ./policy.yaml
    

Checking Compliance

# Check a manifest
argy-code policy check ./deployment.yaml

# Check an entire namespace
argy-code policy audit --namespace production

# Generate a report
argy-code policy report --format html --output report.html

Integrations

Configuring a Git Integration

  1. Go to AdministrationIntegrationsGit
  2. Click Add a connection
  3. Choose the provider:
    • GitHub
    • GitLab
    • Azure DevOps
    • Bitbucket

For GitHub:

# Required permissions for the GitHub App
permissions:
  contents: write
  pull_requests: write
  workflows: write
  actions: read

Configuring a Kubernetes Integration

  1. Go to AdministrationIntegrationsKubernetes
  2. Click Add a cluster
  3. Choose the method:
    • Agent: deploy the Argy agent in the cluster
    • Kubeconfig: upload a kubeconfig (less recommended)

Via Agent (recommended):

# Install the agent in the cluster
helm repo add argy https://charts.argy.cloud
helm install argy-agent argy/agent \
  --namespace argy-system \
  --create-namespace \
  --set token=$ARGY_AGENT_TOKEN

Configuring a Cloud Integration

AWS:

# IAM Role for Argy
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::ACCOUNT:oidc-provider/api.argy.cloud"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "api.argy.cloud:sub": "tenant:YOUR_TENANT_ID"
        }
      }
    }
  ]
}

Azure:

  1. Create a Service Principal
  2. Assign the necessary roles
  3. Configure in Argy with Client ID and Secret

GCP:

  1. Create a Service Account
  2. Configure Workload Identity Federation
  3. Upload the JSON key in Argy

Approval Workflows

Creating a Workflow

# approval-workflow.yaml
apiVersion: argy.cloud/v1
kind: ApprovalWorkflow
metadata:
  name: production-deployment
  description: "Approval for production deployments"

spec:
  # Triggers
  triggers:
    - event: deployment
      conditions:
        - environment: production
        - changeType: [create, update]

  # Approval stages
  stages:
    - name: tech-lead-approval
      description: "Tech Lead approval"
      approvers:
        type: role
        role: tech-lead
        minApprovals: 1
      timeout: 24h
      
    - name: security-review
      description: "Security review"
      approvers:
        type: team
        team: security
        minApprovals: 1
      timeout: 48h
      conditions:
        - securityScanResult: high

    - name: manager-approval
      description: "Manager approval"
      approvers:
        type: user
        users:
          - manager@company.com
      timeout: 24h

  # Post-approval actions
  onApproved:
    - action: notify
      params:
        channel: slack
        message: "Deployment {{ deployment.name }} approved"
    
  onRejected:
    - action: notify
      params:
        channel: slack
        message: "Deployment {{ deployment.name }} rejected: {{ rejection.reason }}"

Observability

Configuring Metrics

# metrics-config.yaml
apiVersion: argy.cloud/v1
kind: MetricsConfig
metadata:
  name: platform-metrics

spec:
  # Export to Prometheus
  prometheus:
    enabled: true
    endpoint: /metrics
    
  # Custom metrics
  customMetrics:
    - name: deployments_total
      type: counter
      description: "Total number of deployments"
      labels:
        - product
        - environment
        - status
        
    - name: deployment_duration_seconds
      type: histogram
      description: "Deployment duration"
      buckets: [30, 60, 120, 300, 600]

Configuring Alerts

# alerts.yaml
apiVersion: argy.cloud/v1
kind: AlertRule
metadata:
  name: deployment-failure-rate

spec:
  expression: |
    sum(rate(deployments_total{status="failed"}[5m])) 
    / sum(rate(deployments_total[5m])) > 0.1
  
  for: 5m
  severity: critical
  
  annotations:
    summary: "High deployment failure rate"
    description: "More than 10% of deployments are failing"
    
  notifications:
    - type: slack
      channel: "#platform-alerts"
    - type: pagerduty
      service: platform-oncall

Best Practices

1. Version Golden Paths

# Use semantic tags
git tag -a v1.0.0 -m "Initial release"
git push origin v1.0.0

# Reference the version in Argy
spec:
  template:
    source: git
    repository: https://github.com/org/templates
    ref: v1.0.0  # Use a tag, not main

2. Test Modules

# module-test.yaml
apiVersion: argy.cloud/v1
kind: ModuleTest
metadata:
  name: test-create-namespace

spec:
  module: create-k8s-namespace
  
  testCases:
    - name: basic-creation
      inputs:
        namespaceName: test-ns
        resourceQuota: small
      assertions:
        - type: exists
          resource: Namespace/test-ns
        - type: exists
          resource: ResourceQuota/default-quota
          
    - name: invalid-name
      inputs:
        namespaceName: "Invalid_Name"
      expectError: true
      errorContains: "pattern"

3. Document Policies

Each policy should have:

  • A clear description
  • Examples of valid/invalid configurations
  • Documented exceptions
  • A contact for questions

4. Monitor Adoption

Track adoption metrics:

  • Number of projects using each Golden Path
  • Policy compliance rate
  • Average deployment time
  • Developer satisfaction (surveys)

Troubleshooting

A module fails

  1. Check detailed logs:

    argy-code module logs --run-id <run-id>
    
  2. Verify agent permissions

  3. Test in dry-run mode

  4. Check resource quotas

A policy blocks a legitimate deployment

  1. Check if an exception is needed
  2. Add the policy-exempt: "true" label temporarily
  3. Create a permanent exception if justified
  4. Document the reason for the exception

Golden Paths are not updating

  1. Verify the Git repository is accessible
  2. Check Git credentials
  3. Force a sync:
    argy-code golden-path sync --force