Platform Engineer Guide
Complete guide for Platform Engineers: creating Golden Paths, modules, policies, and automations.
This guide walks you through creating and managing Argy platform capabilities.
Role Overview
As a Platform Engineer, you are responsible for:
- Golden Paths: defining standard paths for development teams
- Modules: creating reusable automations
- Policies: defining governance rules
- Integrations: connecting Argy to your existing tools
Golden Paths
Golden Paths are pre-approved paths that guide developers toward best practices.
Creating a Golden Path
- Go to Studio → Golden Paths
- Click Create a Golden Path
- Fill in the basic information:
- Name: e.g.,
nodejs-microservice - Description: detailed description
- Tags: for search (e.g.,
backend,nodejs,api) - Icon: choose an icon
- Name: e.g.,
Golden Path Structure
A Golden Path is defined in YAML:
# golden-path.yaml
apiVersion: argy.cloud/v1
kind: GoldenPath
metadata:
name: nodejs-microservice
description: "Node.js microservice with Express, TypeScript, and tests"
tags:
- backend
- nodejs
- typescript
- api
spec:
# File template
template:
source: git
repository: https://github.com/your-org/templates
path: nodejs-microservice
branch: main
# Variables to be filled by the user
parameters:
- name: serviceName
type: string
description: "Service name"
required: true
pattern: "^[a-z][a-z0-9-]*$"
- name: port
type: number
description: "Listening port"
default: 3000
min: 1024
max: 65535
- name: database
type: select
description: "Database type"
options:
- postgresql
- mongodb
- none
default: postgresql
- name: enableAuth
type: boolean
description: "Enable JWT authentication"
default: true
# Modules to apply automatically
modules:
- name: create-repo
params:
repoName: "{{ serviceName }}"
template: nodejs-microservice
- name: setup-ci
params:
type: github-actions
tests: true
lint: true
- name: deploy-k8s
when: "{{ database != 'none' }}"
params:
namespace: "{{ serviceName }}"
replicas: 2
# Policies to verify
policies:
- security-scan
- code-coverage-80
- no-secrets-in-code
# Associated documentation
docs:
- title: "Getting Started Guide"
url: "https://docs.internal/nodejs-microservice"
- title: "Code Standards"
url: "https://docs.internal/code-standards"
File Templates
Create a Git repository with your template structure:
nodejs-microservice/
├── .github/
│ └── workflows/
│ └── ci.yml.tpl
├── src/
│ ├── index.ts.tpl
│ ├── config/
│ │ └── index.ts.tpl
│ └── routes/
│ └── health.ts.tpl
├── tests/
│ └── health.test.ts.tpl
├── Dockerfile.tpl
├── package.json.tpl
├── tsconfig.json
└── README.md.tpl
.tpl files use templating syntax:
// src/index.ts.tpl
import express from 'express';
import { config } from './config';
const app = express();
const PORT = {{ port }};
app.get('/health', (req, res) => {
res.json({ status: 'ok', service: '{{ serviceName }}' });
});
{{#if enableAuth}}
import { authMiddleware } from './middleware/auth';
app.use(authMiddleware);
{{/if}}
app.listen(PORT, () => {
console.log(`{{ serviceName }} listening on port ${PORT}`);
});
Publishing a Golden Path
-
Test your Golden Path locally:
argy-code golden-path validate ./golden-path.yaml -
Publish it:
argy-code golden-path publish ./golden-path.yaml -
Or via the interface:
- Go to Studio → Golden Paths
- Click on your Golden Path
- Click Publish
Modules
Modules are reusable automations that encapsulate complex actions.
Module Types
| Type | Description | Examples |
|---|---|---|
| Provisioning | Resource creation | Git Repo, K8s Namespace, Database |
| Deployment | Application deployment | Helm, Terraform, ArgoCD |
| Configuration | Service configuration | Secrets, ConfigMaps, Variables |
| Observability | Monitoring and alerting | Dashboards, Alerts, Logs |
| Security | Security and compliance | Scans, Policies, Certificates |
Creating a Module
- Go to Studio → Modules
- Click Create a module
- Choose the module type
Module Structure
# module.yaml
apiVersion: argy.cloud/v1
kind: Module
metadata:
name: create-k8s-namespace
description: "Creates a Kubernetes namespace with base resources"
category: provisioning
tags:
- kubernetes
- namespace
spec:
# Input parameters
inputs:
- name: namespaceName
type: string
required: true
description: "Namespace name"
validation:
pattern: "^[a-z][a-z0-9-]*$"
maxLength: 63
- name: resourceQuota
type: select
description: "Resource quota"
options:
- small # 2 CPU, 4Gi RAM
- medium # 4 CPU, 8Gi RAM
- large # 8 CPU, 16Gi RAM
default: small
- name: labels
type: object
description: "Additional labels"
default: {}
# Module outputs
outputs:
- name: namespace
type: string
description: "Created namespace name"
- name: kubeconfig
type: secret
description: "Kubeconfig to access the namespace"
# Execution steps
steps:
- name: create-namespace
action: kubernetes/apply
params:
manifest: |
apiVersion: v1
kind: Namespace
metadata:
name: {{ inputs.namespaceName }}
labels:
managed-by: argy
{{#each inputs.labels}}
{{ @key }}: {{ this }}
{{/each}}
- name: create-resource-quota
action: kubernetes/apply
params:
manifest: |
apiVersion: v1
kind: ResourceQuota
metadata:
name: default-quota
namespace: {{ inputs.namespaceName }}
spec:
hard:
{{#if (eq inputs.resourceQuota "small")}}
requests.cpu: "2"
requests.memory: 4Gi
limits.cpu: "4"
limits.memory: 8Gi
{{/if}}
{{#if (eq inputs.resourceQuota "medium")}}
requests.cpu: "4"
requests.memory: 8Gi
limits.cpu: "8"
limits.memory: 16Gi
{{/if}}
{{#if (eq inputs.resourceQuota "large")}}
requests.cpu: "8"
requests.memory: 16Gi
limits.cpu: "16"
limits.memory: 32Gi
{{/if}}
- name: create-network-policy
action: kubernetes/apply
params:
manifest: |
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
namespace: {{ inputs.namespaceName }}
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
- name: set-outputs
action: core/set-outputs
params:
namespace: "{{ inputs.namespaceName }}"
kubeconfig: "{{ steps.create-namespace.kubeconfig }}"
# Rollback on error
rollback:
- name: delete-namespace
action: kubernetes/delete
params:
kind: Namespace
name: "{{ inputs.namespaceName }}"
Available Actions
| Category | Actions |
|---|---|
| kubernetes | apply, delete, patch, exec, logs |
| terraform | init, plan, apply, destroy |
| git | clone, commit, push, create-pr |
| helm | install, upgrade, uninstall, template |
| shell | exec (sandboxed) |
| http | request |
| vault | read, write, delete |
| core | set-outputs, wait, condition, loop |
Testing a Module
# Validate syntax
argy-code module validate ./module.yaml
# Run in dry-run mode
argy-code module run ./module.yaml \
--input namespaceName=test-ns \
--input resourceQuota=small \
--dry-run
# Run for real
argy-code module run ./module.yaml \
--input namespaceName=test-ns \
--input resourceQuota=small
Policies
Policies define governance rules that are automatically applied.
Policy Types
| Type | Description |
|---|---|
| Validation | Verifies configurations before deployment |
| Mutation | Automatically modifies configurations |
| Audit | Generates compliance reports |
Creating a Policy
# policy.yaml
apiVersion: argy.cloud/v1
kind: Policy
metadata:
name: require-resource-limits
description: "All containers must have resource limits"
severity: high
category: security
spec:
# Policy targets
targets:
- kind: Deployment
- kind: StatefulSet
- kind: DaemonSet
# Validation rules
rules:
- name: check-cpu-limits
message: "Container {{ container.name }} has no CPU limit"
expression: |
spec.template.spec.containers.all(c,
has(c.resources) &&
has(c.resources.limits) &&
has(c.resources.limits.cpu)
)
- name: check-memory-limits
message: "Container {{ container.name }} has no memory limit"
expression: |
spec.template.spec.containers.all(c,
has(c.resources) &&
has(c.resources.limits) &&
has(c.resources.limits.memory)
)
# Actions on violation
enforcement:
mode: deny # deny, warn, audit
# Exceptions
exceptions:
- namespaces:
- kube-system
- argy-system
- labels:
policy-exempt: "true"
Mutation Policy
# policy-mutation.yaml
apiVersion: argy.cloud/v1
kind: Policy
metadata:
name: inject-labels
description: "Automatically adds standard labels"
spec:
targets:
- kind: Deployment
- kind: Service
mutations:
- name: add-managed-by-label
patch:
metadata:
labels:
managed-by: argy
team: "{{ product.team }}"
environment: "{{ product.environment }}"
- name: add-annotations
patch:
metadata:
annotations:
argy.cloud/product-id: "{{ product.id }}"
argy.cloud/deployed-at: "{{ now }}"
Applying a Policy
-
Via the interface:
- Go to Studio → Policies
- Click Create a policy
- Paste your YAML
- Click Publish
-
Via CLI:
argy-code policy apply ./policy.yaml
Checking Compliance
# Check a manifest
argy-code policy check ./deployment.yaml
# Check an entire namespace
argy-code policy audit --namespace production
# Generate a report
argy-code policy report --format html --output report.html
Integrations
Configuring a Git Integration
- Go to Administration → Integrations → Git
- Click Add a connection
- Choose the provider:
- GitHub
- GitLab
- Azure DevOps
- Bitbucket
For GitHub:
# Required permissions for the GitHub App
permissions:
contents: write
pull_requests: write
workflows: write
actions: read
Configuring a Kubernetes Integration
- Go to Administration → Integrations → Kubernetes
- Click Add a cluster
- Choose the method:
- Agent: deploy the Argy agent in the cluster
- Kubeconfig: upload a kubeconfig (less recommended)
Via Agent (recommended):
# Install the agent in the cluster
helm repo add argy https://charts.argy.cloud
helm install argy-agent argy/agent \
--namespace argy-system \
--create-namespace \
--set token=$ARGY_AGENT_TOKEN
Configuring a Cloud Integration
AWS:
# IAM Role for Argy
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::ACCOUNT:oidc-provider/api.argy.cloud"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"api.argy.cloud:sub": "tenant:YOUR_TENANT_ID"
}
}
}
]
}
Azure:
- Create a Service Principal
- Assign the necessary roles
- Configure in Argy with Client ID and Secret
GCP:
- Create a Service Account
- Configure Workload Identity Federation
- Upload the JSON key in Argy
Approval Workflows
Creating a Workflow
# approval-workflow.yaml
apiVersion: argy.cloud/v1
kind: ApprovalWorkflow
metadata:
name: production-deployment
description: "Approval for production deployments"
spec:
# Triggers
triggers:
- event: deployment
conditions:
- environment: production
- changeType: [create, update]
# Approval stages
stages:
- name: tech-lead-approval
description: "Tech Lead approval"
approvers:
type: role
role: tech-lead
minApprovals: 1
timeout: 24h
- name: security-review
description: "Security review"
approvers:
type: team
team: security
minApprovals: 1
timeout: 48h
conditions:
- securityScanResult: high
- name: manager-approval
description: "Manager approval"
approvers:
type: user
users:
- manager@company.com
timeout: 24h
# Post-approval actions
onApproved:
- action: notify
params:
channel: slack
message: "Deployment {{ deployment.name }} approved"
onRejected:
- action: notify
params:
channel: slack
message: "Deployment {{ deployment.name }} rejected: {{ rejection.reason }}"
Observability
Configuring Metrics
# metrics-config.yaml
apiVersion: argy.cloud/v1
kind: MetricsConfig
metadata:
name: platform-metrics
spec:
# Export to Prometheus
prometheus:
enabled: true
endpoint: /metrics
# Custom metrics
customMetrics:
- name: deployments_total
type: counter
description: "Total number of deployments"
labels:
- product
- environment
- status
- name: deployment_duration_seconds
type: histogram
description: "Deployment duration"
buckets: [30, 60, 120, 300, 600]
Configuring Alerts
# alerts.yaml
apiVersion: argy.cloud/v1
kind: AlertRule
metadata:
name: deployment-failure-rate
spec:
expression: |
sum(rate(deployments_total{status="failed"}[5m]))
/ sum(rate(deployments_total[5m])) > 0.1
for: 5m
severity: critical
annotations:
summary: "High deployment failure rate"
description: "More than 10% of deployments are failing"
notifications:
- type: slack
channel: "#platform-alerts"
- type: pagerduty
service: platform-oncall
Best Practices
1. Version Golden Paths
# Use semantic tags
git tag -a v1.0.0 -m "Initial release"
git push origin v1.0.0
# Reference the version in Argy
spec:
template:
source: git
repository: https://github.com/org/templates
ref: v1.0.0 # Use a tag, not main
2. Test Modules
# module-test.yaml
apiVersion: argy.cloud/v1
kind: ModuleTest
metadata:
name: test-create-namespace
spec:
module: create-k8s-namespace
testCases:
- name: basic-creation
inputs:
namespaceName: test-ns
resourceQuota: small
assertions:
- type: exists
resource: Namespace/test-ns
- type: exists
resource: ResourceQuota/default-quota
- name: invalid-name
inputs:
namespaceName: "Invalid_Name"
expectError: true
errorContains: "pattern"
3. Document Policies
Each policy should have:
- A clear description
- Examples of valid/invalid configurations
- Documented exceptions
- A contact for questions
4. Monitor Adoption
Track adoption metrics:
- Number of projects using each Golden Path
- Policy compliance rate
- Average deployment time
- Developer satisfaction (surveys)
Troubleshooting
A module fails
-
Check detailed logs:
argy-code module logs --run-id <run-id> -
Verify agent permissions
-
Test in dry-run mode
-
Check resource quotas
A policy blocks a legitimate deployment
- Check if an exception is needed
- Add the
policy-exempt: "true"label temporarily - Create a permanent exception if justified
- Document the reason for the exception
Golden Paths are not updating
- Verify the Git repository is accessible
- Check Git credentials
- Force a sync:
argy-code golden-path sync --force