Skip to content

Platform Engineering Overview

Platform engineering is the discipline of building and maintaining Internal Developer Platforms (IDPs) — self-service toolchains and workflows that enable software engineering teams to ship faster without waiting on operations teams. It is the evolution of DevOps: instead of expecting every developer to be an infrastructure expert, you build a platform team that abstracts away infrastructure complexity behind curated, self-service interfaces.

The core insight is simple. In most organizations, developers spend 30-40% of their time on infrastructure tasks — provisioning environments, configuring CI/CD, debugging deployments, managing secrets, setting up monitoring. Platform engineering reclaims that time by turning repetitive infrastructure work into self-service capabilities.

This is not about removing control from developers. It is about removing toil. A good platform gives developers the power to deploy a new service in 10 minutes without filing a ticket, while maintaining the guardrails (security, compliance, cost controls) that the organization requires.

The Problem Platform Engineering Solves

Consider what happens when a developer at a typical company wants to deploy a new microservice:

The difference is not just speed. It is developer satisfaction, operational consistency, and organizational scalability. When every service is provisioned through the platform, every service gets the same security baselines, the same monitoring, the same deployment patterns.

What Is an Internal Developer Platform (IDP)?

An IDP is not a single product. It is a curated set of tools, services, and workflows that together provide a self-service experience for developers. Think of it as an internal PaaS (Platform as a Service) tailored to your organization's specific needs.

Core Capabilities of an IDP

CapabilityWhat It ProvidesExample Tools
Service CatalogCentral registry of all services, owners, dependenciesBackstage, Port, OpsLevel
Golden Path TemplatesPre-configured project scaffolds with best practices baked inCookiecutter, Yeoman, Backstage Templates
Infrastructure ProvisioningSelf-service databases, caches, queues, storageTerraform, Crossplane, Pulumi
CI/CDAutomated build, test, security scan, deploy pipelinesGitHub Actions, ArgoCD, Tekton
ObservabilityLogs, metrics, traces, dashboards, alertsGrafana, Datadog, OpenTelemetry
Secrets ManagementSecure storage and injection of credentialsVault, AWS Secrets Manager, Doppler
DocumentationAPI references, runbooks, architecture docsTechDocs, Swagger, Confluence
Cost ManagementPer-team and per-service cloud cost visibilityKubecost, Infracost, CloudHealth

Platform as a Product

The most important mindset shift in platform engineering is treating the platform as a product, not a project. Your internal developers are your customers. The platform team builds features, gathers feedback, iterates, and measures adoption — just like a product team building a SaaS product.

Product Thinking Applied to Platforms

Product PrinciplePlatform Application
Know your customerInterview developers about their biggest pain points
Minimum viable productStart with one golden path, not ten
Measure adoptionTrack how many teams use each platform capability
Self-service over ticketsIf a developer has to ask someone, the platform failed
Documentation is part of the productUndocumented capabilities do not exist
Backwards compatibilityBreaking changes erode trust

Start With the Most Painful Problem

Do not try to build a complete IDP from day one. Find the single most painful infrastructure task your developers face — environment provisioning? secret management? CI/CD setup? — and solve that first. Then expand.

The Platform Team

Team Structure

A platform team is a dedicated engineering team responsible for building and maintaining the IDP. It is not the old ops team with a new name. Platform engineers write code, build products, and ship features — they just happen to build for internal developers instead of external customers.

RoleResponsibility
Platform Product ManagerPrioritize capabilities based on developer needs
Platform EngineersBuild and maintain platform services and tooling
Site Reliability Engineers (SRE)Ensure platform reliability, define SLOs
Developer Advocates (optional)Drive adoption, write docs, run workshops

Team Size Guidelines

Organization SizePlatform Team SizeFocus
20-50 engineers1-2 platform engineersAutomate the top 3 pain points
50-200 engineers3-6 platform engineersBuild a basic IDP with golden paths
200-1000 engineers6-15 platform engineersFull IDP with self-service catalog
1000+ engineers15+ (often multiple teams)Dedicated teams per platform domain

Self-Service Infrastructure

The defining feature of a good platform is self-service. Developers should be able to provision the resources they need without filing tickets or waiting for approvals (within predefined guardrails).

Example: Self-Service Database Provisioning

yaml
# developer-facing interface: platform.yaml in their repo
apiVersion: platform.company.io/v1
kind: ServiceResources
metadata:
  name: order-service
  team: commerce
spec:
  database:
    engine: postgresql
    version: "16"
    size: small       # small | medium | large | xlarge
    highAvailability: true
    backup:
      enabled: true
      retentionDays: 30

  cache:
    engine: redis
    version: "7"
    size: small
    evictionPolicy: allkeys-lru

  queue:
    engine: rabbitmq
    vhost: order-events
typescript
// Platform controller processes the resource request
import * as pulumi from '@pulumi/pulumi';
import * as aws from '@pulumi/aws';

const SIZE_MAP = {
  small:  'db.t3.medium',
  medium: 'db.r6g.large',
  large:  'db.r6g.xlarge',
  xlarge: 'db.r6g.2xlarge',
};

export function provisionDatabase(config: DatabaseConfig) {
  const instance = new aws.rds.Instance(`${config.serviceName}-db`, {
    engine: 'postgres',
    engineVersion: config.version,
    instanceClass: SIZE_MAP[config.size],
    allocatedStorage: 100,
    multiAz: config.highAvailability,
    backupRetentionPeriod: config.backup?.retentionDays || 7,

    // Platform guardrails — developers can't override these
    storageEncrypted: true,
    deletionProtection: true,
    performanceInsightsEnabled: true,

    tags: {
      Team: config.team,
      Service: config.serviceName,
      ManagedBy: 'platform',
    },
  });

  // Automatically inject connection string as a secret
  new aws.secretsmanager.Secret(`${config.serviceName}-db-url`, {
    name: `/${config.team}/${config.serviceName}/DATABASE_URL`,
  });

  return instance;
}

Guardrails, Not Gates

The platform should enforce organizational standards without blocking developers:

Examples of guardrails:

  • Cost limits: Any request under $500/month auto-approves. Above that, requires team lead sign-off.
  • Security baselines: All databases must have encryption at rest. No public endpoints without authentication. All secrets in Vault.
  • Compliance constraints: PII workloads must run in specific regions. Healthcare data requires HIPAA-compliant instances.
  • Naming conventions: All resources follow {team}-{service}-{resource}-{env} naming.

Golden Paths

A golden path is the organization's recommended, supported way to accomplish a common development task. It is not the only way — but it is the way that comes with full platform support, documentation, and guardrails.

Example Golden Paths

TaskGolden PathWhat It Includes
Create a new API serviceplatform new service --type apiRepo scaffold, CI/CD, Dockerfile, Terraform, monitoring
Add a PostgreSQL databaseplatform add database --engine postgresRDS provisioning, secret injection, backup config
Set up a cron jobplatform add cronjob --schedule "0 */6 * * *"Kubernetes CronJob, monitoring, alerting
Deploy to productiongit push (PR merge to main)CI/CD pipeline with tests, security scan, canary deploy

Golden Paths Must Be Optional

If developers feel forced to use the platform, they will route around it. The golden path must be genuinely better (faster, more reliable, less effort) than the alternative. Adoption should be driven by value, not mandates.

Measuring Platform Success

How do you know your platform is working? Track these metrics:

Platform Adoption Metrics

MetricWhat It MeasuresTarget
Services on platform% of services using platform golden paths>80%
Self-service ratio% of infra requests handled without tickets>90%
Time to first deployHow long from "create service" to production<30 minutes
Developer NPSDeveloper satisfaction with the platform>40
Support ticket volumeNumber of platform-related support requestsDecreasing trend

DORA Metrics Impact

Platform engineering directly improves DORA metrics:

DORA MetricWithout PlatformWith Platform
Deployment FrequencyWeeklyMultiple per day
Lead Time for ChangesDays to weeksHours
Mean Time to RecoveryHoursMinutes
Change Failure Rate15-30%<5%

The Platform Maturity Model

LevelCharacteristicsPlatform Team Focus
Level 1: Ad HocNo platform. Developers manage their own infra. Lots of tickets.Identify pain points, build the case for a platform team.
Level 2: StandardizedShared scripts and templates. Documented procedures. Some automation.Create golden path templates, basic CI/CD standardization.
Level 3: Self-ServiceDevelopers provision resources on demand. Service catalog exists.Build IDP, integrate observability, enforce guardrails.
Level 4: OptimizedFull IDP with metrics. Continuously improved based on developer feedback.Optimize developer experience, reduce cognitive load.

Common Anti-Patterns

Platform Engineering Anti-Patterns

  1. Building in isolation — The platform team builds what they think is cool, not what developers actually need. Always start with user research.
  2. Mandatory adoption — Forcing teams to use the platform breeds resentment. Make it so good they choose it.
  3. Too much abstraction — Hiding too much complexity makes debugging impossible. Developers need escape hatches.
  4. No documentation — If a platform capability is not documented, it does not exist for developers.
  5. One-size-fits-all — Different teams have different needs. The platform should be configurable, not rigid.
  6. Treating it as a project — Projects end. Platforms need continuous investment and maintenance.

Section Contents

PageWhat You'll Learn
Backstage & Developer PortalsSpotify's Backstage, software catalogs, templates, and plugins
Developer Experience (DX)DORA metrics, SPACE framework, dev containers, and cognitive load

Further Reading

  • CI/CD Pipelines — The deployment automation layer of your platform
  • Kubernetes — Container orchestration that platforms are often built on
  • Terraform — Infrastructure as Code foundations
  • Observability — Monitoring, logging, and tracing
  • "Team Topologies" by Matthew Skelton and Manuel Pais
  • "Platform Engineering on Kubernetes" by Mauricio Salatino
  • CNCF Platforms Working Group reference architecture

"What I cannot create, I do not understand." — Richard Feynman