Platform Engineering Overview

Platform engineering is the discipline of building and maintaining Internal Developer Platforms (IDPs) — self-service toolchains and workflows that enable software engineering teams to ship faster without waiting on operations teams. It is the evolution of DevOps: instead of expecting every developer to be an infrastructure expert, you build a platform team that abstracts away infrastructure complexity behind curated, self-service interfaces.

The core insight is simple. In most organizations, developers spend 30-40% of their time on infrastructure tasks — provisioning environments, configuring CI/CD, debugging deployments, managing secrets, setting up monitoring. Platform engineering reclaims that time by turning repetitive infrastructure work into self-service capabilities.

This is not about removing control from developers. It is about removing toil. A good platform gives developers the power to deploy a new service in 10 minutes without filing a ticket, while maintaining the guardrails (security, compliance, cost controls) that the organization requires.

The Problem Platform Engineering Solves

Consider what happens when a developer at a typical company wants to deploy a new microservice:

The difference is not just speed. It is developer satisfaction, operational consistency, and organizational scalability. When every service is provisioned through the platform, every service gets the same security baselines, the same monitoring, the same deployment patterns.

What Is an Internal Developer Platform (IDP)?

An IDP is not a single product. It is a curated set of tools, services, and workflows that together provide a self-service experience for developers. Think of it as an internal PaaS (Platform as a Service) tailored to your organization's specific needs.

Core Capabilities of an IDP

Capability	What It Provides	Example Tools
Service Catalog	Central registry of all services, owners, dependencies	Backstage, Port, OpsLevel
Golden Path Templates	Pre-configured project scaffolds with best practices baked in	Cookiecutter, Yeoman, Backstage Templates
Infrastructure Provisioning	Self-service databases, caches, queues, storage	Terraform, Crossplane, Pulumi
CI/CD	Automated build, test, security scan, deploy pipelines	GitHub Actions, ArgoCD, Tekton
Observability	Logs, metrics, traces, dashboards, alerts	Grafana, Datadog, OpenTelemetry
Secrets Management	Secure storage and injection of credentials	Vault, AWS Secrets Manager, Doppler
Documentation	API references, runbooks, architecture docs	TechDocs, Swagger, Confluence
Cost Management	Per-team and per-service cloud cost visibility	Kubecost, Infracost, CloudHealth

Platform as a Product

The most important mindset shift in platform engineering is treating the platform as a product, not a project. Your internal developers are your customers. The platform team builds features, gathers feedback, iterates, and measures adoption — just like a product team building a SaaS product.

Product Thinking Applied to Platforms

Product Principle	Platform Application
Know your customer	Interview developers about their biggest pain points
Minimum viable product	Start with one golden path, not ten
Measure adoption	Track how many teams use each platform capability
Self-service over tickets	If a developer has to ask someone, the platform failed
Documentation is part of the product	Undocumented capabilities do not exist
Backwards compatibility	Breaking changes erode trust

Start With the Most Painful Problem

Do not try to build a complete IDP from day one. Find the single most painful infrastructure task your developers face — environment provisioning? secret management? CI/CD setup? — and solve that first. Then expand.

The Platform Team

Team Structure

A platform team is a dedicated engineering team responsible for building and maintaining the IDP. It is not the old ops team with a new name. Platform engineers write code, build products, and ship features — they just happen to build for internal developers instead of external customers.

Role	Responsibility
Platform Product Manager	Prioritize capabilities based on developer needs
Platform Engineers	Build and maintain platform services and tooling
Site Reliability Engineers (SRE)	Ensure platform reliability, define SLOs
Developer Advocates (optional)	Drive adoption, write docs, run workshops

Team Size Guidelines

Organization Size	Platform Team Size	Focus
20-50 engineers	1-2 platform engineers	Automate the top 3 pain points
50-200 engineers	3-6 platform engineers	Build a basic IDP with golden paths
200-1000 engineers	6-15 platform engineers	Full IDP with self-service catalog
1000+ engineers	15+ (often multiple teams)	Dedicated teams per platform domain

Self-Service Infrastructure

The defining feature of a good platform is self-service. Developers should be able to provision the resources they need without filing tickets or waiting for approvals (within predefined guardrails).

Example: Self-Service Database Provisioning

yaml

# developer-facing interface: platform.yaml in their repo
apiVersion: platform.company.io/v1
kind: ServiceResources
metadata:
  name: order-service
  team: commerce
spec:
  database:
    engine: postgresql
    version: "16"
    size: small       # small | medium | large | xlarge
    highAvailability: true
    backup:
      enabled: true
      retentionDays: 30

  cache:
    engine: redis
    version: "7"
    size: small
    evictionPolicy: allkeys-lru

  queue:
    engine: rabbitmq
    vhost: order-events

typescript

// Platform controller processes the resource request
import * as pulumi from '@pulumi/pulumi';
import * as aws from '@pulumi/aws';

const SIZE_MAP = {
  small:  'db.t3.medium',
  medium: 'db.r6g.large',
  large:  'db.r6g.xlarge',
  xlarge: 'db.r6g.2xlarge',
};

export function provisionDatabase(config: DatabaseConfig) {
  const instance = new aws.rds.Instance(`${config.serviceName}-db`, {
    engine: 'postgres',
    engineVersion: config.version,
    instanceClass: SIZE_MAP[config.size],
    allocatedStorage: 100,
    multiAz: config.highAvailability,
    backupRetentionPeriod: config.backup?.retentionDays || 7,

    // Platform guardrails — developers can't override these
    storageEncrypted: true,
    deletionProtection: true,
    performanceInsightsEnabled: true,

    tags: {
      Team: config.team,
      Service: config.serviceName,
      ManagedBy: 'platform',
    },
  });

  // Automatically inject connection string as a secret
  new aws.secretsmanager.Secret(`${config.serviceName}-db-url`, {
    name: `/${config.team}/${config.serviceName}/DATABASE_URL`,
  });

  return instance;
}

Guardrails, Not Gates

The platform should enforce organizational standards without blocking developers:

Examples of guardrails:

Cost limits: Any request under $500/month auto-approves. Above that, requires team lead sign-off.
Security baselines: All databases must have encryption at rest. No public endpoints without authentication. All secrets in Vault.
Compliance constraints: PII workloads must run in specific regions. Healthcare data requires HIPAA-compliant instances.
Naming conventions: All resources follow {team}-{service}-{resource}-{env} naming.

Golden Paths

A golden path is the organization's recommended, supported way to accomplish a common development task. It is not the only way — but it is the way that comes with full platform support, documentation, and guardrails.

Example Golden Paths

Task	Golden Path	What It Includes
Create a new API service	`platform new service --type api`	Repo scaffold, CI/CD, Dockerfile, Terraform, monitoring
Add a PostgreSQL database	`platform add database --engine postgres`	RDS provisioning, secret injection, backup config
Set up a cron job	`platform add cronjob --schedule "0 /6 * *"`	Kubernetes CronJob, monitoring, alerting
Deploy to production	`git push` (PR merge to main)	CI/CD pipeline with tests, security scan, canary deploy

Golden Paths Must Be Optional

If developers feel forced to use the platform, they will route around it. The golden path must be genuinely better (faster, more reliable, less effort) than the alternative. Adoption should be driven by value, not mandates.

Measuring Platform Success

How do you know your platform is working? Track these metrics:

Platform Adoption Metrics

Metric	What It Measures	Target
Services on platform	% of services using platform golden paths	>80%
Self-service ratio	% of infra requests handled without tickets	>90%
Time to first deploy	How long from "create service" to production	<30 minutes
Developer NPS	Developer satisfaction with the platform	>40
Support ticket volume	Number of platform-related support requests	Decreasing trend

DORA Metrics Impact

Platform engineering directly improves DORA metrics:

DORA Metric	Without Platform	With Platform
Deployment Frequency	Weekly	Multiple per day
Lead Time for Changes	Days to weeks	Hours
Mean Time to Recovery	Hours	Minutes
Change Failure Rate	15-30%	<5%

The Platform Maturity Model

Level	Characteristics	Platform Team Focus
Level 1: Ad Hoc	No platform. Developers manage their own infra. Lots of tickets.	Identify pain points, build the case for a platform team.
Level 2: Standardized	Shared scripts and templates. Documented procedures. Some automation.	Create golden path templates, basic CI/CD standardization.
Level 3: Self-Service	Developers provision resources on demand. Service catalog exists.	Build IDP, integrate observability, enforce guardrails.
Level 4: Optimized	Full IDP with metrics. Continuously improved based on developer feedback.	Optimize developer experience, reduce cognitive load.

Common Anti-Patterns

Platform Engineering Anti-Patterns

Building in isolation — The platform team builds what they think is cool, not what developers actually need. Always start with user research.
Mandatory adoption — Forcing teams to use the platform breeds resentment. Make it so good they choose it.
Too much abstraction — Hiding too much complexity makes debugging impossible. Developers need escape hatches.
No documentation — If a platform capability is not documented, it does not exist for developers.
One-size-fits-all — Different teams have different needs. The platform should be configurable, not rigid.
Treating it as a project — Projects end. Platforms need continuous investment and maintenance.

Section Contents

Page	What You'll Learn
Backstage & Developer Portals	Spotify's Backstage, software catalogs, templates, and plugins
Developer Experience (DX)	DORA metrics, SPACE framework, dev containers, and cognitive load

Platform Engineering Overview ​

The Problem Platform Engineering Solves ​

What Is an Internal Developer Platform (IDP)? ​

Core Capabilities of an IDP ​

Platform as a Product ​

Product Thinking Applied to Platforms ​

The Platform Team ​

Team Structure ​

Team Size Guidelines ​

Self-Service Infrastructure ​

Example: Self-Service Database Provisioning ​

Guardrails, Not Gates ​

Golden Paths ​

Example Golden Paths ​

Measuring Platform Success ​

Platform Adoption Metrics ​

DORA Metrics Impact ​

The Platform Maturity Model ​

Common Anti-Patterns ​

Section Contents ​

Further Reading ​

Related Pages