Skip to content
Unverified — AI-generated content. Help verify this page

Serverless Architecture

Serverless computing lets you run code without provisioning or managing servers. You upload a function, define a trigger, and the cloud provider handles scaling, patching, and availability. You pay only for the compute time you actually use — down to the millisecond. But "serverless" does not mean there are no servers. It means you do not think about them. The tradeoff is control: you give up control over the execution environment in exchange for zero operational overhead.

The Execution Model

How Lambda Actually Works

The Lambda Lifecycle

What Happens During Init

typescript
// Everything OUTSIDE the handler runs during Init (cold start)
import { DynamoDB } from '@aws-sdk/client-dynamodb';
import { SSMClient, GetParameterCommand } from '@aws-sdk/client-ssm';

// These run once during cold start — expensive but cached
const dynamodb = new DynamoDB({ region: 'us-east-1' });
const ssm = new SSMClient({ region: 'us-east-1' });

// Pre-fetch config during init
let dbConfig: DatabaseConfig;
async function initConfig() {
  const param = await ssm.send(new GetParameterCommand({
    Name: '/myapp/db-config',
    WithDecryption: true,
  }));
  dbConfig = JSON.parse(param.Parameter!.Value!);
}
const configPromise = initConfig();

// The handler runs on every invocation — keep this fast
export async function handler(event: APIGatewayProxyEvent) {
  await configPromise; // Ensure init is complete

  const result = await dynamodb.getItem({
    TableName: 'orders',
    Key: { id: { S: event.pathParameters!.id! } },
  });

  return {
    statusCode: 200,
    body: JSON.stringify(result.Item),
  };
}

Cold Starts: Real Numbers

Cold starts are the most debated serverless topic. Here are actual measured numbers.

Cold Start Duration by Runtime

RuntimeCold Start (p50)Cold Start (p99)Notes
Node.js 20150-300ms500-800msFastest for most workloads
Python 3.12200-400ms600-1000msSlightly slower than Node
Go80-150ms200-400msCompiled binary, fastest overall
Java 21 (plain)3-8s8-12sJVM startup is brutal
Java 21 (SnapStart)200-400ms500-800msSnapshots JVM state pre-init
Rust50-100ms150-300msCompiled, no runtime overhead
.NET 8400-800ms1-2sAOT compilation helps
.NET 8 (AOT)150-300ms400-600msAhead-of-time compilation

What Affects Cold Start Duration

FactorImpactMitigation
Package size+100ms per 10MBTree-shake, use layers, exclude dev deps
Dependencies loaded+50-200ms per heavy dependencyLazy-load, use lighter alternatives
VPC attachment+200-500ms (used to be 10s+)Now uses Hyperplane ENI, much faster
Runtime50ms (Rust) to 8s (Java)Choose runtime wisely for latency-sensitive functions
Memory allocatedMore memory = proportionally more CPU = faster init512MB-1024MB is sweet spot for most
Init code complexityDB connections, SDK init, config fetchingMove to init phase, cache in global scope

Provisioned Concurrency

For latency-sensitive workloads, provisioned concurrency keeps instances warm.

yaml
# serverless.yml — provisioned concurrency config
functions:
  api:
    handler: src/handler.main
    runtime: nodejs20.x
    memorySize: 1024
    timeout: 30
    provisionedConcurrency: 10  # Always 10 warm instances

    events:
      - http:
          path: /orders/{id}
          method: get

# Auto-scaling provisioned concurrency
resources:
  Resources:
    ApiProvisionedConcurrencyTarget:
      Type: AWS::ApplicationAutoScaling::ScalableTarget
      Properties:
        MinCapacity: 5
        MaxCapacity: 50
        ResourceId: !Sub "function:${​{self:service}}-${​{sls:stage}}-api:current"
        ScalableDimension: lambda:function:ProvisionedConcurrency
        ServiceNamespace: lambda

    ApiProvisionedConcurrencyScaling:
      Type: AWS::ApplicationAutoScaling::ScalingPolicy
      Properties:
        PolicyName: utilization
        PolicyType: TargetTrackingScaling
        ScalingTargetId: !Ref ApiProvisionedConcurrencyTarget
        TargetTrackingScalingPolicyConfiguration:
          TargetValue: 0.7  # Scale when 70% of provisioned is in use
          PredefinedMetricSpecification:
            PredefinedMetricType: LambdaProvisionedConcurrencyUtilization

Cost of provisioned concurrency:

SettingMonthly Cost (us-east-1)Explanation
10 provisioned, 128MB~$35/monthBaseline for always-warm
10 provisioned, 1024MB~$280/monthHigher memory = higher cost
50 provisioned, 512MB~$700/monthApproaching EC2 cost territory

Event Sources

Lambda is triggered by events from dozens of AWS services. The event source determines the invocation pattern.

PatternRetry BehaviorError Handling
SynchronousClient retriesReturn error to caller
Asynchronous2 automatic retries, then DLQConfigure DLQ or on-failure destination
Stream/QueueRetries until record expires or succeedsConfigure bisect on batch failure, max retries

Event Source Examples

typescript
// API Gateway trigger — synchronous
export async function httpHandler(event: APIGatewayProxyEventV2) {
  const orderId = event.pathParameters?.id;
  const order = await getOrder(orderId);

  return {
    statusCode: 200,
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify(order),
  };
}

// S3 trigger — asynchronous
export async function s3Handler(event: S3Event) {
  for (const record of event.Records) {
    const bucket = record.s3.bucket.name;
    const key = record.s3.object.key;

    // Process uploaded file (resize image, parse CSV, etc.)
    await processFile(bucket, key);
  }
}

// SQS trigger — queue polling
export async function sqsHandler(event: SQSEvent) {
  const failedIds: string[] = [];

  for (const record of event.Records) {
    try {
      const body = JSON.parse(record.body);
      await processMessage(body);
    } catch (error) {
      failedIds.push(record.messageId);
    }
  }

  // Partial batch failure reporting
  return {
    batchItemFailures: failedIds.map(id => ({
      itemIdentifier: id,
    })),
  };
}

// DynamoDB Streams trigger — change data capture
export async function dynamoStreamHandler(event: DynamoDBStreamEvent) {
  for (const record of event.Records) {
    if (record.eventName === 'INSERT') {
      const newItem = record.dynamodb?.NewImage;
      await indexInElasticsearch(newItem);
    } else if (record.eventName === 'MODIFY') {
      const oldItem = record.dynamodb?.OldImage;
      const newItem = record.dynamodb?.NewImage;
      await updateElasticsearch(oldItem, newItem);
    }
  }
}

Cost Model

Lambda Pricing (us-east-1, 2026)

ComponentPriceNotes
Invocations$0.20 per 1M requestsFirst 1M free/month
Duration$0.0000166667 per GB-secondPer 1ms granularity
Provisioned concurrency$0.0000041667 per GB-secondFor always-warm instances
Free tier1M invocations + 400,000 GB-seconds/monthEnough for small apps

Cost Comparison: Lambda vs EC2 vs Fargate

For a REST API handling 10M requests/month, average 200ms execution at 512MB:

Lambda:
  Invocations:  10M × $0.20/1M         = $2.00
  Duration:     10M × 0.2s × 0.5GB × $0.0000166667 = $16.67
  Total:                                  $18.67/month

EC2 (t3.medium, reserved 1yr):
  Instance:     $30.37/month (reserved)
  Always running, even at 3am with zero traffic
  Total:                                  $30.37/month

Fargate (0.25 vCPU, 0.5GB, 2 tasks):
  CPU:          2 × 0.25 × $0.04048/hr × 730hrs = $14.75
  Memory:       2 × 0.5 × $0.004445/hr × 730hrs = $3.25
  Total:                                  $18.00/month

The Crossover Point

Lambda is cheapest when:

  • Traffic is spiky or unpredictable
  • There are long periods of zero traffic
  • Functions execute in under 1 second
  • You value zero ops over cost optimization

Containers are cheapest when:

  • Traffic is steady and predictable
  • Functions run for minutes (batch processing)
  • You have ops capacity to manage infrastructure
  • Traffic exceeds ~100M requests/month

Limitations You Must Know

LimitationValueImpact
Execution timeout15 minutes maxCannot run long-running processes
Memory128MB - 10,240MBCPU scales proportionally with memory
Package size50MB zipped / 250MB unzippedLarge ML models need container images
Container image10GB maxEnough for most workloads
Ephemeral storage512MB - 10GB /tmpNot persistent between invocations
Concurrent executions1000 default (can increase)Burst limit varies by region
Payload size6MB sync / 256KB asyncLarge payloads need S3
Environment variables4KB totalUse SSM Parameter Store for more
ConnectionsNo persistent connections across invocationsDB connection pooling via RDS Proxy

Connection Pooling Problem and Solution

typescript
// Use RDS Proxy for connection pooling
import { RDSDataClient, ExecuteStatementCommand } from '@aws-sdk/client-rds-data';

// Option 1: RDS Data API (HTTP-based, no connection management)
const rdsData = new RDSDataClient({ region: 'us-east-1' });

export async function handler(event: any) {
  const result = await rdsData.send(new ExecuteStatementCommand({
    resourceArn: process.env.DB_CLUSTER_ARN!,
    secretArn: process.env.DB_SECRET_ARN!,
    database: 'mydb',
    sql: 'SELECT * FROM orders WHERE id = :id',
    parameters: [{ name: 'id', value: { stringValue: event.orderId } }],
  }));

  return result.records;
}

Step Functions: Orchestrating Serverless Workflows

For workflows that exceed a single Lambda's 15-minute limit or require complex branching, AWS Step Functions orchestrate multiple Lambdas.

json
{
  "Comment": "Order processing workflow",
  "StartAt": "ValidateOrder",
  "States": {
    "ValidateOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456:function:validate-order",
      "Next": "CheckInventory",
      "Catch": [{
        "ErrorEquals": ["ValidationError"],
        "Next": "RejectOrder"
      }]
    },
    "CheckInventory": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456:function:check-inventory",
      "Next": "ProcessPayment",
      "Catch": [{
        "ErrorEquals": ["OutOfStockError"],
        "Next": "BackorderNotify"
      }]
    },
    "ProcessPayment": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456:function:process-payment",
      "Retry": [{
        "ErrorEquals": ["PaymentRetryableError"],
        "IntervalSeconds": 5,
        "MaxAttempts": 3,
        "BackoffRate": 2.0
      }],
      "Catch": [{
        "ErrorEquals": ["States.ALL"],
        "Next": "CancelOrder"
      }],
      "Next": "FulfillOrder"
    },
    "FulfillOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456:function:fulfill-order",
      "Next": "SendConfirmation"
    },
    "SendConfirmation": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456:function:send-confirmation",
      "End": true
    },
    "RejectOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456:function:reject-order",
      "End": true
    },
    "BackorderNotify": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456:function:backorder-notify",
      "End": true
    },
    "CancelOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456:function:cancel-order",
      "Next": "RefundPayment"
    },
    "RefundPayment": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456:function:refund-payment",
      "End": true
    }
  }
}

When Serverless Beats Containers

ScenarioServerlessContainersWinner
Spiky traffic (0 to 10K req/s)Auto-scales instantly, $0 at idleNeed min capacity runningServerless
Steady high traffic (100K req/s)Expensive at scaleFixed cost, efficientContainers
Event processing (S3, SQS)Native integrationNeed polling infrastructureServerless
Long-running jobs (> 15 min)Not possibleNo time limitContainers
Startup/MVPZero ops, fast iterationNeed K8s/ECS expertiseServerless
WebSocket connectionsLimited supportFull controlContainers
ML inference (GPU)No GPU supportGPU instances availableContainers
Cron jobs (run 5 min/day)Pay for 5 min/dayPay for 24 hours/dayServerless

Vendor Lock-In Mitigation

The Lock-In Spectrum

ComponentLock-In LevelMitigation
Function handler interfaceLowAdapter pattern per cloud
Event sources (S3, SQS)HighAbstract behind interfaces
Step FunctionsVery HighUse Temporal/Conductor instead
DynamoDB in functionHighUse abstractions, consider portable DBs
IAM/permissions modelVery HighAccept it — security is always cloud-specific

Portable Function Pattern

typescript
// Core business logic — zero cloud dependencies
export class OrderProcessor {
  constructor(
    private orderRepo: OrderRepository,
    private paymentGateway: PaymentGateway,
    private notifier: Notifier,
  ) {}

  async processOrder(input: ProcessOrderInput): Promise<ProcessOrderResult> {
    const order = await this.orderRepo.findById(input.orderId);
    const payment = await this.paymentGateway.charge(order.total);
    await this.notifier.sendConfirmation(order.userId, order.id);
    return { orderId: order.id, paymentId: payment.id };
  }
}

// AWS Lambda adapter
import { APIGatewayProxyHandlerV2 } from 'aws-lambda';
const processor = new OrderProcessor(
  new DynamoDBOrderRepository(),
  new StripePaymentGateway(),
  new SESNotifier(),
);

export const handler: APIGatewayProxyHandlerV2 = async (event) => {
  const input = JSON.parse(event.body!);
  const result = await processor.processOrder(input);
  return { statusCode: 200, body: JSON.stringify(result) };
};

// Google Cloud Functions adapter
import { HttpFunction } from '@google-cloud/functions-framework';
const processor = new OrderProcessor(
  new FirestoreOrderRepository(),
  new StripePaymentGateway(),
  new SendGridNotifier(),
);

export const handler: HttpFunction = async (req, res) => {
  const result = await processor.processOrder(req.body);
  res.json(result);
};

Serverless Architecture Patterns

Pattern 1: API Backend

Pattern 2: Event Processing Pipeline

Pattern 3: Scheduled Jobs

yaml
functions:
  dailyReport:
    handler: src/reports/daily.handler
    timeout: 900  # 15 minutes
    memorySize: 2048
    events:
      - schedule:
          rate: cron(0 6 * * ? *)  # 6am UTC daily
          input:
            reportType: "daily-summary"

  cleanupExpired:
    handler: src/maintenance/cleanup.handler
    timeout: 300
    events:
      - schedule:
          rate: rate(1 hour)

Key Takeaways

  1. Cold starts matter — choose runtime wisely (Go/Rust < Node.js < Python < Java), use provisioned concurrency for latency-sensitive paths
  2. Cost model favors spiky workloads — at steady high traffic, containers are cheaper
  3. 15-minute limit is real — use Step Functions for longer workflows
  4. Connection pooling is critical — use RDS Proxy or DynamoDB, never open new DB connections per invocation
  5. Put initialization outside the handler — SDK clients, config fetching, DB connections go in module scope
  6. Serverless shines for event-driven architectures — native integration with S3, SQS, DynamoDB Streams, EventBridge
  7. Vendor lock-in is manageable — isolate business logic from cloud-specific adapters

Real-World Examples

BBC Online

BBC Online serves over 60 million weekly users using a serverless architecture on AWS Lambda. Their content delivery pipeline processes article metadata, generates page variants, and serves personalized content — all without managing a single server. Lambda handles their massive traffic spikes during breaking news events (10-50x normal traffic in minutes), scaling automatically from idle to thousands of concurrent executions.

Coca-Cola

Coca-Cola migrated their vending machine backend to AWS Lambda + API Gateway. Their 1 million+ smart vending machines send telemetry data sporadically — Lambda's pay-per-use model means they pay only for the milliseconds of processing each event requires, rather than maintaining servers 24/7. This reduced their infrastructure costs by over 65% compared to EC2.

iRobot

iRobot uses AWS Step Functions to orchestrate the entire lifecycle of Roomba robot commands. When a user presses "Clean" in the app, a Step Function workflow validates the command, sends it to the robot via IoT Core, monitors execution progress, and handles failures with automatic retries. The visual state machine makes the complex workflow auditable and debuggable without custom orchestration code.

Interview Tip

What to say

"Serverless is ideal for event-driven, spiky workloads where you'd otherwise pay for idle capacity. I'd use Lambda for API endpoints under 50M requests/month, S3-triggered file processing, and cron jobs that run minutes per day. The key constraints are: 15-minute execution limit, cold starts (150-300ms for Node.js, 3-8s for Java without SnapStart), and the connection pooling problem (1000 Lambda instances opening 1000 database connections). I'd mitigate cold starts with provisioned concurrency for latency-sensitive paths, use RDS Proxy for connection pooling, and isolate business logic from the Lambda handler for portability. Beyond 100M requests/month with steady traffic, containers become cheaper — the crossover point matters."

"What I cannot create, I do not understand." — Richard Feynman