Skip to content
Unverified — AI-generated content. Help verify this page

AWS Lambda Deep Dive

AWS Lambda is the canonical serverless compute service — upload a function, and AWS handles provisioning, scaling, patching, and availability. But the devil is in the details. Cold starts, memory tuning, VPC latency, payload limits, timeout ceilings, and concurrency throttling are all production realities that catch teams off guard.

This guide goes from the execution model internals through production-hardened patterns. You will understand not just how to deploy Lambda functions, but how the runtime works under the hood and where it breaks.


1. Why Lambda Exists: The Problem It Solves

Before Lambda (launched November 2014), running a short-lived task on AWS required:

  1. Provisioning an EC2 instance (or ECS task)
  2. Installing a runtime
  3. Writing the request handler
  4. Managing auto-scaling policies
  5. Patching the OS
  6. Paying for idle time

For event-driven workloads — an image upload triggers a resize, an API Gateway request runs business logic, a DynamoDB stream triggers downstream processing — the operational overhead of managing servers dominated the actual business logic.

Lambda's promise: pay only for the compute time consumed, and never manage a server. The pricing model (per-invocation + per-millisecond of execution) made sub-second tasks economically viable in ways that EC2 or ECS never could.

Historical Context

  • 2014: Lambda launches with Node.js support only, 60s max timeout
  • 2015: Python, Java support; API Gateway integration
  • 2016: C#, VPC support, dead letter queues
  • 2017: Go, step functions
  • 2018: Layers, custom runtimes, ALB integration
  • 2019: Provisioned concurrency, EFS support
  • 2020: Container image support (up to 10GB), 10GB memory
  • 2022: Lambda SnapStart (Java), function URLs
  • 2023: Response streaming, advanced logging controls
  • 2024: Recursive loop detection, SnapStart for Python/.NET

2. First Principles: The Execution Model

The Firecracker MicroVM

Lambda functions run inside Firecracker micro-VMs — lightweight virtual machines that boot in ~125ms. Each execution environment is a dedicated Firecracker instance with:

  • A Linux kernel (Amazon Linux 2 or AL2023)
  • The language runtime (Node.js, Python, Java, etc.)
  • Your function code and dependencies
  • A writable /tmp directory (up to 10GB, ephemeral)

Invocation Lifecycle

Every Lambda invocation follows this lifecycle:

The Three Phases

PhaseWhat HappensDurationBilled?
INITDownload code, start runtime, run top-level code100ms - 10sYes (since Dec 2023)
INVOKEExecute the handler functionYour code's runtimeYes
SHUTDOWNRuntime extension hooks fire, sandbox may freezeUp to 2sNo

Execution Environment Reuse

After an invocation completes, the execution environment is frozen — not destroyed. If another invocation arrives within ~5-15 minutes (exact time varies, not guaranteed), Lambda thaws the existing environment rather than creating a new one. This is why:

  • Database connections persist between invocations
  • Global/module-level variables retain their values
  • /tmp contents survive across warm invocations
  • Memory leaks accumulate across warm invocations

WARNING

Never rely on execution environment reuse for correctness. Treat each invocation as potentially running in a fresh sandbox. Use reuse only for performance optimization (connection pooling, caching).


3. Cold Starts: The Core Challenge

Anatomy of a Cold Start

A cold start occurs when Lambda must create a new execution environment. The total cold start latency is:

Tcold=Tmicrovm+Tdownload+Truntime+Tinit

Where:

  • Tmicrovm: Firecracker VM creation (~50-100ms, internal to AWS)
  • Tdownload: Code and layers download (proportional to package size)
  • Truntime: Language runtime startup
  • Tinit: Your initialization code (top-level / constructor)

Cold Start Benchmarks by Runtime

RuntimeMedian Cold StartP99 Cold StartPackage Size Impact
Python 3.12150-300ms500-800msLow
Node.js 20150-250ms400-700msLow
Go (AL2023)80-150ms200-400msMinimal (compiled)
Rust (AL2023)50-120ms150-300msMinimal (compiled)
Java 21800-3000ms3-8sVery High
Java 21 + SnapStart150-300ms400-800msLow (after snapshot)
.NET 8400-800ms1-3sModerate
.NET 8 (NativeAOT)100-200ms300-500msLow

The Math of Cold Start Frequency

For a function with steady traffic at λ requests/second and N concurrent execution environments, the probability of a cold start for any given request is approximately:

Pcold1N1λTwarm

Where Twarm is the average time an environment stays warm (typically 5-15 minutes). In practice, cold starts are most frequent for:

  1. Low-traffic functions (minutes between invocations)
  2. Bursty functions (sudden spike from 0 to 1000 concurrent)
  3. After deployments (all existing environments are replaced)

Minimizing Cold Starts

typescript
// BAD: Heavy imports at top level that you might not need
import { S3Client, PutObjectCommand, GetObjectCommand,
         ListObjectsV2Command, DeleteObjectCommand } from '@aws-sdk/client-s3';
import { DynamoDBClient } from '@aws-sdk/client-dynamodb';
import { DynamoDBDocumentClient, PutCommand, GetCommand,
         QueryCommand, ScanCommand } from '@aws-sdk/lib-dynamodb';
import { SQSClient, SendMessageCommand } from '@aws-sdk/client-sqs';
import sharp from 'sharp';
import Joi from 'joi';

// GOOD: Lazy imports — only load what this invocation needs
const getS3Client = (() => {
  let client: S3Client | null = null;
  return () => {
    if (!client) {
      client = new S3Client({
        region: process.env.AWS_REGION,
        // Use keep-alive to reuse TCP connections
        requestHandler: new NodeHttpHandler({
          connectionTimeout: 3000,
          socketTimeout: 3000,
        }),
      });
    }
    return client;
  };
})();

const getDynamoClient = (() => {
  let client: DynamoDBDocumentClient | null = null;
  return () => {
    if (!client) {
      const ddb = new DynamoDBClient({ region: process.env.AWS_REGION });
      client = DynamoDBDocumentClient.from(ddb, {
        marshallOptions: { removeUndefinedValues: true },
      });
    }
    return client;
  };
})();

TIP

For Node.js, use @aws-sdk/client-* (v3) instead of aws-sdk (v2). V3 is tree-shakeable — import only the commands you need, which reduces bundle size and cold start time significantly.


4. Lambda Layers

What Layers Are

A Lambda Layer is a ZIP archive of code/data (up to 50MB compressed, 250MB uncompressed) that is extracted into the /opt directory of the execution environment. Layers serve several purposes:

  1. Shared dependencies — common libraries across multiple functions
  2. Custom runtimes — provide a runtime not natively supported
  3. Separation of concerns — keep function code small, dependencies in layers
  4. Independent versioning — update dependencies without redeploying function code

Layer Structure by Runtime

RuntimeLayer PathInclude Path
Node.js/opt/nodejs/node_modulesAuto-included
Python/opt/python/lib/python3.x/site-packagesAuto-included
Java/opt/java/libClasspath
Go / Rust/opt/lib or /opt/binLD_LIBRARY_PATH

Creating a Production Layer

typescript
// scripts/build-layer.ts — Build a shared utilities layer
import { execSync } from 'child_process';
import { mkdirSync, cpSync, rmSync, writeFileSync } from 'fs';
import { join } from 'path';

const LAYER_DIR = join(__dirname, '../.layer');
const NODEJS_DIR = join(LAYER_DIR, 'nodejs');

// Clean previous build
rmSync(LAYER_DIR, { recursive: true, force: true });
mkdirSync(join(NODEJS_DIR, 'node_modules'), { recursive: true });

// Install only production dependencies
writeFileSync(
  join(NODEJS_DIR, 'package.json'),
  JSON.stringify({
    name: 'shared-layer',
    version: '1.0.0',
    dependencies: {
      '@aws-sdk/client-s3': '^3.500.0',
      '@aws-sdk/client-dynamodb': '^3.500.0',
      '@aws-sdk/lib-dynamodb': '^3.500.0',
      'zod': '^3.22.0',
      'pino': '^8.17.0',
    },
  })
);

execSync('npm install --omit=dev --prefix ' + NODEJS_DIR, { stdio: 'inherit' });

// Package the layer
execSync(`cd ${LAYER_DIR} && zip -r ../shared-layer.zip .`, { stdio: 'inherit' });

console.log('Layer built successfully.');

Layer Limits

LimitValue
Max layers per function5
Max uncompressed layer size250 MB (total for all layers + function code)
Max compressed layer size50 MB per layer
Max layer versionsUnlimited
Layer download timeProportional to size (impacts cold start)

DANGER

Layers are cached per execution environment. When you publish a new layer version, existing warm environments still use the old version until they are recycled. This means a deployment can have mixed versions running simultaneously for minutes.


5. VPC Integration

The Original Problem

When Lambda launched VPC support (2016), putting a function in a VPC added 10-15 seconds to cold starts because Lambda had to create an Elastic Network Interface (ENI) in your VPC for each execution environment.

The 2019 Fix: Hyperplane ENI

AWS re-architected VPC networking for Lambda using Hyperplane — the same technology behind NAT Gateway, Network Load Balancer, and EFS. Instead of creating one ENI per execution environment:

  1. Lambda creates a shared Hyperplane ENI in each subnet
  2. All execution environments in that subnet share the ENI via NAT
  3. ENI creation happens at function create/update time, not at invocation time
  4. Cold start penalty dropped from 10-15s to <1s additional

When to Use VPC

ScenarioVPC Required?Why
Access RDS/AuroraYesRDS is VPC-only
Access ElastiCacheYesElastiCache is VPC-only
Access internal servicesYesPrivate subnet resources
Access DynamoDBNoUse VPC Gateway Endpoint
Access S3NoUse VPC Gateway Endpoint
Access SQS/SNSNoPublic endpoints (or Interface Endpoint)
Access third-party APIsNoUnless security policy requires

VPC Configuration in Terraform

hcl
resource "aws_lambda_function" "api" {
  function_name = "api-handler"
  role          = aws_iam_role.lambda_exec.arn
  handler       = "index.handler"
  runtime       = "nodejs20.x"
  timeout       = 30
  memory_size   = 512

  vpc_config {
    subnet_ids         = var.private_subnet_ids
    security_group_ids = [aws_security_group.lambda_sg.id]
  }

  environment {
    variables = {
      DB_HOST     = var.rds_endpoint
      REDIS_HOST  = var.elasticache_endpoint
    }
  }
}

resource "aws_security_group" "lambda_sg" {
  name_prefix = "lambda-api-"
  vpc_id      = var.vpc_id

  # Outbound: allow all (Lambda needs to reach AWS services)
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  # No inbound rules needed — Lambda is invoked by AWS, not by inbound traffic
  tags = {
    Name = "lambda-api-sg"
  }
}

# Allow Lambda to access RDS
resource "aws_security_group_rule" "rds_from_lambda" {
  type                     = "ingress"
  from_port                = 5432
  to_port                  = 5432
  protocol                 = "tcp"
  security_group_id        = var.rds_security_group_id
  source_security_group_id = aws_security_group.lambda_sg.id
}

WARNING

VPC Lambda functions need a NAT Gateway to access the internet (including AWS service endpoints not available via VPC Endpoints). NAT Gateway costs ~$32/month + $0.045/GB processed. For high-throughput Lambda functions, this cost can be significant.


6. Provisioned Concurrency

The Problem It Solves

For latency-sensitive workloads (APIs, real-time processing), cold starts are unacceptable. Provisioned Concurrency keeps a specified number of execution environments pre-initialized and warm at all times.

How It Works

Provisioned Concurrency Cost Model

The cost has two components:

Cprovisioned=Cprovision+Cexecution

Where:

  • Cprovision=Nprovisioned×Thours×$0.000004646/GB-second
  • Cexecution = normal invocation pricing (but at reduced rate)

For a 512MB function with 100 provisioned concurrency running 24/7:

Cmonthly=100×0.5GB×86400×30×$0.000004646=$601.57/month

Compare this to the cost of an equivalent Fargate task or EC2 instance. Provisioned concurrency makes sense when:

  1. Cold starts exceed SLA requirements
  2. Traffic is predictable enough to set the right number
  3. The function is invoked frequently enough to justify the cost

Auto-Scaling Provisioned Concurrency

hcl
resource "aws_lambda_provisioned_concurrency_config" "api" {
  function_name                  = aws_lambda_function.api.function_name
  provisioned_concurrent_executions = 50
  qualifier                      = aws_lambda_alias.live.name
}

resource "aws_appautoscaling_target" "lambda" {
  max_capacity       = 200
  min_capacity       = 50
  resource_id        = "function:${aws_lambda_function.api.function_name}:${aws_lambda_alias.live.name}"
  scalable_dimension = "lambda:function:ProvisionedConcurrency"
  service_namespace  = "lambda"
}

resource "aws_appautoscaling_policy" "lambda" {
  name               = "lambda-provisioned-scaling"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.lambda.resource_id
  scalable_dimension = aws_appautoscaling_target.lambda.scalable_dimension
  service_namespace  = aws_appautoscaling_target.lambda.service_namespace

  target_tracking_scaling_policy_configuration {
    predefined_metric_specification {
      predefined_metric_type = "LambdaProvisionedConcurrencyUtilization"
    }
    target_value = 0.7  # Scale up when 70% of provisioned is in use
  }
}

7. Memory and CPU Tuning

The CPU-Memory Proportionality

Lambda allocates CPU proportionally to memory. At 1,769 MB, you get one full vCPU. At 10,240 MB (max), you get ~6 vCPUs.

MemoryvCPUNetwork Bandwidth
128 MB0.07Low
512 MB0.29Low
1,024 MB0.58Moderate
1,769 MB1.0Moderate
3,538 MB2.0High
10,240 MB6.0Up to 10 Gbps

The Cost Optimization Paradox

Increasing memory often reduces cost because the function runs faster. A 128MB function that takes 3 seconds costs more than a 512MB function that takes 0.5 seconds:

C128=1281024×3000ms×$0.0000166667=$0.00000625C512=5121024×500ms×$0.0000166667=$0.00000417

The 512MB function is 33% cheaper and 6x faster.

Power Tuning with AWS Lambda Power Tuning

Use the AWS Lambda Power Tuning tool to find the optimal memory setting:

json
{
  "lambdaARN": "arn:aws:lambda:us-east-1:123456789:function:my-function",
  "powerValues": [128, 256, 512, 1024, 1536, 2048, 3008],
  "num": 50,
  "payload": "{\"test\": true}",
  "parallelInvocation": true,
  "strategy": "cost"
}

This invokes your function at each memory level 50 times and produces a cost-vs-duration chart showing the optimal configuration.


8. Concurrency Model and Throttling

Concurrency Limits

LimitDefaultAdjustable
Account concurrency (per region)1,000Yes (up to tens of thousands)
Reserved concurrency (per function)0 (unreserved)Yes
Burst concurrency500-3,000 (region-dependent)No
Burst rate500/minute after initial burstNo

Reserved vs. Unreserved Concurrency

DANGER

If a single function consumes all unreserved concurrency, every other function in the account gets throttled. Always set reserved concurrency for critical functions and use a separate AWS account for production workloads.

Throttling Behavior by Invocation Type

Invocation TypeThrottle BehaviorRetry
Synchronous (API GW)Returns 429 to callerNo auto-retry
Asynchronous (S3, SNS)Retries up to 6 hoursYes, with backoff
Stream (Kinesis, DynamoDB)Retries entire batchYes, blocks shard
SQSReturns to queueYes, visibility timeout

Production Concurrency Configuration

typescript
// middleware/concurrency-protection.ts
import { Logger } from '@aws-lambda-powertools/logger';

const logger = new Logger({ serviceName: 'api' });

interface ConcurrencyMetrics {
  activeInvocations: number;
  throttledCount: number;
  timestamp: number;
}

/**
 * Monitor concurrency usage and emit metrics for alarming.
 * CloudWatch does not expose per-function concurrency as a metric —
 * you must calculate it from ConcurrentExecutions.
 */
export async function emitConcurrencyMetrics(
  functionName: string,
  cloudwatch: CloudWatchClient,
): Promise<void> {
  const now = new Date();
  const fiveMinutesAgo = new Date(now.getTime() - 5 * 60 * 1000);

  const response = await cloudwatch.send(new GetMetricStatisticsCommand({
    Namespace: 'AWS/Lambda',
    MetricName: 'ConcurrentExecutions',
    Dimensions: [{ Name: 'FunctionName', Value: functionName }],
    StartTime: fiveMinutesAgo,
    EndTime: now,
    Period: 60,
    Statistics: ['Maximum'],
  }));

  const maxConcurrency = response.Datapoints
    ?.sort((a, b) => (b.Timestamp?.getTime() ?? 0) - (a.Timestamp?.getTime() ?? 0))
    ?.[0]?.Maximum ?? 0;

  logger.info('Concurrency metrics', {
    functionName,
    maxConcurrency,
    timestamp: now.toISOString(),
  });
}

9. Event Source Mappings and Invocation Patterns

Synchronous Invocation

The caller waits for the response. Used by API Gateway, ALB, and direct SDK calls.

typescript
// handler.ts — API Gateway Lambda handler with proper error handling
import { APIGatewayProxyEventV2, APIGatewayProxyResultV2 } from 'aws-lambda';
import { Logger, injectLambdaContext } from '@aws-lambda-powertools/logger';
import { Tracer, captureLambdaHandler } from '@aws-lambda-powertools/tracer';
import { Metrics, logMetrics, MetricUnit } from '@aws-lambda-powertools/metrics';
import middy from '@middy/core';
import httpErrorHandler from '@middy/http-error-handler';
import httpJsonBodyParser from '@middy/http-json-body-parser';

const logger = new Logger({ serviceName: 'order-api' });
const tracer = new Tracer({ serviceName: 'order-api' });
const metrics = new Metrics({ namespace: 'OrderService', serviceName: 'order-api' });

interface CreateOrderRequest {
  customerId: string;
  items: Array<{ productId: string; quantity: number }>;
}

const baseHandler = async (
  event: APIGatewayProxyEventV2 & { body: CreateOrderRequest },
): Promise<APIGatewayProxyResultV2> => {
  const { customerId, items } = event.body;

  // Validate input
  if (!customerId || !items?.length) {
    return {
      statusCode: 400,
      body: JSON.stringify({ error: 'customerId and items are required' }),
    };
  }

  try {
    const order = await createOrder(customerId, items);

    metrics.addMetric('OrderCreated', MetricUnit.Count, 1);
    metrics.addMetadata('orderId', order.id);

    return {
      statusCode: 201,
      body: JSON.stringify(order),
      headers: { 'Content-Type': 'application/json' },
    };
  } catch (error) {
    logger.error('Failed to create order', { error, customerId });
    metrics.addMetric('OrderCreationFailed', MetricUnit.Count, 1);

    return {
      statusCode: 500,
      body: JSON.stringify({ error: 'Internal server error' }),
    };
  }
};

export const handler = middy(baseHandler)
  .use(httpJsonBodyParser())
  .use(httpErrorHandler())
  .use(injectLambdaContext(logger, { logEvent: true }))
  .use(captureLambdaHandler(tracer))
  .use(logMetrics(metrics));

Asynchronous Invocation

The caller gets a 202 immediately. Lambda handles retries (2 retries with backoff) and dead letter queues.

typescript
// async-handler.ts — S3 event handler with idempotency
import { S3Event } from 'aws-lambda';
import { IdempotencyConfig, makeIdempotent } from '@aws-lambda-powertools/idempotency';
import { DynamoDBPersistenceLayer } from '@aws-lambda-powertools/idempotency/dynamodb';

const persistenceStore = new DynamoDBPersistenceLayer({
  tableName: process.env.IDEMPOTENCY_TABLE!,
});

const processS3Event = async (event: S3Event): Promise<void> => {
  for (const record of event.Records) {
    const bucket = record.s3.bucket.name;
    const key = decodeURIComponent(record.s3.object.key.replace(/\+/g, ' '));
    const size = record.s3.object.size;

    logger.info('Processing S3 object', { bucket, key, size });

    // Process the object (e.g., generate thumbnail)
    await processImage(bucket, key);
  }
};

// Wrap with idempotency to handle Lambda retries safely
export const handler = makeIdempotent(processS3Event, {
  persistenceStore,
  config: new IdempotencyConfig({
    eventKeyJmesPath: 'Records[0].s3.object.key',
    expiresAfterSeconds: 3600,
  }),
});

Stream-Based (Kinesis, DynamoDB Streams)

Lambda polls the stream, batches records, and invokes your function. If processing fails, the entire batch is retried, blocking the shard.

typescript
// stream-handler.ts — DynamoDB Stream processor with partial batch failure
import { DynamoDBStreamEvent, DynamoDBRecord, SQSBatchResponse } from 'aws-lambda';

export const handler = async (event: DynamoDBStreamEvent): Promise<SQSBatchResponse> => {
  const batchItemFailures: SQSBatchResponse['batchItemFailures'] = [];

  for (const record of event.Records) {
    try {
      await processRecord(record);
    } catch (error) {
      logger.error('Failed to process record', {
        error,
        eventID: record.eventID,
        eventName: record.eventName,
      });

      // Report this specific record as failed
      // Lambda will retry ONLY this record (with ReportBatchItemFailures enabled)
      batchItemFailures.push({
        itemIdentifier: record.eventID!,
      });
    }
  }

  return { batchItemFailures };
};

async function processRecord(record: DynamoDBRecord): Promise<void> {
  if (record.eventName === 'INSERT' || record.eventName === 'MODIFY') {
    const newImage = record.dynamodb?.NewImage;
    if (!newImage) return;

    // Process the change...
    await indexToElasticsearch(newImage);
  }
}

10. Lambda SnapStart (Java / Python / .NET)

The Problem with JVM Cold Starts

Java Lambda cold starts are brutal — 3-8 seconds is common due to:

  1. JVM startup and class loading
  2. JIT compilation warmup
  3. Framework initialization (Spring Boot: 5-10s)
  4. Dependency injection container setup

How SnapStart Works

SnapStart takes a Firecracker snapshot of the initialized execution environment (after the INIT phase) and stores it encrypted in a cache. On cold start, instead of running INIT from scratch, Lambda restores from the snapshot — like resuming from hibernation.

SnapStart Caveats

WARNING

SnapStart restores from a frozen point in time. This means:

  1. Random number generators may produce the same sequence across environments — use CryptoRandom or re-seed after restore
  2. Network connections are stale — re-establish in the handler, not in INIT
  3. Timestamps from INIT are frozen — always use System.currentTimeMillis() in the handler
  4. Uniqueness assumptions (UUIDs generated in INIT) are violated — generate per-invocation

11. Production Patterns

The Middy Middleware Pattern (Node.js)

typescript
// middleware/index.ts — Reusable middleware stack
import middy from '@middy/core';
import httpJsonBodyParser from '@middy/http-json-body-parser';
import httpErrorHandler from '@middy/http-error-handler';
import httpHeaderNormalizer from '@middy/http-header-normalizer';
import httpCors from '@middy/http-cors';
import validator from '@middy/validator';
import warmup from '@middy/warmup';
import { transpileSchema } from '@middy/validator/transpile';
import { injectLambdaContext } from '@aws-lambda-powertools/logger';
import { captureLambdaHandler } from '@aws-lambda-powertools/tracer';
import { logMetrics } from '@aws-lambda-powertools/metrics';
import { logger, tracer, metrics } from './powertools';

export function createApiHandler<TEvent, TResult>(
  handler: (event: TEvent) => Promise<TResult>,
  options?: {
    inputSchema?: Record<string, unknown>;
    cors?: boolean;
  },
) {
  let middified = middy(handler)
    .use(warmup({ isWarmingUp: (e: any) => e?.source === 'serverless-warming' }))
    .use(httpHeaderNormalizer())
    .use(httpJsonBodyParser())
    .use(injectLambdaContext(logger, { logEvent: true }))
    .use(captureLambdaHandler(tracer))
    .use(logMetrics(metrics, { captureColdStartMetric: true }));

  if (options?.inputSchema) {
    middified = middified.use(
      validator({ eventSchema: transpileSchema(options.inputSchema) })
    );
  }

  if (options?.cors !== false) {
    middified = middified.use(httpCors({
      origins: [process.env.ALLOWED_ORIGIN ?? '*'],
      credentials: true,
    }));
  }

  return middified.use(httpErrorHandler({ logger: (error) => logger.error('Unhandled error', { error }) }));
}

Connection Pooling Pattern

typescript
// db/connection.ts — Reuse connections across warm invocations
import { Pool } from 'pg';
import { Signer } from '@aws-sdk/rds-signer';

let pool: Pool | null = null;

async function getPool(): Promise<Pool> {
  if (pool) {
    // Validate the pool is still healthy
    try {
      await pool.query('SELECT 1');
      return pool;
    } catch {
      // Pool is stale, recreate
      await pool.end().catch(() => {});
      pool = null;
    }
  }

  const signer = new Signer({
    hostname: process.env.DB_HOST!,
    port: 5432,
    username: process.env.DB_USER!,
    region: process.env.AWS_REGION!,
  });

  const token = await signer.getAuthToken();

  pool = new Pool({
    host: process.env.DB_HOST,
    port: 5432,
    user: process.env.DB_USER,
    password: token,
    database: process.env.DB_NAME,
    ssl: { rejectUnauthorized: true },
    max: 1, // Lambda runs one invocation at a time — one connection is enough
    idleTimeoutMillis: 120000, // Close idle connections after 2 min
    connectionTimeoutMillis: 5000,
  });

  return pool;
}

export { getPool };

War Story

A startup migrated their Express.js API to Lambda without changing the connection pool size. Each Lambda instance opened 10 database connections (the default max in pg). At 500 concurrent Lambda invocations, they had 5,000 open database connections to a db.r5.large (max 1,000 connections). The database OOMed, cascading into a full outage. The fix: set max: 1 in the pool configuration (since Lambda processes one request at a time) and use RDS Proxy for connection pooling across instances.


12. Observability

Structured Logging

typescript
// powertools.ts — Centralized observability setup
import { Logger } from '@aws-lambda-powertools/logger';
import { Tracer } from '@aws-lambda-powertools/tracer';
import { Metrics } from '@aws-lambda-powertools/metrics';

export const logger = new Logger({
  serviceName: process.env.SERVICE_NAME ?? 'unknown',
  logLevel: process.env.LOG_LEVEL ?? 'INFO',
  persistentLogAttributes: {
    environment: process.env.ENVIRONMENT ?? 'unknown',
    version: process.env.APP_VERSION ?? 'unknown',
  },
});

export const tracer = new Tracer({
  serviceName: process.env.SERVICE_NAME ?? 'unknown',
  captureHTTPsRequests: true,
});

export const metrics = new Metrics({
  namespace: process.env.METRICS_NAMESPACE ?? 'Application',
  serviceName: process.env.SERVICE_NAME ?? 'unknown',
  defaultDimensions: {
    environment: process.env.ENVIRONMENT ?? 'unknown',
  },
});

Key CloudWatch Metrics to Monitor

MetricAlarm ThresholdWhy
Errors> 1% of invocationsFunction failures
Throttles> 0 sustainedHitting concurrency limits
Duration (P99)> 80% of timeoutApproaching timeout
ConcurrentExecutions> 80% of account limitApproaching throttle
IteratorAge (streams)> 60sProcessing falling behind
DeadLetterErrors> 0DLQ delivery failures

13. Edge Cases and Failure Modes

Timeout Cascading

If Function A (timeout: 30s) calls Function B (timeout: 30s) synchronously, Function A can timeout waiting for B. Always set:

Tcaller>Tcallee+Tnetwork_overhead

Payload Size Limits

Invocation TypeRequest PayloadResponse Payload
Synchronous6 MB6 MB
Asynchronous256 KBN/A
Stream6 MB (batch)N/A
Response Streaming6 MB request20 MB response

The /tmp Trap

The /tmp directory persists across warm invocations and has a 10GB limit. If you write files to /tmp without cleaning up, disk space accumulates until:

typescript
// DANGER: /tmp fills up across warm invocations
export const handler = async (event: S3Event) => {
  const tmpFile = `/tmp/${Date.now()}.json`;
  await downloadS3Object(event, tmpFile);
  const result = await processFile(tmpFile);
  // Missing: fs.unlinkSync(tmpFile);
  return result;
};

// SAFE: Always clean up /tmp
export const handler = async (event: S3Event) => {
  const tmpFile = `/tmp/${Date.now()}.json`;
  try {
    await downloadS3Object(event, tmpFile);
    return await processFile(tmpFile);
  } finally {
    try { fs.unlinkSync(tmpFile); } catch { /* ignore */ }
  }
};

Recursive Invocations

A Lambda writing to S3, which triggers the same Lambda, creates an infinite loop. AWS added recursive loop detection in 2024, but you should still guard against it:

typescript
// Guard against recursive invocations
const MAX_RECURSION_DEPTH = 3;

export const handler = async (event: any, context: any) => {
  const depth = parseInt(event.headers?.['x-recursion-depth'] ?? '0');

  if (depth >= MAX_RECURSION_DEPTH) {
    logger.error('Max recursion depth exceeded', { depth });
    throw new Error('Recursive invocation detected');
  }

  // Pass incremented depth to downstream calls
  // ...
};

14. Performance Characteristics

Latency Breakdown (Typical API Gateway + Lambda)

ComponentLatencyNotes
API Gateway overhead10-30msREST API; HTTP API is faster (~5-15ms)
Cold start (Node.js)150-300msFirst invocation only
Warm invocation1-5msLambda service overhead
Handler executionVariesYour code
Response serialization<1msJSON.stringify
Total (warm)15-50msExcluding your code
Total (cold)200-500msExcluding your code

Throughput Limits

Max RPS=Account Concurrency LimitAvg Duration (seconds)

For 1,000 concurrent with 100ms average duration:

Max RPS=10000.1=10,000 RPS

15. Decision Framework: When to Use Lambda

FactorLambda WinsContainers Win
Traffic patternSporadic, burstySteady, predictable
Duration< 15 minutesLong-running
Startup latencyAcceptable cold startsSub-ms required
StateStatelessStateful (WebSockets, caches)
Cost at scale< ~1M requests/day> ~1M requests/day (usually)
Team sizeSmall (< 5)Large with platform team
Vendor lock-inAcceptableMust be portable

Cost Comparison at Scale

Monthly InvocationsAvg DurationLambda CostFargate (equivalent)
1 million200ms$3.54$36.50
10 million200ms$35.40$73.00
100 million200ms$354.00$146.00
1 billion200ms$3,540.00$292.00

TIP

Lambda has a generous free tier (1M requests + 400,000 GB-seconds/month). For low-traffic services, Lambda is essentially free.


16. Advanced Topics

Lambda Extensions

Extensions run as separate processes in the execution environment, enabling:

  • Custom monitoring agents
  • Configuration management
  • Secret rotation
  • Log processing

Response Streaming

Introduced in 2023, response streaming lets you send partial responses as they become available — critical for LLM/AI workloads:

typescript
import { streamifyResponse, ResponseStream } from 'lambda-stream';

export const handler = streamifyResponse(
  async (event: any, responseStream: ResponseStream) => {
    responseStream.setContentType('text/plain');

    // Stream data as it becomes available
    for (let i = 0; i < 100; i++) {
      const chunk = await processChunk(i);
      responseStream.write(chunk);
    }

    responseStream.end();
  }
);

Lambda@Edge vs CloudFront Functions

FeatureLambda@EdgeCloudFront Functions
RuntimeNode.js, PythonJavaScript only
DurationUp to 30s (origin) / 5s (viewer)< 1ms
MemoryUp to 10GB2 MB
NetworkYesNo
PricingLambda pricing1/6 of Lambda@Edge
Use caseAuth, A/B testing, dynamic routingHeader manipulation, URL rewrites

17. Terraform Module: Production Lambda

hcl
# modules/lambda/main.tf — Reusable Lambda module
variable "function_name" { type = string }
variable "handler" { type = string }
variable "runtime" { type = string }
variable "memory_size" { type = number; default = 512 }
variable "timeout" { type = number; default = 30 }
variable "environment_variables" { type = map(string); default = {} }
variable "vpc_config" {
  type = object({
    subnet_ids         = list(string)
    security_group_ids = list(string)
  })
  default = null
}
variable "reserved_concurrency" { type = number; default = -1 }
variable "layers" { type = list(string); default = [] }
variable "tracing_mode" { type = string; default = "Active" }

resource "aws_lambda_function" "this" {
  function_name = var.function_name
  role          = aws_iam_role.this.arn
  handler       = var.handler
  runtime       = var.runtime
  memory_size   = var.memory_size
  timeout       = var.timeout
  layers        = var.layers

  filename         = data.archive_file.function.output_path
  source_code_hash = data.archive_file.function.output_base64sha256

  reserved_concurrent_executions = var.reserved_concurrency

  tracing_config {
    mode = var.tracing_mode
  }

  dynamic "vpc_config" {
    for_each = var.vpc_config != null ? [var.vpc_config] : []
    content {
      subnet_ids         = vpc_config.value.subnet_ids
      security_group_ids = vpc_config.value.security_group_ids
    }
  }

  environment {
    variables = merge(var.environment_variables, {
      POWERTOOLS_SERVICE_NAME = var.function_name
      LOG_LEVEL               = "INFO"
    })
  }

  tags = {
    Service     = var.function_name
    ManagedBy   = "terraform"
  }
}

resource "aws_lambda_function_event_invoke_config" "this" {
  function_name          = aws_lambda_function.this.function_name
  maximum_retry_attempts = 2
  maximum_event_age_in_seconds = 3600

  destination_configuration {
    on_failure {
      destination = aws_sqs_queue.dlq.arn
    }
  }
}

resource "aws_sqs_queue" "dlq" {
  name                      = "${var.function_name}-dlq"
  message_retention_seconds = 1209600  # 14 days

  tags = {
    Service   = var.function_name
    ManagedBy = "terraform"
  }
}

resource "aws_cloudwatch_metric_alarm" "errors" {
  alarm_name          = "${var.function_name}-errors"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 2
  metric_name         = "Errors"
  namespace           = "AWS/Lambda"
  period              = 300
  statistic           = "Sum"
  threshold           = 5
  alarm_description   = "Lambda function error rate exceeded threshold"

  dimensions = {
    FunctionName = aws_lambda_function.this.function_name
  }

  alarm_actions = [var.sns_alarm_topic_arn]
  ok_actions    = [var.sns_alarm_topic_arn]
}

resource "aws_cloudwatch_metric_alarm" "throttles" {
  alarm_name          = "${var.function_name}-throttles"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 1
  metric_name         = "Throttles"
  namespace           = "AWS/Lambda"
  period              = 60
  statistic           = "Sum"
  threshold           = 0
  alarm_description   = "Lambda function is being throttled"

  dimensions = {
    FunctionName = aws_lambda_function.this.function_name
  }

  alarm_actions = [var.sns_alarm_topic_arn]
}

See Also

"What I cannot create, I do not understand." — Richard Feynman