Skip to content

Auth Service Blueprint

Authentication is the front door to every product. It is the first thing users interact with, the most security-critical system you will build, and the one with the highest operational cost if you get it wrong. A compromised auth system does not just leak data — it destroys trust.

This blueprint covers a complete, production-grade authentication service that handles user registration, login, session management, token refresh, multi-factor authentication, password resets, OAuth2/OIDC social login, and full audit logging. It is designed for a SaaS platform serving tens of thousands to millions of users.

What This Blueprint Covers

This is not a tutorial on passport.js. This is a system design for authentication infrastructure that you can operate in production with confidence. The blueprint addresses six critical dimensions:

  1. Architecture — How the auth service fits into a broader platform, its internal components, and its communication patterns with other services.
  2. API Contracts — Every endpoint, request/response schema, error format, and edge case. A developer should be able to implement a client from this spec alone.
  3. Database Schema — Complete PostgreSQL schema with users, sessions, refresh tokens, password reset tokens, MFA secrets, and audit logs. Full DDL with indexes and constraints.
  4. Deployment — Docker, Kubernetes manifests, health checks, environment configuration, and CI/CD pipeline.
  5. Scaling — How the system behaves under load, connection pooling, Redis clustering, horizontal scaling, and capacity planning.
  6. Security — Threat model, attack surface analysis, rate limiting, brute-force protection, and compliance considerations.

System Overview

The auth service is a stateless microservice that manages identity, credentials, and sessions. It relies on PostgreSQL for persistent storage and Redis for session/token caching. It exposes a REST API consumed by client applications and other backend services.

Authentication Flows

The service supports multiple authentication flows, each optimized for different client types and security requirements.

Registration Flow

Login Flow

Token Refresh Flow

Token Architecture

The service uses a dual-token architecture with short-lived access tokens and long-lived refresh tokens, implementing refresh token rotation for security.

Token TypeFormatLifetimeStoragePurpose
Access TokenJWT (RS256)15 minutesClient onlyAPI authorization
Refresh TokenOpaque (UUID v4)30 daysRedis + DBToken renewal
MFA TokenOpaque (UUID v4)5 minutesRedisMFA flow continuation
Email VerificationOpaque (URL-safe)24 hoursPostgreSQLEmail confirmation
Password ResetOpaque (URL-safe)1 hourPostgreSQLPassword recovery

Access Token Claims

typescript
interface AccessTokenPayload {
  // Standard JWT claims
  iss: string;          // Issuer: "auth.yourplatform.com"
  sub: string;          // Subject: user ID (UUID)
  aud: string[];        // Audience: ["api.yourplatform.com"]
  exp: number;          // Expiration: 15 minutes from issuance
  iat: number;          // Issued at
  jti: string;          // JWT ID: unique token identifier

  // Custom claims
  email: string;
  email_verified: boolean;
  roles: string[];      // ["user", "admin"]
  permissions: string[];// ["read:projects", "write:projects"]
  org_id?: string;      // Organization context (multi-tenant)
  session_id: string;   // Links to session in Redis
}

Key Rotation Strategy

Access tokens are signed with RSA-256 using asymmetric keys. The public key is available at a JWKS endpoint so any service can verify tokens without calling the auth service.

GET /.well-known/jwks.json

Key rotation happens on a 90-day cycle:

  1. Generate new key pair, assign new kid.
  2. Add new public key to JWKS endpoint.
  3. Start signing new tokens with the new key.
  4. Keep old public key in JWKS for the lifetime of existing tokens (15 min).
  5. Remove old public key after grace period.

Security Posture

Password Policy

RuleRequirement
Minimum length10 characters
Maximum length128 characters
ComplexityNo explicit rules — use zxcvbn scoring
Minimum zxcvbn score3 out of 4
Breach checkCheck against Have I Been Pwned API (k-anonymity model)
HashingArgon2id (memory: 64 MB, iterations: 3, parallelism: 4)

Brute Force Protection

LayerMechanismThreshold
API GatewayGlobal rate limit1000 req/min per IP
Login endpointPer-IP rate limit10 attempts per 15 min
Login endpointPer-account rate limit5 failed attempts, then progressive delay
Account lockoutTemporary lockout10 failed attempts → 30 min lockout
CAPTCHAChallenge after failuresAfter 3 failed attempts

Session Security

  • HTTP-only, Secure, SameSite=Strict cookies for web clients.
  • Refresh token rotation: every refresh invalidates the old token.
  • Refresh token reuse detection: if a revoked token is presented, all tokens for that session are revoked (indicates token theft).
  • Session binding: tokens are bound to device fingerprint (User-Agent + IP subnet).
  • Concurrent session limit: configurable, default 5 active sessions per user.

Technology Stack Justification

ComponentChoiceRationale
LanguageTypeScript (Node.js)Strong ecosystem for web auth, type safety, team familiarity
FrameworkFastify2x faster than Express, schema-based validation, plugin architecture
ORMDrizzle ORMType-safe queries, zero-cost abstractions, migration support
Password hashingArgon2idOWASP recommended, memory-hard (resists GPU attacks)
JWT signingjose libraryRFC-compliant, supports RS256/ES256, JWKS
DatabasePostgreSQL 16ACID compliance, row-level security, JSON support
Session storeRedis 7 (Cluster)Sub-millisecond latency, TTL support, atomic operations
EmailSendGridDeliverability, templates, webhook tracking
MFAotplib (TOTP)RFC 6238 compliant, compatible with Google Authenticator
OAuthopenid-clientCertified OpenID Connect RP, supports PKCE

Failure Mode Analysis

FailureImpactDetectionMitigation
PostgreSQL downNo new registrations, no password resetsHealth check failsRead replica failover, cached user data in Redis for login
Redis downNo session validation, no token refreshHealth check failsFall back to database session lookup (degraded performance)
Email service downVerification/reset emails not sentDelivery webhook monitoringQueue emails, retry with exponential backoff
JWT signing key compromisedAll tokens untrustworthyAnomaly detection in audit logsImmediate key rotation, revoke all sessions
DDoS on loginService unavailableRate limit metrics spikeWAF rules, CAPTCHA, IP blocklist
Credential stuffingAccount takeoversFailed login rate anomalyRate limiting, breach detection, forced MFA

Subsections

  • Architecture — Detailed component design, service boundaries, internal data flow, and technology stack deep dive.
  • API Contracts — Complete REST API specification with TypeScript types, request/response schemas, and error handling.
  • Database Schema — Full PostgreSQL DDL with tables, indexes, constraints, triggers, and migration files.
  • Deployment — Docker, Kubernetes manifests, environment configuration, health checks, and CI/CD pipeline.
  • Scaling Plan — Performance characteristics, connection pooling, Redis clustering, horizontal scaling, and capacity planning.

Key Design Decisions

Why Not Use Auth0/Clerk/Firebase Auth?

Third-party auth providers are excellent for getting started quickly, but this blueprint exists for teams that need:

  1. Full data ownership — User credentials never leave your infrastructure.
  2. Custom flows — Enterprise SSO, custom MFA, org-level auth policies.
  3. Cost control — At 100K+ MAU, managed auth services cost $5K-50K/month.
  4. Compliance — SOC 2, HIPAA, GDPR requirements that mandate data residency.
  5. Integration depth — Auth that deeply integrates with your domain model.

Why Argon2id Over bcrypt?

bcrypt was the gold standard for password hashing for 15 years. Argon2id supersedes it because:

  • Memory-hard: bcrypt uses ~4 KB of memory; Argon2id uses 64 MB+. This makes GPU/ASIC attacks economically infeasible.
  • Configurable parallelism: Argon2id can use multiple CPU cores.
  • Side-channel resistance: The "id" variant combines Argon2i (side-channel resistant) and Argon2d (GPU-resistant).
  • Winner of the Password Hashing Competition (2015): Peer-reviewed by the cryptographic community.

Why RS256 Over HS256 for JWTs?

HS256 (symmetric) is simpler but requires every service that validates tokens to have the secret key. RS256 (asymmetric) means:

  • Only the auth service has the private key.
  • Any service can validate tokens using the public key (from JWKS endpoint).
  • Key compromise in a downstream service cannot forge tokens.
  • Standard OIDC/JWKS infrastructure works out of the box.

"Auth is the one system where 'good enough' is never good enough. Every shortcut you take is a vulnerability you will eventually have to patch under pressure."

"What I cannot create, I do not understand." — Richard Feynman