ClickHouse vs Druid vs Pinot
ClickHouse, Apache Druid, and Apache Pinot are the three dominant open-source real-time OLAP engines. All three can ingest millions of events per second and serve analytical queries in milliseconds — but they achieve this through fundamentally different architectures, and each excels in different scenarios.
This page is a deep technical comparison. For the general real-time analytics architecture, see the Real-Time Analytics Stack overview.
At a Glance
| Dimension | ClickHouse | Apache Druid | Apache Pinot |
|---|---|---|---|
| Origin | Yandex (2016) | Metamarkets → Imply (2011) | LinkedIn → Apache (2015) |
| Architecture | Shared-nothing MPP | Micro-services (separate ingest/query/storage) | Micro-services (similar to Druid) |
| Primary use | General analytics, log analytics | Time-series analytics, dashboards | User-facing analytics at scale |
| Query language | SQL (near-complete) | Druid SQL + native JSON queries | SQL (PQL legacy, multi-stage engine) |
| Storage format | Columnar (MergeTree) | Columnar (segments) | Columnar (segments) |
| Real-time ingestion | Kafka engine, async insert | Kafka supervisor (native) | Kafka consumer (native) |
| Used by | Cloudflare, GitLab, eBay, Uber | Airbnb, Netflix, Confluent, Salesforce | LinkedIn, Uber, Stripe, Microsoft |
| License | Apache 2.0 | Apache 2.0 | Apache 2.0 |
| Managed offerings | ClickHouse Cloud | Imply Cloud | StarTree Cloud |
Architecture Comparison
ClickHouse Architecture
ClickHouse uses a shared-nothing architecture. Every node is equal — each stores data and serves queries. Coordination is minimal.
Key characteristics:
- Every node does everything (ingest + query + storage)
- Sharding by hash or range; replication via Replicated*MergeTree
- ClickHouse Keeper (built-in Raft) replaces ZooKeeper for coordination
- No separation of compute and storage (disk-local)
Druid Architecture
Druid separates concerns into specialized node types:
Key characteristics:
- Middle Managers handle real-time ingestion
- Historicals serve completed (immutable) segments from deep storage
- Broker routes queries to the appropriate nodes
- Deep storage (S3/HDFS) decouples storage from compute
Pinot Architecture
Pinot's architecture is similar to Druid but designed for LinkedIn's scale:
Key characteristics:
- Realtime servers consume from Kafka and serve in-progress segments
- Offline servers serve completed, immutable segments
- Controller manages the "ideal state" — which segments each server should host
- Apache Helix for cluster management (or ZooKeeper directly)
Ingestion Comparison
| Feature | ClickHouse | Druid | Pinot |
|---|---|---|---|
| Kafka native | Yes (Kafka engine table) | Yes (Kafka supervisor) | Yes (realtime table) |
| Batch ingestion | INSERT, S3, local files | HDFS/S3 batch indexing tasks | Batch ingestion from HDFS/S3 |
| Ingestion transforms | Materialized views at ingest | Ingestion spec (JSON-based) | Transform functions |
| Schema evolution | ALTER TABLE (limited) | Auto-discover new dimensions | Schema config changes |
| Exactly-once | At-least-once (dedup via ReplacingMergeTree) | At-least-once (dedup via hash) | At-least-once (upsert mode) |
| Upsert support | ReplacingMergeTree (eventual) | Not supported | Native upsert tables |
| Ingestion rate | 1-2M events/s/node | 100-500K events/s/node | 200K-1M events/s/node |
ClickHouse Kafka Ingestion
-- Direct Kafka consumption
CREATE TABLE events_kafka (
event_id String,
user_id UInt64,
event_type LowCardinality(String),
timestamp DateTime64(3)
) ENGINE = Kafka
SETTINGS
kafka_broker_list = 'broker1:9092,broker2:9092',
kafka_topic_list = 'events',
kafka_group_name = 'ch_analytics',
kafka_format = 'JSONEachRow',
kafka_num_consumers = 8,
kafka_max_block_size = 65536;
-- Materialized view triggers on every Kafka batch
CREATE MATERIALIZED VIEW events_mv TO events_final AS
SELECT * FROM events_kafka;Druid Kafka Ingestion
{
"type": "kafka",
"spec": {
"dataSchema": {
"dataSource": "events",
"timestampSpec": {
"column": "timestamp",
"format": "iso"
},
"dimensionsSpec": {
"dimensions": ["user_id", "event_type", "page"]
},
"metricsSpec": [
{ "type": "count", "name": "event_count" },
{ "type": "longSum", "name": "total_amount", "fieldName": "amount" }
],
"granularitySpec": {
"segmentGranularity": "HOUR",
"queryGranularity": "MINUTE"
}
},
"ioConfig": {
"consumerProperties": {
"bootstrap.servers": "broker1:9092,broker2:9092"
},
"topic": "events",
"useEarliestOffset": true
}
}
}Pinot Realtime Table
{
"tableName": "events_REALTIME",
"tableType": "REALTIME",
"segmentsConfig": {
"timeColumnName": "timestamp",
"schemaName": "events",
"replicasPerPartition": "2",
"retentionTimeUnit": "DAYS",
"retentionTimeValue": "30"
},
"ingestionConfig": {
"streamIngestionConfig": {
"streamConfigMaps": [{
"stream.kafka.broker.list": "broker1:9092,broker2:9092",
"stream.kafka.topic.name": "events",
"stream.kafka.consumer.type": "lowlevel",
"stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder"
}]
}
}
}Query Performance Comparison
Benchmark: 1 Billion Events, 30 Days
| Query Type | ClickHouse | Druid | Pinot |
|---|---|---|---|
Simple count (SELECT count(*) WHERE date = today) | 15 ms | 50 ms | 45 ms |
| Group by + count (top 10 pages) | 35 ms | 80 ms | 70 ms |
| Time-series (hourly counts, last 24h) | 25 ms | 40 ms | 35 ms |
| High-cardinality group by (per user_id) | 200 ms | 800 ms | 500 ms |
| Complex join (events JOIN users) | 300 ms | Not supported | Limited |
| Approximate distinct (HLL uniques) | 20 ms | 30 ms | 25 ms |
| Percentile (P95 latency) | 50 ms | 120 ms (approx) | 80 ms (approx) |
| Concurrent queries (100 parallel) | Good | Good | Excellent |
WARNING
These benchmarks are illustrative and based on published community benchmarks. Real performance depends heavily on schema design, hardware, data distribution, and query patterns. Always benchmark with your own data.
Query Language Comparison
-- ClickHouse: full SQL with extensions
SELECT
toStartOfHour(timestamp) AS hour,
page,
count() AS views,
uniqHLL12(user_id) AS unique_users,
quantile(0.95)(duration_ms) AS p95_duration
FROM events
WHERE timestamp >= now() - INTERVAL 24 HOUR
AND country IN ('US', 'GB', 'DE')
GROUP BY hour, page
HAVING views > 100
ORDER BY views DESC
LIMIT 20;
-- Druid: SQL (via Calcite) — similar but some differences
SELECT
TIME_FLOOR(__time, 'PT1H') AS "hour",
page,
COUNT(*) AS views,
APPROX_COUNT_DISTINCT_DS_HLL(user_id) AS unique_users,
APPROX_QUANTILE_DS(duration_ms, 0.95) AS p95_duration
FROM events
WHERE __time >= CURRENT_TIMESTAMP - INTERVAL '24' HOUR
AND country IN ('US', 'GB', 'DE')
GROUP BY 1, 2
HAVING COUNT(*) > 100
ORDER BY views DESC
LIMIT 20;
-- Pinot: SQL (multi-stage engine for joins)
SELECT
DATETIMECONVERT(timestamp, '1:MILLISECONDS:EPOCH',
'1:HOURS:EPOCH', '1:HOURS') AS hour,
page,
COUNT(*) AS views,
DISTINCTCOUNTHLL(user_id) AS unique_users,
PERCENTILEEST(duration_ms, 95) AS p95_duration
FROM events
WHERE timestamp >= ago('PT24H')
AND country IN ('US', 'GB', 'DE')
GROUP BY hour, page
HAVING COUNT(*) > 100
ORDER BY views DESC
LIMIT 20Feature Comparison Matrix
| Feature | ClickHouse | Druid | Pinot |
|---|---|---|---|
| SQL support | Near-complete | Good (via Calcite) | Good (multi-stage engine) |
| JOINs | Full support (distributed) | Limited (lookup joins) | Multi-stage joins (newer) |
| Subqueries | Full support | Limited | Growing support |
| Window functions | Full support | Limited | Limited |
| UDFs | Yes (C++, SQL) | No | Yes (Groovy, limited) |
| Materialized views | Yes (powerful, at ingest) | No (pre-aggregation at ingest) | No |
| Pre-aggregation | Via AggregatingMergeTree | At ingestion (rollup) | Star-tree index |
| Tiered storage | Yes (hot/warm/cold) | Deep storage (S3) + cache | Deep storage + tiering |
| Multi-tenancy | Basic (quotas, profiles) | Limited | Good (quota, isolation) |
| Upsert | ReplacingMergeTree | No | Yes (native upserts) |
| Geospatial | Basic (H3, geo functions) | Limited | Basic |
| Text search | Experimental (FTS) | No | Text index (Lucene-based) |
| Exact dedup | ReplacingMergeTree | No | Dedup at ingestion |
Operational Complexity
| Dimension | ClickHouse | Druid | Pinot |
|---|---|---|---|
| Component count | 1 binary (+ Keeper) | 6+ node types | 4+ node types |
| ZooKeeper dependency | ClickHouse Keeper (built-in) or ZK | Required | Required (or Helix) |
| Minimum production cluster | 3 nodes | 8-10 nodes | 6-8 nodes |
| Configuration complexity | Medium (SQL DDL) | High (JSON specs) | High (JSON configs) |
| Upgrade process | Rolling restart | Node-by-node, complex | Node-by-node |
| Monitoring | system.* tables, Prometheus | Metrics endpoint | Metrics endpoint |
| Learning curve | Low (SQL) | High (Druid-specific) | Medium |
| Community size | Very large | Large | Large |
| Documentation | Excellent | Good | Good |
TIP
ClickHouse is dramatically simpler to operate. One binary, SQL configuration, and a built-in coordination layer. Druid and Pinot require multiple node types, external dependencies, and JSON-based configuration. If operational simplicity is a priority, ClickHouse has a significant advantage.
When to Use Which
Choose ClickHouse When
- You want SQL-first analytics with full JOIN, subquery, and window function support
- Your team values operational simplicity (single binary, familiar SQL)
- You need materialized views for complex aggregation pipelines
- Ad-hoc queries are common (data exploration, not just dashboards)
- Log analytics and observability (Grafana + ClickHouse is a popular stack)
- You are replacing Elasticsearch for log search or analytics
Choose Druid When
- Your workload is time-series heavy with pre-defined dimensions
- You need deep storage (S3/HDFS) with automatic tiering
- You are building real-time dashboards with Superset or Turnilo
- Event data has consistent schema and does not need complex JOINs
- You have the operational capacity to run a multi-service architecture
- You need high concurrent query throughput for many dashboard users
Choose Pinot When
- You are building user-facing analytics (each user queries their own data)
- Upsert support is required (e.g., updating order status, user profiles)
- Multi-tenant workloads with per-tenant query isolation
- You need to serve thousands of concurrent queries with guaranteed latency SLAs
- LinkedIn-scale requirements (Pinot was built for this exact use case)
- You need text search alongside analytics (Lucene-based text index)
Decision Matrix
Cost Comparison
| Factor | ClickHouse | Druid | Pinot |
|---|---|---|---|
| Compute (3-node min) | 3 x c6i.4xlarge | 8-10 mixed instances | 6-8 mixed instances |
| Storage | Local SSD (EBS) | S3 (deep) + SSD (cache) | S3 (deep) + SSD (cache) |
| Managed service | ClickHouse Cloud (~$0.30/GB) | Imply Cloud (~$0.50/GB) | StarTree (~$0.45/GB) |
| External dependencies | Keeper (built-in) | ZooKeeper + MySQL + S3 | ZooKeeper + S3 |
| Human ops cost | Low | High | Medium |
TIP
For teams under 50 engineers, ClickHouse is usually the best choice due to lower operational burden. Druid and Pinot shine at LinkedIn/Airbnb scale where dedicated platform teams can manage the complexity and the workload demands their specific strengths (deep storage tiering, user-facing analytics at extreme concurrency).
Migration Patterns
Moving Between Engines
| From | To | Strategy |
|---|---|---|
| Elasticsearch | ClickHouse | Export via Logstash/Kafka, recreate as ClickHouse tables |
| BigQuery | ClickHouse | Export to GCS, load with s3() table function |
| Druid | ClickHouse | Export segments to S3, parse and load |
| Druid | Pinot | Repoint Kafka consumers, parallel run, switch broker |
| ClickHouse | Pinot | Dual-write via Kafka during migration window |
Key Takeaways
- ClickHouse for SQL-first simplicity — best SQL support, easiest to operate, excellent for ad-hoc analytics and log search
- Druid for time-series dashboards — mature deep storage tiering, strong Kafka integration, battle-tested at Netflix and Airbnb
- Pinot for user-facing analytics — built for LinkedIn-scale multi-tenancy, native upserts, extreme concurrent query throughput
- All three are excellent — the "wrong" choice among these three is still much better than using PostgreSQL or Elasticsearch for OLAP workloads
- Operational cost matters — ClickHouse (1 binary) vs Druid/Pinot (6-10 components) is a significant operational difference
- Start simple, migrate if needed — ClickHouse is the easiest starting point; migrate to Druid or Pinot only when you hit their specific strengths
Key Takeaway
- ClickHouse is the SQL-first, operationally simplest choice (single binary); Druid excels at time-series dashboards with deep storage tiering; Pinot is built for LinkedIn-scale user-facing analytics with extreme concurrency.
- All three ingest millions of events per second and serve analytical queries in milliseconds, but through fundamentally different architectures.
- Operational cost (ClickHouse: 1 binary vs Druid/Pinot: 6-10 components) is often the deciding factor for teams without dedicated infrastructure engineers.
Exercise
Choose an OLAP Engine for These Scenarios
For each scenario, recommend ClickHouse, Druid, or Pinot, and explain why:
- A startup with 3 engineers needs sub-second dashboards over 10 TB of event data
- An ad-tech company needs to serve 50,000 concurrent analytical queries per second to end users
- A financial services company needs time-series analytics on market data with 6 months of hot data and 5 years of cold data
- A gaming company needs real-time leaderboards with upserts (player scores change constantly)
- An observability platform ingesting 5M log events/second that engineers query ad-hoc via SQL
Solution
ClickHouse. Simplest operations (single binary), best SQL support, and 3 engineers cannot afford to manage Druid/Pinot's multi-component architecture. 10 TB is well within ClickHouse's sweet spot.
Apache Pinot. Built for exactly this use case (LinkedIn's "who viewed your profile"). Native multi-tenancy, extreme concurrent query throughput, and query isolation between tenants. ClickHouse and Druid struggle with 50K+ QPS.
Apache Druid. Mature deep storage tiering (hot segments on SSD, cold on S3) is purpose-built for this. Druid's segment-based architecture makes tiering natural. ClickHouse can do this with tiered storage but it is less mature.
Apache Pinot. Native upsert support (MutableSegment) enables real-time score updates without rewriting entire segments. ClickHouse ReplacingMergeTree can approximate this but deduplication is eventual, not immediate.
ClickHouse. Best ad-hoc SQL support, fastest for exploratory queries by engineers. Log data is a natural fit for ClickHouse's MergeTree engine. Grafana and other tools have native ClickHouse integrations for observability.
Common Misconceptions
- "ClickHouse cannot handle real-time data." ClickHouse supports Kafka ingestion and MaterializedView for continuous aggregation. It is not streaming-native like Druid/Pinot, but it handles real-time well for most use cases.
- "Druid and Pinot are interchangeable." Druid is optimized for time-series analytics with deep storage; Pinot is optimized for user-facing analytics with multi-tenancy. They have different architectural strengths.
- "These OLAP engines replace a data warehouse." OLAP engines are optimized for specific query patterns (aggregations, time-series, high concurrency). They are not general-purpose warehouses -- they complement Snowflake/BigQuery, not replace them.
- "More nodes always means faster queries." These engines are designed for high-throughput on moderate cluster sizes. Poorly designed schemas, missing pre-aggregations, or wrong sort orders cause more performance issues than insufficient hardware.
- "You need all your data in the OLAP engine." Use these engines for hot data that needs sub-second query latency. Cold/archival data should stay in cheaper storage (S3, data lake) and be queried with Spark/Trino when needed.
In Production
- Uber uses Apache Pinot for their real-time analytics dashboard, processing millions of trip events per second and serving hundreds of thousands of concurrent dashboard queries for operations teams.
- Netflix uses Apache Druid for their streaming quality analytics, ingesting billions of playback events and serving real-time quality dashboards for content and infrastructure teams.
- Cloudflare uses ClickHouse to power their analytics products (Web Analytics, DNS Analytics), processing millions of queries per second with sub-100ms latency across petabytes of data.
- LinkedIn created Apache Pinot specifically for their "who viewed your profile" feature, serving 200K+ QPS with sub-millisecond latencies across billions of records.
- Spotify uses ClickHouse for their internal analytics platform, providing ad-hoc SQL access to hundreds of billions of streaming events for data scientists and engineers.
Quiz
1. What is the key architectural difference between ClickHouse and Druid/Pinot?
A) ClickHouse is open source; Druid and Pinot are not B) ClickHouse is a single-binary SQL database; Druid and Pinot are multi-component distributed systems with separate processes for ingestion, query serving, and coordination C) ClickHouse is faster for all workloads D) Druid and Pinot do not support SQL
Answer
B) ClickHouse is a single binary that handles everything. Druid has 6+ components (broker, coordinator, historical, middleManager, overlord, router) and Pinot has similar complexity. This makes ClickHouse significantly simpler to operate but potentially less flexible for specialized architectures.
2. When is Apache Pinot the strongest choice?
A) For ad-hoc SQL analytics by data engineers B) For user-facing analytics requiring extreme concurrent query throughput (50K+ QPS) and multi-tenancy C) For small datasets under 100 GB D) For log search and observability
Answer
B) Pinot was built at LinkedIn for exactly this: serving analytical queries to millions of end users simultaneously. Its architecture includes native multi-tenancy, query isolation, and optimizations for high-concurrency, low-latency query patterns.
3. What is "deep storage tiering" in Apache Druid?
A) Storing data in deep neural networks B) Automatically moving older data segments from fast local storage to cheaper object storage (S3/GCS) while keeping them queryable C) Compressing data more aggressively D) Encrypting data at rest
Answer
B) Druid separates hot segments (on local SSD for fast queries) from cold segments (on S3/GCS for cost-effective storage). Queries transparently read from both tiers. This makes Druid cost-effective for long-retention time-series data.
4. Why might ClickHouse be preferred for an observability platform?
A) It has the best visualization tools B) It offers the best SQL support for ad-hoc queries, fastest single-query performance, and simplest operations -- all critical for engineers investigating production issues C) It is the cheapest option D) It has the best alerting features
Answer
B) Engineers investigating production issues need fast, flexible SQL queries over log/metrics data. ClickHouse's full SQL support, fastest single-query latency, and single-binary operational simplicity make it ideal. Grafana, Kibana, and other observability tools have native ClickHouse integrations.
5. What is the relationship between these OLAP engines and a data warehouse like Snowflake?
A) They replace data warehouses entirely B) They complement data warehouses -- OLAP engines handle sub-second real-time queries while warehouses handle complex batch analytics, joins, and ad-hoc exploration C) Data warehouses are always faster D) They store the same data in the same format
Answer
B) OLAP engines are optimized for specific patterns (time-series, high-concurrency aggregations) with sub-second latency. Data warehouses handle complex multi-table joins, ad-hoc exploration, and batch analytics. Most production architectures use both: warehouse for deep analytics, OLAP engine for real-time dashboards.
:::
One-Liner Summary: ClickHouse for SQL simplicity and single-binary ops, Druid for time-series with deep storage, Pinot for user-facing analytics at LinkedIn scale -- all three are excellent, so choose based on your operational capacity and query pattern.