Active

Real-Time Log Aggregator

A high-throughput log aggregation pipeline processing 500K events/sec with sub-second search latency.

Overview

A distributed log aggregation system built in Rust that ingests, transforms, and indexes log events at scale. Designed to replace a legacy ELK stack that couldn't keep up with growing traffic.

Problem

The existing logging infrastructure had several pain points:

Ingestion lag — during traffic spikes, Logstash would fall behind by minutes, making real-time debugging impossible.
High resource usage — the JVM-based pipeline consumed 64 GB of RAM per node.
Schema drift — unstructured logs made it hard to build reliable dashboards.

Solution

A purpose-built pipeline with three stages:

Collector — lightweight Rust agent (< 10 MB memory) that tails log files, parses structured fields, and publishes to Kafka.
Transformer — stream processing layer that enriches events (geo-IP, user lookup), normalizes schemas, and routes to appropriate indices.
Query API — TypeScript/Node.js service with a search UI supporting full-text search, field filtering, and time-range queries.

Architecture

Kafka provides durable, partitioned message transport
Elasticsearch handles indexing and search
The Rust collector uses zero-copy parsing for minimal overhead

Key Learnings

Rust's ownership model makes concurrent I/O safe and fast — zero data races in production over 18 months.
Schema enforcement early in the pipeline prevents downstream chaos. We added a schema registry backed by Protobuf definitions.
Sampling beats dropping. When overloaded, sampling 10% of debug logs is better than dropping them entirely — you keep statistical visibility.

Tech Stack

Rust 1.75 (tokio async runtime)
Apache Kafka 3.6
Elasticsearch 8
TypeScript, React (query dashboard)
Docker, Kubernetes
Prometheus, Grafana for monitoring