Active

Real-Time Log Aggregator

A high-throughput log aggregation pipeline processing 500K events/sec with sub-second search latency.

Overview

A distributed log aggregation system built in Rust that ingests, transforms, and indexes log events at scale. Designed to replace a legacy ELK stack that couldn't keep up with growing traffic.

Problem

The existing logging infrastructure had several pain points:

  • Ingestion lag — during traffic spikes, Logstash would fall behind by minutes, making real-time debugging impossible.
  • High resource usage — the JVM-based pipeline consumed 64 GB of RAM per node.
  • Schema drift — unstructured logs made it hard to build reliable dashboards.

Solution

A purpose-built pipeline with three stages:

  1. Collector — lightweight Rust agent (< 10 MB memory) that tails log files, parses structured fields, and publishes to Kafka.
  2. Transformer — stream processing layer that enriches events (geo-IP, user lookup), normalizes schemas, and routes to appropriate indices.
  3. Query API — TypeScript/Node.js service with a search UI supporting full-text search, field filtering, and time-range queries.

Architecture

  • Kafka provides durable, partitioned message transport
  • Elasticsearch handles indexing and search
  • The Rust collector uses zero-copy parsing for minimal overhead

Key Learnings

  1. Rust's ownership model makes concurrent I/O safe and fast — zero data races in production over 18 months.
  2. Schema enforcement early in the pipeline prevents downstream chaos. We added a schema registry backed by Protobuf definitions.
  3. Sampling beats dropping. When overloaded, sampling 10% of debug logs is better than dropping them entirely — you keep statistical visibility.

Tech Stack

  • Rust 1.75 (tokio async runtime)
  • Apache Kafka 3.6
  • Elasticsearch 8
  • TypeScript, React (query dashboard)
  • Docker, Kubernetes
  • Prometheus, Grafana for monitoring
rustkafkaelasticsearchdockertypescript
Was this helpful?