Ultra-Low Latency
HFT Simulator

A production-grade High-Frequency Trading simulator built in modern C++20

1.6μs Tick-to-Trade p50
21ns Risk Check p99
656K/sec Book Updates
0 Hot-Path Allocations

About This Project

This is a production-grade High-Frequency Trading simulator built from scratch in modern C++20. It demonstrates the full pipeline of a real HFT system: market data ingestion, order book reconstruction, strategy signal generation, risk management, and order execution — all optimized for sub-microsecond latency.

C++20 Lock-Free SPSC Queues Zero Allocation Linux / POSIX CMake Google Test Google Benchmark Cache-Aligned Fixed-Point Math

Benchmark Machine

Benchmarked on an AMD Ryzen 9 7900X (Zen 4, 12C/24T, 5.73 GHz boost) with 64 GB DDR5 RAM, running Ubuntu 22.04 LTS and compiled with GCC 11.4 (-O3 -march=native -flto). See the Performance page for full details.

Highlights

  • Six-stage pipeline connected by lock-free SPSC ring buffers, each pinnable to a dedicated CPU core
  • Zero heap allocations on the hot path — all memory pre-allocated at startup via custom pool allocators
  • Sub-microsecond tick-to-trade latency (1.6 μs p50) with 21 ns risk checks
  • Three trading strategies: market making with inventory skew, pairs trading with z-score signals, and momentum with EMA crossover
  • Smart order routing across multiple simulated exchanges with configurable latency profiles
  • 18 passing tests (unit + integration) and 10 benchmark suites

System at a Glance

Market Data Handler

Zero-copy FIX protocol parser (~700ns/msg). Feed simulator with random walk pricing and CSV replay.

Order Book Engine

Price-time priority matching. O(1) cancel via intrusive linked lists. Supports Limit, Market, IOC, FOK orders.

Strategy Engine

Market making, pairs trading, and momentum strategies. Pre-allocated order buffers returned via std::span.

Risk Manager

Six pre-trade checks in ~20ns. Kill switch, position limits, capital limits, rate limiting, fat finger detection.

Execution Engine

Smart order routing across multiple exchanges. Token bucket rate limiting. Full order state machine.

Performance Monitor

Latency histograms (p50 through p99.9). Throughput counters. Log-scale visualization.