Benchmarking

Hornbeam includes a benchmark suite for measuring performance and comparing against other servers like gunicorn.

Prerequisites

Install a benchmarking tool:

# macOS
brew install wrk

# Linux (Ubuntu/Debian)
apt-get install wrk
# or
apt-get install apache2-utils  # for ab

Quick Benchmark

Run the quick benchmark script:

./benchmarks/quick_bench.sh

This runs three test scenarios against the WSGI worker:

  1. Simple requests: 10,000 requests, 100 concurrent connections
  2. High concurrency: 5,000 requests, 500 concurrent connections
  3. Large response: 1,000 requests with 64KB response bodies

Example output (Apple M4 Pro, Python 3.13, OTP 28, February 2026):

=== Benchmark: Simple requests (10000 requests, 100 concurrent) ===
Requests per second:    66000.00 [#/sec] (mean)
Time per request:       1.52 [ms] (mean)
Failed requests:        0

=== Benchmark: High concurrency (5000 requests, 500 concurrent) ===
Requests per second:    71000.00 [#/sec] (mean)
Time per request:       7.04 [ms] (mean)
Failed requests:        0

=== Benchmark: Large response (1000 requests, 50 concurrent) ===
Requests per second:    58000.00 [#/sec] (mean)
Time per request:       0.86 [ms] (mean)
Failed requests:        0

Results Summary

TestRequests/secLatency (mean)Failed
Simple (100 concurrent)66,0001.52ms0
High concurrency (500 concurrent)71,0007.04ms0
Large response (64KB)58,0000.86ms0

Note: These numbers reflect the 6-stage ASGI/WSGI optimizations in hornbeam 1.4.0 with erlang_python 1.8.0, including per-app execution mode caching, Erlang-native async timer support, and request-local queues via contextvars.

These numbers demonstrate that hornbeam maintains consistent high throughput even under heavy concurrency, thanks to Erlang’s lightweight process model.

ASGI Performance (1.4.0)

Hornbeam 1.4.0 includes Erlang-native async timer support, providing significant improvements for async applications:

TestRequests/secDescription
Simple ASGI~66,000Basic async response
High concurrency (500 conn)~71,000Concurrent connections
Async sleep (1ms)~8,600With erlang_asyncio
Concurrent tasks~6,200Multiple async operations

Erlang-Asyncio Optimization

The ASGI runner auto-detects asyncio.sleep() and uses Erlang’s native timer:

# This code automatically benefits from Erlang timers
async def handler():
    await asyncio.sleep(0.001)  # Uses _erlang_sleep internally

This provides 86x improvement over standard asyncio timers for sleep operations.

Comparison with Gunicorn

Direct comparison using identical WSGI app (4 workers, gunicorn with gthread and 4 threads):

TestHornbeamGunicorn gthreadSpeedup
Simple (100 concurrent)66,000 req/s3,661 req/s18.0x
High concurrency (500 concurrent)71,000 req/s3,631 req/s19.6x
Large response (64KB)58,000 req/s3,599 req/s16.1x

Latency Comparison

TestHornbeamGunicorn
Simple (100 concurrent)1.52ms27.3ms
High concurrency (500 concurrent)7.04ms137.7ms
Large response (64KB)0.86ms13.9ms

Why the Difference?

Gunicorn limitations:

  • GIL contention: Python’s Global Interpreter Lock limits true parallelism
  • Process model: Each worker is a separate OS process with overhead
  • Connection handling: Blocking I/O model limits concurrent connections

Hornbeam advantages:

  • BEAM scheduler: Millions of lightweight processes, no OS thread overhead
  • No GIL impact: Python runs on dirty schedulers, isolated from BEAM
  • Cowboy: Battle-tested HTTP server handling connections efficiently
  • Zero-copy ETS: Shared state without serialization overhead

Python Benchmark Script

For more control, use the Python benchmark script:

# WSGI benchmarks
python benchmarks/run_benchmark.py

# ASGI benchmarks
python benchmarks/run_benchmark.py --asgi

# Custom bind address
python benchmarks/run_benchmark.py --bind 0.0.0.0:9000

# Save results to JSON
python benchmarks/run_benchmark.py --output results.json

Comparing with Gunicorn

Compare hornbeam performance against gunicorn:

python benchmarks/compare_servers.py

This runs identical benchmarks against both servers and prints a comparison:

======================================================================
COMPARISON: GUNICORN vs HORNBEAM
======================================================================
Test                 Gunicorn (req/s)     Hornbeam (req/s)     Diff
----------------------------------------------------------------------
simple               1842.3               12543.2              +580.7%
high_concurrency     1523.1               11892.3              +680.9%
large_response       1234.5               8234.6               +567.2%

Benchmark Apps

The benchmark suite uses minimal apps to measure raw server performance:

WSGI App (benchmarks/simple_app.py)

def application(environ, start_response):
    path = environ.get('PATH_INFO', '/')

    if path == '/large':
        body = b'X' * 65536  # 64KB
    else:
        body = b'Hello, World!'

    status = '200 OK'
    headers = [
        ('Content-Type', 'text/plain'),
        ('Content-Length', str(len(body))),
    ]
    start_response(status, headers)
    return [body]

ASGI App (benchmarks/simple_asgi_app.py)

async def application(scope, receive, send):
    if scope['type'] != 'http':
        return

    path = scope.get('path', '/')

    if path == '/large':
        body = b'X' * 65536  # 64KB
    else:
        body = b'Hello, World!'

    await send({
        'type': 'http.response.start',
        'status': 200,
        'headers': [
            [b'content-type', b'text/plain'],
            [b'content-length', str(len(body)).encode()],
        ],
    })
    await send({
        'type': 'http.response.body',
        'body': body,
    })

Running Manual Benchmarks

Using wrk

# Start hornbeam
rebar3 shell

# In Erlang shell:
hornbeam:start("myapp:application", #{
    bind => <<"127.0.0.1:8000">>,
    worker_class => wsgi
}).

# In another terminal, run wrk:
wrk -t4 -c100 -d30s --latency http://127.0.0.1:8000/

Using Apache Bench (ab)

# Simple benchmark
ab -n 10000 -c 100 -k http://127.0.0.1:8000/

# High concurrency
ab -n 5000 -c 500 -k http://127.0.0.1:8000/

Key wrk Flags

FlagDescription
-tNumber of threads
-cNumber of connections
-dDuration (e.g., 30s, 1m)
--latencyPrint latency statistics

Key ab Flags

FlagDescription
-nNumber of requests
-cConcurrency level
-kEnable HTTP keepalive

Performance Tuning

Workers Configuration

Adjust the number of workers based on your CPU cores:

hornbeam:start("app:application", #{
    workers => 8,  % Number of Python workers
    worker_class => wsgi
}).

Connection Limits

For high-concurrency scenarios:

hornbeam:start("app:application", #{
    max_requests => 10000,  % Requests per worker before restart
    timeout => 60000        % Request timeout in ms
}).

HTTP/2 for Lower Latency

Enable HTTP/2 for multiplexed connections:

hornbeam:start("app:application", #{
    http_version => ['HTTP/2', 'HTTP/1.1'],
    ssl => true,
    certfile => "server.crt",
    keyfile => "server.key"
}).

Understanding Results

Key Metrics

MetricDescription
Requests/secThroughput - higher is better
Latency (avg)Mean response time - lower is better
Latency (p99)99th percentile latency - measures tail latency
Failed requestsShould be 0 for valid benchmarks

Why Hornbeam is Fast

  1. BEAM Concurrency: Erlang handles millions of lightweight processes
  2. No GIL Bottleneck: Python runs on dirty schedulers, not blocking the BEAM
  3. Efficient I/O: Cowboy is battle-tested for high-performance HTTP
  4. Connection Multiplexing: HTTP/2 support reduces connection overhead
  5. Zero-Copy: ETS shared state avoids serialization overhead

Realistic Benchmarks

The simple “Hello World” benchmarks measure raw server overhead. For realistic numbers, benchmark your actual application:

# Start your app
hornbeam:start("myapp:app", #{...}).

# Benchmark specific endpoints
wrk -t4 -c100 -d30s http://localhost:8000/api/users
wrk -t4 -c100 -d30s http://localhost:8000/api/search?q=test

Consider:

  • Database queries
  • External API calls
  • ML inference
  • File uploads/downloads
  • WebSocket connections

CI Benchmarking

Add benchmarks to your CI pipeline:

# .github/workflows/benchmark.yml
name: Benchmark

on:
  pull_request:
    branches: [main]

jobs:
  benchmark:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup Erlang
        uses: erlef/setup-beam@v1
        with:
          otp-version: '27.1'

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.13'

      - name: Install wrk
        run: sudo apt-get install -y wrk

      - name: Compile
        run: rebar3 compile

      - name: Run benchmarks
        run: python benchmarks/run_benchmark.py --output results.json

      - name: Upload results
        uses: actions/upload-artifact@v4
        with:
          name: benchmark-results
          path: results.json