Benchmarking
Hornbeam includes a benchmark suite for measuring performance and comparing against other servers like gunicorn.
Prerequisites
Install a benchmarking tool:
# macOS
brew install wrk
# Linux (Ubuntu/Debian)
apt-get install wrk
# or
apt-get install apache2-utils # for ab
Quick Benchmark
Run the quick benchmark script:
./benchmarks/quick_bench.sh
This runs three test scenarios against the WSGI worker:
- Simple requests: 10,000 requests, 100 concurrent connections
- High concurrency: 5,000 requests, 500 concurrent connections
- Large response: 1,000 requests with 64KB response bodies
Example output (Apple M4 Pro, Python 3.13, OTP 28, February 2026):
=== Benchmark: Simple requests (10000 requests, 100 concurrent) ===
Requests per second: 66000.00 [#/sec] (mean)
Time per request: 1.52 [ms] (mean)
Failed requests: 0
=== Benchmark: High concurrency (5000 requests, 500 concurrent) ===
Requests per second: 71000.00 [#/sec] (mean)
Time per request: 7.04 [ms] (mean)
Failed requests: 0
=== Benchmark: Large response (1000 requests, 50 concurrent) ===
Requests per second: 58000.00 [#/sec] (mean)
Time per request: 0.86 [ms] (mean)
Failed requests: 0
Results Summary
| Test | Requests/sec | Latency (mean) | Failed |
|---|---|---|---|
| Simple (100 concurrent) | 66,000 | 1.52ms | 0 |
| High concurrency (500 concurrent) | 71,000 | 7.04ms | 0 |
| Large response (64KB) | 58,000 | 0.86ms | 0 |
Note: These numbers reflect the 6-stage ASGI/WSGI optimizations in hornbeam 1.4.0 with erlang_python 1.8.0, including per-app execution mode caching, Erlang-native async timer support, and request-local queues via contextvars.
These numbers demonstrate that hornbeam maintains consistent high throughput even under heavy concurrency, thanks to Erlang’s lightweight process model.
ASGI Performance (1.4.0)
Hornbeam 1.4.0 includes Erlang-native async timer support, providing significant improvements for async applications:
| Test | Requests/sec | Description |
|---|---|---|
| Simple ASGI | ~66,000 | Basic async response |
| High concurrency (500 conn) | ~71,000 | Concurrent connections |
| Async sleep (1ms) | ~8,600 | With erlang_asyncio |
| Concurrent tasks | ~6,200 | Multiple async operations |
Erlang-Asyncio Optimization
The ASGI runner auto-detects asyncio.sleep() and uses Erlang’s native timer:
# This code automatically benefits from Erlang timers
async def handler():
await asyncio.sleep(0.001) # Uses _erlang_sleep internally
This provides 86x improvement over standard asyncio timers for sleep operations.
Comparison with Gunicorn
Direct comparison using identical WSGI app (4 workers, gunicorn with gthread and 4 threads):
| Test | Hornbeam | Gunicorn gthread | Speedup |
|---|---|---|---|
| Simple (100 concurrent) | 66,000 req/s | 3,661 req/s | 18.0x |
| High concurrency (500 concurrent) | 71,000 req/s | 3,631 req/s | 19.6x |
| Large response (64KB) | 58,000 req/s | 3,599 req/s | 16.1x |
Latency Comparison
| Test | Hornbeam | Gunicorn |
|---|---|---|
| Simple (100 concurrent) | 1.52ms | 27.3ms |
| High concurrency (500 concurrent) | 7.04ms | 137.7ms |
| Large response (64KB) | 0.86ms | 13.9ms |
Why the Difference?
Gunicorn limitations:
- GIL contention: Python’s Global Interpreter Lock limits true parallelism
- Process model: Each worker is a separate OS process with overhead
- Connection handling: Blocking I/O model limits concurrent connections
Hornbeam advantages:
- BEAM scheduler: Millions of lightweight processes, no OS thread overhead
- No GIL impact: Python runs on dirty schedulers, isolated from BEAM
- Cowboy: Battle-tested HTTP server handling connections efficiently
- Zero-copy ETS: Shared state without serialization overhead
Python Benchmark Script
For more control, use the Python benchmark script:
# WSGI benchmarks
python benchmarks/run_benchmark.py
# ASGI benchmarks
python benchmarks/run_benchmark.py --asgi
# Custom bind address
python benchmarks/run_benchmark.py --bind 0.0.0.0:9000
# Save results to JSON
python benchmarks/run_benchmark.py --output results.json
Comparing with Gunicorn
Compare hornbeam performance against gunicorn:
python benchmarks/compare_servers.py
This runs identical benchmarks against both servers and prints a comparison:
======================================================================
COMPARISON: GUNICORN vs HORNBEAM
======================================================================
Test Gunicorn (req/s) Hornbeam (req/s) Diff
----------------------------------------------------------------------
simple 1842.3 12543.2 +580.7%
high_concurrency 1523.1 11892.3 +680.9%
large_response 1234.5 8234.6 +567.2%
Benchmark Apps
The benchmark suite uses minimal apps to measure raw server performance:
WSGI App (benchmarks/simple_app.py)
def application(environ, start_response):
path = environ.get('PATH_INFO', '/')
if path == '/large':
body = b'X' * 65536 # 64KB
else:
body = b'Hello, World!'
status = '200 OK'
headers = [
('Content-Type', 'text/plain'),
('Content-Length', str(len(body))),
]
start_response(status, headers)
return [body]
ASGI App (benchmarks/simple_asgi_app.py)
async def application(scope, receive, send):
if scope['type'] != 'http':
return
path = scope.get('path', '/')
if path == '/large':
body = b'X' * 65536 # 64KB
else:
body = b'Hello, World!'
await send({
'type': 'http.response.start',
'status': 200,
'headers': [
[b'content-type', b'text/plain'],
[b'content-length', str(len(body)).encode()],
],
})
await send({
'type': 'http.response.body',
'body': body,
})
Running Manual Benchmarks
Using wrk
# Start hornbeam
rebar3 shell
# In Erlang shell:
hornbeam:start("myapp:application", #{
bind => <<"127.0.0.1:8000">>,
worker_class => wsgi
}).
# In another terminal, run wrk:
wrk -t4 -c100 -d30s --latency http://127.0.0.1:8000/
Using Apache Bench (ab)
# Simple benchmark
ab -n 10000 -c 100 -k http://127.0.0.1:8000/
# High concurrency
ab -n 5000 -c 500 -k http://127.0.0.1:8000/
Key wrk Flags
| Flag | Description |
|---|---|
-t | Number of threads |
-c | Number of connections |
-d | Duration (e.g., 30s, 1m) |
--latency | Print latency statistics |
Key ab Flags
| Flag | Description |
|---|---|
-n | Number of requests |
-c | Concurrency level |
-k | Enable HTTP keepalive |
Performance Tuning
Workers Configuration
Adjust the number of workers based on your CPU cores:
hornbeam:start("app:application", #{
workers => 8, % Number of Python workers
worker_class => wsgi
}).
Connection Limits
For high-concurrency scenarios:
hornbeam:start("app:application", #{
max_requests => 10000, % Requests per worker before restart
timeout => 60000 % Request timeout in ms
}).
HTTP/2 for Lower Latency
Enable HTTP/2 for multiplexed connections:
hornbeam:start("app:application", #{
http_version => ['HTTP/2', 'HTTP/1.1'],
ssl => true,
certfile => "server.crt",
keyfile => "server.key"
}).
Understanding Results
Key Metrics
| Metric | Description |
|---|---|
| Requests/sec | Throughput - higher is better |
| Latency (avg) | Mean response time - lower is better |
| Latency (p99) | 99th percentile latency - measures tail latency |
| Failed requests | Should be 0 for valid benchmarks |
Why Hornbeam is Fast
- BEAM Concurrency: Erlang handles millions of lightweight processes
- No GIL Bottleneck: Python runs on dirty schedulers, not blocking the BEAM
- Efficient I/O: Cowboy is battle-tested for high-performance HTTP
- Connection Multiplexing: HTTP/2 support reduces connection overhead
- Zero-Copy: ETS shared state avoids serialization overhead
Realistic Benchmarks
The simple “Hello World” benchmarks measure raw server overhead. For realistic numbers, benchmark your actual application:
# Start your app
hornbeam:start("myapp:app", #{...}).
# Benchmark specific endpoints
wrk -t4 -c100 -d30s http://localhost:8000/api/users
wrk -t4 -c100 -d30s http://localhost:8000/api/search?q=test
Consider:
- Database queries
- External API calls
- ML inference
- File uploads/downloads
- WebSocket connections
CI Benchmarking
Add benchmarks to your CI pipeline:
# .github/workflows/benchmark.yml
name: Benchmark
on:
pull_request:
branches: [main]
jobs:
benchmark:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Erlang
uses: erlef/setup-beam@v1
with:
otp-version: '27.1'
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.13'
- name: Install wrk
run: sudo apt-get install -y wrk
- name: Compile
run: rebar3 compile
- name: Run benchmarks
run: python benchmarks/run_benchmark.py --output results.json
- name: Upload results
uses: actions/upload-artifact@v4
with:
name: benchmark-results
path: results.json