Performance

Numbers Don't Lie.
Benchmarks Do.

Every claim backed by reproducible benchmarks. Same hardware, default configs, published Docker images. Run them yourself.

~116 ns

In-Memory p99

WeftKitMem point read

~204 ns

Point get p99

WeftKitKV (1K keys)

~2M

ops/sec

WeftKitKV throughput

~234 μs

Flat search

100 vectors top-10

Head-to-Head

WeftKit vs. Industry Leaders

Direct comparisons against the most widely deployed alternatives for each database type.

WeftKitRel

SQL parse (complex)

vs. SQLite

Lower is better

WeftKit~5.1 μs

SQLite~15 μs

WeftKitRel

SQL parse (simple SELECT)

vs. SQLite

Lower is better

WeftKit~1.6 μs

SQLite~5 μs

WeftKitKV

Point get p99

vs. RocksDB

Lower is better

WeftKit~204 ns

RocksDB~15 μs

WeftKitKV

Point write p99

vs. RocksDB

Lower is better

WeftKit~360 ns

RocksDB~12 μs

WeftKitKV

Throughput (single thread)

vs. RocksDB

Higher is better

WeftKit~2M ops/sec

RocksDB~800K ops/sec

WeftKitMem

Point read p99

vs. Redis (network)

Lower is better

WeftKit~116 ns

Redis (network)~5 μs (network)

WeftKitMem

Throughput (single thread)

vs. Redis (network)

Higher is better

WeftKit~8.6M ops/sec

Redis (network)~500K ops/sec

WeftKitVec

HNSW search 2K vectors

vs. Qdrant

Lower is better

WeftKit~31 ms

Qdrant~10 ms

WeftKitVec

Flat search 100 vectors

vs. Qdrant

Lower is better

WeftKit~234 μs

Qdrant~1.5 ms

WeftKitDoc

Insert one

vs. MongoDB (embedded)

Lower is better

WeftKit~3.1 μs

MongoDB (embedded)~50 μs

WeftKitDoc

Insert 1000 docs

vs. MongoDB (embedded)

Lower is better

WeftKit~15.2 ms

MongoDB (embedded)~50 ms

WeftKitDoc

Find all (10K docs)

vs. MongoDB (embedded)

Lower is better

WeftKit~37 ms

MongoDB (embedded)~80 ms

WeftKitGraph

BFS 1000 vertices

vs. Neo4j

Lower is better

WeftKit~160 μs

Neo4j~500 μs

WeftKitGraph

Vertex creation p99

vs. Neo4j

Lower is better

WeftKit~2.5 μs

Neo4j~50 μs

Memory Efficiency

Footprint Per Module

Base memory usage at idle. Minimum configuration (no indexes, no cache) to maximum typical production deployment.

Min → Max Memory (MB)

WeftKitRel

4–88 MB

WeftKitVec

8–256 MB

WeftKitKV

6–28 MB

WeftKitDoc

8–32 MB

WeftKitGraph

12–128 MB

WeftKitMem

8–64 MB

WeftKitMD

4–32 MB

WeftKitFile

6–16 MB

Pool Manager

Connection Proxy Performance

io_uring-powered proxy handling 1M+ concurrent connections with sub-microsecond acquisition.

1M+

Concurrent connections

per instance

~3.3 ns

Protocol detect (Redis)

p99 latency

~39 ns

Cache hit lookup

p99 latency

~86 ns

Queue push + pop

p99 latency

< 1 KB

Idle connection memory

per connection

~744 ns

Health record

per success

Under the Hood

Kernel-Level Performance

Raw engine, security, and utility layer measurements via Criterion bench harness. These are the building blocks powering every database module.

EngineBuffer pool pin (hit)

~23 ns

EngineMVCC begin + commit

~128 ns

EngineMVCC write + commit

~690 μs

EnginePage read (sequential 8K)

~476 ns

EngineWAL append (async)

~38 ns

SecurityAES-256-GCM encrypt 512 B

~5.1 μs

SecurityHMAC-SHA256 sign 4 KB

~12.4 μs

SecurityRBAC permission check

~28 ns

UtilityCRC32C 1 KB

~56 ns

UtilityxxHash64 64 KB

~4.4 μs

UtilityBLAKE3 hash 4 KB

~2.3 μs

UtilityLZ4 compress 8 KB

~1.1 μs

UtilityLZ4 decompress 8 KB

~1.8 μs

DocumentInsert one

~3.1 μs

DocumentInsert 1000 docs

~15.2 ms

DocumentAggregate (match + limit)

~686 μs

KeyValuePoint get (1K keys)

~204 ns

KeyValuePoint write

~360 ns

GraphBFS 1000 vertices

~160 μs

GraphDFS 500 vertices

~65 μs

GraphVertex creation

~2.5 μs

MarkdownFull document parse

~17 μs

MarkdownBM25 score (1K docs)

~18.6 μs

MarkdownRender with TOC

~22 μs

FileStoreUpload 4 KB chunk

~3.7 ms

FileStoreDownload 256 B

~59 μs

FileStoreGC cycle (1000 chunks)

~996 μs

VectorHNSW search (2K vectors)

~31 ms

VectorFlat search (100 vectors)

~234 μs

InMemoryPoint get (per op)

~116 ns

InMemoryPoint set (per op)

~151 ns

Docker Deployment

Direct vs. Containerized Performance

Docker overhead averages 4.3% across all operations. CPU-bound: ~3.5%, I/O-bound: ~6.2%. Maximum observed: 7.9%.

Docker Engine 24.x, debian:bookworm-slim, --cpus=2 --memory=4g2026-03-02

EngineBuffer pool pin (hit)

+4.3%

Direct~23 ns

Docker~24 ns

EngineMVCC begin + commit

+3.9%

Direct~128 ns

Docker~133 ns

EngineMVCC write + commit

+3.6%

Direct~690 μs

Docker~715 μs

EnginePage read (seq 8K)

+7.1%

Direct~476 ns

Docker~510 ns

EngineWAL append (async)

+7.9%

Direct~38 ns

Docker~41 ns

SecurityAES-256-GCM encrypt 512B

+2%

Direct~5.1 μs

Docker~5.2 μs

SecurityHMAC-SHA256 sign 4KB

+2.4%

Direct~12.4 μs

Docker~12.7 μs

UtilityCRC32C 1KB

+1.8%

Direct~56 ns

Docker~57 ns

UtilityLZ4 compress 8KB

+3.6%

Direct~1.1 μs

Docker~1.14 μs

KeyValuePoint get (1K keys)

+3.9%

Direct~204 ns

Docker~212 ns

KeyValuePoint write

+6.9%

Direct~360 ns

Docker~385 ns

InMemoryPoint get (per op)

+3.4%

Direct~116 ns

Docker~120 ns

InMemoryPoint set (per op)

+4.6%

Direct~151 ns

Docker~158 ns

DocumentInsert one

+6.5%

Direct~3.1 μs

Docker~3.3 μs

DocumentInsert 1000 docs

+5.9%

Direct~15.2 ms

Docker~16.1 ms

GraphBFS 1000 vertices

+4.4%

Direct~160 μs

Docker~167 μs

GraphVertex creation

+6%

Direct~2.5 μs

Docker~2.65 μs

MarkdownFull document parse

+2.9%

Direct~17 μs

Docker~17.5 μs

VectorFlat search (100 vec)

+3.4%

Direct~234 μs

Docker~242 μs

FileStoreUpload 4KB chunk

+6.8%

Direct~3.7 ms

Docker~3.95 ms

PoolProtocol detect (Redis)

+3%

Direct~3.3 ns

Docker~3.4 ns

PoolCache hit lookup

+2.6%

Direct~39 ns

Docker~40 ns

+3.5%

CPU-Bound Avg

+6.2%

I/O-Bound Avg

+4.3%

Overall Avg

+7.9%

Max Observed

Methodology

How We Benchmark

Transparency and reproducibility are non-negotiable. Every benchmark is fully reproducible and independently verifiable.

Environment

Measured inside the published Docker images on arm64 (Apple M-series) and x86-64 hosts with NVMe SSD storage. Images built with release profile + LTO.

Warmup

Criterion default: 3-second warmup, 100 iterations minimum. Cold-cache and warm-cache results reported separately.

Percentiles

p50, p95, p99, p999 latency reported. Charts show p99 (worst-case typical). Max latency excluded.

Competitors

Competitor numbers use published reference values under official default configurations. No detuning.

Reproducible

Every benchmark runs against the same published Docker tags listed on the Downloads page. Pull the exact same image and replay the workload to reproduce our numbers.

Disclaimer

All WeftKit numbers are measured against the published Docker images over the loopback wire protocol — the exact topology customers deploy. No in-process bench tricks.

Numbers Don't Lie.Benchmarks Do.

WeftKit vs. Industry Leaders

SQL parse (complex)

SQL parse (simple SELECT)

Point get p99

Point write p99

Throughput (single thread)

Point read p99

Throughput (single thread)

HNSW search 2K vectors

Flat search 100 vectors

Insert one

Insert 1000 docs

Find all (10K docs)

BFS 1000 vertices

Vertex creation p99

Footprint Per Module

Min → Max Memory (MB)

Connection Proxy Performance

Kernel-Level Performance

Direct vs. Containerized Performance

How We Benchmark

Environment

Warmup

Percentiles

Competitors

Reproducible

Disclaimer

Numbers Don't Lie.
Benchmarks Do.