Apple container vs Colima: local datastore benchmarks on an M4 Mac

Apple’s new container CLI made me curious about something practical: should I keep using Colima for local infrastructure, or is Apple’s runtime already good enough to use for real development services?

I did not want to answer that with a hello-world container. Most of my local container usage is not hello-world. It is databases, cache services, graph stores, and small analytical jobs. So I benchmarked Apple container against Colima across a few datastore shapes:

Redis for cache / key-value
Postgres for OLTP-style relational work
ClickHouse for OLAP server work
DuckDB for embedded analytics
Neo4j for graph queries

This is a personal benchmark on one machine, not a universal claim about container runtimes. Still, the results were useful because they were not one-dimensional.

The short version:

Apple container was faster for the long-running services I tested over localhost TCP.
Colima was easier operationally and much faster for the short-lived DuckDB workload.
The interesting difference was not just throughput. It was also startup behavior, volume behavior, and how much runtime-specific setup each image needed.

My takeaway:

Apple container is worth testing seriously for long-running local services on Apple Silicon. Colima is still the smoother Docker-compatible baseline. The right answer depends on the workload.

What I tested#

Test machine:

Test machine

The local machine and runtime versions used for this run.

Item	Value
Mac	Apple M4
Memory	16 GiB
OS	macOS 26.5.1, build 25F80
Apple container	1.0.0
Colima	0.10.3
Docker via Colima	client 29.6.0, server 29.2.1

Docker Desktop was not installed and was not part of this test.

For Redis, Postgres, ClickHouse, and DuckDB, both runtimes were capped at:

2 CPUs
4 GiB memory

I used 2 CPUs because my Colima VM was capped at 2 CPUs. The first broad run failed when I tried to use 4 CPUs, so I normalized both runtimes to the lower available cap.

Neo4j was the first benchmark I ran, before the later 2 CPU normalization. I kept it in the article because it is a useful graph-database data point, but I treat it separately from the normalized Redis/Postgres/ClickHouse/DuckDB suite.

Workloads#

Workload matrix

Each datastore represented a different local development shape.

Store	Shape	Image	Workload
Redis	KV/cache	redis:7.4-alpine	mixed GET, SET, INCR, MGET, MSET over 50k keys
Postgres	OLTP	postgres:16-alpine	20k accounts, 200k events, mixed reads/writes/updates/aggregates
ClickHouse	OLAP server	clickhouse/clickhouse-server:latest	1M-row MergeTree, mixed count, rollup, filtered aggregate, top-N queries
DuckDB	Embedded OLAP	duckdb/duckdb:latest	1M-row DuckDB table plus Parquet scan in short-lived containers
Neo4j	Graph	neo4j:5-community	9,485 graph nodes, 35,332 relationships, mixed graph read queries

Redis, Postgres, ClickHouse, and Neo4j ran as services. The host benchmark client connected over localhost TCP.

DuckDB is different. I did not run DuckDB as a service. I ran the DuckDB CLI inside short-lived containers against a mounted workspace. That matters, because DuckDB ended up showing the opposite result from the long-running services.

Headline results#

At concurrency 4, Apple container was faster for every long-running service workload in this run.

Headline service results at concurrency 4

Long-running services connected over localhost TCP.

Store	Metric	Colima	Apple container	Faster
Redis	ops/s	15,378.75	27,434.62	Apple winner
Postgres	ops/s	15,141.07	26,371.10	Apple winner
ClickHouse	ops/s	200.90	230.34	Apple winner
Neo4j	ops/s	372.09	627.69	Apple winner

DuckDB went the other direction.

DuckDB short-lived command results

DuckDB ran as a short-lived CLI workload against a mounted workspace.

DuckDB metric	Colima	Apple container	Faster
Setup	0.342 s	0.883 s	Colima winner
Query batch p50	0.180 s	0.806 s	Colima winner
Query batch p95	0.188 s	0.841 s	Colima winner
Full workload command	1.690 s	6.571 s	Colima winner

That split is the main point of the benchmark. If I had only tested services, Apple would look like the clear answer. If I had only tested DuckDB, Colima would look like the clear answer. Testing both made the result more useful.

Service concurrency detail#

Each service workload ran at concurrency 1, 4, and 8. Each level ran for 10 seconds.

Service concurrency detail

Each concurrency level ran for 10 seconds.

Store	Runtime	c=1 ops/s	c=4 ops/s	c=8 ops/s	c=4 p50	c=4 p95
Redis	Colima	5,062.47	15,378.75	19,687.21	0.253 ms	0.341 ms
Redis	Apple faster c=4	10,555.34	27,434.62	22,310.45	0.139 ms	0.213 ms
Postgres	Colima	5,386.89	15,141.07	19,190.96	0.243 ms	0.409 ms
Postgres	Apple faster c=4	11,370.94	26,371.10	27,644.91	0.120 ms	0.287 ms
ClickHouse	Colima	129.92	200.90	221.68	17.942 ms	42.390 ms
ClickHouse	Apple faster c=4	135.80	230.34	238.54	14.030 ms	42.035 ms
Neo4j	Colima	194.62	372.09	401.92	7.787 ms	26.957 ms
Neo4j	Apple faster c=4	251.75	627.69	499.03	4.623 ms	13.996 ms

All final workload summaries reported zero operation errors.

Redis and Postgres showed the strongest Apple wins in the normalized suite. ClickHouse also favored Apple, but by a smaller margin. Neo4j favored Apple as well, though again, it came from the earlier graph run with a different resource cap.

Startup and readiness#

Throughput was not the only thing I measured. I also tracked pull time, detached start-command time, service readiness, and full workload command duration.

Startup and readiness

Lifecycle timings collected alongside the workload runs.

Store	Runtime	Pull	Start command	Ready	Workload command
Redis	Colima	2.601 s	0.157 s	0.007 s	30.948 s
Redis	Apple	1.068 s	0.795 s	0.007 s	31.084 s
Postgres	Colima	1.945 s	0.145 s	1.056 s	31.610 s
Postgres	Apple	1.130 s	0.688 s	1.031 s	31.713 s
ClickHouse	Colima	1.937 s	0.125 s	4.096 s	30.346 s
ClickHouse	Apple slow pull	87.784 s	0.761 s	4.554 s	30.398 s
DuckDB	Colima faster	1.240 s	n/a	n/a	1.690 s
DuckDB	Apple	11.199 s	n/a	n/a	6.571 s

Pull timings include warmed and partially warmed reruns, so they are not treated as the headline result.

I do not treat pull time as the headline result. Some images were already warm because I reran parts of the suite while fixing benchmark issues. The clearest cold-ish Apple pull/unpack measurement in this session was ClickHouse, which took 87.784 seconds.

The more consistent lifecycle observation was this:

Colima returned from detached docker run -d faster.
Apple container run -d took roughly 0.7 to 0.8 seconds for these service containers.
Once the service process was starting, readiness was similar for Redis, Postgres, and ClickHouse.

What broke#

The failures were useful. They showed the operational differences more clearly than the happy path.

CPU caps#

The first broad run requested 4 CPUs and failed under Colima:

range of CPUs is from 0.01 to 2.00, as there are only 2 CPUs available

I changed the suite to use 2 CPUs for both runtimes. That made the comparison fairer.

Apple `container` CPU argument parsing#

Apple container rejected --cpus 2.0:

The value '2.0' is invalid for '--cpus <cpus>'

Passing 2 fixed it.

Neo4j bind mounts#

Colima worked with direct host bind mounts for Neo4j data, logs, import, and plugins.

Apple container did not work with the same Neo4j bind-mounted setup. The official Neo4j image tried to change ownership of /logs and got:

Operation not permitted

The working Apple path used named volumes instead.

Postgres named volumes#

Postgres initially failed under Apple container:

initdb: error: directory "/var/lib/postgresql/data" exists but is not empty
initdb: detail: It contains a lost+found directory, perhaps due to it being a mount point.

Apple named volumes are ext4 images, and the root contains lost+found. Postgres does not want to initialize directly into a non-empty data directory.

The fix was:

PGDATA=/var/lib/postgresql/data/pgdata

ClickHouse auth#

ClickHouse required explicit credentials in this image. I set:

CLICKHOUSE_DB=bench
CLICKHOUSE_USER=bench
CLICKHOUSE_PASSWORD=bench
CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1

DuckDB invocation#

The DuckDB image needed the binary invoked explicitly:

duckdb /workspace/bench.duckdb

That was a benchmark-runner fix, not really a runtime finding.

What Apple `container` did well#

Apple’s strongest result was service throughput.

Redis, Postgres, ClickHouse, and Neo4j all had higher throughput under Apple container in this run. The advantage was largest for Redis, Postgres, and Neo4j. ClickHouse was closer but still favored Apple.

That makes Apple container interesting for local development workflows where:

the service runs for a while
the client connects over localhost TCP
the image works cleanly with Apple’s volume model
Docker CLI compatibility is not the main requirement

What Colima did well#

Colima was smoother.

It used the normal Docker CLI. It was easier to script. Detached service starts returned faster. Bind mounts behaved more like I expected from Docker-shaped workflows.

And DuckDB was not close: Colima was much faster for the short-lived embedded analytics job.

That makes Colima still attractive for:

Docker-compatible local workflows
tools that expect Docker behavior
scripts built around docker run
mounted-file workloads
short-lived containerized commands

Colima being boring is a feature.

What surprised me#

The interesting result is not that one runtime is faster than the other. The interesting result is that workload shape changed the answer.

If I had only tested Redis and Postgres, Apple container would look like the obvious choice.

If I had only tested DuckDB, Colima would look like the obvious choice.

If I had only tested Neo4j, I would have seen Apple’s graph-query throughput and missed the volume-model friction.

Testing several datastore shapes made the split clearer:

long-running services favored Apple
short-lived file-backed analytics favored Colima
operational simplicity favored Colima
service throughput favored Apple

That is the result I trust most from this session.

How I would use this today#

For my own local development, this does not replace Colima outright.

Colima remains my default compatibility runtime because it maps cleanly to Docker workflows and existing tooling.

Apple container is the runtime I would test selectively for long-running local services where:

the image is known to work
the volume setup is understood
the service benefits from the throughput profile
Docker CLI compatibility is not required

For the graph-database case specifically, both paths make sense:

Colima for the smoother Docker-compatible path
Apple container for faster query throughput, using named volumes by default

Caveats#

These are personal local tests on one Apple M4 machine with 16 GiB RAM.

This is not a production benchmark.

This is not a Docker Desktop benchmark.

This does not measure multi-day reliability, Compose workflows, Kubernetes behavior, backup/restore, memory pressure under larger datasets, or production durability.

Some pull timings were warm or partially warmed by reruns, so pull time is not the main result.

ClickHouse and DuckDB used latest images in this pass. That is fine for this personal test, but not ideal for a fully reproducible benchmark suite.

The Neo4j result was carried forward from the earlier graph-database run and used a different resource cap than the later Redis/Postgres/ClickHouse/DuckDB suite.

FAQ#

Is this Docker Desktop vs Apple `container`?#

No. Docker Desktop was not installed on this machine and was not part of the benchmark.

Did Apple `container` win?#

For long-running service workloads in this run, yes, Apple container had higher throughput.

For the short-lived DuckDB embedded workload, no. Colima was much faster.

For operational simplicity, Colima was smoother.

That is why I do not reduce the result to a single winner.

Why use 2 CPUs?#

Because my Colima VM was capped at 2 CPUs. The first broad-suite run failed when the runner requested 4 CPUs. I changed the suite to use 2 CPUs for both runtimes so Apple would not get a higher CPU cap than Colima.

Why is Neo4j treated differently?#

Neo4j was the original benchmark that started the session. It was run before the later 2 CPU normalization. I kept it in the article because it is a useful graph-database data point, but I keep it separate from the normalized datastore suite.

Why was DuckDB so different?#

DuckDB was measured as an embedded CLI workload inside short-lived containers using a mounted workspace. Redis, Postgres, ClickHouse, and Neo4j were long-running services over localhost TCP. Different shape, different result.

Would larger datasets change the result?#

Possibly. Larger datasets, longer runs, heavier write pressure, different volume modes, and memory pressure could all change the shape.

Final read#

Apple container looks genuinely strong for long-running local datastore services on Apple Silicon. In my tests, Redis, Postgres, ClickHouse, and Neo4j all had better throughput under Apple container.

Colima remains the easier operational baseline. It is Docker-compatible, predictable, and better for the short-lived DuckDB embedded workload I tested.

So I do not frame this as a replacement story.

I frame it this way:

Apple container is now worth testing seriously for local services. Colima is still the compatibility baseline. The right answer depends on the workload.

That is a more useful result than a single winner.

References#

Apple container: https://github.com/apple/container opens in a new tab
Apple Open Source container: https://opensource.apple.com/projects/container opens in a new tab
Colima: https://github.com/abiosoft/colima opens in a new tab
Redis image: https://hub.docker.com/_/redis opens in a new tab
Postgres image: https://hub.docker.com/_/postgres opens in a new tab
ClickHouse image: https://hub.docker.com/r/clickhouse/clickhouse-server opens in a new tab
DuckDB image: https://hub.docker.com/r/duckdb/duckdb opens in a new tab
Neo4j image: https://hub.docker.com/_/neo4j opens in a new tab

Disclosure#

This article was written with assistance from ChatGPT/Codex based on my local benchmark session, commands, outputs, and review direction.