Community performance

Real inference numbers from real Macs. The headline benchmarks on the homepage are measured on one M3 Ultra — this page is how everyone else fills in their own row. Submitted by users via rapid-mlx bench <alias> --submit, aggregated by (chip × model × Rapid-MLX version), median across submissions, IQR shown for groups with multiple rows.

Loading aggregate…
Chip Model Version Short decode tok/s Long decode tok/s Long TTFT (ms) Rows
Loading aggregate…

Submit yours

Got an M2 Air, M4 Max, M3 Ultra with different RAM? Run the standardized bench and open a PR — the schema validator + maintainer review run on GitHub. No data leaves your machine without an explicit y/N confirmation, and the runner only reads non-privileged macOS interfaces.

$ rapid-mlx bench <alias> --submit