先说结论,能跑,但没办法长期跑,主要问题是散热,外挂风扇支架也不太能解决问题,高强度跑温度上升快,持续高温机器会降频。如果考虑便携+生产力,推荐上 mac book pro 吧。
装了两个平台,ollama 跟 olmx ,测试下来,olmx 平台会更快些,考虑到机器 32G 的内存,能跑的模型大小不要超 22GB
附上部分主流模型下载容量大小及 olmx 平台测试结果给大家做参考
Qwen3.5-4B-MLX-4bit 2.85GB
gemma-4-26b-a4b-it-4bit 14.57GB
Qwen3.6-35B-A3B-4bit 15.13GB
GLM-4.7-Flash-4bit 15.71GB
gpt-oss-20b-MXFP4-Q8 11.27GB
oMLX - LLM inference, optimized for your Mac Benchmark Model: Qwen3.5-4B-MLX-4bit ================================================================================ Single Request Results -------------------------------------------------------------------------------- Test TTFT(ms) TPOT(ms) pp TPS tg TPS E2E(s) Throughput Peak Mem pp1024/tg128 1001.6 22.74 1022.4 tok/s 44.3 tok/s 3.889 296.2 tok/s 3.29 GB pp4096/tg128 3540.9 23.76 1156.8 tok/s 42.4 tok/s 6.558 644.1 tok/s 3.90 GB Continuous Batching pp1024 / tg128 -------------------------------------------------------------------------------- Batch tg TPS Speedup pp TPS pp TPS/req TTFT(ms) E2E(s) 1x 44.3 tok/s 1.00x 1022.4 tok/s 1022.4 tok/s 1001.6 3.889 2x 88.3 tok/s 1.99x 407.6 tok/s 203.8 tok/s 3040.1 7.924 4x 175.1 tok/s 3.95x 322.7 tok/s 80.7 tok/s 6833.9 15.617 Benchmark Model: gemma-4-26b-a4b-it-4bit ================================================================================ Single Request Results -------------------------------------------------------------------------------- Test TTFT(ms) TPOT(ms) pp TPS tg TPS E2E(s) Throughput Peak Mem pp1024/tg128 1500.5 24.21 682.4 tok/s 41.6 tok/s 4.575 251.8 tok/s 14.23 GB pp4096/tg128 4863.4 25.14 842.2 tok/s 40.1 tok/s 8.056 524.3 tok/s 14.91 GB Continuous Batching pp1024 / tg128 -------------------------------------------------------------------------------- Batch tg TPS Speedup pp TPS pp TPS/req TTFT(ms) E2E(s) 1x 41.6 tok/s 1.00x 682.4 tok/s 682.4 tok/s 1500.5 4.575 2x 82.5 tok/s 1.98x 361.6 tok/s 180.8 tok/s 3495.8 8.767 4x 166.1 tok/s 3.99x 283.4 tok/s 70.8 tok/s 7840.6 17.536 Benchmark Model: Qwen3.6-35B-A3B-4bit ================================================================================ Single Request Results -------------------------------------------------------------------------------- Test TTFT(ms) TPOT(ms) pp TPS tg TPS E2E(s) Throughput Peak Mem pp1024/tg128 1676.1 17.20 610.9 tok/s 58.6 tok/s 3.860 298.4 tok/s 18.80 GB pp4096/tg128 5046.3 17.93 811.7 tok/s 56.2 tok/s 7.323 576.8 tok/s 19.24 GB Continuous Batching pp1024 / tg128 -------------------------------------------------------------------------------- Batch tg TPS Speedup pp TPS pp TPS/req TTFT(ms) E2E(s) 1x 58.6 tok/s 1.00x 610.9 tok/s 610.9 tok/s 1676.1 3.860 2x 116.2 tok/s 1.98x 435.5 tok/s 217.8 tok/s 2973.7 6.907 4x 230.7 tok/s 3.94x 352.0 tok/s 88.0 tok/s 6445.2 13.855 Benchmark Model: GLM-4.7-Flash-4bit ================================================================================ Single Request Results -------------------------------------------------------------------------------- Test TTFT(ms) TPOT(ms) pp TPS tg TPS E2E(s) Throughput Peak Mem pp1024/tg128 1985.0 21.78 515.9 tok/s 46.3 tok/s 4.752 242.4 tok/s 16.27 GB pp4096/tg128 6839.2 27.31 598.9 tok/s 36.9 tok/s 10.307 409.8 tok/s 17.34 GB Continuous Batching pp1024 / tg128 -------------------------------------------------------------------------------- Batch tg TPS Speedup pp TPS pp TPS/req TTFT(ms) E2E(s) 1x 46.3 tok/s 1.00x 515.9 tok/s 515.9 tok/s 1985.0 4.752 2x 91.5 tok/s 1.98x 362.7 tok/s 181.3 tok/s 3549.9 8.445 4x 174.9 tok/s 3.78x 321.2 tok/s 80.3 tok/s 6393.9 15.679 Benchmark Model: gpt-oss-20b-MXFP4-Q8 ================================================================================ Single Request Results -------------------------------------------------------------------------------- Test TTFT(ms) TPOT(ms) pp TPS tg TPS E2E(s) Throughput Peak Mem pp1024/tg128 1687.6 24.70 606.8 tok/s 40.8 tok/s 4.824 238.8 tok/s 11.67 GB pp4096/tg128 4088.8 26.44 1001.8 tok/s 38.1 tok/s 7.446 567.3 tok/s 11.75 GB Continuous Batching pp1024 / tg128 -------------------------------------------------------------------------------- Batch tg TPS Speedup pp TPS pp TPS/req TTFT(ms) E2E(s) 1x 40.8 tok/s 1.00x 606.8 tok/s 606.8 tok/s 1687.6 4.824 2x 82.1 tok/s 2.01x 359.0 tok/s 179.5 tok/s 3489.1 8.822 4x 159.5 tok/s 3.91x 293.2 tok/s 73.3 tok/s 7335.0 17.180
这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。
V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。
V2EX is a community of developers, designers and creative people.