拿到新款 mbp 的 v 友，有兴趣的话能测一下 numpy scipy 的 benchmark 嘛

测试脚本： https://gist.github.com/markus-beuckelmann/8bc25531b11158431a5b09a45abd6276

很好奇这一代 M1 Pro Max 在 Python 科学计算上的提升有多大，之前 v 友测的上一代 M1 的算力在不谈功耗的情况下大概和 i5 互有胜负： https://v2ex.com/t/733777

haogefeifei

2021-11-05 11:22:11 +08:00

纯多核运算应该占不到什么便宜，不过哪来用丝毫不差就是了

M1:
Dotted two 4096x4096 matrices in 0.77 s.
Dotted two vectors of length 524288 in 0.27 ms.
SVD of a 2048x1024 matrix in 0.90 s.
Cholesky decomposition of a 2048x2048 matrix in 0.11 s.
Eigendecomposition of a 2048x2048 matrix in 7.55 s.

虚拟机 AMD 3700X 4.1Ghz:
Dotted two 4096x4096 matrices in 0.44 s.
Dotted two vectors of length 524288 in 0.03 ms.
SVD of a 2048x1024 matrix in 0.58 s.
Cholesky decomposition of a 2048x2048 matrix in 0.10 s.
Eigendecomposition of a 2048x2048 matrix in 6.16 s.

dejavuwind

2021-11-05 11:22:24 +08:00

size 应该定为多少合适？我来试一下 10 核 M1 Pro

dejavuwind

2021-11-05 11:29:25 +08:00

直接跑 M1 Pro
好像不咋滴

Dotted two 4096x4096 matrices in 0.67 s.
Dotted two vectors of length 524288 in 0.26 ms.
SVD of a 2048x1024 matrix in 1.04 s.
Cholesky decomposition of a 2048x2048 matrix in 0.09 s.
Eigendecomposition of a 2048x2048 matrix in 9.24 s.

wilhexm

2021-11-05 11:39:42 +08:00

16 inch M1 Max 24 Core GPU

Dotted two 4096x4096 matrices in 0.55 s.
Dotted two vectors of length 524288 in 0.25 ms.
SVD of a 2048x1024 matrix in 1.32 s.
Cholesky decomposition of a 2048x2048 matrix in 0.08 s.
Eigendecomposition of a 2048x2048 matrix in 6.79 s.

pb941129

2021-11-05 11:41:06 +08:00

之前帖子回复过 16 寸 i9 的跑分，刚在 Monterey 上跑了下，速度基本上一致。从楼上 M1 Pro 的速度来看，感觉如果是用于 Python 科学计算的话，M1 Pro 还是做不了啥事……

Aspector

2021-11-05 11:44:20 +08:00

Deprecated since version 1.20: The native libraries on macOS, provided by Accelerate, are not fit for use in NumPy since they have bugs that cause wrong output under easily reproducible conditions. If the vendor fixes those bugs, the library could be reinstated, but until then users compiling for themselves should use another linear algebra library or use the built-in (but slower) default, see the next section.

现在的 numpy 用 Accelerate 了吗？苹果是没管这些 bug ？

icyalala

2021-11-05 12:08:36 +08:00

M1 不用 Accelerate 就相当于在 Intel 上不用 AVX2

EyreYoung

2021-11-05 12:14:32 +08:00

18 款 i7-8750 （好像是这个）
Dotted two 2048x2048 matrices in 0.07 s.
Dotted two vectors of length 262144 in 0.02 ms.
SVD of a 1024x512 matrix in 0.05 s.
Cholesky decomposition of a 1024x1024 matrix in 0.01 s.
Eigendecomposition of a 1024x1024 matrix in 0.63 s.

Dotted two 4096x4096 matrices in 0.63 s.
Dotted two vectors of length 524288 in 0.10 ms.
SVD of a 2048x1024 matrix in 0.35 s.
Cholesky decomposition of a 2048x2048 matrix in 0.09 s.
Eigendecomposition of a 2048x2048 matrix in 4.15 s.

boboliu

2021-11-05 12:42:33 +08:00

@Aspector #6
@icyalala #7

https://github.com/numpy/numpy/pull/18874

> This pull request is to add support for Accelerate back to NumPy

dbsquirrel

2021-11-05 12:46:25 +08:00

Dotted two 4096x4096 matrices in 1.85 s.
Dotted two vectors of length 524288 in 0.24 ms.
SVD of a 2048x1024 matrix in 0.68 s.
Cholesky decomposition of a 2048x2048 matrix in 0.15 s.
Eigendecomposition of a 2048x2048 matrix in 5.75 s.

风扇直接起飞，mbp 2016 （ 2.9 GHz i5 ）

Aspector

2021-11-05 12:54:55 +08:00

@boboliu 所以现在有没有用上啊…这个 commit 是今年春天的，怎么 v2 这两个帖子里测出来 M1 没变化？

0Vincent0Zhang0

2021-11-05 13:02:46 +08:00

M1 Max 64g 现在的结果：

Dotted two 4096x4096 matrices in 0.70 s.
Dotted two vectors of length 524288 in 0.25 ms.
SVD of a 2048x1024 matrix in 1.99 s.
Cholesky decomposition of a 2048x2048 matrix in 0.10 s.
Eigendecomposition of a 2048x2048 matrix in 10.36 s.

还有待优化。

dejavuwind

2021-11-05 13:24:26 +08:00

跟环境好像有点关系吧
两个 NOT_AVAILABLE 是不是对结果有影响？ @astrophys @Aspector
Dotted two 4096x4096 matrices in 0.65 s.
Dotted two vectors of length 524288 in 0.26 ms.
SVD of a 2048x1024 matrix in 0.93 s.
Cholesky decomposition of a 2048x2048 matrix in 0.09 s.
Eigendecomposition of a 2048x2048 matrix in 9.90 s.

This was obtained using the following Numpy configuration:
blas_mkl_info:
NOT AVAILABLE
blis_info:
NOT AVAILABLE
openblas_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/opt/arm64-builds/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
runtime_library_dirs = ['/opt/arm64-builds/lib']
blas_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/opt/arm64-builds/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
runtime_library_dirs = ['/opt/arm64-builds/lib']
lapack_mkl_info:
NOT AVAILABLE
openblas_lapack_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/opt/arm64-builds/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
runtime_library_dirs = ['/opt/arm64-builds/lib']
lapack_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/opt/arm64-builds/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
runtime_library_dirs = ['/opt/arm64-builds/lib']
Supported SIMD extensions in this NumPy install:
baseline = NEON,NEON_FP16,NEON_VFPV4,ASIMD
found = ASIMDHP
not found = ASIMDDP

astrophys

2021-11-05 13:26:52 +08:00

贴个 2019 16 寸 i9 64g 的结果：

Dotted two 4096x4096 matrices in 0.45 s.
Dotted two vectors of length 524288 in 0.05 ms.
SVD of a 2048x1024 matrix in 0.29 s.
Cholesky decomposition of a 2048x2048 matrix in 0.07 s.
Eigendecomposition of a 2048x2048 matrix in 3.23 s.

astrophys

2021-11-05 13:29:40 +08:00

@dejavuwind 用 MKL 和多线程肯定会快，我贴的是有 MKL 的。

tiramice

2021-11-05 18:21:44 +08:00

w-2175 虚拟机 8 核
Dotted two 4096x4096 matrices in 0.29 s.
Dotted two vectors of length 524288 in 0.03 ms.
SVD of a 2048x1024 matrix in 0.50 s.
Cholesky decomposition of a 2048x2048 matrix in 0.12 s.
Eigendecomposition of a 2048x2048 matrix in 4.47 s.

astrophys

2021-11-05 18:34:59 +08:00

@Aspector 在 numpy 的 1.20.0 版本移除了 accelerate framework 的支持，今天正好有人问了这个问题： https://stackoverflow.com/questions/69848969/how-to-build-numpy-from-source-linked-to-apple-accelerate-framework#

sharpy

2021-11-05 19:15:41 +08:00

16 寸 i9
Dotted two 4096x4096 matrices in 0.41 s.
Dotted two vectors of length 524288 in 0.04 ms.
SVD of a 2048x1024 matrix in 0.28 s.
Cholesky decomposition of a 2048x2048 matrix in 0.07 s.
Eigendecomposition of a 2048x2048 matrix in 2.89 s.

volvo007

2021-11-05 20:58:39 +08:00

2020 mbp13 intel 顶配

Dotted two 4096x4096 matrices in 0.98 s.
Dotted two vectors of length 524288 in 0.20 ms.
SVD of a 2048x1024 matrix in 0.49 s.
Cholesky decomposition of a 2048x2048 matrix in 0.11 s.
Eigendecomposition of a 2048x2048 matrix in 4.16 s.

cxxlxx

2021-11-05 23:46:49 +08:00

@haogefeifei 为啥我 5900x 比你差好多，无论是 wsl 还是 Windows
Dotted two 4096x4096 matrices in 0.39 s.
Dotted two vectors of length 524288 in 0.14 ms.
SVD of a 2048x1024 matrix in 1.34 s.
Cholesky decomposition of a 2048x2048 matrix in 0.08 s.
Eigendecomposition of a 2048x2048 matrix in 4.80 s.

这是一个专为移动设备优化的页面（即为了让你能够在 Google 搜索结果里秒开这个页面），如果你希望参与 V2EX 社区的讨论，你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/813232

V2EX 是创意工作者们的社区，是一个分享自己正在做的有趣事物、交流想法，可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.