13 年 MBP，外接 RX580 显卡，运行 Keras (TensorFlow) + PlaidML

MacBook Pro 配置

主力机是 2013 年初的 15 寸 MBP：

2.7 GHz Intel Core i7
一代雷电接口
Intel HD4000 集显
GT 650m 独显
macOS Mojave 10.14.6

总之是比较老的机子了。独显性能差，且运行时风扇转速高，发热严重

eGPU 配置

蓝宝石 RX 580 8GB
Razer Core X 显卡盒子
USB C 转雷电 1/2 代转换器
雷电 1/2 代延长线

我用的这一代 MBP 连接 eGPU 需要运行：

purge-wrangler.sh，用来让老机子运行 eGPU
purge-nvda.sh，禁用独显必要脚本之一
Ubuntu GNU grub.cfg 魔改的 boot 文件，禁用独显必要步骤之一

教程详见这个eGPU.io 帖子

这个配置虽然把 RX 580 的数据传输性能限制到了一代雷电的水平，可以说是大打折扣。但是完全带得动 LG 4k60p 显示器（直连显卡盒子）。

PlaidML

使用PlaidML，在 pyenv 创建的 Python 3.8.6 虚拟环境里安装：

ipykernel
h5py<3.0.0 (need to enter "h5py<3.0.0") actually 2.10.0. This version is required by tensorflow.
plaidml-keras (only available on PyPI, keras is also included in this package)
plaidbench (only available on PyPI)
tensorflow (just install plaidml-keras is not enough, I have to install this. I installed it after those packages)

括号里是我个人 wiki 里的内容，直接搬过来了。

VGG-19 测试结果

使用 CPU：

2020-12-05 15:00:42.457154: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fde5c7c9110 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-12-05 15:00:42.457188: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
Running initial batch (compiling tile program)
Timing inference...
Ran in 35.34518790245056 seconds

使用 RX580 eGPU：

Using plaidml.keras.backend backend.
INFO:plaidml:Opening device "metal_amd_radeon_rx_580.0"
Running initial batch (compiling tile program)
Timing inference...
Ran in 2.5728609561920166 seconds

我不是专业的机器学习从业者，如果大家对上面哪一环节有兴趣进一步了解，我会详细解释～

volvo007

2020-12-06 15:23:11 +08:00

刚刚忙了点别的，后面在手册里找到相关设置了：
> export PLAIDML_EXPERIMENTAL=1
> export PLAIDML_DEVICE_IDS=opencl_intel_uhd_graphics_630.0

这样在 rc 文件里设置一下就行，IDS 后面跟的就是 plaidml-setup 里面出现的那些设备 ID

于是可以考虑 rc 文件里绑几个 alias，跑代码前切换一下就好了，例如我的：

alias tfcpu='export PLAIDML_EXPERIMENTAL=1 && export PLAIDML_DEVICE_IDS=llvm_cpu.0'

alias tfint='export PLAIDML_EXPERIMENTAL=1 && export PLAIDML_DEVICE_IDS=metal_intel(r)_iris(tm)_plus_graphics.0'

alias tfgpu='export PLAIDML_EXPERIMENTAL=1 && export PLAIDML_DEVICE_IDS=metal_amd_radeon_rx_vega_64.0'

对于相同的 plaidmlbench keras mobilenet 测试命令

--------
tfcpu 就是放在 cpu 上跑，
Example finished, elapsed: 2.923s (compile), 99.922s (execution)

tfint 则放在集显上跑
Example finished, elapsed: 0.401s (compile), 18.213s (execution)

tfgpu 激活 gpu 去跑（ Vega56 刷的 Vega64 的 bios ）
Example finished, elapsed: 0.413s (compile), 8.597s (execution)

效果还不错