AMD ROCm 3.8 Is Released
The latest AMD Radeon Open Compute graphics stack offers absolutely dismal OpenCL performance in the LuksMark benchmark compared to the Mesa Clover OpenCL library from the newly released Mesa 20.2.0 graphics stack. It does have one thing going for it: It does support OpenCL 2.0, something Mesa Clover does not.
ROCm 3.8 adds support for Vega Workstation (Vega20 GL-XE) cards. That's it in terms of new hardware support.
Fortran programmers may be happy to note that there is a new Hipfort interface library for accessing GPU Kernels from Fortran in ROCm 3.8.
There is also a new Big Data tool AMD calls "The ROCm™ Data Center Tool™". Such Trademarks™, wow™. This new tool:
"simplifies the administration and addresses key infrastructure challenges in AMD GPUs in cluster and datacenter environments. The important features of this tool are:
- GPU telemetry
- GPU statistics for jobs
- Integration with third-party tools
- Open source"
The Data Center Tool™ is for the enterprise users who represent the big bucks AMD is hoping to eventually make from their GPU compute efforts.
There's also new support for building static ROCm libraries in this release. That's nice if you want to build some binary using OpenCL and deploy it without having to worry about installing ROCm on every single machine.
We have examined the ROCm 3.8 performance using the LuksMark 3.1 benchmark suite and found it to be severely lacking in that particular benchmark.
|ROCm 3.7||ROCm 3.8||Mesa Clover 20.2|
Mesa's Clover OpenCL driver from the newly released Mesa 20.2 library scores much higher in all the tests except for the LuksMark 3.1 "MICROPHONE" where Mesa Clover fails and produces a complete and utter disaster - except, perhaps, for this very fine on-screen artwork:
The kernel ring buffer is filled with messages like these when it crashes:
[ 79.600454] amdgpu 0000:08:00.0: amdgpu: GPU fault detected: 147 0x07f80402 for process luxmark.bin pid 6282 thread luxmark.bi:cs0 pid 6283 [ 79.600458] amdgpu 0000:08:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x005000FF [ 79.600460] amdgpu 0000:08:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x04004002 [ 79.600464] amdgpu 0000:08:00.0: amdgpu: VM fault (0x02, vmid 2, pasid 32769) at page 5243135, read from 'TC1' (0x54433100) (4) [ 79.600522] amdgpu 0000:08:00.0: amdgpu: GPU fault detected: 147 0x0f4ac802 for process luxmark.bin pid 6282 thread luxmark.bi:cs0 pid 6283 [ 79.600524] amdgpu 0000:08:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x001E7DE9 [ 79.600526] amdgpu 0000:08:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x050C8002
We have filed a complaint with Mesa customer service regarding this issue.
ROCm 3.8 is way slower than Mesa Clover OpenCL in the tests Mesa Clover does manage to complete. It must be noted that none of them passed LuksMark's "Image validation" of the images rendered during the benchmark, so both OpenCL implementations get a hard fail in that regard. Radeon Open Compute is the only one of the two that managed to complete all the LuksMark scenes without resulting in a major catastrophe, so it does have that going for it.
ROCm 3.8 is also the only one of the two with actual OpenCL 2.0 support. Mesa Clover 20.2.0 is limited to OpenCL 1.1 support which means that it can not be used for something as simple as OpenCL acceleration in LibreOffice. That's one application where Radeon Open Compute is useful for us little people who are not Big Data with Big Data-Centers where the new "The ROCm™ Data Center Tool™" is applicable.
You can find AMD's installation instructions for Radeon Open Compute at rocmdocs.amd.com /Installation_Guide/. Fedora 32 and 33 are not officially supported but CentOS and RHEL is and the packages for those seem to work just fine on Fedora 32 and 33. Distributions do not carry ROCm packages, so you will have to install them yourself if you want access to this technology.