HOWTO undervolt the AMD RX 4XX and RX 5XX GPUs

From LinuxReviews
Jump to navigationJump to search
Amdlogo.png

You may or may not to manually adjust the clockspeeds on AMD GPUs. You can do it fairly easily if you are using the amdgpu driver on GNU/Linux by limiting the clock states the GPU is allowed to use. It used to be possible to adjust max clocks and voltages for each state, but that is no longer possible.

Requirements

There's two requirements: a kernel parameter and a single-monitor setups.

A "ppfeaturemask" Kernel Paramter Is Required

You need to add:

amdgpu.ppfeaturemask=0xfffd7fff

to your kernel command-line to enable "overdrive" so you can manually fool around with GPU clocks. This can be added to GRUB_CMDLINE_LINUX= in /etc/sysconfig/grub.

It is also possible to use amdgpu.ppfeaturemask=0xffffffff but that higher setting causes artifacts on many cards (mainly RX 470/570). The actual meaning of the values you can set are explained in amd_shared.h:[1]

enum PP_FEATURE_MASK {
        PP_SCLK_DPM_MASK = 0x1,
        PP_MCLK_DPM_MASK = 0x2,
        PP_PCIE_DPM_MASK = 0x4,
        PP_SCLK_DEEP_SLEEP_MASK = 0x8,
        PP_POWER_CONTAINMENT_MASK = 0x10,
        PP_UVD_HANDSHAKE_MASK = 0x20,
        PP_SMC_VOLTAGE_CONTROL_MASK = 0x40,
        PP_VBI_TIME_SUPPORT_MASK = 0x80,
        PP_ULV_MASK = 0x100,
        PP_ENABLE_GFX_CG_THRU_SMU = 0x200,
        PP_CLOCK_STRETCH_MASK = 0x400,
        PP_OD_FUZZY_FAN_CONTROL_MASK = 0x800,
        PP_SOCCLK_DPM_MASK = 0x1000,
        PP_DCEFCLK_DPM_MASK = 0x2000,
        PP_OVERDRIVE_MASK = 0x4000,
        PP_GFXOFF_MASK = 0x8000,
        PP_ACG_MASK = 0x10000,
        PP_STUTTER_MODE = 0x20000,
        PP_AVFS_MASK = 0x40000,
};

amdgpu.ppfeaturemask=0xfffd7fff works if you want to adjust clocks and voltages. Booting with this will make a special file called /sys/class/drm/card0/device/pp_od_clk_voltage appear. You can write values to this file and then activate them.

A Single-Monitor Setup Is Required

amdgpu will not let you change settings on multi-monitor setups. That is very unfortunate but it is what it is. We had to disconnect all but one screen to write this guide.

Commands like: echo "0 1 2 3 4 5" > /sys/class/drm/card0/device/pp_dpm_sclk return write error: Invalid argument if two or more screens are connected while it does work with just one (assuming you have the amdgpu.ppfeaturemask=0xfffd7fff kernel parameter and you sent manual to /drm/card0/device/power_dpm_force_performance_level).

The Quick And Easy Way To Manually "Undervolt" AMD GPUs

The first thing you need to do before you can change anything is to set /sys/class/drm/card0/device/power_dpm_force_performance_level to manual to enable manual control. You will get write error: Invalid argument errors when writing clock values of you don't.

Shell command:
echo "manual" > /sys/class/drm/card0/device/power_dpm_force_performance_level

You will need to run that command, and every other command that changes amdgpu module settings, as root (or using sudo).

The available power_dpm_force_performance_level settings other than manual are

auto Drivers chooses automatically
low Forces the lowest possible clock and locks the GPU there
high Forcest the highest possible clock and locks the GPU there
profile_standard "When the profiling modes are selected, clock and power gating are disabled and the clocks are set for different profiling cases. This mode is recommended for profiling specific work loads where you do not want clock or power gating for clock fluctuation to interfere with your results. profile_standard sets the clocks to a fixed clock level which varies from asic to asic. profile_min_sclk forces the sclk to the lowest level. profile_min_mclk forces the mclk to the lowest level. profile_peak sets all clocks (mclk, sclk, pcie) to the highest levels."[2]
profile_min_sclk
profile_min_mclk
profile_peak

Next, check what GPU and memory clock states are available:

HOWTO See Available GPU Clock And Memory States And Their Values

There are eight (0-7) GPU clock states and two (0-1) OR three (0-2) memory states on AMD RX GPUs. This will be different on Vega and Navi GPUs.

You can check what GPU clock states are available with:

Shell command:
cat /sys/class/drm/card0/device/pp_od_clk_voltage

This will show a list like:

OD_SCLK:
0:        300MHz        800mV
1:        466MHz        818mV
2:        751MHz        824mV
3:       1019MHz        950mV
4:       1074MHz       1000mV
5:       1126MHz       1050mV
6:       1169MHz       1093mV
7:       1242MHz       1150mV
OD_MCLK:
0:        300MHz        800mV
1:       1650MHz       1000mV
OD_RANGE:
SCLK:     300MHz       2000MHz
MCLK:     300MHz       2100MHz
VDDC:     800mV        1175mV

Here we see that there are eight (zero to seven) different GPU clock states (OD_SCLK) and two (one and zero) different GPU memory clock states (OD_MCLK).

HOWTO Limit The GPU To A Certain Set Of GPU Clock States

You can limit what GPU clock states are used by sending a series of numbers listing avaialble clock states (OD_SCLK above) to /sys/class/drm/card0/device/pp_dpm_sclk.

If you want to use less power you can send:

Shell command:
echo "0 1 2 3 4 5" > /sys/class/drm/card0/device/pp_dpm_sclk

That ensures that the GPU will never use the two highest clock states.

Similarly, you can burn energy and make your GPU run hot with:

Shell command:
echo "5 6 7" > /sys/class/drm/card0/device/pp_dpm_sclk

That would make the GPU run at the three highest GPU clock levels at all times.

Fine-Grained Per-State Clock Control

A previous version of this guide, which can be seen in the "history" tab, was all about manually setting power levels for each power state. That worked fine with whatever kernel we used to test and write this guide as it appeared in May 2019. This no longer works and you should ignore this entire section unless you are curious about how it used to work.

amdgpu_pm.c story in Linux 5.9[3] is:

"To manually adjust these settings, first select manual using power_dpm_force_performance_level. Enter a new value for each level by writing a string that contains "s/m level clock voltage" to the file. E.g., "s 1 500 820" will update sclk level 1 to be 500 MHz at 820 mV; "m 0 350 810" will update mclk level 0 to be 350 MHz at 810 mV. When you have edited all of the states as needed, write "c" (commit) to the file to commit your changes. If you want to reset to the default power levels, write "r" (reset) to the file to reset them."

We could not write any values to /sys/class/drm/card0/device/pp_od_clk_voltage in on either a MSI RX 470 8 GiB GPU or a ASUS RX 570 8 GiB GPU using Linux 5.9. We did not try testing with 4.xx kernels to find out what changed because we are not historians.

Kernel 5.9.1 fixed this for me (Tested on Manjaro Testing).

Changing GPU clock speeds and voltages used to work on those GPUs. It was done by first sending values to /sys/class/drm/card0/device/pp_od_clk_voltage and then activating them by sending c to that same /sys/class/drm/card0/device/pp_od_clk_voltage sys file. The values would be set as:

  • s maxpstate GPU-clock voltate will change the GPU clock for a specific p-state.
  • m max-memory-pstate memory-clock memory-voltate to change the memory clock for a p-state.

If seven is the highest state GPU clock state and and two is the highest memory clock state and you wanted to change change the values for the highest GPU clock and GPU memory clock states you could do:

Shell command:
echo "s 7 1209 900" > /sys/class/drm/card0/device/pp_od_clk_voltage

echo "m 1 1600 1000" > /sys/class/drm/card0/device/pp_od_clk_voltage

The first line would set a maximum pstate of 7, set MHz to 1209 and a 900mV voltage.

The second line would set the maximum memory pstate to 2, memory clock to 1850 and voltage to 850.

The settings were not applied as soon as they are written. You have to manually activate them by sending c to that same pp_od_clk_voltage /sys/ file:

Shell command:
echo "c" > /sys/class/drm/card0/device/pp_od_clk_voltage

You could re-set and revert back to default GPU settings by writing r to pp_od_clk_voltage:

Shell command:
echo "r" > /sys/class/drm/card0/device/pp_od_clk_voltage

We do not know when, or why, this stopped working on AMD RX 4xx and 5xx. What we do know is that we decided to rewrite this guide after one and a half years and the steps that used to work do not work with modern kernels. The kernel documentation indicates that it should work like it always did but that's irrelevant since it doesn't. Just limit the GPU to fewer clock states if you want to undervolt your GPU.

Undervolting As A Service

You can create a script and a systemd service file if you want to limit your GPU to lower power-states every time you boot. First, create a simple script that does what you want:

File: /usr/local/bin/Set_GPU_Settings.sh
#!/bin/sh
echo "0 1 2 3 4 5" >  /sys/class/drm/card0/device/pp_dpm_sclk

Make that script executable:

chmod +x /usr/local/bin/Set_GPU_Settings.sh

Next, create a systemd service that runs it at boot:

A script like that, once created, would be activated on each boot by making a simple systemd service script:

File: /etc/systemd/system/tunegpu.service
[Unit]
Description=Change GPU clocks
[Service]
Type=oneshot
ExecStart=/usr/local/bin/Set_GPU_Settings.sh
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target

Run

systemctl daemon-reload

to make systemd aware of it's existence then

systemctl enable --now tunegpu.service

to both run it once and enable it as a start-up service (enable enables the service and --now makes it start now).

Footnotes

Add your comment
LinuxReviews welcomes all comments. If you do not want to be anonymous, register or log in. It is free.