Deep learning is all the rage now. It is not a surprise if you have two or more GPUs, but it is problematic when the heat is accumulated among different GPUs. There is a need to increase fan speed of GPUs because the default speed is quite slow (22%). In this post, I will guide you how to increase fan of multiple GPUs. All you need to do is to modify the /etc/X11/xorg.conf. I have successfully increased my four GPUs with this method.
The principle is very simple. A GPU needs to have X screen attached to it in order to increase the fan speed of that GPU through Coolbits [1]. First, we use nvidia-settings to attach the second monitor to each GPU. The first monitor is always attached to the first GPU for display. The second monitor is alternatively attached to the second GPU to create another X screen. See the below image for how to do it through GUI.
The steps are following: First, we need sudo privilege to generate X-conf (only if you do not have).
sudo nvidia-xconfigNext, open the X-configuration file /etc/X11/xorg.conf, create for each GPUs a X-screen, add coolbit option to each X-screen. A X-screen has parameters of monitor and device (GPU). What you need to do to replicate monitors, GPU devices, and their corresponding X-screen. Each GPU devices have only different in bus-ID. We can check it in nvidia-smi command. Then, we can add each X-screen into ServerLayout.
# nvidia-xconfig: X configuration file generated by nvidia-xconfig# nvidia-xconfig: version 384.69 (buildmeister@swio-display-x86-rhel47-06) Wed Aug 16 20:57:01 PDT 2017Section "ServerLayout"Identifier "Layout0"Screen 0 "Screen0" 0 0Screen 1 "Screen1" RightOf "Screen0"Screen 2 "Screen2" RightOf "Screen1"Screen 3 "Screen3" RightOf "Screen2"InputDevice "Keyboard0" "CoreKeyboard"InputDevice "Mouse0" "CorePointer"EndSectionSection "Files"EndSectionGPUSection "InputDevice"# generated from defaultIdentifier "Mouse0"Driver "mouse"Option "Protocol" "auto"Option "Device" "/dev/psaux"Option "Emulate3Buttons" "no"Option "ZAxisMapping" "4 5"EndSectionSection "InputDevice"# generated from defaultIdentifier "Keyboard0"Driver "kbd"EndSectionSection "Monitor"Identifier "Monitor0"VendorName "Unknown"ModelName "Unknown"HorizSync 28.0 - 33.0VertRefresh 43.0 - 72.0Option "DPMS"EndSectionSection "Monitor"Identifier "Monitor1"VendorName "Unknown"ModelName "DELL U2410"HorizSync 0.0 - 0.0VertRefresh 0.0EndSectionSection "Monitor"Identifier "Monitor2"VendorName "Unknown"ModelName "DELL U2410"HorizSync 0.0 - 0.0VertRefresh 0.0EndSectionSection "Monitor"Identifier "Monitor3"VendorName "Unknown"ModelName "DELL U2410"HorizSync 0.0 - 0.0VertRefresh 0.0EndSectionSection "Device"Identifier "Device0"Driver "nvidia"VendorName "NVIDIA Corporation"BusID "PCI:5:0:0"EndSectionSection "Device"Identifier "Device1"Driver "nvidia"VendorName "NVIDIA Corporation"BusID "PCI:6:0:0"EndSectionSection "Device"Identifier "Device2"Driver "nvidia"VendorName "NVIDIA Corporation"BusID "PCI:9:0:0"EndSectionSection "Device"Identifier "Device3"Driver "nvidia"VendorName "NVIDIA Corporation"BusID "PCI:10:0:0"EndSectionSection "Screen"Identifier "Screen0"Device "Device0"Monitor "Monitor0"DefaultDepth 24Option "Coolbits" "4"SubSection "Display"Depth 24EndSubSectionEndSectionSection "Screen"Identifier "Screen1"Device "Device1"Monitor "Monitor1"DefaultDepth 24Option "Coolbits" "4"SubSection "Display"Depth 24EndSubSectionEndSectionSection "Screen"Identifier "Screen2"Device "Device2"Monitor "Monitor2"DefaultDepth 24Option "Coolbits" "4"SubSection "Display"Depth 24EndSubSectionEndSectionSection "Screen"Identifier "Screen3"Device "Device3"Monitor "Monitor3"DefaultDepth 24Option "Coolbits" "4"SubSection "Display"Depth 24EndSubSectionEndSection
Finally log out and log in again, voila, you can change your fan speed now. Please save a X-configuration for uses in the future:
sudo cp /etc/X11/xorg.conf /etc/X11/xorg.conf-backup-coolbit
Then, we need to increase the fan speed of GPUs through a script [2]. For example, I want to increase fan speed to 95% for two GPUs. The bash script is as simple as follows:
#!/bin/bashYou should save it as set_gpu_fan.sh and add it into start-up program on Ubuntu. So the script will be automatically executed when you log into your workstation through an application of X servers. Alternatively, you can enable the persistent state of GPUs by add the following command into crontab. The state of GPUs will be preserved in persistent (including fan speed) until reboot.
nvidia-settings -a [gpu:0]/GPUFanControlState=1 -a [fan:0]/GPUTargetFanSpeed=95 -a [gpu:1]/GPUFanControlState=1 -a [fan:1]/GPUTargetFanSpeed=95
sudo crontab -eThen add the following snippet into crontab.
@reboot nvidia-smi --persistence-mode=ENABLEDFinally, the following methods only work for the case you have as many monitors as number of GPUs. The principle is the same for more than 2 GPUs. You just program the /etc/X11/xorg.conf to have a X screen for each GPU. Then, you can manually adjust fan speed of each GPU. However, the drawback of this method is that you need to log in after the workstation restarts, but I work for me on the commodity workstation for doing researches in a lab.
No comments:
Post a Comment