nVidia is the best and worst graphics card for Linux. It is the worst because it is fraught with proprietary nonsense and it is the best, well, because it works pretty well.

If you need a system where you can audit all the source code, nVidia hardware may not be an option. But if you just need some simple Linux workstations for 3d graphics, it might be the simplest option.

I find that using nVidia’s automagical installer/driver just works. Usually.

I have separate notes for CUDA.

Also for CentOS specific package technology involving Nvidia drivers see my CentOS notes.

Drivers

At the current time (late 2012) the Linux drivers live here. Note that "Linux x86/IA32" is for 32 bit systems. (Check yours with something like file /sbin/init). These days, you probably want "Linux x86_64/AMD64/EM64T".

What version are you currently using? Check with this.

cat /proc/driver/nvidia/version

Installing and Updates

It turns out that GPU drivers are deeply in touch with the kernel. The driver itself is a kernel module. This module must match the kernel and must be built to fit. The nVidia installer automagically takes care of all this (assuming you have a build environment with a complier, etc).

The problem is that whenever you update your machine and there is a kernel update (which is about every two weeks in my experience), the graphics will stop working. You must reboot into the new kernel (you can’t fix it right after doing the update while running the previous kernel). Then you’ll be in some no-mans-land text console with no prompt (CentOS6). Use "Alt-F2" to go to a console with a getty login prompt. Log in and re install the nVidia driver. This also is the process after you first install CentOS.

I find that I do this so often that I have a tiny script to make it automatic so I don’t have to answer questions and generally hold its hand. My little script looks like:

#!/bin/bash
sh /pro/nvidia/current -a -q -X --ui=none -f -n

For the Debian style distributions this works.

#!/bin/bash

echo "Shutting down X server..."
sudo service lightdm stop

echo "Running NVIDIA kernel module installer..."
sudo sh ~/src/NVIDIA-Linux-x86_64-304.117.run -a -q -X --ui=none -f -n

And that lives in a directory with an assortment of drivers where current is a link to the one I need most often:

:->[host][~]$ ls /usr/local/nvidia/
NVIDIA-Linux-x86-304.64.run             NVIDIA-Linux-x86_64-304.64.run
NVIDIA-Linux-x86_64-173.14.22-pkg2.run  current
NVIDIA-Linux-x86_64-190.53-pkg2.run     nvfix
NVIDIA-Linux-x86_64-195.36.15-pkg2.run

Update Process

When I update I usually do it remotely. I log in and do sudo yum -y update. Then if a new kernel has been installed, I do sudo reboot. Then wait a couple of minutes (sleep 111). And then log in again. This time everything seems fine and is updated, but the users sitting at the workstation will find a confusing text screen with no prompt. This is because graphics are actually dead. This is when you need to run the nvfix script shown above, that’s sudo /usr/local/nvidia/nvfix of course since it must be run as root. Then you must sudo reboot again. At that point everything should be cool. It’s a good idea to wait and log back in when it comes up. I’ve had machines mysteriously not wake up after the reboot.

ElRepo

It might be smarter these days to try to use prepackaged proprietary drivers from the ElRepo repository.

One problem I had after upgrading from 7.x to 7.4 is that although the modules seem inserted and everything seems fine, no graphics happen. This talks about it and has some good general troubleshooting tips. It seems that lightdm wasn’t starting or staying started. But doing systemctl start lightdm seems to have started it and system enable lightdm seems to have cured it.

Nouveau Issues

In CentOS 6 and later the default thing to do on installation is to use the new open source Nouveau drivers. That’s nice and I’m glad that someone’s working on a wholesome alternative. But the problem is that these drivers under-perform, by a factor of 2 in my tests. Test it yourself before committing.

Now the really gruesome bit is that you can’t easily install the proprietary drivers while the Nouveau ones are in. Maybe nVidia will fix their installer to be less stupid but for now it’s quite a chore to extricate the Nouveau driver. The best plan is to often reinstall CentOS and make sure you select the reduced graphics mode. I forget what it’s called, but it doesn’t just affect the installation graphics, it affects what drivers are installed. With the low quality (or whatever it’s called) mode, the normal non-accelerated X drivers are installed and those can be replaced by the nVidia installer.

Legacy

Sometimes you’ll have an older machine:

:->[ws9-ablab.ucsd.edu][~]$ lspci | grep -i [n]vi
01:00.0 VGA compatible controller: NVIDIA Corporation NV43
[GeForce 6600 GT] (rev a2)

And running the normal installer fails with some kind of message about legacy drivers. On the machine above I had to run NVIDIA-Linux-x86_64-304.64.run and then it worked. This version was found on the driver page above and called Latest Legacy GPU version (304.xx series). There are other legacy series like 71.86.xx, 96.43.xx, and 173.14.xx. Use what the installers suggest.

Manual Tweaking With xrandr

I had two vertical 1080x1920 monitors and the "Display" program in Mate was just garbling them. Here’s what I did to sort that out.

xrandr --fb 2160x1920   \
       --output HDMI-1  \
       --auto \
       --pos 0x0 \
       --output DVI-I-1 \
       --auto \
       --pos 1080x0

Or more recently with a different card…

xrandr --fb 2160x1920 \
    --output HDMI-0 --auto --rotate left --pos 0x0 \
    --output DVI-D-0 --auto --rotate right --pos 1080x0

Here’s another example of my 3 vertical HP monitor setup which each have the slightly unusual resolution of 1920x1200.

xrandr --fb 3600x1920 \
       --output VGA-0   --auto --pos 0x0 \
       --output DVI-D-0 --auto --pos 1200x0 \
       --output HDMI-0  --auto --pos 2400x0

Also note these, which I did not need, if required for emphasis.

--rotate left
--output A --left-of B

In CentOS 7’s Mate I’m finding that the System->Preference->Hardware->Displays tool just can’t put my vertical monitors together properly. What works is to close that, use an xrandr command as shown above. Then go back to the Displays GUI tool when everything is correct. Then it will come up detected correctly and this is when you want to click "Apply" and then "Apply system-wide". I don’t know what that writes but it once it’s written, things work as they should. Well, not the display manager of course, but who cares about that?

Dummy

From the xpra Xdummy documentation. "Proprietary drivers often install their own copy of libGL which conflicts with the use of software GL rendering. You cannot use this GL library to render directly on Xdummy (or Xvfb)."

This is why you might have trouble using non-interactive rendering tools.

Here is one way Andrey got this problem solved. First he grabbed a libGL.so.1 from a Mesa system (no nvidia drivers). That can be stored locally with no privileges.

Then run the application with something like this.

LD_PRELOAD=/home/${USER}/tmp/libGL.so.1 /usr/bin/Xvfb :96 -cc 4 -screen 0 1024x768x16

AMD

Just some quick notes on AMD/ATI drivers. AMD tries to match nVidia, but they’re a bit behind. However, here are some programs that might come in handy.

amdcccle
fglrxinfo
fgl_glxgears