Myluzh Blog

Strive to become a dream architect.

Linux云主机安装GPU驱动(NVIDIA)- A100显卡

发布时间: 2024-5-8 文章作者: myluzh 分类名称: Linux


一、centos系统
0x01 查看显卡
[root@ecs-50372952-3-1 ~]#  lspci | grep -i nvidia
00:06.0 3D controller: NVIDIA Corporation Device 20f1 (rev a1)
00:07.0 3D controller: NVIDIA Corporation Device 20f1 (rev a1)
00:08.0 3D controller: NVIDIA Corporation Device 20f1 (rev a1)
00:09.0 3D controller: NVIDIA Corporation Device 20f1 (rev a1)
00:0a.0 3D controller: NVIDIA Corporation Device 20f1 (rev a1)
00:0b.0 3D controller: NVIDIA Corporation Device 20f1 (rev a1)
00:0c.0 3D controller: NVIDIA Corporation Device 20f1 (rev a1)
00:0d.0 3D controller: NVIDIA Corporation Device 20f1 (rev a1)

0x02 下载驱动
访问NVIDIA 官网驱动下载中心,按照自己的显卡型号SEARCH后,根据页面跳转信息点击DOWNLOAD,右键点击AGREE & DOWNLOAD,选择复制链接,在云主机下载安装包
[root@ecs-50372952-3-1 ~]#  wget https://us.download.nvidia.com/tesla/550.54.15/NVIDIA-Linux-x86_64-550.54.15.run

0x03 准备工作
1、查询有没有安装 kernel-devel kernel-headers,如果没有安装的话安装下
[root@ecs-50372952-3-1 ~]#  rpm -qa | grep $(uname -r)
[root@ecs-50372952-3-1 ~]#  yum -y install kernel-devel-$(uname -r)
[root@ecs-50372952-3-1 ~]#  yum -y install kernel-headers-$(uname -r)
2、安装编译工具
[root@ecs-50372952-3-1 ~]#  yum -y install gcc gcc-c++ make

0x04 禁用nouveau驱动
# 检查 nouveau driver 有没有被加载,如果加载需要禁用,未加载跳过下面步骤。
[root@ecs-50372952-3-1 ~]#  lsmod |grep nouveau
nouveau              1898794  0
mxm_wmi                13021  1 nouveau
...
# 在 /usr/lib/modprobe.d/dist-blacklist.conf 中添加两行内容
[root@ecs-50372952-3-1 ~]#  echo "blacklist nouveau" >> /usr/lib/modprobe.d/dist-blacklist.conf
[root@ecs-50372952-3-1 ~]#  echo "options nouveau modeset=0" >> /usr/lib/modprobe.d/dist-blacklist.conf
# 给当前镜像做备份
[root@ecs-50372952-3-1 ~]#  mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
# 建立新镜像
[root@ecs-50372952-3-1 ~]#  dracut /boot/initramfs-$(uname -r).img $(uname -r)
# 重启后,检查nouveau驱动加载情况
[root@ecs-50372952-3-1 ~]#  lsmod |grep nouveau

0x04 安装驱动
# 安装驱动
[root@ecs-50372952-3-1 ~]#  chmod +x NVIDIA-Linux-x86_64-550.54.15.run
[root@ecs-50372952-3-1 ~]#  sh NVIDIA-Linux-x86_64-550.54.15.run
# 重启后,查看是否安装成功 
[root@ecs-50372952-3-1 ~]#  nvidia-smi
# 开启显卡持久模式
[root@ecs-50372952-3-1 ~]#  sudo nvidia-persistenced --persistence-mode

二、Ubuntu系统
# 禁用 nouveau 驱动
echo "blacklist nouveau" >> /etc/modprobe.d/blacklist-nouveau.conf
echo "options nouveau modeset=0" >> /etc/modprobe.d/blacklist-nouveau.conf
sudo update-initramfs -u
# 重启后查看确认,没有内容输出,说明禁用成功
reboot
lsmod | grep nouveau
# 安装gcc/cmake 依次执行
sudo apt-get update
sudo apt-get install gcc
sudo apt-get install cmake
# 查看gcc版本,有版本号显示说明安装成功
gcc --version
cmake --version
# 后面操作一样,给执行权限,安装即可
sh NVIDIA-Linux-x86_64-535.86.10.run
# 查看是否安装成功
nvidia-smi

标签: centos ubuntu gpu ecs nvidia tesla a100

该文章不允许评论