Skip to content
Weibo's Home
Go back

KVM GPU passthrough Ubuntu 20.04

Edit page

Environment

Enable IOMMU

Configure GRUB

Edit /etc/default/grub

# Intel CPU
GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on"
# AMD CPU
GRUB_CMDLINE_LINUX_DEFAULT="amd_iommu=on iommu=pt kvm_amd.npt=1 kvm_amd.avic=1"

Update GRUB

sudo update-grub

Reboot

sudo shutdown -r now

Verify IOMMU is enabled

dmesg | grep IOMMU

Output:

IOMMU enabled

Enable IOMMU group

Check IOMMU group is enabled

for a in /sys/kernel/iommu_groups/*; do find $a -type l; done | sort --version-sort

output:

/sys/kernel/iommu_groups/0/devices/0000:00:00.0
/sys/kernel/iommu_groups/1/devices/0000:00:04.0
/sys/kernel/iommu_groups/2/devices/0000:00:04.1
/sys/kernel/iommu_groups/3/devices/0000:00:04.2
/sys/kernel/iommu_groups/4/devices/0000:00:04.3

Edit BIOS setting is not enabled

If output is not expected, configure BIOS setting

VT-d

(Asus)

(Supermicro)

Isolation of the guest GPU

graph LR

subgraph C [guest] C1[PCI device] end subgraph B [hypervisor] B1[VFIO] —> C1[PCI device] end subgraph A [Host] A1[PCI device] —> B1[VFIO] end

Using vfio-pci to manage PCI device

Find out vendor ID and device ID

lspci -nn | grep -i NVIDIA

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU102 [GeForce RTX 2080 Ti] [10de:1e04] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation TU102 High Definition Audio Controller [10de:10f7] (rev a1)
01:00.2 USB controller [0c03]: NVIDIA Corporation TU102 USB 3.1 Host Controller [10de:1ad6] (rev a1)
01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU102 USB Type-C UCSI Controller [10de:1ad7] (rev a1)

GeForce RTX 2080 Ti VGA compatible controller: PCI ID:01:00.0 vendor ID: 10de device ID: 1e04

Configure GRUB

/etc/default/grub

Apply all the related devices

GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on vfio-pci.ids=10de:1e04,10de:10f7,10de:1ad6,10de:1ad7"

Update GRUB

sudo update-grub

Reboot

sudo reboot

Verify PCI device is managed by vfio-pci

lspci -nnv

Find the line Kernel driver in use

0b:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU102 [GeForce RTX 2080 Ti] [10de:1e04] (rev a1) (prog-if 00 [VGA controller])
...
Kernel driver in use: vfio-pci

Test GPU passthrough on kvm instance

Fresh install

Run virt-install with --host-device [device_id] and --features kvm_hidden=on parameters

virt-install ... \
--host-device 01:00.0 \
--features kvm_hidden=on \

Modify existing instance

virsh edit [domain]

Add PCI mapping hostdev block

0000:01:00.0 within the host will be mapped to 0000:04:00.0 within guest

:::warning

bus number should less than virtio’s

Increase virtio’s bus number to spare small number for new added entry

:::

<devices>
  ...
/dev/urandom
</devices>

kvm hidden within features block

<features>
...
  <kvm>
  </kvm>
</features>

Check GPU is working in guest

lspci

04:00.0 VGA compatible controller: NVIDIA Corporation TU102[GeForce RTX 2080 Ti] (rev a1)

Install NVIDIA driver

sudo apt update
sudo apt install nvidia-driver-460
sudo reboot

nvidia-smi

Wed Mar 10 08:19:43 2021
+---------------------------------------------------------------------------+
| NVIDIA-SMI 460.39       Driver Version: 460.39       CUDVersion: 11.2     |
|-------------------------------+--------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage GPU-Util  Compute M. |
|                               |                    |               MIG M. |
|===============================+====================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:04:00.0 Of|                  N/A |
| 15%   44C    P0     1W / 250W |      0MiB / 11019Mi|      0%      Default |
|                               |                    |                  N/A |
+-------------------------------+--------------------+----------------------+
+---------------------------------------------------------------------------+
Processes:                                                                |
|  GPU   GI   CI        PID   Type   Procesname                  GPU Memory |
|        ID ID                                                 Usage      |
===========================================================================|
|  No running processefound                                                 |
+---------------------------------------------------------------------------+

Reference


Edit page
Share this post:

Previous Post
GCP resource hierarchy
Next Post
AWS resource hierarchy