NVIDIA GPUs have limits on how much physical memory they can address. This directly impacts DMA buffers, as a DMA buffer allocated in physical memory that is unaddressable by the NVIDIA GPU cannot be used (or may be truncated, resulting in bad memory accesses). See Chapter 34, Addressing Capabilities for details on the addressing limitations of specific GPUs.
Newer kernels provide a simple way to allocate memory that is guaranteed to reside within the 32 bit physical address space. Linux 2.6.15 provides this functionality with the __GFP_DMA32 interface. Kernels earlier than this version provide a software I/O TLB on Intel's EM64T and IOMMU support on AMD's AMD64 platform.
Unfortunately, some problems exist with both interfaces. Early implementations of the Linux SWIOTLB set aside a very small amount of memory for its memory pool (only 4 MB). Also, when this memory pool is exhausted, some SWIOTLB implementations forcibly panic the kernel. This is also true for some implementations of the IOMMU interface.
The NVIDIA Linux driver does not support the SWIOTLB. NVIDIA recommends that users of Intel's EM64T platform upgrade to Linux 2.6.11 or a more recent Linux kernel.
On AMD's AMD64 platform, the size of the IOMMU can be configured in the system BIOS or, if no IOMMU BIOS option is available, using the 'iommu=memaper' kernel parameter. This kernel parameter expects an order and instructs the Linux kernel to create an IOMMU of size 32 MB^order overlapping physical memory. If the system's default IOMMU is smaller than 64 MB, the Linux kernel automatically replaces it with a 64 MB IOMMU.
To reduce the risk of stability problems as a result of IOMMU space exhaustion on the X86-64 platform, the NVIDIA Linux driver internally limits its use of these interfaces. By default, the driver will not use more than 60 MB of IOMMU space, leaving at least 4 MB for the rest of the system (assuming a 64 MB IOMMU).
This limit can be adjusted with the 'NVreg_RemapLimit' NVIDIA kernel module option. Specifically, if the IOMMU is larger than 64 MB, the limit can be adjusted to take advantage of the additional space. The 'NVreg_RemapLimit' option expects the size argument in bytes.
NVIDIA recommends leaving 4 MB available for the rest of the system when changing the limit. For example, if the internal limit is to be relaxed to account for a 128 MB IOMMU, the recommended remap limit is 124 MB. This remap limit can be specified by passing 'NVreg_RemapLimit=0x7c00000' to the NVIDIA kernel module.
Also see the 'The X86-64 platform (AMD64/EM64T) and early Linux 2.6 kernels' section in The X86-64 platform (AMD64/EM64T) and early Linux 2.6 kernels.