Table of Contents
This release includes support for the Video Decode and Presentation API for Unix-like systems (VDPAU) on most GeForce 8 series and newer add-in cards, as well as motherboard chipsets with integrated graphics that have PureVideo support based on these GPUs.
VDPAU is specified as a generic API - the choice of which features to support, and performance levels of those features, is left up to individual implementations. The details of NVIDIA's implementation are provided below.
The maximum supported resolution is 4096x4096.
The following surface formats and get-/put-bits combinations are supported:
VDP_CHROMA_TYPE_420 (Supported get-/put-bits formats are VDP_YCBCR_FORMAT_NV12, VDP_YCBCR_FORMAT_YV12)
VDP_CHROMA_TYPE_422 (Supported get-/put-bits formats are VDP_YCBCR_FORMAT_UYVY, VDP_YCBCR_FORMAT_YUYV)
The maximum supported resolution is 8192x8192.
The following surface formats are supported:
VDP_RGBA_FORMAT_B8G8R8A8
VDP_RGBA_FORMAT_R8G8B8A8
VDP_RGBA_FORMAT_B10G10R10A2
VDP_RGBA_FORMAT_R10G10B10A2
VDP_RGBA_FORMAT_A8
Note that VdpBitmapSurfaceCreate's frequently_accessed parameter directly controls whether the bitmap data will be placed into video RAM (VDP_TRUE) or system memory (VDP_FALSE). Note that if the bitmap data cannot be placed into video RAM when requested due to resource constraints, the implementation will automatically fall back to placing the data into system RAM.
The maximum supported resolution is 8192x8192.
The following surface formats are supported:
VDP_RGBA_FORMAT_B8G8R8A8
VDP_RGBA_FORMAT_R10G10B10A2
For all surface formats, the following get-/put-bits indexed formats are supported:
VDP_INDEXED_FORMAT_A4I4
VDP_INDEXED_FORMAT_I4A4
VDP_INDEXED_FORMAT_A8I8
VDP_INDEXED_FORMAT_I8A8
For all surface formats, the following get-/put-bits YCbCr formats are supported:
VDP_YCBCR_FORMAT_Y8U8V8A8
VDP_YCBCR_FORMAT_V8U8Y8A8
In all cases, VdpDecoder objects solely support 8-bit 4:2:0 streams, and only support writing to VDP_CHROMA_TYPE_420 surfaces.
The exact set of supported VdpDecoderProfile values depends on the hardware model in use. Hardware-specific support is listed below. When reading these lists, please note that VC1_SIMPLE and VC1_MAIN may be referred to as WMV, WMV3, or WMV9 in other contexts. Partial acceleration means that VLD (bitstream) decoding is performed on the CPU, with the GPU performing IDCT and motion compensation. Complete acceleration means that the GPU performs all of VLD, IDCT, and motion compensation.
These chips support the following VdpDecoderProfile values:
VDP_DECODER_PROFILE_MPEG1, VDP_DECODER_PROFILE_MPEG2_SIMPLE, VDP_DECODER_PROFILE_MPEG2_MAIN:
Partial acceleration.
Minimum width or height: 3 macroblocks (48 pixels).
Maximum width or height: 128 macroblocks (2048 pixels).
Maximum macroblocks: 8192
VDP_DECODER_PROFILE_H264_MAIN, VDP_DECODER_PROFILE_H264_HIGH:
Complete acceleration.
Minimum width or height: 3 macroblocks (48 pixels).
Maximum width or height: 128 macroblocks (2048 pixels).
Maximum macroblocks: 8192
VDP_DECODER_PROFILE_VC1_SIMPLE, VDP_DECODER_PROFILE_VC1_MAIN, VDP_DECODER_PROFILE_VC1_ADVANCED:
Partial acceleration.
Minimum width or height: 3 macroblocks (48 pixels).
Maximum width or height: 128 macroblocks (2048 pixels).
Maximum macroblocks: 8190
These chips support the following VdpDecoderProfile values:
VDP_DECODER_PROFILE_MPEG1, VDP_DECODER_PROFILE_MPEG2_SIMPLE, VDP_DECODER_PROFILE_MPEG2_MAIN:
Complete acceleration.
Minimum width or height: 3 macroblocks (48 pixels).
Maximum width or height: 128 macroblocks (2048 pixels).
Maximum macroblocks: 8192
VDP_DECODER_PROFILE_H264_MAIN, VDP_DECODER_PROFILE_H264_HIGH:
Complete acceleration.
Minimum width or height: 3 macroblocks (48 pixels).
Maximum width: 127 macroblocks (2032 pixels).
Maximum height: 128 macroblocks (2048 pixels).
Maximum macroblocks: 8190
Unsupported widths: 49, 54, 59, 64, 113, 118, 123 macroblocks (784, 864, 944, 1024, 1808, 1888 pixels).
VDP_DECODER_PROFILE_VC1_SIMPLE, VDP_DECODER_PROFILE_VC1_MAIN, VDP_DECODER_PROFILE_VC1_ADVANCED:
Complete acceleration.
Minimum width or height: 3 macroblocks (48 pixels).
Maximum width or height: 128 macroblocks (2048 pixels).
Maximum macroblocks: 8190
The maximum supported resolution is 4096x4096.
The video mixer supports all video and output surface resolutions and formats that the implementation supports.
The video mixer supports at most 4 auxiliary layers.
The following features are supported:
VDP_VIDEO_MIXER_FEATURE_DEINTERLACE_TEMPORAL
VDP_VIDEO_MIXER_FEATURE_DEINTERLACE_TEMPORAL_SPATIAL
VDP_VIDEO_MIXER_FEATURE_INVERSE_TELECINE
VDP_VIDEO_MIXER_FEATURE_NOISE_REDUCTION
VDP_VIDEO_MIXER_FEATURE_SHARPNESS
VDP_VIDEO_MIXER_FEATURE_LUMA_KEY
In order for either VDP_VIDEO_MIXER_FEATURE_DEINTERLACE_TEMPORAL or VDP_VIDEO_MIXER_FEATURE_DEINTERLACE_TEMPORAL_SPATIAL to operate correctly, the application must supply at least 2 past and 1 future fields to each VdpMixerRender call. If those fields are not provided, the VdpMixer will fall back to bob de-interlacing.
Both regular de-interlacing and half-rate de-interlacing are supported. Both have the same requirements in terms of the number of past/future fields required. Both modes should produce equivalent results.
In order for VDP_VIDEO_MIXER_FEATURE_INVERSE_TELECINE to have any effect, one of VDP_VIDEO_MIXER_FEATURE_DEINTERLACE_TEMPORAL or VDP_VIDEO_MIXER_FEATURE_DEINTERLACE_TEMPORAL_SPATIAL must be requested and enabled. Inverse telecine has the same requirement on the minimum number of past/future fields that must be provided. Inverse telecine will not operate when "half-rate" de-interlacing is used.
Whilst is is possible to apply de-interlacing algorithms to progressive streams using the techniques outlined in the VDPAU documentation, NVIDIA does not recommend doing so. One is likely to introduce more artifacts due to the inverse telecine process than are removed by detection of bad edits etc.
The resolution of VdpTime is approximately 10 nanoseconds. At some arbitrary point during system startup, the initial value of this clock is synchronized to the system's real-time clock, as represented by nanoseconds since since Jan 1, 1970. However, no attempt is made to keep the two time-bases synchronized after this point. Divergence can and will occur.
NVIDIA's VdpPresentationQueue supports two mechanisms for displaying surfaces; overlay and blit-based. The overlay path will be used wherever possible, with the blit path acting as a more general fallback. At present, the selection of overlay v.s. blit path is made at the time of presentation queue creation.
The following conditions or system configurations will prevent usage of the overlay path:
Overlay hardware already in use, e.g. by another VDPAU, GL, or X11 application, or by SDI output.
SLI or Multi-GPU enabled on the given X screen.
Desktop rotation enabled on the given screen.
X composite extension enabled on the given screen. Note that simply having the extension enabled is enough to prevent overlay usage; running an actual compositing manager is not required.
The environment variable VDPAU_NVIDIA_NO_OVERLAY is set to a string representation of a non-zero integer.
The driver determines that the performance requirements of overlay usage cannot be met by the current hardware configuration.
Both the overlay and blit path sync to VBLANK.
When TwinView is enabled, the blit path can only sync to one of the display devices; this may cause tearing corruption on the display device to which VDPAU is not syncing. You can use the environment variable VDPAU_NVIDIA_SYNC_DISPLAY_DEVICE to specify the display device to which VDPAU should sync. You should set this environment variable to the name of a display device; for example "CRT-1". Look for the line "Connected display device(s):" in your X log file for a list of the display devices present and their names. You may also find it useful to review Chapter 12, Configuring TwinView "Configuring Twinview" and the section on Ensuring Identical Mode Timings in Chapter 18, Programming Modes.