You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
High resolution previews are currently a bit slow, even hardware decode (added to PyAV in PyAV-Org/PyAV#1685). The problem is most hardware decoders (eg VideoToolbox on Mac, D3D11/12 on Windows) return an UV interleaved semi-planar format by default (NV12/P010LE). These formats are not very fast to convert to the RGB24 packed format we need.
Many decoders actually support a wide range of output formats, and it would be great if we can get the GPU to do it for us. Looking at the hardware decode API, one may be misled into believing that that just requires calling av_hwframe_transfer_get_formats(), and picking the format we want when calling av_hwframe_transfer_data(). That is misleading because if we actually look at the implementation of that function for various decoders, they all only support one download format! The way they work is that the hardware decodes into a GPU texture in some format, and we have to download in that format, so it's a direct VRAM->DRAM transfer.
The real solution is to tell the decoder to put the data in a format we want in the first place, in VRAM. Hardware decoders are configured by two methods - hardware device context and hardware frame context. Hardware device context is derived from codec + hardware config (we can pick a hardware type, and a specific device, but that's about it). This is how we currently configure hardware decode. Frame context is (mostly an additional) way of configuring the decoder that gives us more flexibility. It allows us to, among other things, look at what the canonical software format is for the frame in question (determined by the software codec), and decide what VRAM format we want. This is done in the get_format() callback (keep in mind that get_format() is called from C with GIL released). If get_format() doesn't set a frame context, the decoder will generate a default hardware frame context, and pick a format that works best with the hardware (but not necessarily what we want).
What we need:
Add a way to support generating hardware frame contexts in PyAV. This needs to happen inside get_format(), and we need to decide what's the best API for this.
Add support in ReefShader to ideally convert directly to RGB(A)32/64 packed. Or at least fully planar YUV.
The text was updated successfully, but these errors were encountered:
High resolution previews are currently a bit slow, even hardware decode (added to PyAV in PyAV-Org/PyAV#1685). The problem is most hardware decoders (eg VideoToolbox on Mac, D3D11/12 on Windows) return an UV interleaved semi-planar format by default (NV12/P010LE). These formats are not very fast to convert to the RGB24 packed format we need.
Many decoders actually support a wide range of output formats, and it would be great if we can get the GPU to do it for us. Looking at the hardware decode API, one may be misled into believing that that just requires calling av_hwframe_transfer_get_formats(), and picking the format we want when calling av_hwframe_transfer_data(). That is misleading because if we actually look at the implementation of that function for various decoders, they all only support one download format! The way they work is that the hardware decodes into a GPU texture in some format, and we have to download in that format, so it's a direct VRAM->DRAM transfer.
The real solution is to tell the decoder to put the data in a format we want in the first place, in VRAM. Hardware decoders are configured by two methods - hardware device context and hardware frame context. Hardware device context is derived from codec + hardware config (we can pick a hardware type, and a specific device, but that's about it). This is how we currently configure hardware decode. Frame context is (mostly an additional) way of configuring the decoder that gives us more flexibility. It allows us to, among other things, look at what the canonical software format is for the frame in question (determined by the software codec), and decide what VRAM format we want. This is done in the get_format() callback (keep in mind that get_format() is called from C with GIL released). If get_format() doesn't set a frame context, the decoder will generate a default hardware frame context, and pick a format that works best with the hardware (but not necessarily what we want).
What we need:
The text was updated successfully, but these errors were encountered: