Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make previews fast #1

Open
matthewlai opened this issue Dec 27, 2024 · 0 comments
Open

Make previews fast #1

matthewlai opened this issue Dec 27, 2024 · 0 comments

Comments

@matthewlai
Copy link
Owner

matthewlai commented Dec 27, 2024

High resolution previews are currently a bit slow, even hardware decode (added to PyAV in PyAV-Org/PyAV#1685). The problem is most hardware decoders (eg VideoToolbox on Mac, D3D11/12 on Windows) return an UV interleaved semi-planar format by default (NV12/P010LE). These formats are not very fast to convert to the RGB24 packed format we need.

Many decoders actually support a wide range of output formats, and it would be great if we can get the GPU to do it for us. Looking at the hardware decode API, one may be misled into believing that that just requires calling av_hwframe_transfer_get_formats(), and picking the format we want when calling av_hwframe_transfer_data(). That is misleading because if we actually look at the implementation of that function for various decoders, they all only support one download format! The way they work is that the hardware decodes into a GPU texture in some format, and we have to download in that format, so it's a direct VRAM->DRAM transfer.

The real solution is to tell the decoder to put the data in a format we want in the first place, in VRAM. Hardware decoders are configured by two methods - hardware device context and hardware frame context. Hardware device context is derived from codec + hardware config (we can pick a hardware type, and a specific device, but that's about it). This is how we currently configure hardware decode. Frame context is (mostly an additional) way of configuring the decoder that gives us more flexibility. It allows us to, among other things, look at what the canonical software format is for the frame in question (determined by the software codec), and decide what VRAM format we want. This is done in the get_format() callback (keep in mind that get_format() is called from C with GIL released). If get_format() doesn't set a frame context, the decoder will generate a default hardware frame context, and pick a format that works best with the hardware (but not necessarily what we want).

What we need:

  1. Add a way to support generating hardware frame contexts in PyAV. This needs to happen inside get_format(), and we need to decide what's the best API for this.
  2. Add support in ReefShader to ideally convert directly to RGB(A)32/64 packed. Or at least fully planar YUV.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant