Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SUGGESTION] AVX/AVX2/AVX-512 optimizations ? #62

Closed
MarcoRavich opened this issue Aug 7, 2023 · 4 comments
Closed

[SUGGESTION] AVX/AVX2/AVX-512 optimizations ? #62

MarcoRavich opened this issue Aug 7, 2023 · 4 comments

Comments

@MarcoRavich
Copy link

Hi there,
since we don't have any embedded dev, we honestly don't know if this is a technically correct suggestion...
...anyway we found some interesting resources about it:

  1. Improving the compute performance of video processing software using AVX (Advanced Vector Extensions) instructions
  2. [VIMEO] Optimizing for AVX2
  3. SIMD Acceleration for HEVC Decoding
  4. Accelerating x265 with Intel® Advanced Vector Extensions 512
  5. Intel AVX-512 tested in x265: how to enable it and does it help?
  6. AVX Optimizations and Performance: VisualStudio vs GCC

From what we understand, the performances improvement should range between 5 and 10%.

Hope that inspires !

@MarcoRavich
Copy link
Author

Bump.

Many a/v de/en-coders are introducing AVX optimizations, here's a couple of examples:

Last but not least, we've discovered this interesting (2019) article by @blegal and @cjego:

Hope that helps.

@iEvgeny
Copy link
Owner

iEvgeny commented Sep 29, 2023

Low-level optimizations rely entirely on implementation in ffmpeg and are outside the purview of application software developers.
I'm currently busy implementing Zero Copy and I have to admit I'm stuck. The poor quality of graphics drivers for Linux and the specificity of the problem require more time. Progress is being made and I've already gotten the first results of reduced CPU and memory load, but I can't say anything about reduced playback latency yet.

@MarcoRavich
Copy link
Author

1st of all, thanks for reply.

Low-level optimizations rely entirely on implementation in ffmpeg and are outside the purview of application software developers.

Of course, but I believe that building binaries with AVX* compiler optimizations may help too.

I'm currently busy implementing Zero Copy and I have to admit I'm stuck. The poor quality of graphics drivers for Linux and the specificity of the problem require more time.

Dunno if can help/inspire in any way, but I've recently readed this article about (VirtIO) ZeroCopy:
https://www.phoronix.com/news/VirtIO-Vsock-MSG-Zerocopy

Progress is being made and I've already gotten the first results of reduced CPU and memory load, but I can't say anything about reduced playback latency yet.

CCTV latency is more than acceptable for its primary use (monitoring surveillance cameras), but optimizing it could allow it to be used in more time-sensitive applications.

Last but not least, if you feel in a - temporary - deadlock state for ZC, it might be useful - even as a "recreation" - to focus on other features (such as overlays).

Thanks again for your (volunteer) work !

@iEvgeny
Copy link
Owner

iEvgeny commented Sep 30, 2023

CCTV Viewer does not operate on frame buffers directly. This is done by ffmpeg and partly by Qt.
As an experiment, you can rebuild ffmpeg and Qt on your machine locally with the desired optimizations. But I don't think it will have a noticeable effect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants