Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AV1 support for HW decoding #1690

Closed
5 of 6 tasks
legraphista opened this issue Dec 23, 2024 · 4 comments
Closed
5 of 6 tasks

AV1 support for HW decoding #1690

legraphista opened this issue Dec 23, 2024 · 4 comments

Comments

@legraphista
Copy link

legraphista commented Dec 23, 2024

Overview

#1685 implemented HW support for CUDA devices. This support works for H264, VP9 but not AV1 (on supported devices)

Expected behavior

Decoding to be done on GPU.

Actual behavior

Decoding is done in software (even though allow_software_fallback is False)

Traceback:
n/a

Investigation

sample media: av1.webm

import av
import time

file = 'av1.webm'

hwaccel = av.codec.hwaccel.HWAccel(device_type='cuda', allow_software_fallback=False)
container = av.open(file, hwaccel=hwaccel)

start_time = time.time()
frame_count = 0
for packet in container.demux(video=0):
    for _ in packet.decode():
        frame_count += 1

hw_time = time.time() - start_time
hw_fps = frame_count / hw_time
container.close()

print(f"Decoded with cuda in {hw_time:.2f}s ({hw_fps:.2f} fps).")

Sanity Check:
FFmpeg:

$ ffmpeg -c:v av1_cuvid -i av1.webm -f null -
...
frame=  300 fps=0.0 q=-0.0 Lsize=N/A time=00:00:10.00 bitrate=N/A speed=17.3x    
# 30 FPS * 17.3x realtime ~ 500FPS decode speed

Py Sample:

$ python test.py
Decoded with cuda in 3.18s (94.25 fps).

Reproduction

see above

Versions

PyAV v14.0.1
library configuration: --disable-static --enable-shared --libdir=/tmp/vendor/lib --prefix=/tmp/vendor --disable-alsa --disable-doc --disable-libtheora --disable-libfreetype --disable-libfontconfig --disable-libbluray --disable-libopenjpeg --disable-mediafoundation --enable-gmp --enable-gnutls --enable-libaom --enable-libdav1d --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopus --enable-libspeex --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libxcb --enable-libxml2 --enable-lzma --enable-zlib --enable-version3 --enable-libx264 --disable-libopenh264 --enable-libx265 --enable-gpl
library license: GPL version 3 or later
libavcodec     61. 19.100
libavdevice    61.  3.100
libavfilter    10.  4.100
libavformat    61.  7.100
libavutil      59. 39.100
libswresample   5.  3.100
libswscale      8.  3.100
  • I am/tried using the binary wheels
  • I compiled from source

Research

I have done the following:

Additional context

Tests done on an RTX4090, reproduced also on an L4 in GCP

@matthewlai
Copy link
Contributor

matthewlai commented Dec 23, 2024

A few questions:

  1. How do you know it's not doing hardware decoding? Is it just the speed difference? Currently the hw decode pipeline in PyAV hasn't been very optimised, so the speed is not expected to match ffmpeg CLI. You should be able to verify with nvidia-smi while the decode is happening.
  2. Does this work? ffmpeg -hwaccel cuda -i av1.webm -f null -
  3. Are you building PyAV with custom ffmpeg? The ffmpeg shipped with PyAV wheels don't have CUDA support

@legraphista
Copy link
Author

legraphista commented Dec 23, 2024

  1. How do you know it's not doing hardware decoding? Is it just the speed difference? Currently the hw decode pipeline in PyAV hasn't been very optimised, so the speed is not expected to match ffmpeg CLI.

I re-compiled ffmpeg without libaom and the above code stopped working, whereas the ffmpeg command that uses HW decoding continued to work

  • Does this work? ffmpeg -hwaccel cuda -i av1.webm -f null -
$ ffmpeg -hwaccel cuda -i av1.webm -f null -
...
frame=  300 fps=0.0 q=-0.0 Lsize=N/A time=00:00:10.00 bitrate=N/A speed=11.8x

$ echo $?
0
Full Logs

Tested on an GCP L4.
av1.mkv and av1.webm are the exact same file, github wouldn't allow me to upload an mkv.

(base) stefan@av1-decode-test-on-l4-with-pyav:~$ ffmpeg -c:v av1_cuvid -i av1.mkv -f null -
ffmpeg version git-2024-12-23-6c9218d Copyright (c) 2000-2024 the FFmpeg developers
  built with gcc 10 (Debian 10.2.1-6)
  configuration: --enable-nonfree --enable-cuda-nvcc --enable-cuda --enable-cuvid --enable-libnpp --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --enable-nonfree --enable-gpl --enable-cuda-nvcc --disable-static --enable-shared
  libavutil      59. 51.100 / 59. 51.100
  libavcodec     61. 28.100 / 61. 28.100
  libavformat    61.  9.101 / 61.  9.101
  libavdevice    61.  4.100 / 61.  4.100
  libavfilter    10.  6.101 / 10.  6.101
  libswscale      8. 12.100 /  8. 12.100
  libswresample   5.  4.100 /  5.  4.100
  libpostproc    58.  4.100 / 58.  4.100
Input #0, matroska,webm, from 'av1.mkv':
  Metadata:
    ENCODER         : Lavf61.9.101
  Duration: 00:00:10.00, start: 0.000000, bitrate: 59 kb/s
  Stream #0:0: Video: av1 (Main), yuv420p(tv, progressive), 3840x2160, SAR 1:1 DAR 16:9, 30 fps, 30 tbr, 1k tbn
    Metadata:
      ENCODER         : Lavc61.27.101 libaom-av1
      DURATION        : 00:00:10.000000000
Stream mapping:
  Stream #0:0 -> #0:0 (av1 (av1_cuvid) -> wrapped_avframe (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
  Metadata:
    encoder         : Lavf61.9.101
  Stream #0:0: Video: wrapped_avframe, nv12(tv, progressive), 3840x2160 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn
    Metadata:
      encoder         : Lavc61.28.100 wrapped_avframe
      DURATION        : 00:00:10.000000000
[out#0/null @ 0x55beb87c8100] video:129KiB audio:0KiB subtitle:0KiB other streams:0KiB global headers:0KiB muxing overhead: unknown
frame=  300 fps=0.0 q=-0.0 Lsize=N/A time=00:00:10.00 bitrate=N/A speed=  10x




(base) stefan@av1-decode-test-on-l4-with-pyav:~$ ffmpeg -hwaccel cuda -i av1.mkv -f null -
ffmpeg version git-2024-12-23-6c9218d Copyright (c) 2000-2024 the FFmpeg developers
  built with gcc 10 (Debian 10.2.1-6)
  configuration: --enable-nonfree --enable-cuda-nvcc --enable-cuda --enable-cuvid --enable-libnpp --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --enable-nonfree --enable-gpl --enable-cuda-nvcc --disable-static --enable-shared
  libavutil      59. 51.100 / 59. 51.100
  libavcodec     61. 28.100 / 61. 28.100
  libavformat    61.  9.101 / 61.  9.101
  libavdevice    61.  4.100 / 61.  4.100
  libavfilter    10.  6.101 / 10.  6.101
  libswscale      8. 12.100 /  8. 12.100
  libswresample   5.  4.100 /  5.  4.100
  libpostproc    58.  4.100 / 58.  4.100
Input #0, matroska,webm, from 'av1.mkv':
  Metadata:
    ENCODER         : Lavf61.9.101
  Duration: 00:00:10.00, start: 0.000000, bitrate: 59 kb/s
  Stream #0:0: Video: av1 (Main), yuv420p(tv, progressive), 3840x2160, SAR 1:1 DAR 16:9, 30 fps, 30 tbr, 1k tbn
    Metadata:
      ENCODER         : Lavc61.27.101 libaom-av1
      DURATION        : 00:00:10.000000000
Stream mapping:
  Stream #0:0 -> #0:0 (av1 (native) -> wrapped_avframe (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
  Metadata:
    encoder         : Lavf61.9.101
  Stream #0:0: Video: wrapped_avframe, nv12(tv, progressive), 3840x2160 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn
    Metadata:
      encoder         : Lavc61.28.100 wrapped_avframe
      DURATION        : 00:00:10.000000000
[out#0/null @ 0x55f467eb5100] video:129KiB audio:0KiB subtitle:0KiB other streams:0KiB global headers:0KiB muxing overhead: unknown
frame=  300 fps=176 q=-0.0 Lsize=N/A time=00:00:10.00 bitrate=N/A speed=5.88x    




(base) stefan@av1-decode-test-on-l4-with-pyav:~$ ffmpeg -i av1.mkv -f null -
ffmpeg version git-2024-12-23-6c9218d Copyright (c) 2000-2024 the FFmpeg developers
  built with gcc 10 (Debian 10.2.1-6)
  configuration: --enable-nonfree --enable-cuda-nvcc --enable-cuda --enable-cuvid --enable-libnpp --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --enable-nonfree --enable-gpl --enable-cuda-nvcc --disable-static --enable-shared
  libavutil      59. 51.100 / 59. 51.100
  libavcodec     61. 28.100 / 61. 28.100
  libavformat    61.  9.101 / 61.  9.101
  libavdevice    61.  4.100 / 61.  4.100
  libavfilter    10.  6.101 / 10.  6.101
  libswscale      8. 12.100 /  8. 12.100
  libswresample   5.  4.100 /  5.  4.100
  libpostproc    58.  4.100 / 58.  4.100
Input #0, matroska,webm, from 'av1.mkv':
  Metadata:
    ENCODER         : Lavf61.9.101
  Duration: 00:00:10.00, start: 0.000000, bitrate: 59 kb/s
  Stream #0:0: Video: av1 (Main), yuv420p(tv, progressive), 3840x2160, SAR 1:1 DAR 16:9, 30 fps, 30 tbr, 1k tbn
    Metadata:
      ENCODER         : Lavc61.27.101 libaom-av1
      DURATION        : 00:00:10.000000000
Stream mapping:
  Stream #0:0 -> #0:0 (av1 (native) -> wrapped_avframe (native))
Press [q] to stop, [?] for help
[av1 @ 0x55998e39d300] Your platform doesn't support hardware accelerated AV1 decoding.
[av1 @ 0x55998e39d300] Failed to get pixel format.
[av1 @ 0x55998e39d300] Get current frame error
[vist#0:0/av1 @ 0x55998e38ab80] [dec:av1 @ 0x55998e39c780] Error submitting packet to decoder: Function not implemented
[av1 @ 0x55998e39d300] Your platform doesn't support hardware accelerated AV1 decoding.
[av1 @ 0x55998e39d300] Failed to get pixel format.
[av1 @ 0x55998e39d300] Get current frame error
[vist#0:0/av1 @ 0x55998e38ab80] [dec:av1 @ 0x55998e39c780] Error submitting packet to decoder: Function not implemented
[av1 @ 0x55998e39d300] Your platform doesn't support hardware accelerated AV1 decoding.
[av1 @ 0x55998e39d300] Failed to get pixel format.
[av1 @ 0x55998e39d300] Get current frame error
[vist#0:0/av1 @ 0x55998e38ab80] [dec:av1 @ 0x55998e39c780] Error submitting packet to decoder: Function not implemented
[av1 @ 0x55998e39d300] Your platform doesn't support hardware accelerated AV1 decoding.
[av1 @ 0x55998e39d300] Failed to get pixel format.
[av1 @ 0x55998e39d300] Get current frame error
[vist#0:0/av1 @ 0x55998e38ab80] [dec:av1 @ 0x55998e39c780] Error submitting packet to decoder: Function not implemented
[av1 @ 0x55998e39d300] Your platform doesn't support hardware accelerated AV1 decoding.
[av1 @ 0x55998e39d300] Failed to get pixel format.
[av1 @ 0x55998e39d300] Get current frame error

... truncated for brevity ...

[vist#0:0/av1 @ 0x55998e38ab80] [dec:av1 @ 0x55998e39c780] Error submitting packet to decoder: Function not implemented
[av1 @ 0x55998e39d300] Your platform doesn't support hardware accelerated AV1 decoding.
[av1 @ 0x55998e39d300] Failed to get pixel format.
[av1 @ 0x55998e39d300] Get current frame error
[vist#0:0/av1 @ 0x55998e38ab80] [dec:av1 @ 0x55998e39c780] Error submitting packet to decoder: Function not implemented
[av1 @ 0x55998e39d300] Your platform doesn't support hardware accelerated AV1 decoding.
[av1 @ 0x55998e39d300] Failed to get pixel format.
[av1 @ 0x55998e39d300] Get current frame error
[vist#0:0/av1 @ 0x55998e38ab80] [dec:av1 @ 0x55998e39c780] Error submitting packet to decoder: Function not implemented
[vist#0:0/av1 @ 0x55998e38ab80] [dec:av1 @ 0x55998e39c780] Decode error rate 1 exceeds maximum 0.666667
[vist#0:0/av1 @ 0x55998e38ab80] [dec:av1 @ 0x55998e39c780] Task finished with error code: -1145393733 (Error number -1145393733 occurred)
[vist#0:0/av1 @ 0x55998e38ab80] [dec:av1 @ 0x55998e39c780] Terminating thread with return code -1145393733 (Error number -1145393733 occurred)
[vf#0:0 @ 0x55998e39b980] No filtered frames for output stream, trying to initialize anyway.
Output #0, null, to 'pipe:':
  Metadata:
    encoder         : Lavf61.9.101
  Stream #0:0: Video: wrapped_avframe, yuv420p(progressive), 3840x2160 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 1k tbn
    Metadata:
      encoder         : Lavc61.28.100 wrapped_avframe
      DURATION        : 00:00:10.000000000
[out#0/null @ 0x55998e396740] video:0KiB audio:0KiB subtitle:0KiB other streams:0KiB global headers:0KiB muxing overhead: unknown
[out#0/null @ 0x55998e396740] Output file is empty, nothing was encoded(check -ss / -t / -frames parameters if used)
frame=    0 fps=0.0 q=0.0 Lsize=N/A time=N/A bitrate=N/A speed=N/A    
Conversion failed!

3. Are you building PyAV with custom ffmpeg? The ffmpeg shipped with PyAV wheels don't have CUDA support

Yes, compiled ffmpeg with CUDA support & shared libraries, installed on the system (sudo make install && sudo ldconfig) before (re-)installing pyav from master

@matthewlai
Copy link
Contributor

If you run python -m av --hwconfigs do you get a hwconfig for av1?

Can you do print(container.streams.video[0].codec_context.is_hwaccel)?

The way allow_software_fallback works right now is a bit misleading. It controls the case where the hardware decoder can be found and opened, but something goes wrong trying to set it up to decode the actual video (eg. resolution too high, subsampling mode not supported, etc). In that case if allow_software_fallback is set, we allow the decoder to do software decoding instead.

In the case that the hardware decoder is never there to begin with, we silently fall back to software decoding, regardless of the allow_software_fallback setting. This is necessary because in many files you have multiple streams with different encodings, and there will be at least one that can't be hardware decoded (eg MJPEG). We don't want to fail in those cases. Maybe we can raise an exception if no stream can be accelerated.

@legraphista
Copy link
Author

If you run python -m av --hwconfigs do you get a hwconfig for av1?

(base) stefan@av1-decode-test-on-l4-with-pyav:~$ python -m av --hwconfigs
Hardware configs:
    av1
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x33475c00>
    av1_cuvid
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x33146890>
    h263
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x3347e5a0>
    h263p
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x3347e5a0>
    h264
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x3347e680>
    h264_cuvid
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x33146890>
    hevc
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x3347e800>
    hevc_cuvid
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x33146890>
    mjpeg
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x3347fa20>
    mjpeg_cuvid
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x33146890>
    mpeg1_cuvid
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x33146890>
    mpeg1video
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x334800f0>
    mpeg2_cuvid
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x33146890>
    mpeg2video
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x33480080>
    mpeg4
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x33480460>
    mpeg4_cuvid
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x33146890>
    vc1
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x33483ac0>
    vc1_cuvid
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x33146890>
    vp8
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x33483f60>
    vp8_cuvid
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x33146890>
    vp9
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x33484100>
    vp9_cuvid
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x33146890>
    wmv3
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x334839e0>

Can you do print(container.streams.video[0].codec_context.is_hwaccel)?

(base) stefan@av1-decode-test-on-l4-with-pyav:~$ python
Python 3.10.15 | packaged by conda-forge | (main, Oct 16 2024, 01:24:24) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import av
>>> container = av.open('av1.mkv')
>>> print(container.streams.video[0].codec_context.is_hwaccel)
False
>>> hwaccel = av.codec.hwaccel.HWAccel(device_type='cuda', allow_software_fallback=False)
>>> container = av.open('av1.mkv', hwaccel=hwaccel)
>>> print(container.streams.video[0].codec_context.is_hwaccel)
True
>>> 

strangely, the paths diverge here.
On the GCP L4, I cannot decode the stream.
But on my PC where I have libaom compiled and configured (and can decode the file using the weird software unintended fallback, verified with nvtop), I get the following:

container = av.open('av1.mkv')
print(container.streams.video[0].codec_context.is_hwaccel)
# False

hwaccel = av.codec.hwaccel.HWAccel(device_type='cuda', allow_software_fallback=False)
container = av.open('av1.mkv', hwaccel=hwaccel)
print(container.streams.video[0].codec_context.is_hwaccel)
# False

both systems have AV1 HW support

Hardware configs:
    av1
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x7a6ba5c0>
    av1_cuvid
        <av.HWConfig device_type=cuda format=cuda is_supported=True at 0x7a3bba80>

@PyAV-Org PyAV-Org locked and limited conversation to collaborators Dec 23, 2024
@WyattBlue WyattBlue converted this issue into discussion #1691 Dec 23, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants