.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "tutorials/nvenc_tutorial.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_tutorials_nvenc_tutorial.py: Accelerated video encoding with NVENC ===================================== .. _nvenc_tutorial: **Author**: `Moto Hira `__ .. warning:: Starting with version 2.8, we are refactoring TorchAudio to transition it into a maintenance phase. As a result: - The APIs described in this tutorial are deprecated in 2.8 and will be removed in 2.9. - The decoding and encoding capabilities of PyTorch for both audio and video are being consolidated into TorchCodec. Please see https://github.com/pytorch/audio/issues/3902 for more information. This tutorial shows how to use NVIDIA’s hardware video encoder (NVENC) with TorchAudio, and how it improves the performance of video encoding. .. GENERATED FROM PYTHON SOURCE LINES 24-47 .. note:: This tutorial requires FFmpeg libraries compiled with HW acceleration enabled. Please refer to :ref:`Enabling GPU video decoder/encoder ` for how to build FFmpeg with HW acceleration. .. note:: Most modern GPUs have both HW decoder and encoder, but some highend GPUs like A100 and H100 do not have HW encoder. Please refer to the following for the availability and format coverage. https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new Attempting to use HW encoder on these GPUs fails with an error message like ``Generic error in an external library``. You can enable debug log with :py:func:`torchaudio.utils.ffmpeg_utils.set_log_level` to see more detailed error messages issued along the way. .. GENERATED FROM PYTHON SOURCE LINES 47-61 .. code-block:: default import torch import torchaudio print(torch.__version__) print(torchaudio.__version__) import io import time import matplotlib.pyplot as plt from IPython.display import Video from torchaudio.io import StreamReader, StreamWriter .. rst-class:: sphx-glr-script-out .. code-block:: none 2.8.0+cu126 2.8.0 .. GENERATED FROM PYTHON SOURCE LINES 62-68 Check the prerequisites ----------------------- First, we check that TorchAudio correctly detects FFmpeg libraries that support HW decoder/encoder. .. GENERATED FROM PYTHON SOURCE LINES 69-72 .. code-block:: default from torchaudio.utils import ffmpeg_utils .. GENERATED FROM PYTHON SOURCE LINES 74-79 .. code-block:: default print("FFmpeg Library versions:") for k, ver in ffmpeg_utils.get_versions().items(): print(f" {k}:\t{'.'.join(str(v) for v in ver)}") .. rst-class:: sphx-glr-script-out .. code-block:: none FFmpeg Library versions: /pytorch/audio/examples/tutorials/nvenc_tutorial.py:76: UserWarning: torio.utils.ffmpeg_utils.get_versions has been deprecated. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. Please see https://github.com/pytorch/audio/issues/3902 for more information. It will be removed from the 2.9 release. for k, ver in ffmpeg_utils.get_versions().items(): libavcodec: 60.3.100 libavdevice: 60.1.100 libavfilter: 9.3.100 libavformat: 60.3.100 libavutil: 58.2.100 .. GENERATED FROM PYTHON SOURCE LINES 81-86 .. code-block:: default print("Available NVENC Encoders:") for k in ffmpeg_utils.get_video_encoders().keys(): if "nvenc" in k: print(f" - {k}") .. rst-class:: sphx-glr-script-out .. code-block:: none Available NVENC Encoders: /pytorch/audio/examples/tutorials/nvenc_tutorial.py:82: UserWarning: torio.utils.ffmpeg_utils.get_video_encoders has been deprecated. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. Please see https://github.com/pytorch/audio/issues/3902 for more information. It will be removed from the 2.9 release. for k in ffmpeg_utils.get_video_encoders().keys(): - av1_nvenc - h264_nvenc - hevc_nvenc .. GENERATED FROM PYTHON SOURCE LINES 88-93 .. code-block:: default print("Avaialbe GPU:") print(torch.cuda.get_device_properties(0)) .. rst-class:: sphx-glr-script-out .. code-block:: none Avaialbe GPU: _CudaDeviceProperties(name='NVIDIA A10G', major=8, minor=6, total_memory=22598MB, multi_processor_count=80, uuid=566db26d-0405-a011-4198-2699df443f87, pci_bus_id=0, pci_device_id=30, pci_domain_id=0, L2_cache_size=6MB) .. GENERATED FROM PYTHON SOURCE LINES 94-97 We use the following helper function to generate test frame data. For the detail of synthetic video generation please refer to :ref:`StreamReader Advanced Usage `. .. GENERATED FROM PYTHON SOURCE LINES 97-108 .. code-block:: default def get_data(height, width, format="yuv444p", frame_rate=30000 / 1001, duration=4): src = f"testsrc2=rate={frame_rate}:size={width}x{height}:duration={duration}" s = StreamReader(src=src, format="lavfi") s.add_basic_video_stream(-1, format=format) s.process_all_packets() (video,) = s.pop_chunks() return video .. GENERATED FROM PYTHON SOURCE LINES 109-116 Encoding videos with NVENC -------------------------- To use HW video encoder, you need to specify the HW encoder when defining the output video stream by providing ``encoder`` option to :py:meth:`~torchaudio.io.StreamWriter.add_video_stream`. .. GENERATED FROM PYTHON SOURCE LINES 119-129 .. code-block:: default pict_config = { "height": 360, "width": 640, "frame_rate": 30000 / 1001, "format": "yuv444p", } frame_data = get_data(**pict_config) .. rst-class:: sphx-glr-script-out .. code-block:: none /pytorch/audio/examples/tutorials/nvenc_tutorial.py:101: UserWarning: torio.io._streaming_media_decoder.StreamingMediaDecoder has been deprecated. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. The decoding and encoding capabilities of PyTorch for both audio and video are being consolidated into TorchCodec. Please see https://github.com/pytorch/audio/issues/3902 for more information. It will be removed from the 2.9 release. s = StreamReader(src=src, format="lavfi") .. GENERATED FROM PYTHON SOURCE LINES 131-137 .. code-block:: default w = StreamWriter(io.BytesIO(), format="mp4") w.add_video_stream(**pict_config, encoder="h264_nvenc", encoder_format="yuv444p") with w.open(): w.write_video_chunk(0, frame_data) .. rst-class:: sphx-glr-script-out .. code-block:: none /pytorch/audio/examples/tutorials/nvenc_tutorial.py:132: UserWarning: torio.io._streaming_media_encoder.StreamingMediaEncoder has been deprecated. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. The decoding and encoding capabilities of PyTorch for both audio and video are being consolidated into TorchCodec. Please see https://github.com/pytorch/audio/issues/3902 for more information. It will be removed from the 2.9 release. w = StreamWriter(io.BytesIO(), format="mp4") .. GENERATED FROM PYTHON SOURCE LINES 138-142 Similar to the HW decoder, by default, the encoder expects the frame data to be on CPU memory. To send data from CUDA memory, you need to specify ``hw_accel`` option. .. GENERATED FROM PYTHON SOURCE LINES 142-151 .. code-block:: default buffer = io.BytesIO() w = StreamWriter(buffer, format="mp4") w.add_video_stream(**pict_config, encoder="h264_nvenc", encoder_format="yuv444p", hw_accel="cuda:0") with w.open(): w.write_video_chunk(0, frame_data.to(torch.device("cuda:0"))) buffer.seek(0) video_cuda = buffer.read() .. rst-class:: sphx-glr-script-out .. code-block:: none /pytorch/audio/examples/tutorials/nvenc_tutorial.py:144: UserWarning: torio.io._streaming_media_encoder.StreamingMediaEncoder has been deprecated. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. The decoding and encoding capabilities of PyTorch for both audio and video are being consolidated into TorchCodec. Please see https://github.com/pytorch/audio/issues/3902 for more information. It will be removed from the 2.9 release. w = StreamWriter(buffer, format="mp4") .. GENERATED FROM PYTHON SOURCE LINES 153-156 .. code-block:: default Video(video_cuda, embed=True, mimetype="video/mp4") .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 157-167 Benchmark NVENC with StreamWriter --------------------------------- Now we compare the performance of software encoder and hardware encoder. Similar to the benchmark in NVDEC, we process the videos of different resolution, and measure the time it takes to encode them. We also measure the size of resulting video file. .. GENERATED FROM PYTHON SOURCE LINES 169-172 The following function encodes the given frames and measure the time it takes to encode and the size of the resulting video data. .. GENERATED FROM PYTHON SOURCE LINES 172-193 .. code-block:: default def test_encode(data, encoder, width, height, hw_accel=None, **config): assert data.is_cuda buffer = io.BytesIO() s = StreamWriter(buffer, format="mp4") s.add_video_stream(encoder=encoder, width=width, height=height, hw_accel=hw_accel, **config) with s.open(): t0 = time.monotonic() if hw_accel is None: data = data.to("cpu") s.write_video_chunk(0, data) elapsed = time.monotonic() - t0 size = buffer.tell() fps = len(data) / elapsed print(f" - Processed {len(data)} frames in {elapsed:.2f} seconds. ({fps:.2f} fps)") print(f" - Encoded data size: {size} bytes") return elapsed, size .. GENERATED FROM PYTHON SOURCE LINES 194-199 We conduct the tests for the following configurations - Software encoder with the number of threads 1, 4, 8 - Hardware encoder with and without ``hw_accel`` option. .. GENERATED FROM PYTHON SOURCE LINES 199-254 .. code-block:: default def run_tests(height, width, duration=4): # Generate the test data print(f"Testing resolution: {width}x{height}") pict_config = { "height": height, "width": width, "frame_rate": 30000 / 1001, "format": "yuv444p", } data = get_data(**pict_config, duration=duration) data = data.to(torch.device("cuda:0")) times = [] sizes = [] # Test software encoding encoder_config = { "encoder": "libx264", "encoder_format": "yuv444p", } for i, num_threads in enumerate([1, 4, 8]): print(f"* Software Encoder (num_threads={num_threads})") time_, size = test_encode( data, encoder_option={"threads": str(num_threads)}, **pict_config, **encoder_config, ) times.append(time_) if i == 0: sizes.append(size) # Test hardware encoding encoder_config = { "encoder": "h264_nvenc", "encoder_format": "yuv444p", "encoder_option": {"gpu": "0"}, } for i, hw_accel in enumerate([None, "cuda"]): print(f"* Hardware Encoder {'(CUDA frames)' if hw_accel else ''}") time_, size = test_encode( data, **pict_config, **encoder_config, hw_accel=hw_accel, ) times.append(time_) if i == 0: sizes.append(size) return times, sizes .. GENERATED FROM PYTHON SOURCE LINES 255-261 And we change the resolution of videos to see how these measurement change. 360P ---- .. GENERATED FROM PYTHON SOURCE LINES 261-264 .. code-block:: default time_360, size_360 = run_tests(360, 640) .. rst-class:: sphx-glr-script-out .. code-block:: none Testing resolution: 640x360 /pytorch/audio/examples/tutorials/nvenc_tutorial.py:101: UserWarning: torio.io._streaming_media_decoder.StreamingMediaDecoder has been deprecated. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. The decoding and encoding capabilities of PyTorch for both audio and video are being consolidated into TorchCodec. Please see https://github.com/pytorch/audio/issues/3902 for more information. It will be removed from the 2.9 release. s = StreamReader(src=src, format="lavfi") * Software Encoder (num_threads=1) /pytorch/audio/examples/tutorials/nvenc_tutorial.py:178: UserWarning: torio.io._streaming_media_encoder.StreamingMediaEncoder has been deprecated. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. The decoding and encoding capabilities of PyTorch for both audio and video are being consolidated into TorchCodec. Please see https://github.com/pytorch/audio/issues/3902 for more information. It will be removed from the 2.9 release. s = StreamWriter(buffer, format="mp4") - Processed 120 frames in 0.62 seconds. (192.36 fps) - Encoded data size: 381331 bytes * Software Encoder (num_threads=4) - Processed 120 frames in 0.24 seconds. (501.65 fps) - Encoded data size: 381307 bytes * Software Encoder (num_threads=8) - Processed 120 frames in 0.18 seconds. (672.48 fps) - Encoded data size: 390689 bytes * Hardware Encoder - Processed 120 frames in 0.05 seconds. (2264.57 fps) - Encoded data size: 1262979 bytes * Hardware Encoder (CUDA frames) - Processed 120 frames in 0.05 seconds. (2583.08 fps) - Encoded data size: 1262979 bytes .. GENERATED FROM PYTHON SOURCE LINES 265-268 720P ---- .. GENERATED FROM PYTHON SOURCE LINES 268-271 .. code-block:: default time_720, size_720 = run_tests(720, 1280) .. rst-class:: sphx-glr-script-out .. code-block:: none Testing resolution: 1280x720 /pytorch/audio/examples/tutorials/nvenc_tutorial.py:101: UserWarning: torio.io._streaming_media_decoder.StreamingMediaDecoder has been deprecated. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. The decoding and encoding capabilities of PyTorch for both audio and video are being consolidated into TorchCodec. Please see https://github.com/pytorch/audio/issues/3902 for more information. It will be removed from the 2.9 release. s = StreamReader(src=src, format="lavfi") * Software Encoder (num_threads=1) /pytorch/audio/examples/tutorials/nvenc_tutorial.py:178: UserWarning: torio.io._streaming_media_encoder.StreamingMediaEncoder has been deprecated. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. The decoding and encoding capabilities of PyTorch for both audio and video are being consolidated into TorchCodec. Please see https://github.com/pytorch/audio/issues/3902 for more information. It will be removed from the 2.9 release. s = StreamWriter(buffer, format="mp4") - Processed 120 frames in 2.31 seconds. (51.94 fps) - Encoded data size: 1335451 bytes * Software Encoder (num_threads=4) - Processed 120 frames in 0.94 seconds. (128.08 fps) - Encoded data size: 1336418 bytes * Software Encoder (num_threads=8) - Processed 120 frames in 0.88 seconds. (136.99 fps) - Encoded data size: 1344063 bytes * Hardware Encoder - Processed 120 frames in 0.33 seconds. (367.70 fps) - Encoded data size: 1358969 bytes * Hardware Encoder (CUDA frames) - Processed 120 frames in 0.15 seconds. (801.61 fps) - Encoded data size: 1358969 bytes .. GENERATED FROM PYTHON SOURCE LINES 272-275 1080P ----- .. GENERATED FROM PYTHON SOURCE LINES 275-278 .. code-block:: default time_1080, size_1080 = run_tests(1080, 1920) .. rst-class:: sphx-glr-script-out .. code-block:: none Testing resolution: 1920x1080 /pytorch/audio/examples/tutorials/nvenc_tutorial.py:101: UserWarning: torio.io._streaming_media_decoder.StreamingMediaDecoder has been deprecated. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. The decoding and encoding capabilities of PyTorch for both audio and video are being consolidated into TorchCodec. Please see https://github.com/pytorch/audio/issues/3902 for more information. It will be removed from the 2.9 release. s = StreamReader(src=src, format="lavfi") * Software Encoder (num_threads=1) /pytorch/audio/examples/tutorials/nvenc_tutorial.py:178: UserWarning: torio.io._streaming_media_encoder.StreamingMediaEncoder has been deprecated. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. The decoding and encoding capabilities of PyTorch for both audio and video are being consolidated into TorchCodec. Please see https://github.com/pytorch/audio/issues/3902 for more information. It will be removed from the 2.9 release. s = StreamWriter(buffer, format="mp4") - Processed 120 frames in 4.77 seconds. (25.18 fps) - Encoded data size: 2678241 bytes * Software Encoder (num_threads=4) - Processed 120 frames in 1.84 seconds. (65.11 fps) - Encoded data size: 2682028 bytes * Software Encoder (num_threads=8) - Processed 120 frames in 1.71 seconds. (70.31 fps) - Encoded data size: 2685086 bytes * Hardware Encoder - Processed 120 frames in 0.72 seconds. (166.12 fps) - Encoded data size: 1705900 bytes * Hardware Encoder (CUDA frames) - Processed 120 frames in 0.32 seconds. (370.92 fps) - Encoded data size: 1705900 bytes .. GENERATED FROM PYTHON SOURCE LINES 279-281 Now we plot the result. .. GENERATED FROM PYTHON SOURCE LINES 281-321 .. code-block:: default def plot(): fig, axes = plt.subplots(2, 1, sharex=True, figsize=[9.6, 7.2]) for items in zip(time_360, time_720, time_1080, "ov^X+"): axes[0].plot(items[:-1], marker=items[-1]) axes[0].grid(axis="both") axes[0].set_xticks([0, 1, 2], ["360p", "720p", "1080p"], visible=True) axes[0].tick_params(labeltop=False) axes[0].legend( [ "Software Encoding (threads=1)", "Software Encoding (threads=4)", "Software Encoding (threads=8)", "Hardware Encoding (CPU Tensor)", "Hardware Encoding (CUDA Tensor)", ] ) axes[0].set_title("Time to encode videos with different resolutions") axes[0].set_ylabel("Time [s]") for items in zip(size_360, size_720, size_1080, "v^"): axes[1].plot(items[:-1], marker=items[-1]) axes[1].grid(axis="both") axes[1].set_xticks([0, 1, 2], ["360p", "720p", "1080p"]) axes[1].set_ylabel("The encoded size [bytes]") axes[1].set_title("The size of encoded videos") axes[1].legend( [ "Software Encoding", "Hardware Encoding", ] ) plt.tight_layout() plot() .. image-sg:: /tutorials/images/sphx_glr_nvenc_tutorial_001.png :alt: Time to encode videos with different resolutions, The size of encoded videos :srcset: /tutorials/images/sphx_glr_nvenc_tutorial_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 322-347 Result ------ We observe couple of things; - The time to encode video grows as the resolution becomes larger. - In the case of software encoding, increasing the number of threads helps reduce the decoding time. - The gain from extra threads diminishes around 8. - Hardware encoding is faster than software encoding in general. - Using ``hw_accel`` does not improve the speed of encoding itself as much. - The size of the resulting videos grow as the resolution becomes larger. - Hardware encoder produces smaller video file at larger resolution. The last point is somewhat strange to the author (who is not an expert in production of videos.) It is often said that hardware decoders produce larger video compared to software encoders. Some says that software encoders allow fine-grained control over encoding configuration, so the resulting video is more optimal. Meanwhile, hardware encoders are optimized for performance, thus does not provide as much control over quality and binary size. .. GENERATED FROM PYTHON SOURCE LINES 349-361 Quality Spotcheck ----------------- So, how are the quality of videos produced with hardware encoders? A quick spot check of high resolution videos uncovers that they have more noticeable artifacts on higher resolution. Which might be an explanation of the smaller binary size. (meaning, it is not allocating enough bits to produce quality output.) The following images are raw frames of videos encoded with hardware encoders. .. GENERATED FROM PYTHON SOURCE LINES 363-369 360P ---- .. raw:: html NVENC sample 360P .. GENERATED FROM PYTHON SOURCE LINES 371-377 720P ---- .. raw:: html NVENC sample 720P .. GENERATED FROM PYTHON SOURCE LINES 379-385 1080P ----- .. raw:: html NVENC sample 1080P .. GENERATED FROM PYTHON SOURCE LINES 387-394 We can see that there are more artifacts at higher resolution, which are noticeable. Perhaps one might be able to reduce these using ``encoder_options`` arguments. We did not try, but if you try that and find a better quality setting, feel free to let us know. ;) .. GENERATED FROM PYTHON SOURCE LINES 397-398 Tag: :obj:`torchaudio.io` .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 23.330 seconds) .. _sphx_glr_download_tutorials_nvenc_tutorial.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: nvenc_tutorial.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: nvenc_tutorial.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_