torchaudio.load_with_torchcodec¶

torchaudio.load_with_torchcodec(uri: Union[BinaryIO, str, PathLike], frame_offset: int = 0, num_frames: int = -1, normalize: bool = True, channels_first: bool = True, format: Optional[str] = None, buffer_size: int = 4096, backend: Optional[str] = None) → Tuple[Tensor, int][source]¶

Load audio data from source using TorchCodec’s AudioDecoder.

Note

This function supports the same API as load(), and relies on TorchCodec’s decoding capabilities under the hood. It is provided for convenience, but we do recommend that you port your code to natively use torchcodec’s AudioDecoder class for better performance: https://docs.pytorch.org/torchcodec/stable/generated/torchcodec.decoders.AudioDecoder. In TorchAudio 2.9, load() will be relying on load_with_torchcodec(). Note that some parameters of load(), like normalize, buffer_size, and backend, are ignored by load_with_torchcodec().

Parameters

uri (path-like object or file-like object) –
Source of audio data. The following types are accepted:
- path-like: File path or URL.
- file-like: Object with read(size: int) -> bytes method.
frame_offset (int, optional) – Number of samples to skip before start reading data.
num_frames (int, optional) – Maximum number of samples to read. -1 reads all the remaining samples, starting from frame_offset.
normalize (bool, optional) – TorchCodec always returns normalized float32 samples. This parameter is ignored and a warning is issued if set to False. Default: True.
channels_first (bool, optional) – When True, the returned Tensor has dimension [channel, time]. Otherwise, the returned Tensor’s dimension is [time, channel].
format (str or None, optional) – Format hint for the decoder. May not be supported by all TorchCodec decoders. (Default: None)
buffer_size (int, optional) – Not used by TorchCodec AudioDecoder. Provided for API compatibility.
backend (str or None, optional) – Not used by TorchCodec AudioDecoder. Provided for API compatibility.

Returns

Resulting Tensor and sample rate. Always returns float32 tensors. If channels_first=True, shape is [channel, time], otherwise [time, channel].

Return type

(torch.Tensor, int)

Raises

Note

TorchCodec always returns normalized float32 samples, so the normalize parameter has no effect.
The buffer_size and backend parameters are ignored.
Not all audio formats supported by torchaudio backends may be supported by TorchCodec.

Docs