Decoding audio streams with AudioDecoder¶
In this example, we’ll learn how to decode an audio file using the
AudioDecoder
class.
First, a bit of boilerplate: we’ll download an audio file from the web and define an audio playing utility. You can ignore that part and jump right below to Creating a decoder.
import requests
from IPython.display import Audio
def play_audio(samples):
return Audio(samples.data, rate=samples.sample_rate)
# Audio source is CC0: https://opengameart.org/content/town-theme-rpg
# Attribution: cynicmusic.com pixelsphere.org
url = "https://opengameart.org/sites/default/files/TownTheme.mp3"
response = requests.get(url, headers={"User-Agent": ""})
if response.status_code != 200:
raise RuntimeError(f"Failed to download video. {response.status_code = }.")
raw_audio_bytes = response.content
Creating a decoder¶
We can now create a decoder from the raw (encoded) audio bytes. You can of course use a local audio file and pass the path as input. You can also decode audio streams from videos!
from torchcodec.decoders import AudioDecoder
decoder = AudioDecoder(raw_audio_bytes)
The has not yet been decoded by the decoder, but we already have access to
some metadata via the metadata
attribute which is an
AudioStreamMetadata
object.
print(decoder.metadata)
AudioStreamMetadata:
duration_seconds_from_header: 97.48898
begin_stream_seconds_from_header: 0.025057
bit_rate: 108039.0
codec: mp3
stream_index: 0
sample_rate: 44100
num_channels: 2
sample_format: fltp
Decoding samples¶
To get decoded samples, we just need to call the
get_all_samples()
method,
which returns an AudioSamples
object:
samples = decoder.get_all_samples()
print(samples)
play_audio(samples)
AudioSamples:
data (shape): torch.Size([2, 4297722])
pts_seconds: 0.02505668934240363
duration_seconds: 97.45401360544217
sample_rate: 44100