
# AudioEffector Usages

**Author**: [Moto Hira](moto@meta.com)_

<div class="alert alert-danger"><h4>Warning</h4><p>Starting with version 2.8, we are refactoring TorchAudio to transition it
    into a maintenance phase. As a result:

    - The APIs described in this tutorial are deprecated in 2.8 and will be removed in 2.9.
    - The decoding and encoding capabilities of PyTorch for both audio and video
      are being consolidated into TorchCodec.

    Please see https://github.com/pytorch/audio/issues/3902 for more information.</p></div>

This tutorial shows how to use :py:class:`torchaudio.io.AudioEffector` to
apply various effects and codecs to waveform tensor.


<div class="alert alert-info"><h4>Note</h4><p>This tutorial requires FFmpeg libraries.
   Please refer to `FFmpeg dependency <ffmpeg_dependency>` for
   the detail.</p></div>




## Overview

:py:class:`~torchaudio.io.AudioEffector` combines in-memory encoding,
decoding and filtering that are provided by
:py:class:`~torchaudio.io.StreamWriter` and
:py:class:`~torchaudio.io.StreamReader`.

The following figure illustrates the process.

<img src="https://download.pytorch.org/torchaudio/tutorial-assets/AudioEffector.png">




In [None]:
import torch
import torchaudio

print(torch.__version__)
print(torchaudio.__version__)

In [None]:
from torchaudio.io import AudioEffector, CodecConfig

import matplotlib.pyplot as plt
from IPython.display import Audio

In [None]:
for k, v in torchaudio.utils.ffmpeg_utils.get_versions().items():
    print(k, v)

## Usage

To use ``AudioEffector``, instantiate it with ``effect`` and
``format``, then either pass the waveform to
:py:meth:`~torchaudio.io.AudioEffector.apply` or
:py:meth:`~torchaudio.io.AudioEffector.stream` method.

.. code:: python

   effector = AudioEffector(effect=..., format=...,)

   # Apply at once
   applied = effector.apply(waveform, sample_rate)

``apply`` method applies effect and codec to the entire waveform at
once. So if the input waveform is long, and memory consumption is an
issue, one can use ``stream`` method to process chunk by chunk.

.. code:: python

   # Apply chunk by chunk
   for applied_chunk = effector.stream(waveform, sample_rate):
       ...




## Example




In [None]:
src = torchaudio.utils.download_asset("tutorial-assets/Lab41-SRI-VOiCES-src-sp0307-ch127535-sg0042.wav")
waveform, sr = torchaudio.load(src, channels_first=False)

## Gallery




In [None]:
def show(effect, *, stereo=False):
    wf = torch.cat([waveform] * 2, dim=1) if stereo else waveform
    figsize = (6.4, 2.1 if stereo else 1.2)

    effector = AudioEffector(effect=effect, pad_end=False)
    result = effector.apply(wf, int(sr))

    num_channels = result.size(1)
    f, ax = plt.subplots(num_channels, 1, squeeze=False, figsize=figsize, sharex=True)
    for i in range(num_channels):
        ax[i][0].specgram(result[:, i], Fs=sr)
    f.set_tight_layout(True)

    return Audio(result.numpy().T, rate=sr)

## Original




In [None]:
show(effect=None)

## Effects




### tempo
https://ffmpeg.org/ffmpeg-filters.html#atempo



In [None]:
show("atempo=0.7")

In [None]:
show("atempo=1.8")

### highpass
https://ffmpeg.org/ffmpeg-filters.html#highpass



In [None]:
show("highpass=frequency=1500")

### lowpass
https://ffmpeg.org/ffmpeg-filters.html#lowpass



In [None]:
show("lowpass=frequency=1000")

### allpass
https://ffmpeg.org/ffmpeg-filters.html#allpass



In [None]:
show("allpass")

### bandpass
https://ffmpeg.org/ffmpeg-filters.html#bandpass



In [None]:
show("bandpass=frequency=3000")

### bandreject
https://ffmpeg.org/ffmpeg-filters.html#bandreject



In [None]:
show("bandreject=frequency=3000")

### echo
https://ffmpeg.org/ffmpeg-filters.html#aecho



In [None]:
show("aecho=in_gain=0.8:out_gain=0.88:delays=6:decays=0.4")

In [None]:
show("aecho=in_gain=0.8:out_gain=0.88:delays=60:decays=0.4")

In [None]:
show("aecho=in_gain=0.8:out_gain=0.9:delays=1000:decays=0.3")

### chorus
https://ffmpeg.org/ffmpeg-filters.html#chorus



In [None]:
show("chorus=0.5:0.9:50|60|40:0.4|0.32|0.3:0.25|0.4|0.3:2|2.3|1.3")

### fft filter
https://ffmpeg.org/ffmpeg-filters.html#afftfilt



In [None]:
# fmt: off
show(
    "afftfilt="
    "real='re * (1-clip(b * (b/nb), 0, 1))':"
    "imag='im * (1-clip(b * (b/nb), 0, 1))'"
)

In [None]:
show(
    "afftfilt="
    "real='hypot(re,im) * sin(0)':"
    "imag='hypot(re,im) * cos(0)':"
    "win_size=512:"
    "overlap=0.75"
)

In [None]:
show(
    "afftfilt="
    "real='hypot(re,im) * cos(2 * 3.14 * (random(0) * 2-1))':"
    "imag='hypot(re,im) * sin(2 * 3.14 * (random(1) * 2-1))':"
    "win_size=128:"
    "overlap=0.8"
)
# fmt: on

### vibrato
https://ffmpeg.org/ffmpeg-filters.html#vibrato



In [None]:
show("vibrato=f=10:d=0.8")

### tremolo
https://ffmpeg.org/ffmpeg-filters.html#tremolo



In [None]:
show("tremolo=f=8:d=0.8")

### crystalizer
https://ffmpeg.org/ffmpeg-filters.html#crystalizer



In [None]:
show("crystalizer")

### flanger
https://ffmpeg.org/ffmpeg-filters.html#flanger



In [None]:
show("flanger")

### phaser
https://ffmpeg.org/ffmpeg-filters.html#aphaser



In [None]:
show("aphaser")

### pulsator
https://ffmpeg.org/ffmpeg-filters.html#apulsator



In [None]:
show("apulsator", stereo=True)

### haas
https://ffmpeg.org/ffmpeg-filters.html#haas



In [None]:
show("haas")

## Codecs




In [None]:
def show_multi(configs):
    results = []
    for config in configs:
        effector = AudioEffector(**config)
        results.append(effector.apply(waveform, int(sr)))

    num_configs = len(configs)
    figsize = (6.4, 0.3 + num_configs * 0.9)
    f, axes = plt.subplots(num_configs, 1, figsize=figsize, sharex=True)
    for result, ax in zip(results, axes):
        ax.specgram(result[:, 0], Fs=sr)
    f.set_tight_layout(True)

    return [Audio(r.numpy().T, rate=sr) for r in results]

### ogg




In [None]:
results = show_multi(
    [
        {"format": "ogg"},
        {"format": "ogg", "encoder": "vorbis"},
        {"format": "ogg", "encoder": "opus"},
    ]
)

#### ogg - default encoder (flac)




In [None]:
results[0]

#### ogg - vorbis




In [None]:
results[1]

#### ogg - opus




In [None]:
results[2]

### mp3
https://trac.ffmpeg.org/wiki/Encode/MP3



In [None]:
results = show_multi(
    [
        {"format": "mp3"},
        {"format": "mp3", "codec_config": CodecConfig(compression_level=1)},
        {"format": "mp3", "codec_config": CodecConfig(compression_level=9)},
        {"format": "mp3", "codec_config": CodecConfig(bit_rate=192_000)},
        {"format": "mp3", "codec_config": CodecConfig(bit_rate=8_000)},
        {"format": "mp3", "codec_config": CodecConfig(qscale=9)},
        {"format": "mp3", "codec_config": CodecConfig(qscale=1)},
    ]
)

#### default



In [None]:
results[0]

#### compression_level=1



In [None]:
results[1]

#### compression_level=9



In [None]:
results[2]

#### bit_rate=192k



In [None]:
results[3]

#### bit_rate=8k



In [None]:
results[4]

#### qscale=9



In [None]:
results[5]

#### qscale=1



In [None]:
results[6]

Tag: :obj:`torchaudio.io`

