Shortcuts

SlidingWindowCmn

class torchaudio.transforms.SlidingWindowCmn(cmn_window: int = 600, min_cmn_window: int = 100, center: bool = False, norm_vars: bool = False)[source]

Apply sliding-window cepstral mean (and optionally variance) normalization per utterance.

This feature supports the following devices: CPU, CUDA This API supports the following properties: Autograd, TorchScript
Parameters
  • cmn_window (int, optional) – Window in frames for running average CMN computation (int, default = 600)

  • min_cmn_window (int, optional) – Minimum CMN window used at start of decoding (adds latency only at start). Only applicable if center == false, ignored if center==true (int, default = 100)

  • center (bool, optional) – If true, use a window centered on the current frame (to the extent possible, modulo end effects). If false, window is to the left. (bool, default = false)

  • norm_vars (bool, optional) – If true, normalize variance to one. (bool, default = false)

Example
>>> waveform, sample_rate = torchaudio.load("test.wav", normalize=True)
>>> transform = transforms.SlidingWindowCmn(cmn_window=1000)
>>> cmn_waveform = transform(waveform)
forward(specgram: Tensor) Tensor[source]
Parameters

specgram (Tensor) – Tensor of spectrogram of dimension (…, time, freq).

Returns

Tensor of spectrogram of dimension (…, time, freq).

Return type

Tensor

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources