Shortcuts

InverseSpectrogram

class torchaudio.transforms.InverseSpectrogram(n_fft: int = 400, win_length: ~typing.Optional[int] = None, hop_length: ~typing.Optional[int] = None, pad: int = 0, window_fn: ~typing.Callable[[...], ~torch.Tensor] = <built-in method hann_window of type object>, normalized: ~typing.Union[bool, str] = False, wkwargs: ~typing.Optional[dict] = None, center: bool = True, pad_mode: str = 'reflect', onesided: bool = True)[source]

Create an inverse spectrogram to recover an audio signal from a spectrogram.

This feature supports the following devices: CPU, CUDA This API supports the following properties: Autograd, TorchScript
Parameters
  • n_fft (int, optional) – Size of FFT, creates n_fft // 2 + 1 bins. (Default: 400)

  • win_length (int or None, optional) – Window size. (Default: n_fft)

  • hop_length (int or None, optional) – Length of hop between STFT windows. (Default: win_length // 2)

  • pad (int, optional) – Two sided padding of signal. (Default: 0)

  • window_fn (Callable[..., Tensor], optional) – A function to create a window tensor that is applied/multiplied to each frame/window. (Default: torch.hann_window)

  • normalized (bool or str, optional) – Whether the stft output was normalized by magnitude. If input is str, choices are "window" and "frame_length", dependent on normalization mode. True maps to "window". (Default: False)

  • wkwargs (dict or None, optional) – Arguments for window function. (Default: None)

  • center (bool, optional) – whether the signal in spectrogram was padded on both sides so that the \(t\)-th frame is centered at time \(t \times \text{hop\_length}\). (Default: True)

  • pad_mode (string, optional) – controls the padding method used when center is True. (Default: "reflect")

  • onesided (bool, optional) – controls whether spectrogram was used to return half of results to avoid redundancy (Default: True)

Example
>>> batch, freq, time = 2, 257, 100
>>> length = 25344
>>> spectrogram = torch.randn(batch, freq, time, dtype=torch.cdouble)
>>> transform = transforms.InverseSpectrogram(n_fft=512)
>>> waveform = transform(spectrogram, length)
Tutorials using InverseSpectrogram:
Audio Feature Augmentation

Audio Feature Augmentation

Audio Feature Augmentation
Speech Enhancement with MVDR Beamforming

Speech Enhancement with MVDR Beamforming

Speech Enhancement with MVDR Beamforming
forward(spectrogram: Tensor, length: Optional[int] = None) Tensor[source]
Parameters
  • spectrogram (Tensor) – Complex tensor of audio of dimension (…, freq, time).

  • length (int or None, optional) – The output length of the waveform.

Returns

Dimension (…, time), Least squares estimation of the original signal.

Return type

Tensor

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources