HIFIGAN_VOCODER_V3_LJSPEECH¶
- torchaudio.prototype.pipelines.HIFIGAN_VOCODER_V3_LJSPEECH¶
[DEPRECATED]
Warning
This object is deprecated deprecated from version 2.8. It will be removed in the 2.9 release. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. Please see https://github.com/pytorch/audio/issues/3902 for more information.
- HiFiGAN Vocoder pipeline, trained on The LJ Speech Dataset
-
This pipeine can be used with an external component which generates mel spectrograms from text, for example, Tacotron2 - see examples in
HiFiGANVocoderBundle
. Although this works with the existing Tacotron2 bundles, for the best results one needs to retrain Tacotron2 using the same data preprocessing pipeline which was used for training HiFiGAN. In particular, the original HiFiGAN implementation uses a custom method of generating mel spectrograms from waveforms, different fromtorchaudio.transforms.MelSpectrogram
. We reimplemented this transform asHiFiGANVocoderBundle.get_mel_transform()
, making sure it is equivalent to the original HiFiGAN code here.The underlying vocoder is constructed by
torchaudio.prototype.models.hifigan_vocoder()
. The weights are converted from the ones published with the original paper [Kong et al., 2020] under MIT License. See links to pre-trained models on GitHub.Please refer to
HiFiGANVocoderBundle
for usage instructions.