HiFiGANVocoderBundle¶
- class torchaudio.prototype.pipelines.HiFiGANVocoderBundle[source]¶
DEPRECATED
Warning
This class is deprecated from version 2.8. It will be removed in the 2.9 release. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. Please see https://github.com/pytorch/audio/issues/3902 for more information.
- Data class that bundles associated information to use pretrained
-
This class provides interfaces for instantiating the pretrained model along with the information necessary to retrieve pretrained weights and additional data to be used with the model.
Torchaudio library instantiates objects of this class, each of which represents a different pretrained model. Client code should access pretrained models via these instances.
This bundle can convert mel spectrorgam to waveforms and vice versa. A typical use case would be a flow like text -> mel spectrogram -> waveform, where one can use an external component, e.g. Tacotron2, to generate mel spectrogram from text. Please see below for the code example.
- Example: Transform synthetic mel spectrogram to audio.
>>> import torch >>> import torchaudio >>> # Since HiFiGAN bundle is in prototypes, it needs to be exported explicitly >>> from torchaudio.prototype.pipelines import HIFIGAN_VOCODER_V3_LJSPEECH as bundle >>> >>> # Load the HiFiGAN bundle >>> vocoder = bundle.get_vocoder() Downloading: "https://download.pytorch.org/torchaudio/models/hifigan_vocoder_v3_ljspeech.pth" 100%|████████████| 5.59M/5.59M [00:00<00:00, 18.7MB/s] >>> >>> # Generate synthetic mel spectrogram >>> specgram = torch.sin(0.5 * torch.arange(start=0, end=100)).expand(bundle._vocoder_params["in_channels"], 100) >>> >>> # Transform mel spectrogram into audio >>> waveform = vocoder(specgram) >>> torchaudio.save('sample.wav', waveform, bundle.sample_rate)
- Example: Usage together with Tacotron2, text to audio.
>>> import torch >>> import torchaudio >>> # Since HiFiGAN bundle is in prototypes, it needs to be exported explicitly >>> from torchaudio.prototype.pipelines import HIFIGAN_VOCODER_V3_LJSPEECH as bundle_hifigan >>> >>> # Load Tacotron2 bundle >>> bundle_tactron2 = torchaudio.pipelines.TACOTRON2_WAVERNN_CHAR_LJSPEECH >>> processor = bundle_tactron2.get_text_processor() >>> tacotron2 = bundle_tactron2.get_tacotron2() >>> >>> # Use Tacotron2 to convert text to mel spectrogram >>> text = "A quick brown fox jumped over a lazy dog" >>> input, lengths = processor(text) >>> specgram, lengths, _ = tacotron2.infer(input, lengths) >>> >>> # Load HiFiGAN bundle >>> vocoder = bundle_hifigan.get_vocoder() Downloading: "https://download.pytorch.org/torchaudio/models/hifigan_vocoder_v3_ljspeech.pth" 100%|████████████| 5.59M/5.59M [00:03<00:00, 1.55MB/s] >>> >>> # Use HiFiGAN to convert mel spectrogram to audio >>> waveform = vocoder(specgram).squeeze(0) >>> torchaudio.save('sample.wav', waveform, bundle_hifigan.sample_rate)
Properties¶
sample_rate¶
Methods¶
get_mel_transform¶
- HiFiGANVocoderBundle.get_mel_transform() Module [source]¶
DEPRECATED
Warning
This function has been deprecated. It will be removed from 2.9 release. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. Please see https://github.com/pytorch/audio/issues/3902 for more information.
Construct an object which transforms waveforms into mel spectrograms.
get_vocoder¶
- HiFiGANVocoderBundle.get_vocoder(*, dl_kwargs=None) HiFiGANVocoder [source]¶
DEPRECATED
Warning
This function has been deprecated. It will be removed from 2.9 release. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. Please see https://github.com/pytorch/audio/issues/3902 for more information.
Construct the HiFiGAN Generator model, which can be used a vocoder, and load the pretrained weight.
The weight file is downloaded from the internet and cached with
torch.hub.load_state_dict_from_url()
- Args:
dl_kwargs (dictionary of keyword arguments): Passed to
torch.hub.load_state_dict_from_url()
.- Returns:
Variation of
HiFiGANVocoder
.