Wav2Vec2Bundle¶
- class torchaudio.pipelines.Wav2Vec2Bundle[source]¶
- Data class that bundles associated information to use pretrained - Wav2Vec2Model.- This class provides interfaces for instantiating the pretrained model along with the information necessary to retrieve pretrained weights and additional data to be used with the model. - Torchaudio library instantiates objects of this class, each of which represents a different pretrained model. Client code should access pretrained models via these instances. - Please see below for the usage and the available values. - Example - Feature Extraction
- >>> import torchaudio >>> >>> bundle = torchaudio.pipelines.HUBERT_BASE >>> >>> # Build the model and load pretrained weight. >>> model = bundle.get_model() Downloading: 100%|███████████████████████████████| 360M/360M [00:06<00:00, 60.6MB/s] >>> >>> # Resample audio to the expected sampling rate >>> waveform = torchaudio.functional.resample(waveform, sample_rate, bundle.sample_rate) >>> >>> # Extract acoustic features >>> features, _ = model.extract_features(waveform) 
 
Properties¶
sample_rate¶
Methods¶
get_model¶
- Wav2Vec2Bundle.get_model(*, dl_kwargs=None) Module[source]¶
- Construct the model and load the pretrained weight. - The weight file is downloaded from the internet and cached with - torch.hub.load_state_dict_from_url()- Parameters:
- dl_kwargs (dictionary of keyword arguments) – Passed to - torch.hub.load_state_dict_from_url().
- Returns:
- Variation of - Wav2Vec2Model.- For the models listed below, an additional layer normalization is performed on the input. - For all other models, a - Wav2Vec2Modelinstance is returned.- WAV2VEC2_LARGE_LV60K 
- WAV2VEC2_ASR_LARGE_LV60K_10M 
- WAV2VEC2_ASR_LARGE_LV60K_100H 
- WAV2VEC2_ASR_LARGE_LV60K_960H 
- WAV2VEC2_XLSR53 
- WAV2VEC2_XLSR_300M 
- WAV2VEC2_XLSR_1B 
- WAV2VEC2_XLSR_2B 
- HUBERT_LARGE 
- HUBERT_XLARGE 
- HUBERT_ASR_LARGE 
- HUBERT_ASR_XLARGE 
- WAVLM_LARGE