torchaudio.functional.melscale_fbanks¶
- torchaudio.functional.melscale_fbanks(n_freqs: int, f_min: float, f_max: float, n_mels: int, sample_rate: int, norm: Optional[str] = None, mel_scale: str = 'htk') Tensor[source]¶
- Create a frequency bin conversion matrix. - Note - For the sake of the numerical compatibility with librosa, not all the coefficients in the resulting filter bank has magnitude of 1.   - Parameters:
- n_freqs (int) – Number of frequencies to highlight/apply 
- f_min (float) – Minimum frequency (Hz) 
- f_max (float) – Maximum frequency (Hz) 
- n_mels (int) – Number of mel filterbanks 
- sample_rate (int) – Sample rate of the audio waveform 
- norm (str or None, optional) – If “slaney”, divide the triangular mel weights by the width of the mel band (area normalization). (Default: - None)
- mel_scale (str, optional) – Scale to use: - htkor- slaney. (Default:- htk)
 
- Returns:
- Triangular filter banks (fb matrix) of size ( - n_freqs,- n_mels) meaning number of frequencies to highlight/apply to x the number of filterbanks. Each column is a filterbank so that assuming there is a matrix A of size (…,- n_freqs), the applied result would be- A @ melscale_fbanks(A.size(-1), ...).
- Return type:
- Tensor 
 - Tutorials using melscale_fbanks:
 
