LIBRISPEECH¶
- class torchaudio.datasets.LIBRISPEECH(root: Union[str, Path], url: str = 'train-clean-100', folder_in_archive: str = 'LibriSpeech', download: bool = False)[source]¶
LibriSpeech [Panayotov et al., 2015] dataset.
- Parameters:
root (str or Path) – Path to the directory where the dataset is found or downloaded.
url (str, optional) – The URL to download the dataset from, or the type of the dataset to dowload. Allowed type values are
"dev-clean","dev-other","test-clean","test-other","train-clean-100","train-clean-360"and"train-other-500". (default:"train-clean-100")folder_in_archive (str, optional) – The top-level directory of the dataset. (default:
"LibriSpeech")download (bool, optional) – Whether to download the dataset if it is not found at root path. (default:
False).
__getitem__¶
- LIBRISPEECH.__getitem__(n: int) Tuple[Tensor, int, str, int, int, int][source]¶
Load the n-th sample from the dataset.
- Parameters:
n (int) – The index of the sample to be loaded
- Returns:
Tuple of the following items;
- Tensor:
Waveform
- int:
Sample rate
- str:
Transcript
- int:
Speaker ID
- int:
Chapter ID
- int:
Utterance ID
get_metadata¶
- LIBRISPEECH.get_metadata(n: int) Tuple[str, int, str, int, int, int][source]¶
Get metadata for the n-th sample from the dataset. Returns filepath instead of waveform, but otherwise returns the same fields as
__getitem__().- Parameters:
n (int) – The index of the sample to be loaded
- Returns:
Tuple of the following items;
- str:
Path to audio
- int:
Sample rate
- str:
Transcript
- int:
Speaker ID
- int:
Chapter ID
- int:
Utterance ID