Shortcuts

torchaudio.functional.forced_align

torchaudio.functional.forced_align(log_probs: Tensor, targets: Tensor, input_lengths: Optional[Tensor] = None, target_lengths: Optional[Tensor] = None, blank: int = 0) Tuple[Tensor, Tensor][source]

DEPRECATED

Warning

This function has been deprecated. It will be removed from 2.9 release. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. Please see https://github.com/pytorch/audio/issues/3902 for more information.

Align a CTC label sequence to an emission.

This feature supports the following devices: CPU, CUDA This API supports the following properties: TorchScript
Parameters
  • log_probs (Tensor) – log probability of CTC emission output. Tensor of shape (B, T, C). where B is the batch size, T is the input length, C is the number of characters in alphabet including blank.

  • targets (Tensor) – Target sequence. Tensor of shape (B, L), where L is the target length.

  • input_lengths (Tensor or None, optional) – Lengths of the inputs (max value must each be <= T). 1-D Tensor of shape (B,).

  • target_lengths (Tensor or None, optional) – Lengths of the targets. 1-D Tensor of shape (B,).

  • blank_id (int, optional) – The index of blank symbol in CTC emission. (Default: 0)

Returns

Tensor: Label for each time step in the alignment path computed using forced alignment.

Tensor: Log probability scores of the labels for each time step.

Return type

Tuple(Tensor, Tensor)

Note

The sequence length of log_probs must satisfy:

\[L_{\text{log\_probs}} \ge L_{\text{label}} + N_{\text{repeat}}\]

where \(N_{\text{repeat}}\) is the number of consecutively repeated tokens. For example, in str “aabbc”, the number of repeats are 2.

Note

The current version only supports batch_size==1.

Tutorials using forced_align:
CTC forced alignment API tutorial

CTC forced alignment API tutorial

CTC forced alignment API tutorial
Forced alignment for multilingual data

Forced alignment for multilingual data

Forced alignment for multilingual data
Forced Alignment with Wav2Vec2

Forced Alignment with Wav2Vec2

Forced Alignment with Wav2Vec2

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources