int_scaled_matmul#
- torchao.quantization.int_scaled_matmul(a: Tensor, b: Tensor, scales1: Tensor) Tensor[source][source]#
Performs scaled integer matrix multiplication.
- Parameters
a (torch.Tensor) – The first matrix to multiply.
b (torch.Tensor) – The second matrix to multiply.
scales1 (torch.Tensor) – The scaling factors for the rows of the result.
- Returns
The result of the scaled matrix multiplication.
- Return type
- Raises
AssertionError – If the dimensions of the input tensors do not match the expected shapes.