Float8LinearRecipeName#
- class torchao.float8.Float8LinearRecipeName(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source][source]#
Pre-made recipes for common float8 training configurations.
Values:
TENSORWISE: Default, dynamic per-tensor scaling with the cuBLAS tensorwise kernel. Fastest option.ROWWISE: Dynamic rowwise scaling with the CUTLASS rowwise kernel. Uses e4m3 for activations, weights, gradients. Scales are rounded (floor) to the nearest power of two for increased accuracy.ROWWISE_WITH_GW_HP: A modification on rowwise scaling with increased accuracy for grad_weight by keeping grad_weight computation in high precision. Most accurate option.