Float8LinearRecipeName#

class torchao.float8.Float8LinearRecipeName(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source][source]#

Pre-made recipes for common float8 training configurations.

Values:

TENSORWISE: Default, dynamic per-tensor scaling with the cuBLAS tensorwise kernel. Fastest option.
ROWWISE: Dynamic rowwise scaling with the CUTLASS rowwise kernel. Uses e4m3 for activations, weights, gradients. Scales are rounded (floor) to the nearest power of two for increased accuracy.
ROWWISE_WITH_GW_HP: A modification on rowwise scaling with increased accuracy for grad_weight by keeping grad_weight computation in high precision. Most accurate option.

Float8LinearRecipeName#

Docs

Tutorials

Resources