ElasticTransform¶
- class torchvision.transforms.v2.ElasticTransform(alpha: Union[float, Sequence[float]] = 50.0, sigma: Union[float, Sequence[float]] = 5.0, interpolation: Union[InterpolationMode, int] = InterpolationMode.BILINEAR, fill: Union[int, float, Sequence[int], Sequence[float], None, dict[Union[type, str], Union[int, float, collections.abc.Sequence[int], collections.abc.Sequence[float], NoneType]]] = 0)[source]¶
Transform the input with elastic transformations.
If the input is a
torch.Tensoror aTVTensor(e.g.Image,Video,BoundingBoxesetc.) it can have arbitrary number of leading batch dimensions. For example, the image can have[..., C, H, W]shape. A bounding box can have[..., 4]shape.Given alpha and sigma, it will generate displacement vectors for all pixels based on random offsets. Alpha controls the strength and sigma controls the smoothness of the displacements. The displacements are added to an identity grid and the resulting grid is used to transform the input.
Note
Implementation to transform bounding boxes is approximative (not exact). We construct an approximation of the inverse grid as
inverse_grid = identity - displacement. This is not an exact inverse of the grid used to transform images, i.e.grid = identity + displacement. Our assumption is thatdisplacement * displacementis small and can be ignored. Large displacements would lead to large errors in the approximation.- Applications:
Randomly transforms the morphology of objects in images and produces a see-through-water-like effect.
- Parameters:
alpha (float or sequence of python:floats, optional) – Magnitude of displacements. Default is 50.0.
sigma (float or sequence of python:floats, optional) – Smoothness of displacements. Default is 5.0.
interpolation (InterpolationMode, optional) – Desired interpolation enum defined by
torchvision.transforms.InterpolationMode. Default isInterpolationMode.BILINEAR. If input is Tensor, onlyInterpolationMode.NEAREST,InterpolationMode.BILINEARare supported. The corresponding Pillow integer constants, e.g.PIL.Image.BILINEARare accepted as well.fill (number or tuple or dict, optional) – Pixel fill value used when the
padding_modeis constant. Default is 0. If a tuple of length 3, it is used to fill R, G, B channels respectively. Fill value can be also a dictionary mapping data type to the fill value, e.g.fill={tv_tensors.Image: 127, tv_tensors.Mask: 0}whereImagewill be filled with 127 andMaskwill be filled with 0.
Examples using
ElasticTransform: