AUGMENTATIONS#

This module contains various image augmentation functions and classes.

Classes#

class utils.augmentations.CustomRandomCrop(size, padding=0, resize=False, min_resize_index=None)[source]#

Bases: object

Custom augmentation class for performing random crop on a pair of stackable images and other associated tensors (e.g. attention maps).

Parameters:
  • size (int or tuple) – Desired output size for the crop. If size is an int, a square crop of size (size, size) is returned.

  • padding (int or tuple, optional) – Optional padding on each border of the image. Default is 0.

  • resize (bool, optional) – Whether to resize the other_img maps. Default is False.

  • min_resize_index (int, optional) – The minimum index of other_img maps to resize. Default is None.

Returns:

A tuple containing the cropped image and a list of cropped other_img maps.

Return type:

tuple

class utils.augmentations.CustomRandomHorizontalFlip(p=0.5)[source]#

Bases: object

Custom augmentation class for performing random horizontal flips on a pair of stackable images and other associated tensors (e.g. attention maps).

Parameters:

p (float) – Probability of applying the horizontal flip. Defaults to 0.5.

class utils.augmentations.DoubleCompose(transforms)[source]#

Bases: object

Composes multiple transformations to be applied on a pair of stackable images and other associated tensors (e.g. attention maps).

Parameters:

transforms (list) – List of transformations to be applied. The transformations should accept two inputs (img, other_img) and return two outputs (img, other_img). For example, CustomRandomCrop and CustomRandomHorizontalFlip.

__iter__()[source]#

Returns an iterator for the transformations.

__getitem__(i)[source]#

Returns the transformation at index i.

__len__()[source]#

Returns the number of transformations.

__call__(img, other_img)[source]#

Applies the composed transformations on the input images.

class utils.augmentations.DoubleTransform(tf)[source]#

Bases: object

This class applies a given transformation to the first image and leaves the second input unchanged.

Parameters:

tf – The transformation to be applied.

class utils.augmentations.RepeatedTransform(transform_list, autosqueeze=False)[source]#

Bases: object

This class applies a series of transforms to the same input.

Parameters:

transform_list (list) – The list of transformations to be applied.

class utils.augmentations.soft_aug(mean, std)[source]#

Bases: object

class utils.augmentations.strong_aug(size, mean, std)[source]#

Bases: object

A class representing a strong data augmentation pipeline (used in X-DER).

Parameters:
  • size (int) – The size of the output image.

  • mean (float) – The mean value for normalization.

  • std (float) – The standard deviation value for normalization.

Functions#

utils.augmentations.apply_transform(x, transform, autosqueeze=False)[source]#

Applies a transform to a batch of images.

If the transforms is a KorniaAugNoGrad, it is applied directly to the batch. Otherwise, it is applied to each image in the batch.

Parameters:
  • x (Tensor) – a batch of images.

  • transform – the transform to apply.

  • autosqueeze – whether to automatically squeeze the output tensor.

Returns:

The transformed batch of images.

Return type:

Tensor

utils.augmentations.cutmix_data(x, y, alpha=1.0, cutmix_prob=0.5, force=False)[source]#

Generate a cutmix sample given a batch of images and labels.

Parameters:
  • x (torch.Tensor) – The batch of images.

  • y (torch.Tensor) – The batch of labels.

  • alpha (float) – The alpha value used to calculate the size of the bounding box.

  • cutmix_prob (float) – The probability of applying cutmix.

  • force (bool) – Whether to force the application of cutmix.

Returns:

The mixed batch of images. y_a (torch.Tensor): The batch of labels for the first image. y_b (torch.Tensor): The batch of labels for the second image. lam (float): The lambda value used to calculate the size of the bounding box.

Return type:

x (torch.Tensor)

Raises:

AssertionError – If the input tensor x does not have 4 dimensions.

utils.augmentations.normalize(x, mean, std)[source]#

Normalize the input tensor x of images using the provided mean and standard deviation.

Parameters:
  • x (torch.Tensor) – Input tensor to be normalized.

  • mean (list or tuple) – Mean values for each channel.

  • std (list or tuple) – Standard deviation values for each channel.

Returns:

Normalized tensor.

Return type:

torch.Tensor

Raises:

AssertionError – If the input tensor x does not have 4 dimensions.

utils.augmentations.rand_bbox(size, lam)[source]#

Generate a random bounding box given the size of the image and a lambda value.

Parameters:
  • size (tuple) – The size of the image in the format (batch_size, channels, height, width).

  • lam (float) – The lambda value used to calculate the size of the bounding box.

Returns:

The x-coordinate of the top-left corner of the bounding box. bby1 (int): The y-coordinate of the top-left corner of the bounding box. bbx2 (int): The x-coordinate of the bottom-right corner of the bounding box. bby2 (int): The y-coordinate of the bottom-right corner of the bounding box.

Return type:

bbx1 (int)

utils.augmentations.random_crop(x, padding)[source]#

Randomly crops the input tensor.

Parameters:
  • x (torch.Tensor) – The input tensor with shape (batch_size, channels, height, width).

  • padding (int) – The padding size for the crop.

Returns:

The cropped tensor with shape (batch_size, channels, height, width).

Return type:

torch.Tensor

utils.augmentations.random_flip(x)[source]#

Randomly flips the input tensor along the last dimension.

Parameters:

x (torch.Tensor) – Input tensor of shape (batch_size, channels, height, width).

Returns:

Flipped tensor with the same shape as the input tensor.

Return type:

torch.Tensor

utils.augmentations.random_grayscale(x, prob=0.2)[source]#

Apply random grayscale transformation to the input tensor.

Parameters:
  • x (torch.Tensor) – Input tensor of shape (batch_size, channels, height, width).

  • prob (float) – Probability of applying the grayscale transformation.

Returns:

Transformed tensor with random grayscale applied.

Return type:

torch.Tensor