LIDER MODEL#
Base class for all models that use the Lipschitz regularization in LiDER (https://arxiv.org/pdf/2210.06443.pdf).
Classes#
- class models.utils.lider_model.LiderOptimizer(backbone, loss, args, transform, dataset=None)[source]#
Bases:
ContinualModel
Superclass for all models that use the Lipschitz regularization in LiDER (https://arxiv.org/pdf/2210.06443.pdf).
- dynamic_budget_lip_loss(features)[source]#
Compute the dynamic budget Lipschitz loss for a batch of features (eq. 7).
- Parameters:
features (List[torch.Tensor]) – The list features of each layer. The features are assumed to be ordered from the input to the output of the network.
- Returns:
The dynamic budget Lipschitz loss.
- Return type:
torch.Tensor
- get_feature_lip_coeffs(features)[source]#
Compute the Lipschitz coefficient for all the layers of a network given a list of batches of features. The features are assumed to be ordered from the input to the output of the network.
- Parameters:
features (List[torch.Tensor]) – The list features of each layer.
- Returns:
The list of Lipschitz coefficients for each layer.
- Return type:
List[torch.Tensor]
- get_layer_lip_coeffs(features_a, features_b)[source]#
Compute the Lipschitz coefficient of a layer given its batches of input and output features. Estimates the Lipschitz coefficient with https://arxiv.org/pdf/2108.12905.pdf.
- Parameters:
features_a (torch.Tensor) – The batch of input features.
features_b (torch.Tensor) – The batch of output features.
- Returns:
The Lipschitz coefficient of the layer.
- Return type:
torch.Tensor
- get_norm(t)[source]#
Compute the norm of a tensor.
- Parameters:
t (torch.Tensor) – The tensor.
- Returns:
The norm of the tensor.
- Return type:
torch.Tensor
- init_net(dataset)[source]#
Compute the target Lipschitz coefficients for the network and initialize the network’s Lipschitz coefficients to match them.
- Parameters:
dataset (ContinualDataset) – The dataset to use for the computation.
- minimization_lip_loss(features)[source]#
Compute the Lipschitz minimization loss for a batch of features (eq. 8).
- Parameters:
features (List[torch.Tensor]) – The list features of each layer. The features are assumed to be ordered from the input to the output of the network.
- Returns:
The Lipschitz minimization loss.
- Return type:
torch.Tensor
- top_eigenvalue(K, n_power_iterations=10)[source]#
Compute the top eigenvalue of a matrix K using the power iteration method. Stop gradient propagation after n_power_iterations.
- Parameters:
K (torch.Tensor) – The matrix to compute the top eigenvalue of.
n_power_iterations (int) – The number of power iterations to run. If positive, compute gradient only for the first n_power_iterations iterations. If negative, compute gradient only for the last n_power_iterations iterations.
- Returns:
The top eigenvalue of K.
- Return type:
torch.Tensor