SECOND ORDER#

Arguments#

Options

--virtual_bs_nint

Help: Virtual batch size iterations

  • Default: 1

--clip_gradnone_or_float

Help: Clip gradient norm (None means no clipping)

  • Default: 100

--tuning_stylestr

Help: Strategy to use for tuning the model.

  • “lora”: LoRA

  • “full”: full fine-tuning

  • “ia3”: IA3

    • Default: lora

    • Choices: lora, full, ia3

--lora_rint

Help: LoRA rank. Used if tuning_style is “lora”.

  • Default: 16

--num_epochs_pretuningint

Help: Number of epochs for pre-tuning

  • Default: 3

--learning_rate_pretuningfloat

Help: Learning rate for pre-tuning.

  • Default: 0.01

--fisher_mc_classesint_or_all

Help: Number of classes to use for EWC Fisher computation.

  • “all”: slow but accurate, uses all classes

  • <int>: use subset of <int> classes, faster but less accurate

    • Default: all

--num_samples_align_pretuningint

Help: Num. of samples from each gaussian.

  • Default: 256

--batch_size_align_pretuningint

Help: Batch size for CA.

  • Default: 128

--num_epochs_align_pretuningint

Help: Num. of epochs for CA.

  • Default: 10

--lr_align_pretuningfloat

Help: Learning rate for CA.

  • Default: 0.01

--use_iel0|1|True|False -> bool

Help: Tune with ITA or IEL

  • Default: 0

  • Choices: 0, 1

--beta_ielfloat

Help: Beta parameter of IEL (Eq. 18/19)

  • Default: 0.0

--alpha_itafloat

Help: Alpha parameter of ITA (Eq. 11)

  • Default: 0.0

--req_weight_clsfloat

Help: Regularization weight (alpha for ITA/beta for IEL) for classifier. If None, will use the alpha/beta of ITA/IEL.

  • Default: None

--simple_reg_weight_clsfloat

Help: Regularization weight for simple MSE-based loss for the classifier.

  • Default: 0.0

Classes#

class models.second_order.SecondOrder(backbone, loss, args, transform, dataset=None)[source]#

Bases: ContinualModel

COMPATIBILITY: List[str] = ['class-il', 'domain-il', 'task-il', 'general-continual']#
NAME: str = 'second_order'#
accuracy(pred, labels)[source]#
align_pretuning(cls, distributions_to_sample_from, desc='')[source]#
apply_grads(param_group)[source]#
begin_task(dataset)[source]#
compute_loss(stream_logits, stream_labels)[source]#

Compute the loss for the current task.

compute_statistics(dataset, distributions, use_lora)[source]#
create_features_dataset(data_loader, use_lora)[source]#
create_synthetic_features_dataset(distributions_to_sample_from=None, upto=None)[source]#
end_task(dataset)[source]#
forward(x, task_weights=None, returnt='out')[source]#
get_optimizer()[source]#
static get_parser(parser)[source]#
Return type:

ArgumentParser

get_scheduler()[source]#
get_sgd_optim(cls, lr)[source]#
grad_backup(param_group)[source]#
grad_recall(param_group, op='set')[source]#
linear_probing(dataset, classifier, lr, num_epochs, desc='', use_lora=False)[source]#
masked_loss(cls, x, labels)[source]#

Separate losses for current and previous tasks.

net: Model#
observe(inputs, labels, not_aug_inputs, epoch=None)[source]#
pretuning(dataset)[source]#
sched(optim, num_epochs)[source]#
update_statistics(dataset)[source]#

Functions#

models.second_order.int_or_all(x)[source]#