TWF#

Arguments#

Options

--der_alphafloat

Help: Distillation alpha hyperparameter for student stream (alpha in the paper).

  • Default: None

--der_betafloat

Help: Distillation beta hyperparameter (beta in the paper).

  • Default: None

--lambda_fpfloat

Help: weight of feature propagation loss replay

  • Default: None

--lambda_diverse_lossfloat

Help: Diverse loss hyperparameter.

  • Default: 0

--lambda_fp_replayfloat

Help: weight of feature propagation loss replay

  • Default: 0

--resize_maps0|1|True|False -> bool

Help: Apply downscale and upscale to feature maps before save in buffer?

  • Default: 0

--min_resize_thresholdint

Help: Min size of feature maps to be resized?

  • Default: 16

--virtual_bs_iterationsint

Help: virtual batch size iterations

  • Default: 1

Rehearsal arguments

Arguments shared by all rehearsal-based methods.

--buffer_sizeint

Help: The size of the memory buffer.

  • Default: None

--minibatch_sizeint

Help: The batch size of the memory buffer.

  • Default: None

Classes#

class models.twf.TwF(backbone, loss, args, transform, dataset=None)[source]#

Bases: ContinualModel

Transfer without Forgetting: double-branch distillation + inter-branch skip attention.

COMPATIBILITY: List[str] = ['class-il', 'task-il']#
NAME: str = 'twf'#
begin_task(dataset)[source]#
end_task(dataset)[source]#
get_custom_double_transform(transform)[source]#
static get_parser(parser)[source]#
Return type:

ArgumentParser

observe(inputs, labels, not_aug_inputs, epoch=None)[source]#
partial_distill_loss(net_partial_features, pret_partial_features, targets, teacher_forcing=None, extern_attention_maps=None)[source]#

Functions#

models.twf.batch_iterate(size, batch_size)[source]#