

--datasetstr (with underscores replaced by dashes)

Help: Which dataset to perform experiments on.

  • Default: None

  • Choices: seq-cifar10-224, seq-cub200, seq-cifar10, seq-cars196, seq-resisc45, seq-cifar100-224, seq-cifar100-224-rs, seq-mnist, rot-mnist, perm-mnist, seq-mit67, seq-cropdisease, seq-imagenet-r, seq-tinyimg, seq-tinyimg-r, seq-isic, seq-cifar10-224-rs, seq-eurosat-rgb, mnist-360, seq-celeba, seq-cub200-rs, seq-cifar100, seq-chestx

--modelstr (with underscores replaced by dashes)

Help: Model name.

  • Default: None

  • Choices: dualprompt, pnn, der, er-ace-lider, ranpac, derpp-lider, xder-ce, er, llava, icarl, idefics, starprompt, derpp-cscct, ccic, lwf, lwf-mc, icarl-cscct, si, icarl-lider, agem, coda-prompt, gdumb-lider, zscl, er-ace-aer-abs, er-ace-cscct, agem-r, er-tricks, l2p, gdumb, ewc-on, gss, derpp-casper, twf, rpc, puridiver, cgil, er-ace-casper, icarl-casper, joint-gcl, gem, xder-rpc, xder, xder-rpc-cscct, dap, derpp, first-stage-starprompt, hal, sgd, slca, fdr, attriclip, clip, xder-rpc-casper, moe-adapters, mer, er-ace-tricks, er-ace, second-stage-starprompt, lws, lucir, joint, bic

--backbonestr (with underscores replaced by dashes)

Help: Backbone network name.

  • Default: None

  • Choices: resnet50, resnet50-pt, resnet18, resnet18-7x7-pt, reduced-resnet18, resnet34, resnet32, mnistmlp, vit


Help: (deprecated) Loads the best arguments for each method, dataset and memory buffer. NOTE: This option is deprecated and not up to date.

  • Default: False


Help: The configuration used for this dataset (e.g., number of tasks, transforms, backbone architecture, etc.).The available configurations are defined in the datasets/config/<dataset> folder.

  • Default: None


Experiment arguments

Arguments used to define the experiment settings.


Help: Learning rate. This should either be set as default by the model (with set_defaults), by the dataset (with set_default_from_args, see utils), or with –lr=<value>.

  • Default: None


Help: Batch size.

  • Default: None


Help: Percentage in (0-1] of labeled examples per task.

  • Default: 1


Help: Train model on Joint (single task)?

  • Default: 0

  • Choices: 0, 1

--eval_future0|1|True|False -> bool

Help: Evaluate future tasks?

  • Default: False

Validation and fitting arguments

Arguments used to define the validation strategy and the method used to fit the model.


Help: Percentage of samples FOR EACH CLASS drawn from the training set to build the validation set.

  • Default: None


Help: Mode used for validation. Must be used in combination with validation argument. Possible values: - current: uses only the current task for validation (default). - complete: uses data from both current and past tasks for validation.

  • Default: current

  • Choices: complete, current


Help: Strategy used for fitting the model. Possible values: - epochs: fits the model for a fixed number of epochs (default). NOTE: this option is controlled by the n_epochs argument. - iters: fits the model for a fixed number of iterations. NOTE: this option is controlled by the n_iters argument. - early_stopping: fits the model until early stopping criteria are met. This option requires a validation set (see validation argument). The early stopping criteria are: if the validation loss does not decrease for early_stopping_patience epochs, the training stops.

  • Default: epochs

  • Choices: epochs, iters, time, early_stopping


Help: Number of epochs to wait before stopping the training if the validation loss does not decrease. Used only if fitting_mode=early_stopping.

  • Default: 5


Help: Metric used for early stopping. Used only if fitting_mode=early_stopping.

  • Default: loss

  • Choices: loss, accuracy


Help: Frequency of validation evaluation. Used only if fitting_mode=early_stopping.

  • Default: 1


Help: Minimum improvement required to consider a new best model. Used only if fitting_mode=early_stopping.

  • Default: 1e-06


Help: Number of epochs. Used only if fitting_mode=epochs.

  • Default: None


Help: Number of iterations. Used only if fitting_mode=iters.

  • Default: None

Optimizer and learning rate scheduler arguments

Arguments used to define the optimizer and the learning rate scheduler.


Help: Optimizer.

  • Default: sgd

  • Choices: sgd, adam, adamw


Help: optimizer weight decay.

  • Default: 0.0


Help: optimizer momentum.

  • Default: 0.0

--optim_nesterov0|1|True|False -> bool

Help: optimizer nesterov momentum.

  • Default: 0

--drop_last0|1|True|False -> bool

Help: Drop the last batch if it is not complete?

  • Default: 0


Help: Learning rate scheduler.

  • Default: None


Help: Scheduler mode. Possible values: - epoch: the scheduler is called at the end of each epoch. - iter: the scheduler is called at the end of each iteration.

  • Default: epoch

  • Choices: epoch, iter


Help: Learning rate scheduler milestones (used if lr_scheduler=multisteplr).

  • Default: []


Help: Learning rate scheduler gamma (used if lr_scheduler=multisteplr).

  • Default: 0.1

Noise arguments

Arguments used to define the noisy-label settings.

--noise_typefield with aliases (str)

Help: Type of noise to apply. The symmetric type is supported by all datasets, while the asymmetric must be supported explicitly by the dataset (see datasets/utils/label_noise).

  • Default: symmetric


Help: Noise rate in [0-1].

  • Default: 0

--disable_noisy_labels_cache0|1|True|False -> bool

Help: Disable caching the noisy label targets? NOTE: if the seed is not set, the noisy labels will be different at each run with this option disabled.

  • Default: 0


Help: Path where to save the noisy labels cache. The path is relative to the base_path.

  • Default: noisy_labels


Management arguments

Generic arguments to manage the experiment reproducibility, logging, debugging, etc.


Help: The random seed. If not provided, a random seed will be used.

  • Default: None

--permute_classes0|1|True|False -> bool

Help: Permute classes before splitting into tasks? This applies the seed before permuting if the seed argument is present.

  • Default: 0


Help: The base path where to save datasets, logs, results.

  • Default: ./data/


Help: The path where to save the results. NOTE: this path is relative to base_path.

  • Default: results/


Help: The device (or devices) available to use for training. More than one device can be specified by separating them with a comma. If not provided, the code will use the least used GPU available (if there are any), otherwise the CPU. MPS is supported and is automatically used if no GPU is available and MPS is supported. If more than one GPU is available, Mammoth will use the least used one if –distributed=no.

  • Default: None


Help: Helper argument to include notes for this run. Example: distinguish between different versions of a model and allow separation of results

  • Default: None


Help: Perform inference on validation every eval_epochs epochs. If not provided, the model is evaluated ONLY at the end of each task.

  • Default: None

--non_verbose0|1|True|False -> bool

Help: Make progress bars non verbose

  • Default: 0

--disable_log0|1|True|False -> bool

Help: Disable logging?

  • Default: 0


Help: Number of workers for the dataloaders (default=infer from number of cpus).

  • Default: None

--enable_other_metrics0|1|True|False -> bool

Help: Enable computing additional metrics: forward and backward transfer.

  • Default: 0

--debug_mode0|1|True|False -> bool

Help: Run only a few training steps per epoch. This also disables logging on wandb.

  • Default: 0

--inference_only0|1|True|False -> bool

Help: Perform inference only for each task (no training).

  • Default: 0


Help: Optimization level for the code.0: no optimization.1: Use TF32, if available.2: Use BF16, if available.3: Use BF16 and torch.compile. BEWARE: torch.compile may break your code if you change the model after the first run! Use with caution.

  • Default: 0

  • Choices: 0, 1, 2, 3


Help: Enable distributed training?

  • Default: no

  • Choices: no, dp, ddp


Help: Save checkpoint every task or at the end of the training (last).

  • Default: None

  • Choices: last, task


Help: Save the model checkpoint with metadata in a single pickle file with the old structure (old_pickle) or with the new, safe structure (default)?. NOTE: the old_pickle structure requires weights_only=False, which will be deprecated by PyTorch.

  • Default: safe

  • Choices: old_pickle, safe


Help: Path of the checkpoint to load (.pt file for the specific task)

  • Default: None


Help: (optional) checkpoint save name.

  • Default: None


Help: Task to start from

  • Default: None


Help: Task limit

  • Default: None

Wandb arguments

Arguments to manage logging on Wandb.


Help: Wandb name for this run. Overrides the default name (args.model).

  • Default: None


Help: Wandb entity

  • Default: None


Help: Wandb project name

  • Default: None



Help: The size of the memory buffer.

  • Default: None


Help: The batch size of the memory buffer.

  • Default: None


utils.args.add_configuration_args(parser, args)[source]#

Arguments that need to define the configuration of the dataset and model.

utils.args.add_dynamic_parsable_args(parser, dataset, backbone)[source]#

Add the additional arguments of the chosen dataset and backbone to the parser.

  • parser (ArgumentParser) – the parser instance to extend

  • dataset (str) – the dataset name

  • backbone (str) – the backbone name


Adds the arguments used by all the models.


parser (ArgumentParser) – the parser instance



Return type:



Returns the initial parser for the arguments.

Return type:



Adds the management arguments.


parser (ArgumentParser) – the parser instance



Return type:



Adds the arguments used by all the rehearsal-based methods


parser (ArgumentParser) – the parser instance



Return type:


utils.args.build_parsable_args(parser, spec)[source]#

Builds the argument parser given a specification and extends the given parser.

The specification dictionary can either be a simple list of key-value argument or follow the format:

    'name': {
    'type': type,
    'default': default,
    'choices': choices,
    'help': help,
    'required': True/False

If the specification is a simple list of key-value arguments, the value of the argument is the default value. If the default is set to inspect.Parameter.empty, the argument is required. The type of the argument is inferred from the default value (default is str).

  • parser (ArgumentParser) – the argument parser

  • spec (dict) – the specification dictionary


the argument parser

Return type:



Check if an argument is defined multiple times during the string parsing. Prevents the user from typing the same argument multiple times as: –arg1=val1 –arg1=val2.


Extracts the registered name from the dictionary arguments.

Return type:


utils.args.fix_model_parser_backwards_compatibility(main_parser, model_parser=None)[source]#

Fix the backwards compatibility of the get_parser method of the models.


the fixed parser

Return type:


utils.args.get_single_arg_value(parser, arg_name)[source]#

Returns the value of a single argument without explicitly parsing the arguments.

  • parser (ArgumentParser) – the argument parser

  • arg_name (str) – the name of the argument


the value of the argument

Return type:


utils.args.update_cli_defaults(parser, cnf)[source]#

Updates the default values of the parser with the values in the configuration dictionary.

If an argument is defined as required in the parser but a default value is provided in the configuration dictionary, the argument is set as not required.

  • parser (ArgumentParser) – the argument parser

  • cnf (dict) – the configuration dictionary



Return type:
