Arguments#

MAIN MAMMOTH ARGS

--datasetstr (with underscores replaced by dashes)

Help: Which dataset to perform experiments on.

  • Default: None

  • Choices: seq-tinyimg, seq-tinyimg-r, seq-cifar10-224-rs, seq-mit67, seq-cifar100-224-rs, seq-eurosat-rgb, seq-resisc45, seq-cub200, seq-mnist, seq-cifar10-224, seq-imagenet-r, seq-isic, seq-cifar100, seq-cars196, seq-cub200-rs, rot-mnist, seq-cifar100-224, perm-mnist, seq-cropdisease, seq-cifar10, mnist-360, seq-chestx

--modelstr (with underscores replaced by dashes)

Help: Model name.

  • Default: None

  • Choices: lwf-mc, xder-ce, derpp-lider, clip, second-stage-starprompt, gem, xder, dap, der, idefics, sgd, mer, llava, icarl-lider, joint-gcl, ranpac, ewc-on, puridiver, slca, l2p, hal, ccic, pnn, starprompt, gdumb-lider, agem, attriclip, dualprompt, fdr, si, xder-rpc, gss, er-ace-lider, moe-adapters, twf, icarl, er-tricks, lwf, er-ace-tricks, lucir, rpc, agem-r, bic, derpp, er-ace, cgil, er, er-ace-aer-abs, joint, first-stage-starprompt, gdumb, coda-prompt

--backbonestr (with underscores replaced by dashes)

Help: Backbone network name.

  • Default: None

  • Choices: resnet50, resnet50_pt, mnistmlp, vit, resnet18, resnet34

--load_best_argsunknown

Help: (deprecated) Loads the best arguments for each method, dataset and memory buffer. NOTE: This option is deprecated and not up to date.

  • Default: False

--dataset_configstr

Help: The configuration used for this dataset (e.g., number of tasks, transforms, backbone architecture, etc.).The available configurations are defined in the datasets/config/<dataset> folder.

  • Default: None

EXPERIMENT-RELATED ARGS

Experiment arguments

Arguments used to define the experiment settings.

--lrfloat

Help: Learning rate. This should either be set as default by the model (with set_defaults), by the dataset (with set_default_from_args, see utils), or with –lr=<value>.

  • Default: None

--batch_sizeint

Help: Batch size.

  • Default: None

--label_percfloat

Help: Percentage in (0-1] of labeled examples per task.

  • Default: 1

--label_perc_by_classfloat

Help: Percentage in (0-1] of labeled examples per task.

  • Default: 1

--jointint

Help: Train model on Joint (single task)?

  • Default: 0

  • Choices: 0, 1

--eval_futureint

Help: Evaluate future tasks?

  • Default: 0

  • Choices: 0, 1

Validation and fitting arguments

Arguments used to define the validation strategy and the method used to fit the model.

--validationfloat

Help: Percentage of samples FOR EACH CLASS drawn from the training set to build the validation set.

  • Default: None

--validation_modestr

Help: Mode used for validation. Must be used in combination with validation argument. Possible values: - current: uses only the current task for validation (default). - complete: uses data from both current and past tasks for validation.

  • Default: current

  • Choices: complete, current

--fitting_modestr

Help: Strategy used for fitting the model. Possible values: - epochs: fits the model for a fixed number of epochs (default). NOTE: this option is controlled by the n_epochs argument. - iters: fits the model for a fixed number of iterations. NOTE: this option is controlled by the n_iters argument. - early_stopping: fits the model until early stopping criteria are met. This option requires a validation set (see validation argument). The early stopping criteria are: if the validation loss does not decrease for early_stopping_patience epochs, the training stops.

  • Default: epochs

  • Choices: epochs, iters, time, early_stopping

--early_stopping_patienceint

Help: Number of epochs to wait before stopping the training if the validation loss does not decrease. Used only if fitting_mode=early_stopping.

  • Default: 5

--early_stopping_metricstr

Help: Metric used for early stopping. Used only if fitting_mode=early_stopping.

  • Default: loss

  • Choices: loss, accuracy

--early_stopping_freqint

Help: Frequency of validation evaluation. Used only if fitting_mode=early_stopping.

  • Default: 1

--early_stopping_epsilonfloat

Help: Minimum improvement required to consider a new best model. Used only if fitting_mode=early_stopping.

  • Default: 1e-06

--n_epochsint

Help: Number of epochs. Used only if fitting_mode=epochs.

  • Default: None

--n_itersint

Help: Number of iterations. Used only if fitting_mode=iters.

  • Default: None

Optimizer and learning rate scheduler arguments

Arguments used to define the optimizer and the learning rate scheduler.

--optimizerstr

Help: Optimizer.

  • Default: sgd

  • Choices: sgd, adam, adamw

--optim_wdfloat

Help: optimizer weight decay.

  • Default: 0.0

--optim_momfloat

Help: optimizer momentum.

  • Default: 0.0

--optim_nesterov0|1|True|False -> bool

Help: optimizer nesterov momentum.

  • Default: 0

--lr_schedulerstr

Help: Learning rate scheduler.

  • Default: None

--scheduler_modestr

Help: Scheduler mode. Possible values: - epoch: the scheduler is called at the end of each epoch. - iter: the scheduler is called at the end of each iteration.

  • Default: epoch

  • Choices: epoch, iter

--lr_milestonesint

Help: Learning rate scheduler milestones (used if lr_scheduler=multisteplr).

  • Default: []

--sched_multistep_lr_gammafloat

Help: Learning rate scheduler gamma (used if lr_scheduler=multisteplr).

  • Default: 0.1

Noise arguments

Arguments used to define the noisy-label settings.

--noise_typefield with aliases (str)

Help: Type of noise to apply. The symmetric type is supported by all datasets, while the asymmetric must be supported explicitly by the dataset (see datasets/utils/label_noise).

  • Default: symmetric

--noise_ratefloat

Help: Noise rate in [0-1].

  • Default: 0

--disable_noisy_labels_cache0|1|True|False -> bool

Help: Disable caching the noisy label targets? NOTE: if the seed is not set, the noisy labels will be different at each run with this option disabled.

  • Default: 0

--cache_path_noisy_labelsstr

Help: Path where to save the noisy labels cache. The path is relative to the base_path.

  • Default: noisy_labels

MANAGEMENT ARGS

Management arguments

Generic arguments to manage the experiment reproducibility, logging, debugging, etc.

--seedint

Help: The random seed. If not provided, a random seed will be used.

  • Default: None

--permute_classes0|1|True|False -> bool

Help: Permute classes before splitting into tasks? This applies the seed before permuting if the seed argument is present.

  • Default: 0

--base_pathstr

Help: The base path where to save datasets, logs, results.

  • Default: ./data/

--results_pathstr

Help: The path where to save the results. NOTE: this path is relative to base_path.

  • Default: results/

--devicestr

Help: The device (or devices) available to use for training. More than one device can be specified by separating them with a comma. If not provided, the code will use the least used GPU available (if there are any), otherwise the CPU. MPS is supported and is automatically used if no GPU is available and MPS is supported. If more than one GPU is available, Mammoth will use the least used one if –distributed=no.

  • Default: None

--notesstr

Help: Helper argument to include notes for this run. Example: distinguish between different versions of a model and allow separation of results

  • Default: None

--eval_epochsint

Help: Perform inference on validation every eval_epochs epochs. If not provided, the model is evaluated ONLY at the end of each task.

  • Default: None

--non_verbose0|1|True|False -> bool

Help: Make progress bars non verbose

  • Default: 0

--disable_log0|1|True|False -> bool

Help: Disable logging?

  • Default: 0

--num_workersint

Help: Number of workers for the dataloaders (default=infer from number of cpus).

  • Default: None

--enable_other_metrics0|1|True|False -> bool

Help: Enable computing additional metrics: forward and backward transfer.

  • Default: 0

--debug_mode0|1|True|False -> bool

Help: Run only a few training steps per epoch. This also disables logging on wandb.

  • Default: 0

--inference_only0|1|True|False -> bool

Help: Perform inference only for each task (no training).

  • Default: 0

--code_optimizationint

Help: Optimization level for the code.0: no optimization.1: Use TF32, if available.2: Use BF16, if available.3: Use BF16 and torch.compile. BEWARE: torch.compile may break your code if you change the model after the first run! Use with caution.

  • Default: 0

  • Choices: 0, 1, 2, 3

--distributedstr

Help: Enable distributed training?

  • Default: no

  • Choices: no, dp, ddp

--savecheckstr

Help: Save checkpoint every task or at the end of the training (last).

  • Default: None

  • Choices: last, task

--loadcheckstr

Help: Path of the checkpoint to load (.pt file for the specific task)

  • Default: None

--ckpt_namestr

Help: (optional) checkpoint save name.

  • Default: None

--start_fromint

Help: Task to start from

  • Default: None

--stop_afterint

Help: Task limit

  • Default: None

Wandb arguments

Arguments to manage logging on Wandb.

--wandb_namestr

Help: Wandb name for this run. Overrides the default name (args.model).

  • Default: None

--wandb_entitystr

Help: Wandb entity

  • Default: None

--wandb_projectstr

Help: Wandb project name

  • Default: None

REEHARSAL-ONLY ARGS

--buffer_sizeint

Help: The size of the memory buffer.

  • Default: None

--minibatch_sizeint

Help: The batch size of the memory buffer.

  • Default: None