Core Concepts#

This section explains the fundamental concepts of continual learning and how they are implemented in Mammoth Lite.

Continual Learning Fundamentals#

Continual Learning Scenarios#

Mammoth Lite supports only a subset of the common continual learning scenarios:

Class-Incremental Learning (Class-IL): New classes are introduced in each task, and the model must classify among all seen classes without knowing task boundaries at test time.
Task-Incremental Learning (Task-IL): New classes are introduced in each task, but the task identity is known at test time, so the model only needs to classify among classes within that task.

Check out the awesome Three scenarios for continual learning paper for more details on these scenarios.

The Catastrophic Forgetting Problem#

When neural networks learn new tasks, they tend to forget previously learned information. This usually happens because:

Weight Interference: New task learning overwrites weights important for previous tasks
Feature Drift: Shared representations shift to accommodate new data
Output Bias: Decision boundaries change to favor recent training data

Mammoth Lite Framework Components#

Models (Continual Learning Algorithms)#

Models in Mammoth Lite implement different strategies to mitigate catastrophic forgetting:

Regularization-based Methods: Add penalties to prevent important weights from changing too much. This includes methods such as Elastic Weight Consolidation (here, you can find it in its less heavy “online” version: –model=ewc_on) and Learning without Forgetting (–model=lwf).
Replay-based Methods: Store and replay examples from previous tasks while learning new ones. This includes methods like Experience Replay (–model=er) and Dark Experience Replay (–model=der).
Architectural Methods: Allocate different network parameters for different tasks. This includes methods like Progressive Neural Networks (–model=pnn) and PackNet (not available yet).
Pretrain-based Methods: Use pretraining on a large dataset before continual learning. This includes methods like Learning to Prompt for Continua (–model=l2p) and CODA-Prompt (–model=coda_prompt).
Mixed Methods: Combine multiple strategies for better performance. The vast majority of methods found in literature are mixed, such as Slow Learner with Classifier Alignment (–model=slca), which combines regularization from a pretrained network and generative replay.

All models inherit from the ContinualModel base class and implement the observe method. Models can be registered with the framework using the @register_model decorator:

@register_model('my-model')
class MyContinualModel(ContinualModel):
    def observe(self, inputs, labels, not_aug_inputs, epoch=None) -> float:
    """
    Process a batch of training data.

    Args:
        inputs: input data processed with data augmentation
        labels: target labels
        not_aug_inputs: non-augmented input data
        epoch: current epoch number (optional)
    """
    # Implement your continual learning algorithm here
    pass

Datasets (Continual Learning Benchmarks)#

Datasets in Mammoth Lite split traditional datasets into sequential tasks:

Sequential CIFAR-10 CIFAR-10 split into 5 tasks of 2 classes each:

Task 1: airplane, automobile
Task 2: bird, cat
Task 3: deer, dog
Task 4: frog, horse
Task 5: ship, truck

Datasets inherit from ContinualDataset and define:

Task structure: How data is split across tasks (e.g., number of classes per task and number of tasks)
Transformations: Data augmentation and normalization
get_data_loaders: Method to return the COMPLETE train and test datasets, which will be split into tasks automatically during training.

In addition, datasets must be registered with the framework using the @register_dataset decorator.

@register_dataset('my-dataset')
class MyDataset(ContinualDataset):
    NAME = 'my-dataset'
    SETTING = 'class-il'  # or 'task-il', 'domain-il'
    N_CLASSES_PER_TASK = 2
    N_TASKS = 5

    def get_data_loaders(self):
    """Return train and test loaders for current task."""
    pass

Backbones (Neural Network Architectures)#

Backbones define the neural network architecture used for feature extraction:

For example, Mammoth Lite includes: - ResNet18: A popular convolutional neural network architecture - Vision Transformer: A transformer-based architecture for image classification

Backbones must inherit from MammothBackbone and can be registered with the framework using the @register_backbone decorator.

class MyBackbone(MammothBackbone):
    def __init__(self, num_classes, param1, param2):
    super().__init__()
    # Define your architecture

    def forward(self, x, returnt=ReturnTypes.OUT):
    """
    Forward pass with flexible return types.

    Args:
        x: Input tensor
        returnt: What to return (OUT, FEATURES, etc.)
    """
    pass

 @register_backbone('my-backbone-v1')
 def my_backbone(num_classes):
     return MyBackbone(num_classes, param1=10, param2=20)

 @register_backbone('my-backbone-v2')
 def my_backbone(num_classes):
     return MyBackbone(num_classes, param1=50, param2=100)

Training Process#

Task-by-Task Learning#

Mammoth Lite trains models sequentially on each task:

Task Setup: Load data for current task
Training Loop: Train for specified epochs
Evaluation: Test on all seen tasks
Task Transition: Move to next task

for task_id in range(dataset.N_TASKS):

    # Hook to initialize the model for the current task, if needed
    model.begin_task(dataset)

    # Training loop
    for epoch in range(args.n_epochs):

    # Hook to initialize stuff for the current epoch
    model.begin_epoch(epoch, dataset)

    for batch in dataset.train_loader:
        model.observe(batch.inputs, batch.labels, batch.not_aug_inputs)

    # Hook to handle the end of the current epoch
    model.end_epoch(epoch, dataset)

    # Hook to perform some final operations after training the task
    model.end_task(dataset)

    # Evaluate on all tasks
    accuracy = evaluate(model, dataset)

All this is taken care by the train function (in mammoth_lite/utils/training.py), which orchestrates the entire training process:

from mammoth_lite import train

# Train the model on the continual learning scenario
train(model, dataset)

Evaluation Metrics#

In Mammoth Lite, performance is measured in terms of Final Average Accuracy (FAA), which is the average accuracy across all tasks at the end of training.

More metrics, such as forgetting, are available in the full Mammoth framework.

Extending Mammoth Lite#

Adding Custom Models#

Inherit from ContinualModel
Implement the observe method
Register with @register_model('name')
Optionally define compatibility settings

Adding Custom Datasets#

Create a data source class inheriting from MammothDataset (see mammoth_lite/datasets/seq_cifar10.py for an example) - Define the dataset name, setting (e.g., ‘class-il’), number of classes per task, and number of tasks - Implement the get_data_loaders method to return train and test loaders
Create a continual dataset class inheriting from ContinualDataset
Register with @register_dataset('name')

Adding Custom Backbones#

Inherit from MammothBackbone
Implement forward pass with flexible return types
Register with @register_backbone('name')

Next Steps#

Now that you understand the core concepts:

Explore Examples: See Examples for hands-on implementations
Read API Docs: Check API Reference for detailed reference
Experiment: Try different model-dataset combinations
Contribute: Add your own models, datasets, or improvements

The examples section will show you how to implement these concepts in practice with detailed Jupyter notebooks.