DISTRIBUTED#
Distributed utilities for parallel processing.
Supports both Distributed Data Parallel (DDP) and Data Parallel (DP) models.
Examples
>>> from utils.distributed import make_ddp, make_dp
>>> model = make_ddp(model) # for DDP >>> model = make_dp(model) # for DP
Note: - DDP is not applicable to rehearsal methods (see make_ddp for more details). - When using DDP, you might need the wait_for_master function.
Synchronization before and after training is handled automatically.
Classes#
- class utils.distributed.CustomDP(module, device_ids=None, output_device=None, dim=0)[source]#
Bases:
DataParallel
Custom DataParallel class to avoid using .module when accessing intercept_names attributes.
- intercept_names = ['classifier', 'num_classes', 'set_return_prerelu']#
Functions#
- utils.distributed.make_ddp(model)[source]#
Create a DistributedDataParallel (DDP) model.
Note: DDP is not applicable to rehearsal methods (e.g., GEM, A-GEM, ER, etc.). This is because DDP breaks the buffer, which has to be synchronized. Ad-hoc solutions are possible, but they are not implemented here.
- Parameters:
model (Module) – The model to be wrapped with DDP.
- Returns:
The DDP-wrapped model.
- Return type:
None
- utils.distributed.make_dp(model)[source]#
Create a DataParallel (DP) model.
- Parameters:
model – The model to be wrapped with DP.
- Returns:
The DP-wrapped model.