advsecurenet.distributed package
advsecurenet.distributed.ddp_base_task module
- class advsecurenet.distributed.ddp_base_task.DDPBaseTask(model: BaseModel, rank: int, world_size: int)
Bases:
object
Base class for DistributedDataParallel tasks.
- Parameters:
model (BaseModel) – The model to be used.
rank (int) – The rank of the current process.
world_size (int) – The total number of processes.
advsecurenet.distributed.ddp_coordinator module
- class advsecurenet.distributed.ddp_coordinator.DDPCoordinator(ddp_func, world_size, *args, **kwargs)
Bases:
object
The generic DDP DDPTrainer class. This class is used to train a model using DistributedDataParallel.
- ddp_setup(rank: int)
DDP setup function. This function is called by each process to setup the DDP environment. Sets the master address and port and initializes the process group. Automatically finds a free port on the machine and uses it as the master port.
The default backend is nccl.
- run()
Spawn the processes for DDP training.
- run_process(rank: int)
Setup DDP and call the training function.