Govur University Logo
--> --> --> -->
...

In a distributed training environment, which NCCL primitive is required to combine gradient updates from all GPUs into a single synchronized result across the cluster?



The NCCL primitive required to combine gradient updates from all GPUs into a single synchronized result is AllReduce. In distributed training, each GPU calculates a portion of the total gradient based on its specific batch of data. Becau....

Log in to view the answer



Redundant Elements