Actions: microsoft/DeepSpeed
Actions
Showing runs from all workflows
48,644 workflow runs
48,644 workflow runs
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
nv-accelerate-v100
#3566:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
nv-mii
#2082:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
Formatting
#6266:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
nv-lightning-v100
#4706:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
amd-mi200
#49:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
nv-torch19-v100
#437:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
amd-mi100
#49:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
nv-inference
#4082:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
nv-torch19-p40
#425:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
python
#1921:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
nv-megatron
#1727:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
nv-torch-latest-v100
#3571:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
nv-transformers-v100
#4684:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
nv-torch-latest-cpu
#1454:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
python
#1920:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
nv-lightning-v100
#4705:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
nv-torch19-v100
#436:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
nv-megatron
#1726:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
amd-mi200
#48:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
nv-torch-latest-v100
#3570:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
nv-inference
#4081:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
nv-mii
#2081:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
amd-mi100
#48:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
nv-torch19-p40
#424:
Pull request #2999
synchronize
by
YizhouZ
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
nv-torch-latest-cpu
#1453:
Pull request #2999
synchronize
by
YizhouZ