docker exec -it --user root -e NVIDIA_VISIBLE_DEVICES=0,1,2,3 remote /bin/bash
According to other people's methods on github, I created a new container by myself, and then configured it on this basis, and found that the problem of RuntimeError: NCCL Error 2: unhandled system error
will not occur.
docker exec -it --user root --gpus all --ipc=host -e NVIDIA_VISIBLE_DEVICES=0,1,2,3 remote /bin/bash
summary: