RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
This is because my calculation using torch.nn.CrossEntropyLoss
is wrong and the number of categories is not correct
pred = torch.zeros((128, 1, 128, 128))
labels = torch.zeros((128, 128, 128))
loss_fn = torch.nn.CrossEntropyLoss()
loss = loss_fn(pred, labels)
loss.backward() <--error here
pred = torch.zeros((128, 1, 128, 128))
labels = torch.zeros((128, 128, 128))
loss_fn = torch.nn.SmoothL1Loss()
loss = loss_fn(pred.view(128, 128, 128), labels)
loss.backward()
It's just to explain the problem, not really directly change CE to L1Loss, it depends on your actual situation.
In addition, this error RuntimeError
is not very clear. It is not only me that will have this error. You can communicate with me about the error in other situations.