RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

created at 09-27-2021 views: 13

I encountered this error when I used the open source framework pytorch-deeplab-xception on github to perform semantic segmentation on my own data set. It took two days to solve the problem of pit debugging.
According to the meaning of this error prompt, I searched for solutions, saying that CuDNN, pytorch, and graphics card should be reinstalled to make the versions match. So I did it, and after a long time toss, I found it was useless.

Because this error appeared in the loss function, it was suspected that it was the cause of the input dimension. So I tried to convert the input label and output dimensions to three-dimensional, four-dimensional and so on, but still reported this error.

Finally, when I was desperate, I printed the label, and found that the value of the Label exceeded the value range. Specifically, my data set has nine categories, so the label value should correspond to 0~8, but the label required by this network is a single channel, I first attribute the value of the label image to the three channels between 0 and 8 (The values ​​on the three channels at each position are exactly the same), and then use the convert function in Image to convert into a single-channel image as the final label.
never expected! This step of convert will actually change the value of the image. For example, I finally printed and found that the pixel values ​​of these single-channel images ranged from 0 to 255... This really surprised me. I thought that three-channel conversion to single-channel Just take one of the layers without changing its value. I didn't expect that its value range would change so much.
So naturally, the label that the network should enter is between 0 and 8, but the Label value range I entered is incorrect, so an error is reported.

problem causes

When this problem occurs, you should carefully check the dimensions and value range of the label, instead of reinstalling CuDNN, pytorch, etc. immediately, otherwise you may make detours.

solution

Convert the three-channel picture into a numpy array and then take only one of the channels and write it as a picture file as a label, instead of directly using the Image Convert function to convert the three-channel 24-bit deep data into a single-channel 8-bit deep data. The latter will change the value range.

created at:09-27-2021
edited at: 09-27-2021: