CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`

I've had this problem twice because of:

  1. Embedding dimension problem
  2. Vocabulary problems

How do I find out the reason? If I run it on the cpu, it will report the specific location of the error, instead of the confusing error as shown in the title.

Example 1

import torch
import torch.nn as nn

position_embeddings = nn.Embedding(10, 128)#10 is the boundary here, not the dimension
position_ids = torch.arange(1, 11, dtype=torch.long)
position_ids = position_ids.unsqueeze(0).expand([2,7,10])
x= position_embeddings(position_ids)

Use gpu to report the error shown in the title, and use cpu to report the following error

IndexError: index out of range in self

Because the bound of embedding is 10, it should be changed to

position_ids = torch.arange(0, 10, dtype=torch.long)

Example 2

The size of my vocabulary list vocab.txt is 21128. As a result, I added a character to the vocabulary list and forgot to modify the vocabulary size of config, and the error shown in the title will be reported.

