Tensorflow: InvalidArgumentError: Graph execution error:2 root error(s) found

created at 03-20-2022 views: 418

problem

  • Platform: google colab
  • Use GPU
  • When training the dl model, an error is encountered
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-56-09c5afe5a363> in <module>()
----> 1 text_vectorization.adapt(text_only_train_ds)
      2 
      3 tfidf_2gram_train_ds = train_ds.map(
      4     lambda x, y: (text_vectorization(x), y),
      5     num_parallel_calls=4)

3 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     53     ctx.ensure_initialized()
     54     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 55                                         inputs, attrs, num_outputs)
     56   except core._NotOkStatusException as e:
     57     if name is not None:

InvalidArgumentError: Graph execution error:

2 root error(s) found.
  (0) INVALID_ARGUMENT:  During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string
     [[{{node map/TensorArrayUnstack/TensorListFromTensor/_42}}]]
     [[Func/map/while/body/_1/input/_48/_72]]
  (1) INVALID_ARGUMENT:  During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string
     [[{{node map/TensorArrayUnstack/TensorListFromTensor/_42}}]]
0 successful operations.
0 derived errors ignored. [Op:__inference_adapt_step_194806]

The above error is generated by the following code

text_vectorization.adapt(text_only_train_ds)

tfidf_2gram_train_ds = train_ds.map(
    lambda x, y: (text_vectorization(x), y),
    num_parallel_calls=4)
tfidf_2gram_val_ds = val_ds.map(
    lambda x, y: (text_vectorization(x), y),
    num_parallel_calls=4)
tfidf_2gram_test_ds = test_ds.map(
    lambda x, y: (text_vectorization(x), y),
    num_parallel_calls=4)

model = get_model()
model.summary()
callbacks = [
    keras.callbacks.ModelCheckpoint("tfidf_2gram.keras",
                                    save_best_only=True)
]
model.fit(tfidf_2gram_train_ds.cache(),
          validation_data=tfidf_2gram_val_ds.cache(),
          epochs=10,
          callbacks=callbacks)
model = keras.models.load_model("tfidf_2gram.keras")
print(f"Test acc: {model.evaluate(tfidf_2gram_test_ds)[1]:.3f}")

solution

The current guess is that training should be done with CPUs, not GPUs. Sure enough, change runtime type to None

CPU

Because the data preprocessing here uses the CPU, and the specific calculation uses the GPU, the data set should be done by the CPU.

created at:03-20-2022
edited at: 03-20-2022: