Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuDNN failed to initialize #8

Open
Al3n70rn opened this issue Jan 21, 2022 · 0 comments
Open

cuDNN failed to initialize #8

Al3n70rn opened this issue Jan 21, 2022 · 0 comments

Comments

@Al3n70rn
Copy link

Hi,

Thank for this very interesting package.

I have modified a little bit your notebook to use multiclass mask, everything seems to work properly but once I try to fit the model I have this error:

Training on 1 GPUs.
Epoch 1/10

/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/indexed_slices.py:449: UserWarning: Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradient_tape/mask_loss/cond/gradients/mask_loss/cond/map/while_grad/gradients/mask_loss/cond/map/while/GatherNd_grad/Squeeze:0", shape=(None,), dtype=int64), values=Tensor("gradient_tape/mask_loss/cond/gradients/mask_loss/cond/map/while_grad/gradients/AddN_3:0", shape=(None, None), dtype=float32), dense_shape=Tensor("mask_loss/cond/map/while/gradient_tape/mask_loss/cond/gradients/mask_loss/cond/map/while_grad/gradients/mask_loss/cond/map/while/GatherNd_grad/Shape:0", shape=(2,), dtype=int64))) to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "shape. This may consume a large amount of memory." % value)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/indexed_slices.py:449: UserWarning: Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradient_tape/mask_loss/cond/gradients/mask_loss/cond/map/while_grad/gradients/mask_loss/cond/map/while/GatherNd_1_grad/Squeeze:0", shape=(None,), dtype=int64), values=Tensor("gradient_tape/mask_loss/cond/gradients/mask_loss/cond/map/while_grad/gradients/mask_loss/cond/map/while/transpose_grad/transpose:0", shape=(None, 28, 28, None), dtype=float32), dense_shape=Tensor("mask_loss/cond/map/while/gradient_tape/mask_loss/cond/gradients/mask_loss/cond/map/while_grad/gradients/mask_loss/cond/map/while/GatherNd_1_grad/Shape:0", shape=(4,), dtype=int64))) to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "shape. This may consume a large amount of memory." % value)

WARNING:tensorflow:Gradients do not exist for variables ['conv_0_semantic_upsample_2/kernel:0', 'conv_0_semantic_upsample_2/bias:0', 'conv_0_semantic_upsample_3/kernel:0', 'conv_0_semantic_upsample_3/bias:0', 'conv_0_semantic_upsample_4/kernel:0', 'conv_0_semantic_upsample_4/bias:0', 'conv_0_semantic_upsample_5/kernel:0', 'conv_0_semantic_upsample_5/bias:0', 'conv_0_semantic_upsample_6/kernel:0', 'conv_0_semantic_upsample_6/bias:0', 'conv_0_semantic_2/kernel:0', 'conv_0_semantic_2/bias:0', 'conv_0_semantic_3/kernel:0', 'conv_0_semantic_3/bias:0', 'conv_0_semantic_4/kernel:0', 'conv_0_semantic_4/bias:0', 'conv_0_semantic_5/kernel:0', 'conv_0_semantic_5/bias:0', 'conv_0_semantic_6/kernel:0', 'conv_0_semantic_6/bias:0', 'batch_normalization_0_semantic_2/gamma:0', 'batch_normalization_0_semantic_2/beta:0', 'batch_normalization_0_semantic_3/gamma:0', 'batch_normalization_0_semantic_3/beta:0', 'batch_normalization_0_semantic_4/gamma:0', 'batch_normalization_0_semantic_4/beta:0', 'batch_normalization_0_semantic_5/gamma:0', 'batch_normalization_0_semantic_5/beta:0', 'batch_normalization_0_semantic_6/gamma:0', 'batch_normalization_0_semantic_6/beta:0', 'conv_1_semantic_2/kernel:0', 'conv_1_semantic_2/bias:0', 'conv_1_semantic_3/kernel:0', 'conv_1_semantic_3/bias:0', 'conv_1_semantic_4/kernel:0', 'conv_1_semantic_4/bias:0', 'conv_1_semantic_5/kernel:0', 'conv_1_semantic_5/bias:0', 'conv_1_semantic_6/kernel:0', 'conv_1_semantic_6/bias:0'] when minimizing the loss.
WARNING:tensorflow:Gradients do not exist for variables ['conv_0_semantic_upsample_2/kernel:0', 'conv_0_semantic_upsample_2/bias:0', 'conv_0_semantic_upsample_3/kernel:0', 'conv_0_semantic_upsample_3/bias:0', 'conv_0_semantic_upsample_4/kernel:0', 'conv_0_semantic_upsample_4/bias:0', 'conv_0_semantic_upsample_5/kernel:0', 'conv_0_semantic_upsample_5/bias:0', 'conv_0_semantic_upsample_6/kernel:0', 'conv_0_semantic_upsample_6/bias:0', 'conv_0_semantic_2/kernel:0', 'conv_0_semantic_2/bias:0', 'conv_0_semantic_3/kernel:0', 'conv_0_semantic_3/bias:0', 'conv_0_semantic_4/kernel:0', 'conv_0_semantic_4/bias:0', 'conv_0_semantic_5/kernel:0', 'conv_0_semantic_5/bias:0', 'conv_0_semantic_6/kernel:0', 'conv_0_semantic_6/bias:0', 'batch_normalization_0_semantic_2/gamma:0', 'batch_normalization_0_semantic_2/beta:0', 'batch_normalization_0_semantic_3/gamma:0', 'batch_normalization_0_semantic_3/beta:0', 'batch_normalization_0_semantic_4/gamma:0', 'batch_normalization_0_semantic_4/beta:0', 'batch_normalization_0_semantic_5/gamma:0', 'batch_normalization_0_semantic_5/beta:0', 'batch_normalization_0_semantic_6/gamma:0', 'batch_normalization_0_semantic_6/beta:0', 'conv_1_semantic_2/kernel:0', 'conv_1_semantic_2/bias:0', 'conv_1_semantic_3/kernel:0', 'conv_1_semantic_3/bias:0', 'conv_1_semantic_4/kernel:0', 'conv_1_semantic_4/bias:0', 'conv_1_semantic_5/kernel:0', 'conv_1_semantic_5/bias:0', 'conv_1_semantic_6/kernel:0', 'conv_1_semantic_6/bias:0'] when minimizing the loss.

---------------------------------------------------------------------------

UnknownError                              Traceback (most recent call last)

<ipython-input-86-144b2c018829> in <module>()
     30     validation_data=val_data,
     31     validation_steps=val_data.y.shape[0] // batch_size,
---> 32     callbacks=train_callbacks)

6 frames

/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     58     ctx.ensure_initialized()
     59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60                                         inputs, attrs, num_outputs)
     61   except core._NotOkStatusException as e:
     62     if name is not None:

UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node resnet50_retinanetmask/conv1_conv/Conv2D (defined at <ipython-input-86-144b2c018829>:32) ]] [Op:__inference_train_function_125016]

Function call stack:
train_function

Do you have any clues on how to handle this issue?

Best regards,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant