Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

invalid device ordinal #74

Open
wq409813230 opened this issue Jun 22, 2017 · 6 comments
Open

invalid device ordinal #74

wq409813230 opened this issue Jun 22, 2017 · 6 comments

Comments

@wq409813230
Copy link

wq409813230 commented Jun 22, 2017

I have installed all dependencies that this repository need,but something gose wrong when running the command bellow:

th test.lua -input_image /data/artwork/content/huaban.jpeg -model_t7 data/checkpoints/model.t7 -gpu 0

THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-3022/cutorch/init.c line=734 error=10 :
invalid device ordinal
/root/AI/torch/install/bin/luajit: test.lua:26: cuda runtime error (10) : invalid device ordinal at /tmp/luarocks_cutorch-scm-1-3022/cutorch/init.c:734
stack traceback:
[C]: in function 'setDevice'
test.lua:26: in main chunk
[C]: in function 'dofile'
...t/AI/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406670
`
bellow is my GPU info

`+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.26 Driver Version: 375.26 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro P5000 Off | 0000:03:00.0 On | Off |
| 26% 39C P8 8W / 180W | 110MiB / 16264MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1387 G /usr/lib/xorg/Xorg 108MiB |
+-----------------------------------------------------------------------------+
`
I really have no idea where the problem is.

@DmitryUlyanov
Copy link
Owner

DmitryUlyanov commented Jun 22, 2017 via email

@wq409813230
Copy link
Author

wq409813230 commented Jun 22, 2017

Hi,Dear Dmitry,thank you for your reply.but it still failed when I ignore the -gpu argument.what makes me confused is that the chainer-fast-neuralstyle
implemented with python also has the '-gpu' argument, and it runs well when I set -gpu 0.
qq 20170622160836

@engahmed1190
Copy link

engahmed1190 commented Oct 19, 2017

Hi ,
This issue still persist any one found a solution for it

@gxlcliqi
Copy link

gxlcliqi commented Nov 6, 2017

the gpu index starts from 1, pls try to use option -gpu 1 instead of -gpu 0

@psenough
Copy link

psenough commented Jul 16, 2018

i also get this error. whatever gpu id i input. cudnn works fine on chainer.

my setup info:
ubuntu 16.04 torch7 cuda9.2 cudnn7.1.4
``
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.26 Driver Version: 396.26 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 970 Off | 00000000:01:00.0 On | N/A |
| 0% 46C P8 17W / 163W | 455MiB / 4040MiB | 1% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 958 G /usr/lib/xorg/Xorg 287MiB |
| 0 1897 G compiz 164MiB |
+-----------------------------------------------------------------------------+
``

i think it might be because of torch7 being by default for cudnn r5 ?!

i had to run git clone https://github.com/soumith/cudnn.torch.git -b R7 && cd cudnn.torch && luarocks make cudnn-scm-1.rockspec to get cudnn7 recognized by torch. and had to re-do luarocks install cunn and luarocks install cutorch after that, but now get this same "invalid device ordinal" error.

maybe it's having some sort of version mismatch of cudnn cunn and cutorch? don't know where the cunn.torch and cutorch.torch compliant with cudnn.torch R7 might be located. anyone has any clue?

i'm not used to ubuntu and lua :S

@psenough
Copy link

found https://github.com/torch/cutorch/issues
and yeah, doesn't look like they support cuda 9 yet, that's probably the issue here i think. :/
if anyone has any other insights beyond "try downgrading", i'd appreciate the input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants