Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pullback for tr produces a CPU Diagonal causing downstream scalar indexing on GPUs #682

Open
darsnack opened this issue Oct 17, 2022 · 3 comments
Labels
good first issue Good for newcomers GPU

Comments

@darsnack
Copy link

Functions like tr(A * B) will throw scalar indexing issues in the pullback for * when A and B are CuArrays. This is because the pullback for tr creates a Diagonal which will cause downstream matrix multiplies to hit the LinearAlgebra definition.

@mcabbott
Copy link
Member

Worse it's a Diagonal{T,Array}. Would a Diagonal{T, CuArray} work?

@darsnack
Copy link
Author

Yeah, that seems to avoid scalar indexing

@mcabbott
Copy link
Member

Then probably it can re-use what sum does, which should also allow 2nd derivatives:

_unsum(x, dy, ::Colon) = broadcast(lasttuple, x, Ref(dy))

It would also be nice if the test noticed this. We have @gpu test_rrule(tr, randn(4, 4)) but it apparently isn't smart enough to object to the Array.

@mcabbott mcabbott added GPU good first issue Good for newcomers labels Oct 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers GPU
Projects
None yet
Development

No branches or pull requests

2 participants