Add rules and tests for `kron` #741

simsurace · 2023-09-25T13:42:03Z

In Julia 1.9 there was an internal change in kron that introduced some mutation, which has made Zygote unable to differentiate kron. Here, we add some rules to restore that ability.

Discovered in JuliaGaussianProcesses/TemporalGPs.jl#115

devmotion · 2023-09-25T13:53:22Z

made Zygote unable to differentiate kron

Did the workaround in FluxML/Zygote.jl#1378 not fix it? As mentioned in #684, should ideally be fixed in ChainRules nevertheless, but I'm a bit curious.

src/rulesets/LinearAlgebra/dense.jl

simsurace · 2023-09-25T14:06:46Z

made Zygote unable to differentiate kron

Did the workaround in FluxML/Zygote.jl#1378 not fix it? As mentioned in #684, should ideally be fixed in ChainRules nevertheless, but I'm a bit curious.

Thanks for commenting. I think @willtebbutt said that he will have a look at these rules later on.
I don't know if it is related, but the abovementioned fix predates Julia 1.9 by several months. I observed the breakage when upgrading Julia from 1.8 to 1.9.

src/rulesets/LinearAlgebra/dense.jl

simsurace · 2023-09-26T19:51:34Z

Hi all, I rewrote the rules and now all the tests pass. There is probably opportunity to optimize them, please let me know.

simsurace · 2023-09-27T08:45:58Z

Ok, did not test on Julia 1.6. Apparently this requires special care

simsurace · 2023-09-27T10:47:18Z

Why don't we see the full stack traces here? Is it due to using JuliaInterpreter?

src/rulesets/LinearAlgebra/dense.jl

test/rulesets/LinearAlgebra/dense.jl

simsurace · 2023-09-28T10:25:17Z

Ok, I made the suggested changes and added tests to check the correct behavior of the projections. However, we have some type inference problem in the matrix-matrix case.

simsurace · 2023-09-28T12:09:50Z

The problem is this:

julia> x = Diagonal(rand(2)); y = Diagonal(rand(2)); z, pb = rrule(kron, x, y);

julia> @code_warntype unthunk(pb(z)[2])
MethodInstance for ChainRulesCore.unthunk(::Thunk{ChainRules.var"#2318#2321"{Base.ReshapedArray{Float64, 4, Diagonal{Float64, Vector{Float64}}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}, Diagonal{Float64, Vector{Float64}}, ProjectTo{Diagonal, NamedTuple{(:diag,), Tuple{ProjectTo{AbstractArray, NamedTuple{(:element, :axes), Tuple{ProjectTo{Float64, NamedTuple{(), Tuple{}}}, Tuple{Base.OneTo{Int64}}}}}}}}}})
  from unthunk(x::Thunk) @ ChainRulesCore ~/.julia/packages/ChainRulesCore/0t04l/src/tangent_types/thunks.jl:204
Arguments
  #self#::Core.Const(ChainRulesCore.unthunk)
  x::Thunk{ChainRules.var"#2318#2321"{Base.ReshapedArray{Float64, 4, Diagonal{Float64, Vector{Float64}}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}, Diagonal{Float64, Vector{Float64}}, ProjectTo{Diagonal, NamedTuple{(:diag,), Tuple{ProjectTo{AbstractArray, NamedTuple{(:element, :axes), Tuple{ProjectTo{Float64, NamedTuple{(), Tuple{}}}, Tuple{Base.OneTo{Int64}}}}}}}}}}
Body::Any
1 ─      nothing
│   %2 = Base.getproperty(x, :f)::ChainRules.var"#2318#2321"{Base.ReshapedArray{Float64, 4, Diagonal{Float64, Vector{Float64}}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}, Diagonal{Float64, Vector{Float64}}, ProjectTo{Diagonal, NamedTuple{(:diag,), Tuple{ProjectTo{AbstractArray, NamedTuple{(:element, :axes), Tuple{ProjectTo{Float64, NamedTuple{(), Tuple{}}}, Tuple{Base.OneTo{Int64}}}}}}}}}
│   %3 = (%2)()::Any
└──      return %3

Any ideas how to make the unthunking type-stable here?
EDIT:
The core of the problem is that dot(y, first(eachslice(dz; dims = (2, 4)))) is type-unstable:

@code_warntype dot(y, first(eachslice(dz; dims = (2, 4))))
MethodInstance for LinearAlgebra.dot(::Diagonal{Float64, Vector{Float64}}, ::SubArray{Float64, 2, Base.ReshapedArray{Float64, 4, Diagonal{Float64, Vector{Float64}}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64, Base.Slice{Base.OneTo{Int64}}, Int64}, false})
  from dot(D::Diagonal, B::AbstractMatrix) @ LinearAlgebra ~/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/share/julia/stdlib/v1.9/LinearAlgebra/src/diagonal.jl:806
Arguments
  #self#::Core.Const(LinearAlgebra.dot)
  D::Diagonal{Float64, Vector{Float64}}
  B::SubArray{Float64, 2, Base.ReshapedArray{Float64, 4, Diagonal{Float64, Vector{Float64}}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64, Base.Slice{Base.OneTo{Int64}}, Int64}, false}
Body::Any
1 ─ %1  = LinearAlgebra.size(D)::Tuple{Int64, Int64}
│   %2  = LinearAlgebra.size(B)::Tuple{Int64, Int64}
│   %3  = (%1 == %2)::Bool
└──       goto #3 if not %3
2 ─       goto #4
3 ─ %6  = LinearAlgebra.size(D)::Tuple{Int64, Int64}
│   %7  = LinearAlgebra.size(B)::Tuple{Int64, Int64}
│   %8  = Base.string("Matrix sizes ", %6, " and ", %7, " differ")::String
│   %9  = LinearAlgebra.DimensionMismatch(%8)::Any
└──       LinearAlgebra.throw(%9)
4 ┄ %11 = Base.getproperty(D, :diag)::Vector{Float64}
│   %12 = LinearAlgebra.diagind(B)::Core.PartialStruct(StepRange{Int64, Int64}, Any[Core.Const(1), Int64, Int64])
│   %13 = LinearAlgebra.view(B, %12)::Core.PartialStruct(SubArray{Float64, 1, Base.ReshapedArray{Float64, 1, SubArray{Float64, 2, Base.ReshapedArray{Float64, 4, Diagonal{Float64, Vector{Float64}}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64, Base.Slice{Base.OneTo{Int64}}, Int64}, false}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}, Tuple{StepRange{Int64, Int64}}, false}, Any[Base.ReshapedArray{Float64, 1, SubArray{Float64, 2, Base.ReshapedArray{Float64, 4, Diagonal{Float64, Vector{Float64}}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64, Base.Slice{Base.OneTo{Int64}}, Int64}, false}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}, Core.PartialStruct(Tuple{StepRange{Int64, Int64}}, Any[Core.PartialStruct(StepRange{Int64, Int64}, Any[Core.Const(1), Int64, Int64])]), Core.Const(0), Core.Const(0)])
│   %14 = LinearAlgebra.dot(%11, %13)::Any
└──       return %14

and I cannot fix that without collecting either y or dz. Any other ideas?

src/rulesets/LinearAlgebra/dense.jl

mcabbott · 2023-10-03T13:49:44Z

src/rulesets/LinearAlgebra/dense.jl

+        function kron_pullback(z̄)
+            dz = reshape(unthunk(z̄), size(y, 1), size(x, 1), size(y, 2), size(x, 2))
+            x̄ = @thunk(project_x(_dot_collect.(Ref(y), eachslice(dz; dims = (2, 4)))))
+            ȳ = @thunk(project_y(_dot_collect.(Ref(x), eachslice(dz; dims = (1, 3)))))


I was wondering if you have to make slices, given that kron is just reshape and .*. So here's an attempt to do without:

using ChainRulesCore function pr_rule(x::AbstractMatrix{<:Number}, y::AbstractMatrix{<:Number}) # from https://github.com/JuliaDiff/ChainRules.jl/pull/741 project_x = ProjectTo(x) project_y = ProjectTo(y) function kron_pullback(z̄) dz = reshape(unthunk(z̄), size(y, 1), size(x, 1), size(y, 2), size(x, 2)) x̄ = @thunk(project_x(dot.(Ref(y), eachslice(dz; dims = (2, 4))))) ȳ = @thunk(project_y(dot.(Ref(x), eachslice(dz; dims = (1, 3))))) return NoTangent(), x̄, ȳ end end # using TensorCast # mykron(x,y) = @cast z[(a,b), (c,d)] := x[b,d] * y[a,c] # @pretty @cast z[(a,b), (c,d)] := x[b,d] * y[a,c] function shape_rule(x::AbstractMatrix, y::AbstractMatrix) function back(dz) x4 = reshape(x, 1, size(x,1), 1, size(x,2)) y4 = reshape(y, size(y,1), 1, size(y,2), 1) dz4 = reshape(unthunk(dz), size(y,1), size(x,1), size(y,2), size(x,2)) dx = @thunk ProjectTo(x)(reshape(sum(dz4 .* y4, dims=(1,3)), size(x))) # might be missing conj dy = @thunk ProjectTo(y)(reshape(sum(dz4 .* x4, dims=(2,4)), size(y))) 0, dx, dy end end let x = rand(10,20), y = rand(30,10) b1 = pr_rule(x, y) b2 = shape_rule(x, y) z = kron(x,y) _, dx1, _ = @btime map(unthunk, $b1($z)) _, dx2, _ = @btime map(unthunk, $b2($z)) dx1 ≈ dx2 end # min 181.458 μs, mean 185.668 μs (4 allocations, 4.39 KiB) # min 80.583 μs, mean 169.305 μs (32 allocations, 943.05 KiB) # true

It's a pity to allocate these big arrays dz4 .* y4 but still seems quicker. Possibly we could use lazy broadcasting to avoid that:

bc = Broadcast.instantiate(Broadcast.broadcasted(*, [1 2 3], [4, 5])); sum(bc) # OK sum(bc; dims=1) # ERROR: MethodError: no method matching reducedim_init(::typeof(identity), ::typeof(Base.add_sum), ::Base.Broadcast.Broadcasted{…}, ::Int64) sum!([0 0 0], bc) # ERROR: MethodError: no method matching sum!(::Matrix{Int64}, ::Base.Broadcast.Broadcasted sum(bc; dims=1, init=0.0) # OK, not sure if it's fast or not

On StaticArrays (mentioned above) both at present make a SizedMatrix, which I think is ProjectTo's attempt to fix things up. Surely this reshaping could be done in a static-friendly way but IDK exactly how.

julia> let x = @SMatrix(rand(5,5)), y = @SMatrix(rand(5,5)) b1 = pr_rule(x, y) b2 = shape_rule(x, y) z = kron(x,y) _, dx1, _ = @btime map(unthunk, $b1($z)) _, dx2, _ = @btime map(unthunk, $b2($z)) dx1 ≈ dx2 end min 2.458 μs, mean 2.558 μs (2 allocations, 512 bytes) min 4.006 μs, mean 5.198 μs (22 allocations, 11.38 KiB) true

Does this result scale to larger arrays?

Result meaning speed difference? It will vary with size & machine. On very small arrays reshaping is ~~faster~~ slower! (Like 3x3 I meant.)

Issues with StaticArrays will be similar at all sizes.

I think broadcasting over slices will work badly on CuArrays, and tend to make Arrays. But right now neither idea seems to work, not sure why

julia> using Metal julia> bk = pr_rule(MtlArray(rand(Float32, 3,3)), MtlArray(rand(Float32, 3,3))); julia> bk(MtlArray(rand(Float32, 9,9)))[2] |> unthunk ERROR: GPU compilation of MethodInstance for (::GPUArrays.var"#broadcast_kernel#26")(::Metal.mtlKernelContext, ::MtlDeviceMatrix{…}, ::Base.Broadcast.Broadcasted{…}, ::Int64) failed KernelError: passing and using non-bitstype argument julia> bk2 = shape_rule(MtlArray(rand(Float32, 3,3)), MtlArray(rand(Float32, 3,3))); julia> bk2(MtlArray(rand(Float32, 9,9)))[2] |> unthunk ERROR: could not load symbol "LLVMExtraAddPropagateJuliaAddrspaces": dlsym(RTLD_DEFAULT, LLVMExtraAddPropagateJuliaAddrspaces): symbol not found

If the reshape version is not strictly better than the current one, especially for large arrays, I would propose to keep the current version and put further optimizations in a separate PR.

A bit curious at what sizes it's slower for you?

But mainly I think the issue is less about the race than that simple solid-array operations have a better chance of behaving well with StaticArrays, and CuArrays. I haven't taken another pass to see if the first draft can be improved on.

I haven't benchmarked anything myself yet. I will give it a go later.

Hmm, results seem to be mixed. For larger sizes the allocations are taking their price:

let x = rand(100,200), y = rand(300,100) b1 = pr_rule(x, y) b2 = shape_rule(x, y) z = kron(x,y) _, dx1, _ = @btime map(unthunk, $b1($z)) _, dx2, _ = @btime map(unthunk, $b2($z)) dx1 ≈ dx2 end # 3.376 s (6 allocations: 390.84 KiB) # 3.797 s (34 allocations: 8.94 GiB) # true

I would suggest staying with the current implementation.

One way to ensure any implementation isn't excluding all GPU array types would be to toss a @gpu in front of the new tests, no?

Co-authored-by: David Widmann <[email protected]>

Co-authored-by: Seth Axen <[email protected]>

Co-authored-by: David Widmann <[email protected]>

devmotion reviewed Sep 25, 2023

View reviewed changes

src/rulesets/LinearAlgebra/dense.jl Outdated Show resolved Hide resolved

src/rulesets/LinearAlgebra/dense.jl Outdated Show resolved Hide resolved

src/rulesets/LinearAlgebra/dense.jl Outdated Show resolved Hide resolved

devmotion reviewed Sep 25, 2023

View reviewed changes

src/rulesets/LinearAlgebra/dense.jl Outdated Show resolved Hide resolved

simsurace requested review from devmotion and willtebbutt September 26, 2023 19:54

simsurace marked this pull request as ready for review September 26, 2023 19:55

simsurace mentioned this pull request Sep 27, 2023

Extend kron support FluxML/Zygote.jl#1458

Merged

2 tasks

sethaxen reviewed Sep 27, 2023

View reviewed changes

src/rulesets/LinearAlgebra/dense.jl Outdated Show resolved Hide resolved

src/rulesets/LinearAlgebra/dense.jl Outdated Show resolved Hide resolved

src/rulesets/LinearAlgebra/dense.jl Outdated Show resolved Hide resolved

src/rulesets/LinearAlgebra/dense.jl Outdated Show resolved Hide resolved

sethaxen requested changes Sep 28, 2023

View reviewed changes

sethaxen reviewed Sep 28, 2023

View reviewed changes

test/rulesets/LinearAlgebra/dense.jl Outdated Show resolved Hide resolved

simsurace requested a review from sethaxen September 28, 2023 12:13

devmotion reviewed Sep 28, 2023

View reviewed changes

src/rulesets/LinearAlgebra/dense.jl Outdated Show resolved Hide resolved

mcabbott reviewed Oct 3, 2023

View reviewed changes

simsurace and others added 11 commits February 6, 2024 21:52

Add rules and tests

5bb9766

Add tests for rrule

f0902e3

Add rules and try to cover complex case

7c53f4b

Restrict types of arguments

c1226eb

Co-authored-by: David Widmann <[email protected]>

Write rules functionally and fix them

236daf1

Add unthunk and @thunk

b2d4f4a

Change dimensions to make them recognizable

8b94cfc

Further simplify rules

b71b8ef

Only define rules for Julia 1.9 onwards

2ad5473

Add projections

fde509e

Add projections and remove redundant conj calls

4386143

simsurace and others added 5 commits February 6, 2024 21:52

Fix type instability

1b97828

Run tests only above Julia 1.9

5b74071

Co-authored-by: Seth Axen <[email protected]>

Fix typo

72060d0

Enable all tests

6650adc

Improve version bound

f104172

Co-authored-by: David Widmann <[email protected]>

simsurace force-pushed the kron branch from 3258dee to f104172 Compare February 6, 2024 20:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add rules and tests for `kron` #741

Add rules and tests for `kron` #741

simsurace commented Sep 25, 2023

devmotion commented Sep 25, 2023

simsurace commented Sep 25, 2023

simsurace commented Sep 26, 2023

simsurace commented Sep 27, 2023

simsurace commented Sep 27, 2023

simsurace commented Sep 28, 2023

simsurace commented Sep 28, 2023 •

edited

Loading

mcabbott Oct 3, 2023 •

edited

Loading

simsurace Oct 4, 2023

mcabbott Oct 5, 2023 •

edited

Loading

simsurace Oct 5, 2023

mcabbott Oct 5, 2023 •

edited

Loading

simsurace Oct 5, 2023

simsurace Oct 5, 2023

ToucheSir Oct 5, 2023

Add rules and tests for kron #741

Are you sure you want to change the base?

Add rules and tests for kron #741

Conversation

simsurace commented Sep 25, 2023

devmotion commented Sep 25, 2023

simsurace commented Sep 25, 2023

simsurace commented Sep 26, 2023

simsurace commented Sep 27, 2023

simsurace commented Sep 27, 2023

simsurace commented Sep 28, 2023

simsurace commented Sep 28, 2023 • edited Loading

mcabbott Oct 3, 2023 • edited Loading

Choose a reason for hiding this comment

simsurace Oct 4, 2023

Choose a reason for hiding this comment

mcabbott Oct 5, 2023 • edited Loading

Choose a reason for hiding this comment

simsurace Oct 5, 2023

Choose a reason for hiding this comment

mcabbott Oct 5, 2023 • edited Loading

Choose a reason for hiding this comment

simsurace Oct 5, 2023

Choose a reason for hiding this comment

simsurace Oct 5, 2023

Choose a reason for hiding this comment

ToucheSir Oct 5, 2023

Choose a reason for hiding this comment

Add rules and tests for `kron` #741

Add rules and tests for `kron` #741

simsurace commented Sep 28, 2023 •

edited

Loading

mcabbott Oct 3, 2023 •

edited

Loading

mcabbott Oct 5, 2023 •

edited

Loading

mcabbott Oct 5, 2023 •

edited

Loading