Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving cache to file and reading from it #68

Open
physicophilic opened this issue Dec 17, 2020 · 7 comments
Open

Saving cache to file and reading from it #68

physicophilic opened this issue Dec 17, 2020 · 7 comments

Comments

@physicophilic
Copy link

Hi, I am a new user of memoize and need some help.

As I can see, memoize can use dictionary as a cache which I can probably write to a file later. Can I read this dictionary later and load it into the cache?

I checked the asked questions but it seems this feature doesn't exist(?)

@cstjean
Copy link
Collaborator

cstjean commented Dec 17, 2020

You should use a persistent dictionary type for that (i.e. one that is saved to a file), like

@memoize PersistentDict(file="blah.json") foo(x) = ...

but I don't know if anyone has made such a PersistentDict yet. It wouldn't be done in this package, in any case.

@physicophilic
Copy link
Author

Any idea on how this dict type can be constructed? If it would be possible for me I can try it.

@cstjean
Copy link
Collaborator

cstjean commented Dec 20, 2020

That's a big question! If you google for on-disk dictionary or on-disk hash-table, that would be the full solution.

The easier solution would be to have a

struct PerstitentDict
    dict::Dict
end

Then on setkey!(::PresistentDict), you Serialization.serialize! the dict. Won't be very efficient, but maybe that'll be ok for your use case.

@physicophilic
Copy link
Author

Thanks, this is what I was also looking for. :)

@willow-ahrens
Copy link

I just tried this with https://github.com/jw3126/DiskBackedDicts.jl. Note that you could also use https://github.com/blenessy/PersistentCollections.jl.

Regardless of our disc-backed dictionary, theres a slight hiccup: Memoize calls empty! on the dict before using it (why?). Unless we can merge a PR to remove that call to empty, we need to wrap our dictionary in a wrapper type that ignores calls to empty!.

For now, I can get things to work by hacking an override for the empty! method.

Base.empty!(::DiskBackedDict) = nothing

@cstjean
Copy link
Collaborator

cstjean commented Dec 30, 2020

You can try removing it and see what breaks in the tests. I doubt that we'll be able to take it out in a non-breaking release, but I can definitely see that on the 0.5 milestone...

FWIW, this is almost certainly about being revision-friendly. It came from 4 years ago (1a3b4d5), so it predates Revise.jl.

I'd say the key property to keep here is that if I change the definition of a memoized function, it should reset the cache. Because of the eval shenanigans (#48), the reset of the cache is not trivial.

@willow-ahrens
Copy link

willow-ahrens commented Dec 30, 2020

For now, I've opted to wrap my dictionary in the following:

struct NoEmptyMemoizeDict{K, V, T<:AbstractDict{K, V}} <: AbstractDict{K, V}
    parent::T
end

function Base.get!(f::Base.Callable, d::NoEmptyMemoizeDict, key)
    return get!(f, d.parent, key)
end

function Base.empty!(d::NoEmptyMemoizeDict)
    return nothing
end

function sudo_empty!(d::NoEmptyMemoizeDict)
    return empty!(d.parent)
end

This solution works for me for now.

However, I don't think that memoize can be truly revision-friendly without significant changes. For example, revisions of the same function which don't go through Memoize aren't tracked, nor are revisions of functions which the memoized function calls, even if those functions go through Memoize. To give two examples:

julia> using Memoize

julia> @memoize g() = 1
g (generic function with 1 method)

julia> @memoize f() = g()
f (generic function with 1 method)

julia> f()
1

julia> @memoize g() = 2
g (generic function with 1 method)

julia> f()
1

julia> @memoize fib(n) = if n <=2 1 else fib(n - 1) + fib(n - 2) end
fib (generic function with 1 method)

julia> fib(Int32(7))
13

julia> fib(n::Int64) = 1
fib (generic function with 2 methods)

julia> fib(Int32(7))
13

julia> fib(Int32(8))
2

To clarify, I think it might be possible (and perhaps useful) to create an efficient 265-aware memoize if we keep track of the world age bounds of the method cache (and keep a separate cache for each method like in #59, rather than having one cache per function like in the current version). However, the current version of Memoize only supports a subset of such revisions, and it might be better for the cache to never invalidate unless asked explicitly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants