Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change MultiDict to be backed by a MultiSet instead of a Vector; add delete!(md, k, v) #692

Draft
wants to merge 26 commits into
base: MultiDict-reboot
Choose a base branch
from

Conversation

NHDaly
Copy link
Contributor

@NHDaly NHDaly commented Oct 24, 2020

This PR builds on top of all the currently open PRs related to MultiDicts, to change the implementation to be built on a Dict{K, MultiSet{V}}. This allows logarithmic-time lookup of values, which makes delete!(md, k, v) and in(md, kv::Pair{K,V)) feasible performance-wise.

This PR then adds delete!(md, k, v) and updates in(k=>v, md::MultiDict).

PRs included in this one:

Fixes #654.

NHDaly and others added 26 commits August 18, 2020 22:43
Before, constructors would accept _either_ (k=>v, k=>v...), or (k=>[v],
k=>[v]...). Each worked in some contexts, but neither in all contexts.
Also, before, generators did not work.

Now, there are two simple rules:
1. MultiDict constructors expect pairs or tuples or generators of {K,V}:
    - `MultiDict(1=>1,1=>2)`, `MultiDict{Int,Int}(1=>2,2=>3)`, etc
2. To manually construct a MultiDict with {K,Vector{V}), you can
   construct with the _underlying Dict_ directly.
    - `MultiDict(Dict(1=>[1,2])`, `MultiDict{Int,Int}(1=>[2],2=>[3])`, etc
- Fixes broken constructor by adding `grow_to!` overload. We need this
  because MultiDict isn't an AbstractDict.
- Adds `empty(::MultiDict, ::Type{K}, ::Type{V})`, which was needed in
  the process, and also makes us more standard! :raised-hands:
Apply @eulerkochy's formatting suggestions

Co-authored-by: Koustav Chowdhury <[email protected]>
Fixes constructors that must do type-promotion, e.g.:
```julia
julia> MultiDict(1 => 2, 1 => "hello")
MultiDict{Int64,Any}(Dict{Int64,Array{Any,1}}(1 => [2, "hello"]))
```
…tually allowed duplicates

```julia
julia> collect(MultiDict(3=>3, 1=>2, 1=>4))
3-element Array{Pair{Int64,Int64},1}:
 3 => 3
 1 => 2
 1 => 4
```
…extend.

Per BlueStyle, we should overload via qualified name, and import names via `using`. I simply lapsed in my earlier commit.
Stop implicit overloading via `import`
A `MultiSet{T}` is very much like an `Accumulator{T,Int}`, but it
exposes a single-value-based interface. The fact that it is storing
counts under the hood is completely hidden from the interface.

For example, iterating the MultiSet returns each item in the MultiSet,
repeated per its count, as if the multiset _actually_ contained
duplicate items:
```julia
julia> show(collect(MultiSet([1,1,1,2,2])))
[2, 2, 1, 1, 1]
```
whereas the accumulator presents its count-based representation to the
user:
```julia
julia> show(collect(MultiSet([1,1,1,2,2]).data))
[2 => 2, 1 => 3]
```

For more info on MultiSets, see:
https://en.wikipedia.org/wiki/Multiset
```julia
julia> collect(md)
5-element Array{Pair{Int64,Int64},1}:
 2 => 10
 2 => 20
 1 => 10
 1 => 10
 1 => 20

julia> collect(md[1])
3-element Array{Int64,1}:
 10
 10
 20
```
The default implementation of keys() and values() lazily constructs the
keys and values out of `iterate()`, which is exactly what we'd want to
do, and it's more standard using a KeySet and ValueIterator. :)
Before, it was comparing the value to the collection returned from
`get()`, and now it returns the correct value.

There already existed a function doing the right thing, but it was
incorrectly overloaded for 2-Tuples, not Pairs, which is not what the
AbstractDict interface expects.
@NHDaly
Copy link
Contributor Author

NHDaly commented Oct 24, 2020

After this PR, i think i am finally done with the MultiDict changes! :)

I'll have a look at the OrderedMultiDict, like @StephenVavasis suggested, but i'm not sure when I'll get to it. (Sorry this has all been slow going; this is just my recreational programming 😅)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant