Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic Data Structures #33

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Generic Data Structures #33

wants to merge 7 commits into from

Conversation

elazarg
Copy link
Member

@elazarg elazarg commented Apr 14, 2024

Decompose the data structures into domain-specific and generic data structures.

The ds package has zero dependencies, and it knows nothing about numbers or network.

Interfaces (all in interfaces.go):

  • Comparable (Equal, Copy)
  • Hashable (Comparable, Hash)
  • Set (Hashable + set operations)
  • Product of A x B (partitions, left and right projections, swap)
  • TripleSet, convenience (and associativity agnostic) interface for A x B x C

Implementations:

  • HashMap - a simple adapter around a go map, allowing Hashable keys.
  • HashSet
  • MultiMap is a non-injective map. It gives a simple "inverse" on a HashMap.
  • ProductLeft and ProductRight is a map-based implementation of a cartesian product; it implements Product as a one-to-one mapping from sets to sets, where each key-value pairs encodes the cross product of its two items.
  • TripleSetLeft and TripleSetRight and TripleSetOuter are right-, left- and outer-associative implementations of TripleSet.
  • DisjointSum is tagged-union of two sets. Go's generics place some limitations on the implementation; The empty element must always be explicitly passed to the constructor.
  • UnitSet is an idea - using sets of size 1 as the identity element of Product. It's probably not really helpful.

The interval package has interval and intervalset. Nothing particularly new about them.

netp is unchanged.

The netset package is where handles sets of network elements.

  • ICMPSet is largely the one proposed in Add ICMP set #23.
  • IPBlock is adapted to implement Set.
  • TCPUDPSet is a TripleSet of ProtocolSet x PortSet x PortSet
  • TransportSet is a disjoint union of TCPUDPSet + ICMPSet
  • ConnectionSet is a TripleSet of IPBlock x IPBlock x TransportSet

The package connection now contains the state-aware connectionset (named Set), and its json-formatting. In my opinion it should be moved to the analyzer.

Many tests are adapted. but some data structures do not have dedicated tests.

There is no implementation of a dynamically-bounded tuple, since I could not find the proper way to fit that into a generic datastructure with a well defined degenerate edge case.

The diff is probably not useful; it should be easier to read the code.

Copy link
Contributor

@adisos adisos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, added few initial comments.

pkg/netset/connectionset.go Outdated Show resolved Hide resolved
pkg/netset/tcpudpset.go Outdated Show resolved Hide resolved
pkg/ds/interfaces.go Outdated Show resolved Hide resolved
pkg/ds/hashmap.go Outdated Show resolved Hide resolved
pkg/ds/hashmap.go Show resolved Hide resolved
pkg/netset/connectionset.go Show resolved Hide resolved
@elazarg
Copy link
Member Author

elazarg commented Jun 19, 2024

We may need to use ordered maps instead of hash maps, so the lists of partitions are consistently ordered.

Copy link
Contributor

@adisos adisos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few more questions and comments

pkg/ds/interfaces.go Outdated Show resolved Hide resolved
pkg/ds/interfaces.go Outdated Show resolved Hide resolved
pkg/ds/interfaces.go Show resolved Hide resolved
pkg/ds/interfaces.go Show resolved Hide resolved
pkg/netset/icmpset.go Outdated Show resolved Hide resolved
pkg/netset/connectionset.go Outdated Show resolved Hide resolved
pkg/ds/unitset.go Outdated Show resolved Hide resolved
pkg/ds/tripleset_outer.go Show resolved Hide resolved
pkg/netp/icmp.go Outdated Show resolved Hide resolved
Signed-off-by: Elazar Gershuni <[email protected]>
Signed-off-by: Elazar Gershuni <[email protected]>
Also, rename NewMap to NewHashMap

Signed-off-by: Elazar Gershuni <[email protected]>
Signed-off-by: Elazar Gershuni <[email protected]>
@elazarg
Copy link
Member Author

elazarg commented Jun 25, 2024

Added an elaborated README, and changed implementation of MultiMap to use HashSet.

@elazarg
Copy link
Member Author

elazarg commented Jun 26, 2024

BTW, I think the class ICMP is better named ICMPRule, and is a set. The file icmp.go in the netp package should only have functions that are related to ICMP proper. Such change would also solve the need to expose implementation details of the ICMP class to the ICMPSet class.

@elazarg elazarg requested a review from adisos July 3, 2024 15:44
@elazarg
Copy link
Member Author

elazarg commented Jul 3, 2024

@adisos is there anything left for me to implement/document so this PR can be merged?

@elazarg elazarg mentioned this pull request Jul 6, 2024
Copy link
Contributor

@adisos adisos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few small comments and questions

pkg/ds/hashmap.go Outdated Show resolved Hide resolved
return m.m.IsEmpty()
}

// Size returns the number of unique pairs in the Product object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the Size documentation is incorrect? it does not return the number of pairs..

// Product is a cartesian product of sets S1 x S2
type Product[S1 Set[S1], S2 Set[S2]] interface {
Set[Product[S1, S2]]
Partitions() []Pair[S1, S2]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Partitions() []Pair[S1, S2]
// Partitions returns a slice of pairs in the product set (note that the order is arbitrary)
Partitions() []Pair[S1, S2]

type Product[S1 Set[S1], S2 Set[S2]] interface {
Set[Product[S1, S2]]
Partitions() []Pair[S1, S2]
NumPartitions() int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should explain the difference between NumPartitions and Size ?

Set[Product[S1, S2]]
Partitions() []Pair[S1, S2]
NumPartitions() int
Left(empty S1) S1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Left(empty S1) S1
// Left returns the left projection from pairs in the product set on S1, given input of an empty set in S1
Left(empty S1) S1

Partitions() []Pair[S1, S2]
NumPartitions() int
Left(empty S1) S1
Right(empty S2) S2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Right(empty S2) S2
// Right returns the right projection from pairs in the product set S2, given input of an empty set in S2
Right(empty S2) S2

NumPartitions() int
Left(empty S1) S1
Right(empty S2) S2
Swap() Product[S2, S1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Swap() Product[S2, S1]
// Swap returns a new Product object, built from the input object, with left and right swapped
Swap() Product[S2, S1]

@adisos
Copy link
Contributor

adisos commented Jul 8, 2024

@adisos is there anything left for me to implement/document so this PR can be merged?

added few small comments, still reviewing the tests and some implementation details.

elazarg and others added 3 commits July 8, 2024 15:43
Co-authored-by: Adi Sosnovich <[email protected]>
interval:
* Separate tests from interval_test.go
* Improve documentation.
* Export and set-like functions that are well defined.
* Rename interval.Subtract to interval.SubtractSplit, and add tests.
* Handle empty cases first.
* Preallocate Elements.

intervalset:
* Guard Size() from overflow, and use intervalset.CalculateSize().
* Handle empty cases first.
* Remove String() method, since it is not obvious; clients should implement.

Signed-off-by: Elazar Gershuni <[email protected]>
@elazarg
Copy link
Member Author

elazarg commented Jul 8, 2024

I've improved and fixed the documentation of interface.go. @adisos please re-review this part.

Additionally, I've improved the interval package.

interval:

  • Separate tests from intervalset_test.go
  • Improve documentation.
  • Export set-like functions that are well defined.
  • Rename interval.Subtract to interval.SubtractSplit, and add tests.
  • Handle empty cases first.
  • Preallocate Elements.

intervalset:

  • Guard Size() from overflow, and use intervalset.CalculateSize().
  • Handle empty cases first.
  • Remove String() method, since it is not obvious; clients should implement.

I think it's better to make GenericInterval and GenericCanonicalSet as generic types (with integral constraints), so clients can choose whether or not to rely on things like Size() being 64bit integer. interval.Interval should be an alias to GenericInterval[int], and interval.Set should be an alias to GenericCanonicalSet[int].

@elazarg elazarg requested a review from adisos July 8, 2024 15:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants