If I have a struct like:
type Foo struct {
title string
Tags map[string]string
}
How might approach maintaining a unique set of such structs? From what I understand, although struct equality is a thing - map equality isn't. This means I can't compare my above structs. Therefore I can't just implement the map as set pattern.
The two options that might work I can think of are: convert the Tags to a sorted [][]string
or use reflect.Deepequal. Anyone have a better idea?
Depending on what you are doing, one option might be to store the structs as the values in the map rather than keys. For this to work, you will need to create some way to generate a unique key from each struct value though.
Something like this might work:
// Doesn't have to be a string: just has to be suitable for use as a map key.
func (foo *Foo) key() string {
return key_string
}
fooSet := make(map[string] *Foo)
// Store a Foo
fooSet[x.key()] = x
// Check if x is in the set:
if fooSet[x.key()] != nil {
println("x is in the set")
}
How well this works will depend on how efficient you can derive a key for your struct.
There are a few ways of implementing this. James Henstridge actually had a good idea, and I attempted to implement it. It performed pretty poorly just to use map in the first place, without my own hash algorithm.
The way I solved this problem is just keep an array of your structs and then remove any duplicates as you insert them.
package structset
type Foo struct {
title string
Tags map[string]string
}
func (f Foo) Equals(f2 Foo) bool {
if f.title != f2.title {
return false
}
if len(f.Tags) != len(f2.Tags) {
return false
}
for k, v := range f.Tags {
if w, ok := f2.Tags[k]; !ok || v != w {
return false
}
}
return true
}
type FooSet []Foo
func (this FooSet) Add(value Foo) {
if !this.Contains(value) {
this = append(this, value)
}
}
func (this FooSet) Length() int {
return len(this)
}
func (this FooSet) Contains(f Foo) bool {
for _, v := range this {
if v.Equals(f) {
return true
}
}
return false
}
func NewSet() FooSet {
return FooSet(make([]Foo, 0, 100))
}
I benchmarked this on my i7-3770K Windows machine and got:
BenchmarkSmallSetWithFewCollisions 50000 46615 ns/op
BenchmarkSmallSetWithMoreCollisions 50000 46575 ns/op
BenchmarkSmallSetWithManyCollisions 50000 46605 ns/op
BenchmarkMediumSetWithFewCollisions 1000 2335296 ns/op
BenchmarkMediumSetWithMoreCollisions 1000 2352298 ns/op
BenchmarkMediumSetWithManyCollisions 1000 2336796 ns/op
BenchmarkLargeSetWithFewCollisions 50 46805944 ns/op
BenchmarkLargeSetWithMoreCollisions 50 47376016 ns/op
BenchmarkLargeSetWithManyCollisions 50 46815946 ns/op
To eek out a very small amount of performance, you can insert all your data into the array first, and then remove all duplicates after.
The remove duplicates code is:
func (this FooSet) RemoveDuplicates() {
length := len(this) - 1
for i := 0; i < length; i++ {
for j := i + 1; j <= length; j++ {
if this[i].Equals(this[j]) {
this[j] = this[length]
this = this[0:length]
length--
j--
}
}
}
}
The benchmarks for this is:
BenchmarkSmallSetWithFewCollisions 50000 45245 ns/op
BenchmarkSmallSetWithMoreCollisions 50000 45615 ns/op
BenchmarkSmallSetWithManyCollisions 50000 45555 ns/op
BenchmarkMediumSetWithFewCollisions 1000 2294791 ns/op
BenchmarkMediumSetWithMoreCollisions 1000 2309293 ns/op
BenchmarkMediumSetWithManyCollisions 1000 2286290 ns/op
BenchmarkLargeSetWithFewCollisions 50 46235870 ns/op
BenchmarkLargeSetWithMoreCollisions 50 46515906 ns/op
BenchmarkLargeSetWithManyCollisions 50 45865824 ns/op
Here is the benchmark of just assigning Foo to a map[string]Foo.
BenchmarkSmallSetWithFewCollisions 50000 65718 ns/op
BenchmarkSmallSetWithMoreCollisions 50000 64238 ns/op
BenchmarkSmallSetWithManyCollisions 50000 55016 ns/op
BenchmarkMediumSetWithFewCollisions 500 3429435 ns/op
BenchmarkMediumSetWithMoreCollisions 500 3117395 ns/op
BenchmarkMediumSetWithManyCollisions 1000 2826858 ns/op
BenchmarkLargeSetWithFewCollisions 20 82635495 ns/op
BenchmarkLargeSetWithMoreCollisions 20 85285830 ns/op
BenchmarkLargeSetWithManyCollisions 20 73659350 ns/op
It seems to me even if a map was hashable, it still wouldn't perform very well.
Are you sure your example works? I believe you have to pass a pointer to the Add() method for your code to work. Anyways, here is my implementation:
package math
// types
type IntPoint struct {
X, Y int
}
// set implementation for small number of items
type IntPointSet struct {
slice []IntPoint
}
// functions
func (p1 IntPoint) Equals(p2 IntPoint) bool {
return (p1.X == p2.X) && (p1.Y == p2.Y)
}
func (set *IntPointSet) Add(p IntPoint) {
if ! set.Contains(p) {
set.slice = append(set.slice, p)
}
}
func (set IntPointSet) Contains(p IntPoint) bool {
for _, v := range set.slice {
if v.Equals(p) {
return true
}
}
return false
}
func (set IntPointSet) NumElements() int {
return len(set.slice)
}
func NewIntPointSet() IntPointSet {
return IntPointSet{(make([]IntPoint, 0, 10))}
}