Believe it or not, it appears that the iota (the last letter) for this word has been encoded in two different ways in unicode:
I assume that sometimes the letter is being encoded as a single letter, and at other times it is encoded as a letter+accent.
Is there some kind of map or database that allows us to do conversion between one or the other that I can import into my code.
Believe it or not
Let's leave the world of fantasy.
Duplicated vowel+oxia characters in Greek Unicode range
Unicode: Frequently Asked Questions: Normalization
The Go Blog: Text normalization in Go
For example,
package main
import (
"bytes"
"fmt"
"golang.org/x/text/unicode/norm"
)
func Equal(a, b string) bool {
var ia, ib norm.Iter
ia.InitString(norm.NFKD, a)
ib.InitString(norm.NFKD, b)
for !ia.Done() && !ib.Done() {
if !bytes.Equal(ia.Next(), ib.Next()) {
return false
}
}
return ia.Done() && ib.Done()
}
func main() {
a := "εἰμ\u03AF"
b := "εἰμ\u1F77"
fmt.Println(a)
fmt.Println(b)
fmt.Println(a == b)
fmt.Println(Equal(a, b))
}
Output:
εἰμί
εἰμί
false
true