I'm writing a golang program, which takes a list of strings and sorts them into bucket lists by the first character of string. However, I want it to group accented characters with the unaccented character that it most resembles. So, if I have a bucket for the letter A, then I want strings that start with Á to be included.
Does Go have anything built-in for determining this, or is my best bet to just have a large switch statement with all characters and their accented variations?
Looks like there are some addon packages for this. Here's an example...
package main
import (
"fmt"
"code.google.com/p/go.text/collate"
"code.google.com/p/go.text/language"
)
func main() {
strs := []string{"abc", "áab", "aaa"}
cl := collate.New(language.En)
cl.SetOptions(collate.Loose)
cl.SortStrings(strs)
fmt.Println(strs)
}
outputs:
[aaa áab abc]
Also, check out the following reference on text normalization: http://blog.golang.org/normalization