符文切片的长度和RuneCountInString之间的区别?

We can get the number of runes in a string by getting the length of the rune slice converted from the string.

s := "世界"
runes := []rune(s)
fmt.Println(len(runes))

Or use the RuneCountInString function in unicode/utf8 package.

fmt.Println(utf8.RuneCountInString(s))

What's the difference between the two?

The difference is that the first one:

runes  := []rune(s)
length := len(runes)

has to step through s to build a slice of runes and then ask that slice how long it is whereas utf8.RuneCountInString simply steps through s byte by byte incrementing a counter whenever it sees a sequence of contiguous bytes that make up a UTF-8 character.

The []rune(s) version has to do more work than utf8.RuneCountInString does.


A cursory bit of wandering around the source suggests that []rune(someString) is implemented by stringtoslicerune which actually does two iterations over the string: one two find out how many runes are there and another to copy those runes into a slice. I'm not certain about this as I'm not that familiar with the implementation details of Go.