为什么utf 8.Valid String函数无法检测到无效的unicode字符?

From https://en.wikipedia.org/wiki/UTF-8#Invalid_code_points, I got to know that U+D800 through U+DFFF are invalid. So in decimal system, it is 55296 through 57343.

And Maximum valid Unicode is '\U0010FFFF'. In decimal system, it is 1114111

My code:

package main

import "fmt"
import "unicode/utf8"

func main() {

    fmt.Println("Case 1(Invalid Range)")
    str := fmt.Sprintf("%c", rune(55296+1))
    if !utf8.ValidString(str) {
        fmt.Print(str, " is not a valid Unicode")
    } else {
        fmt.Println(str, " is valid unicode character")
    }

    fmt.Println("Case 2(More than maximum valid range)")
    str = fmt.Sprintf("%c", rune(1114111+1))
    if !utf8.ValidString(str) {
        fmt.Print(str, " is not a valid Unicode")
    } else {
        fmt.Println(str, " is valid unicode character")
    }
}

Why ValidString function is not returning false for invalid unicode characters given as input ? I am sure my understanding is wrong, could some one explain??

Your problem happens in Sprintf. Since you give it an invalid character Sprintf replaces with with rune(65533) which is the unicode replacement character used instead of invalid characters. So your string is valid UTF8.

This will also happen if you do something like this: str := string([]rune{ 55297 }) so this might be something that happens when creating runes. It's not immediately obvious from: https://blog.golang.org/strings

If you want to force your string to contain invalid UTF8 you can write the first string like this:

str := string([]byte{237, 159, 193})

You take an invalid value and convert it using Sprintf. It's converted to the error value. You then check the error value, which is a valid Unicode code point.

package main

import (
    "fmt"
    "unicode/utf8"
)

func main() {

    fmt.Println("Case 1: Invalid Range")
    str := fmt.Sprintf("%c", rune(55296+1))
    fmt.Printf("%q %X %d %d
", str, str, []rune(str)[0], utf8.RuneError)
    if !utf8.ValidString(str) {
        fmt.Print(str, " is not a valid Unicode")
    } else {
        fmt.Println(str, " is valid unicode character")
    }

    fmt.Println("Case 2: More than maximum valid range")
    str = fmt.Sprintf("%c", rune(1114111+1))
    fmt.Printf("%q %X %d %d
", str, str, []rune(str)[0], utf8.RuneError)
    if !utf8.ValidString(str) {
        fmt.Print(str, " is not a valid Unicode")
    } else {
        fmt.Println(str, " is valid unicode character")
    }

}

Output:

Case 1: Invalid Range
"�" EFBFBD 65533 65533
�  is valid unicode character
Case 2: More than maximum valid range
"�" EFBFBD 65533 65533
�  is valid unicode character