如何在golang中使用表情符号处理(解码或删除无效的Unicode代码点)字符串?

Example string:

"\u0410\u043b\u0435\u043a\u0441\u0430\u043d\u0434\u0440\u044b! 
\u0421\u043f\u0430\u0441\u0438\u0431\u043e \ud83d\udcf8 link.ru \u0437\u0430 
#hashtag  Русское слово, an English word"

Without this \ud83d\udcf8 my func works well:

func convertUnicode(text string) string {
    s, err := strconv.Unquote(`"` + text + `"`)
    if err != nil {
        // Error.Printf("can't convert: %s | err: %s
", text, err)
        return text
    }
    return s
}

My question is how to detect that text contains this kind of entries? And how to convert it to emoji or how to remove from the text? Thanks

Well, probably not so simple as neither \ud83d nor \udcf8 are valid code points but together are a surrogate pair used in UTF-16 encoding to encode \U0001F4F8. Now strconv.Unquote will give you two surrogate halves which you have to combine yourself.

  1. Use strconv.Unquote to unquote as you did.
  2. Convert to []rune for convenience.
  3. Find surrogate pairs with unicode/utf16.IsSurrogate.
  4. Combine surrogate pairs with unicode/utf16.DecodeRune.
  5. Convert back to string.