I'm trying to parse a string into a regular JSON struct in golang. I don't control the original string, but it might contain unwanted characters like this
originalstring := `{"os": "\u001C09:@>A>DB Windows 8.1 \u001A>@?>@0B82=0O"}`
input := []byte(originalstring)
var event JsonStruct
parsingError := json.Unmarshal(input, &event)
If I try to parse this into golang I get this error
invalid character '\x1c' in string literal
I previously had a way to do this in Java by doing this
event = charset.decode(charset.encode(event)).toString();
eventJSON = new JsonObject(event);
Any idea?
According to the Ecmascript standard for JSON strings, control characters must be escaped in order to be valid JSON. If you want to preserve your control characters you'll have to turn them into valid escape strings, or if you don't want to preserve them then you'll have to remove them before Unmarshaling.
Here is an implementation of the latter:
func stripCtlFromUTF8(str string) string {
return strings.Map(func(r rune) rune {
if r >= 32 && r != 127 {
return r
}
return -1
}, str)
}
func main() {
js := []byte(stripCtlFromUTF8(`{"os": "09:@>A>DB Windows 8.1 >@?>@0B82=0O"}`))
t := struct {
OS string
}{}
err := json.Unmarshal(js, &t)
fmt.Println("error:", err)
fmt.Println(t)
}
On the playground: http://play.golang.org/p/QRtkS8LF1z
You need to convert control characters to unicode code points in notation \xYYYY
where Y is hexadecimal digit. A working example of that is:
package main
import (
"bytes"
"encoding/json"
"fmt"
"unicode"
)
func convert(input string) string {
var buf bytes.Buffer
for _, r := range input {
if unicode.IsControl(r) {
fmt.Fprintf(&buf, "\\u%04X", r)
} else {
fmt.Fprintf(&buf, "%c", r)
}
}
return buf.String()
}
func main() {
input := convert(`{"os": "09:@>A>DB Windows 8.1 >@?>@0B82=0O"}`)
fmt.Println(input)
js := []byte(input)
t := struct {
OS string
}{}
err := json.Unmarshal(js, &t)
fmt.Println("error:", err)
fmt.Println(t)
}
Which produces:
{"os": "09:@>A>DB Windows 8.1 \u001A>@?>@0B82=0O"}
error: <nil>
{09:@>A>DB Windows 8.1 >@?>@0B82=0O}