I have a text dump file with string
s like this one:
x\x9cK\xb42\xb5\xaa.\xb6\xb2\xb0R\xcaK-\x09J\xccKOU
I need to convert them to []byte
.
Can someone please suggest how this can be done in Go?
The python
equivalent is decode('string_escape')
.
Here is one way of doing it. Note this isn't a complete decode of the python string_escape
format, but may be sufficient given the example you've given.
package main
import (
"fmt"
"log"
"regexp"
"strconv"
)
func main() {
b := []byte(`x\x9cK\xb42\xb5\xaa.\xb6\xb2\xb0R\xcaK-\x09J\xccKOU`)
re := regexp.MustCompile(`\\x([0-9a-fA-F]{2})`)
r := re.ReplaceAllFunc(b, func(in []byte) []byte {
i, err := strconv.ParseInt(string(in[2:]), 16, 64)
if err != nil {
log.Fatalf("Failed to convert hex: %s", err)
}
return []byte{byte(i)}
})
fmt.Println(r)
fmt.Println(string(r))
}
I did have the idea of using the json
decoder, but unfortunately it doesn't understand the \xYY
syntax.
Here's how you might approach write a little parser (if you needed to support other esc things in the future):
import (
"fmt"
"encoding/hex"
)
func decode(bs string) ([]byte,error) {
in := []byte(bs)
res := make([]byte,0)
esc := false
for i := 0; i<len(in); i++ {
switch {
case in[i] == '\\':
esc = true
continue
case esc:
switch {
case in[i] == 'x':
b,err := hex.DecodeString(string(in[i+1:i+3]))
if err != nil {
return nil,err
}
res = append(res, b...)
i = i+2
default:
res = append(res, in[i])
}
esc = false
default:
res = append(res, in[i])
}
}
return res,nil
}