How to check if a given string is in form of multiple json string separated by spaces/newline?
For example,
given: "test" 123 {"Name": "mike"}
(3 json concatenated with space)
return: true
, since each of item ("test"
123
and {"Name": "mike"}
) is a valid json.
In Go, I can write a O(N^2) function like:
// check given string is json or multiple json concatenated with space/newline
func validateJSON(str string) error {
// only one json string
if isJSON(str) {
return nil
}
// multiple json string concatenate with spaces
str = strings.TrimSpace(str)
arr := []rune(str)
start := 0
end := 0
for start < len(str) {
for end < len(str) && !unicode.IsSpace(arr[end]) {
end++
}
substr := str[start:end]
if isJSON(substr) {
for end < len(str) && unicode.IsSpace(arr[end]) {
end++
}
start = end
} else {
if end == len(str) {
return errors.New("error when parsing input: " + substr)
}
for end < len(str) && unicode.IsSpace(arr[end]) {
end++
}
}
}
return nil
}
func isJSON(str string) bool {
var js json.RawMessage
return json.Unmarshal([]byte(str), &js) == nil
}
But this won't work for large input.
There are two options. The simplest, from a coding standpoint, is going to be just to decode the JSON string normally. You can make this most efficient by decoding to an empty struct:
package main
import "encoding/json"
func main() {
input := []byte(`{"a":"b", "c": 123}`)
var x struct{}
if err := json.Unmarshal(input, &x); err != nil {
panic(err)
}
input = []byte(`{"a":"b", "c": 123}xxx`) // This one fails
if err := json.Unmarshal(input, &x); err != nil {
panic(err)
}
}
This method has a few potential drawbacks:
interface{}
must be used, which introduces the potential for some serious performance penalties.x
value must still be allocated, and at least one reflection call is likely under the sheets, which may introduce a noticeable performance penalty for some workloads.Given these limitations, my recommendation is to use the second option: loop through the entire JSON input, ignoring the actual contents. This is made simple with the standard library json.Decoder:
package main
import (
"bytes"
"encoding/json"
"io"
)
func main() {
input := []byte(`{"a":"b", "c": 123}`)
dec := json.NewDecoder(bytes.NewReader(input))
for {
_, err := dec.Token()
if err == io.EOF {
break // End of input, valid JSON
}
if err != nil {
panic(err) // Invalid input
}
}
input = []byte(`{"a":"b", "c": 123}xxx`) // This input fails
dec = json.NewDecoder(bytes.NewReader(input))
for {
_, err := dec.Token()
if err == io.EOF {
break // End of input, valid JSON
}
if err != nil {
panic(err) // Invalid input
}
}
}
As Volker mentioned in the comments, use a *json.Decoder to decode all json documents in your input successively:
package main
import (
"encoding/json"
"io"
"log"
"strings"
)
func main() {
input := `"test" 123 {"Name": "mike"}`
dec := json.NewDecoder(strings.NewReader(input))
for {
var x json.RawMessage
switch err := dec.Decode(&x); err {
case nil:
// not done yet
case io.EOF:
return // success
default:
log.Fatal(err)
}
}
}
Try it on the playground: https://play.golang.org/p/1OKOii9mRHn
Try fastjson.Scanner:
s := `"test" 123 {"Name": "mike"}`
var sc fastjson.Scanner
sc.Init(s)
// Iterate over a stream of json objects
for sc.Next() {}
if sc.Error() != nil {
fmt.Println("ok")
} else {
fmt.Println("false")
}