I'm wondering why json serialization of structs containing large strings is slow in Crystal.
The following code performs rather poorly:
struct Page
include AutoJson
field :uri, String
field :html, String
end
page = Page.new(url, html) # html is a string containing ±128KB of html
page.to_json
Whereas the following code in Javascript (Node.js) or Go is pretty much instantaneous (like x10~x20 times faster):
Node.js
page = { url: url, html: html }
JSON.stringify(page)
Go
type Page struct {
Uri string `json="uri"`
Html string `json="html"`
}
page = Page{ uri, html }
json, _ = json.Marshal(page)
Considering Crystal is usually very fast (on par with Go and much faster than V8 Javascript) it kinda left me wondering what was going on here.
I've been experimenting with the Crystal code a little bit and it seems as if the incriminating bit here is the double-quote string escaping of large strings (which is obviously required when serializing json objects). But why would it take so long, I don't know (multiple allocations, copies?).
For the record, in these example, html
is a roughly 128KB html file loaded from disk using whatever synchronous method is available. File reading operations are obviously not taken into consideration when benchmarking these snippets.
Like many other APIs, Crystal's JSON implementation is not really optimized for speed. It is merely to get it working. And that is actually already quite fast for most use cases, but there are certainly huge improvements awaiting.
I'm not sure what's exactly the reason here. It might be related to string escaping, allthough this need to be done in other languages as well.
Regarding the comparison to JavaScript, transforming an object to JSON is actually quite performant because this is a native datatype of JavaScript and implemented very efficiently. This is not dynamic code evaluation but compiled in the Javascript VM.
Try:
crystal build test.cr --release --no-debug
If that doesn't resolve the issue, then it would be worthwhile creating a ticket at https://github.com/crystal-lang/crystal/issues
The --no-debug
flag may not be necessary, but as of this writing, there is an open issue indicating that in some contexts it is:
I tested this with crystal 0.25.1 (LLVM 6.0.1), go 1.10.3, node.js v8.11.2 on macOS x86_64.
The examples all read a 161 KB html file into a string, open a tempfile and do 10.000 iterations of serializing the page object and writing it to the file.
This generates about 1.5 GB of JSON, the system has a very fast PCIe SSD, so IO throughput is not a bottleneck.
I chose to actually write the data to a file to make sure compilers cannot optimize the function calls away.
Crystal
require "json"
require "tempfile"
url = "http://www.example.org"
html = File.read("index.html")
record(Page, uri : String, html : String) do
include JSON::Serializable
end
Tempfile.open("foo") do |io|
10_000.times do
page = Page.new(url, html)
page.to_json(io)
end
end
Go
package main
import (
"encoding/json"
"io/ioutil"
"log"
"os"
)
type Page struct {
Uri string `json="uri"`
Html string `json="html"`
}
func main() {
buf, err := ioutil.ReadFile("index.html")
if err != nil {
log.Fatal(err)
}
uri := "http://www.example.org"
html := string(buf)
file, err := ioutil.TempFile(os.TempDir(), "foo")
if err != nil {
log.Fatal(err)
}
defer os.Remove(file.Name())
for i := 0; i < 10000; i++ {
page := Page{uri, html}
json, err := json.Marshal(page)
if err != nil {
log.Fatal(err)
}
_, err = file.Write(json)
if err != nil {
log.Fatal(err)
}
}
}
Node.js
const fs = require('fs')
const tmp = require('tmp')
const uri = 'http://www.example.org'
const html = fs.readFileSync('index.html')
tmp.file((err, path, fd) => {
if (err) throw err;
for(let i = 0; i < 10000; i++) {
const page = { uri, html }
const json = JSON.stringify(page)
fs.writeSync(fd, json)
}
})
Results
Note that I compiled the the Crystal example with --release
and updated the code for 0.25.1.
The Node.js example used v8 instead of v10, because v10 was incompatible with the node-tmp
npm module I used for tempfiles.
The benchmarks were done on an early-2015 13" Retina MacBook Pro with i7-5557U CPU, 16 GB RAM and 1 TB PCIe SSD.