top | item 27049637

(no title)

WillAbides | 4 years ago

I apparently had a fundamental misunderstanding of the meaning of branchless. That's embarrassing.

As for simdjson-go, I did benchmark it. rjson outperformed simdjson-go in most benchmarks, but simdjson-go was about 3% faster reading citm_catalog.json.

https://github.com/WillAbides/rjson#simdjson

Edit:

I see on your bio that you are one of the simdjson authors. I hope you will indulge a question about it. I am generally more interested parsing a large volume of json documents efficiently than I am in doing it quickly. Since simdjson uses parallel instructions, would that mean that the speedup comes at the expense of being able to process more documents in parallel on the same hardware?

discuss

order

glangdale|4 years ago

I'm not sure I believe your benchmarks, honestly. They are way slower than the C++ version. But I don't really have a horse in the race as simdjson-go is its own thing. Still, I'm mildly surprised that a byte-by-byte parser is the same speed as a SIMD approach, and usually when someone tells me this, the answer is usually that they buggered up the benchmarking. I would suggest comparing your numbers to the simdjson paper or whatever brag numbers simdjson-go has as a reality check; one of us is clearly wrong.

The use of parallelism in simdjson is the use of SIMD instructions, not multiple cores. So the answer is "no" to the second question.