Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need a std::map or unordered_map implementation. #346

Open
nicehero opened this issue Jan 14, 2022 · 2 comments
Open

Need a std::map or unordered_map implementation. #346

nicehero opened this issue Jan 14, 2022 · 2 comments

Comments

@nicehero
Copy link

Describe the proposed feature
add type
jsoncons::mjson (std::map or unordered_map implementation)
jsoncons::mojson (preserve order std::map or unordered_map implementation)
like
jsoncons::json (sort vector implementation)
jsoncons::ojson (preserve order sort vector implementation)

The performance of insert is too low.
https://github.com/nicehero/json_benchmark

And I love other featrues from this project such as proxy_type ,bson support etc.

nlohmann insert&erase times:1000 cost:0.000472 qps:2117616
jsoncons insert&erase times:1000 cost:0.011752 qps:85090
rapidjson insert&erase times:1000 cost:0.002199 qps:454700

jpg

@danielaparker
Copy link
Owner

danielaparker commented Jan 14, 2022

Thanks for posting this, the results are quite interesting. It's uncommon to see json benchmarks that go beyond parsing and serializing.

In terms of structure, nlohmann stores json objects as a std::map<key,value>, jsoncons as a std::vector of key-value pairs sorted by name, and rapidjson as an unsorted array of key- value pairs. Consequently, I would expect rapidjson to perform better than nlohmann and jsoncons for inserts, and worse for access, which is reflected in your tests.

Of note, by sticking to sequential data structures, jsoncons and rapidjson have more compact data representations and use less memory than nlohmann.

But it seems jsoncons performs significantly poorer than nlohmann for inserts. I downloaded your project and did some experiments. I tried pre-allocating memory for the object storage with jj.reserve(), and using jj.try_emplace() rather than jj[str] =, but neither made much difference. It seems most of the time is spent moving the old key-value pairs to make room for the new, rather than in allocations. For additional confirmation, I tried inserting the keys in sorted order, and for that special case jsoncons was much faster than nlohmann.

One thing though is that your test is inserting 500 key-value pairs into an object, that seems to me to be a lot of object members. On my machine, up until about 50 keys, which seems to me to be a more typical number, jsoncons is faster than nlohmann, than falls behind.

However, let's suppose we want to match nlohmann for inserts of larger numbers (> 50) of key-value pairs. One option as you propose is to support std::map or std::unordered_map as a backing store. I haven't fully thought out how feasible that is.

As another option, note that appending all the new items to a vector and then sorting once at the end (which is what jsoncons::json::parse does) is much faster than keeping the vector in sorted order for each insert (which is what the json insert functions do.) Given that in your benchmarks (and mine) jsoncons parsing is somewhat faster than nlohmann, I think we could support comparable insert performance for larger objects by providing a special interface for anyone that needed that, perhaps something like

json j; // Original value

json_builder  builder(std::move(j));

for (int i = 0; i < 500; ++i)
    builder.insert(key,val); // appends

j = builder.get_result(); // sorts and moves result into j

Any comments welcome.

Daniel

@danielaparker
Copy link
Owner

danielaparker commented Mar 7, 2022

After considering this request, I think I want to stick to the flat map memory layout, I don't think I want to introduce std::map or std::unordered_map (which have different iterator requirements) as alternative backing stores. Rather, I'd like to support an efficient unordered map for JSON objects following robin-hood-hashing, using the flat array memory layout. So when that's done we'll have an additional typedef ujson, which should perform well on your benchmarks. I think it will also prove to be a useful intermediate type for mapping JSON objects to C++ data structures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants