Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SortedHashMap? #28

Closed
bruno-roustant opened this issue Sep 22, 2021 · 4 comments · Fixed by #29
Closed

SortedHashMap? #28

bruno-roustant opened this issue Sep 22, 2021 · 4 comments · Fixed by #29
Milestone

Comments

@bruno-roustant
Copy link
Collaborator

There are cases where we want a HashMap with iteration in sorted order. In those cases we can use the JDK TreeMap, but it uses more memory and provides slower O(log(N)) #get as well as slower iteration (2x slower than JDK HashMap in my benchmark).
Often the TreeMap is built once and is not modified afterward, in this case TreeMap is not really advantageous.

There could be an opportunity to have a SortedHashMap based on HPPC HashMap (open-addressing). It would require to extend HPPC HashMap and have a int[] orderArray sorted using HPPC IndirectSort (it could even sort on values). If the orderArray is built once and reused (for example discarded only if the map is modified, which does not happen in the use-case I describe), then such a map requires much less memory and provides O(1) #get and fast iteration.

But such a map has a fixed iteration order! So we get into big perf issue if we iterate it to add to another HPPC HashMap.

So here is my question:
Would it fit into HPPC with a warning that it must not be used to copy to another map (the same warning as ScatterMap had), or is it too dangerous or too exotic to fit?

@dweiss
Copy link
Member

dweiss commented Sep 23, 2021

Hi Bruno. I was wondering about adding sorted maps (and insertion-order-preserving maps) a few times but I just couldn't find the implementation scenario that would be generic and fast enough. The "iteration order" array seems more like a helpful specific façade - maintaining the entry order array is impossible while adding/removing elements so it's not like a general-purpose replacement for a TreeMap (and I think it would confuse people).

What you suggest makes sense to me as a lightweight, read-only view over a delegate map object, with the ability to create (or recreate) the iteration order array. Something like this, for example:

{code}
var delegate = new KTypeVTypeMap<Object, Object>();
var sortedMap = new SortedIterationKTypeVTypeMap<>(delegate, (Comparator) kvPair-> {...});
// sortedMap follows the order of the comparator. In assertions-mode it should also attempt to detect modifications to the delegate and fail on such modifications?
{code}

@bruno-roustant
Copy link
Collaborator Author

I like this idea of a read-only view with a delegate. It's even better than an ImmutableSortedMap builder because it offers wider use cases, while still being clear on the intended usage.
I'll try to sketch something.

@dweiss
Copy link
Member

dweiss commented Sep 23, 2021

Try it! I think you can try with one of the concrete classes first, the template can follow-up as it's more tedious to write.

@bruno-roustant
Copy link
Collaborator Author

Actually it could be nice to also have a KTypeVTypeImmutableSortedMap with a builder and taking a parameter to select performance or memory.
The immutable sorted map for performance would be a read-only sorted-iteration view on an internal KTypeVTypeHashMap (inaccessible thus immutable).
The immutable sorted map for memory would be composed of only 2 arrays for keys and values, of exactly the size required, and with a binary search for the get/contains. It would provide a very fast iteration.

@dweiss dweiss added this to the 0.9.1 milestone Dec 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants