[Durability] Increased memory usage post Serde. #419

EthanEChristian · 2019-02-05T15:47:37Z

I have been investigating memory usage when creating and serializing sessions, and i ran across the fact that pre-serialization session consume less heap than after serialization.

For example i have a session that has 31478 productions, looking at heap dumps:

Pre-Serialization:

Post-Serialization(new JVM):

Talking with @mrrodriguez, he pointed to the fressian handlers for the clojure data structures. The problem with them is that pre-serialization a lot of the clara rulebase uses common references to constraints from rules, but when serialized these references are not recreated, meaning that large portions of productions become duplicated.

I'm going to try and use an identity based cache for serializing the session in such a way that references can be maintained during deserialization, similar to the way we handle records today.

WilliamParker · 2019-02-05T23:39:48Z

I'm surprised that the difference is so large actually, I suspect that there is indeed room for significant improvement. Keep in mind that when you perform operations on a Clojure data structure, depending on the particular case in question both the original data structure and the new view will share some underlying Java objects. This is how Clojure performs these operations efficiently - if it had to copy on write it would be much slower. This blog is a good place to start if you haven't read on this before: https://hypirion.com/musings/understanding-persistent-vector-pt-1 . Unfortunately taking advantage of this sharing could be difficult or infeasible during SerDe, and you might have to weigh CPU vs memory costs. However, IMO there are good odds that there is much lower hanging fruit available to reduce this memory use increase.

EthanEChristian · 2019-02-06T15:10:14Z

@WilliamParker
I agree that we probably can't reproduce the sharing at that level, but we should be able to replicate identical collections on SerDe. From what i can tell, the vast majority of duplication comes from the constraints of the rules themselves, as these will be held by the nodes and the pre-eval'd fns for those nodes.

mrrodriguez · 2019-02-06T15:12:29Z

@WilliamParker
That is a good point when it comes to altering large data structures. I wonder how much that is done at the ruleset serde level.

I agree that we probably can't reproduce the sharing at that level, but we should be able to replicate identical collections on SerDe. From what i can tell, the vast majority of duplication comes from the constraints of the rules themselves, as these will be held by the nodes and the pre-eval'd fns for those nodes.

@EthanEChristian This makes sense to me that it'd still have quite a bit of value in maintaining the identity of collections here since many nodes, etc, refer to the exact same in-memory structures.

EthanEChristian · 2019-03-05T15:45:31Z

I have merged #420, i think i will close this issue.

EthanEChristian added enhancement performance-optimization labels Feb 5, 2019

EthanEChristian self-assigned this Feb 5, 2019

WilliamParker added the durability label Feb 5, 2019

EthanEChristian added a commit to EthanEChristian/clara-rules that referenced this issue Feb 6, 2019

oracle-samples#419: [Durability] Increased memory usage post Serde.

2cf5bae

EthanEChristian mentioned this issue Feb 6, 2019

#419: [Durability] Increased memory usage post Serde. #420

Merged

WilliamParker mentioned this issue Feb 7, 2019

0.19.1 release #421

Closed

EthanEChristian mentioned this issue Feb 7, 2019

Expose flag to retain compile-ctx instead of maintaining it #422

Closed

EthanEChristian added a commit that referenced this issue Mar 5, 2019

#419: [Durability] Increased memory usage post Serde. (#420)

4c0cdc7

EthanEChristian closed this as completed Mar 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Durability] Increased memory usage post Serde. #419

[Durability] Increased memory usage post Serde. #419

EthanEChristian commented Feb 5, 2019

WilliamParker commented Feb 5, 2019

EthanEChristian commented Feb 6, 2019

mrrodriguez commented Feb 6, 2019

EthanEChristian commented Mar 5, 2019

[Durability] Increased memory usage post Serde. #419

[Durability] Increased memory usage post Serde. #419

Comments

EthanEChristian commented Feb 5, 2019

WilliamParker commented Feb 5, 2019

EthanEChristian commented Feb 6, 2019

mrrodriguez commented Feb 6, 2019

EthanEChristian commented Mar 5, 2019