Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 344: Add a function for reporting potential performance problem… #416

Merged
merged 2 commits into from
Feb 7, 2019

Conversation

WilliamParker
Copy link
Collaborator

…s in a traced session

Pull request for issue 344
@mrrodriguez this is one of the tracing helpers we've discussed before.

@EthanEChristian this is probably of interest to you. Possibly @eraserhd as well if I recall some discussions on the Slack channel in recent months correctly.

Some remaining items to address:

  • How does this handle nodes with more than one descendant terminal node? I'd like to add tests for this.
  • This will report autogenerated rules created from the extraction added to fix Negated conjunctions and binding visibility #149 . This is an existing issue with session inspection as well though - I don't think addressing this needs to block a merge.

(map second))
(eng/get-conditions-and-rule-names node))))

(defn traced-session->ranked-productions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might also be interesting to have something similar to pair with this. Once we have the production name it might be nice to be able to easily get the nodes and number of interactions that make up a specific production. This could probably be added in the future if we see the need though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, there's lots of interesting views on the tracing data that one could provide.

Copy link
Collaborator

@mrrodriguez mrrodriguez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad you are adding some functionality like this. Using the trace info is certainly more approachable with examples like this. Also, this fn could be immediately useful to narrow down a perf problem on it's own. 💫

(map second))
(eng/get-conditions-and-rule-names node))))

(defn traced-session->ranked-productions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd probably prefer the name to just beranked-productions.

I'm not sure there is a reason to express "conversion" that the x->y tends to imply.
I don't 100% agree [1] with the suggestions of this post [2], but do think it has some fairly good guidance in this regard

[1] It does mention

There are exceptions to every rule.
So it covers that base.

[2] https://stuartsierra.com/2016/01/09/how-to-name-clojure-functions

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That name is OK with me, will change.

that this will provide useful information for a first pass at rules
performance problem debugging. This should not be used to drive user logic.

This currently returns a Clojure array map currently in order to conveniently have the rules
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"currently" is repeated in this sentence.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 good catch, will change

@@ -114,3 +114,48 @@
(first))]
(.-trace ^PersistentTracingListener listener)
(throw (ex-info "No tracing listener attached to session." {:session session}))))

(defn ^:private node-id->productions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to my comment for traced-session->ranked-productions,
I'd prefer node-productions since this doesn't seem like a "conversion" operation.

It's subjective of course.

Copy link
Collaborator Author

@WilliamParker WilliamParker Feb 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps "productions-for-id"? In any case this is an internal private function so it probably doesn't matter much.

"Given a session and a node ID return a list of the rule and query names associated
with the node."
[session id]
(let [node (-> session .-rulebase :id-to-node (get id))]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be preferable to remove the reflective access here. It doesn't matter perf-wise in this case of course, but the less reflection warnings in the codebase the less noise there is to find the important ones.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just use the components function on the session actually and avoid direct field access entirely. It's better to use the protocols on the session anyway in case there was another implementation at some point. https://github.com/cerner/clara-rules/blob/0.19.0/src/main/clojure/clara/rules/engine.cljc#L1990


production-name->interactions (frequencies production-names)

ranked-tuples (reverse (sort-by second production-name->interactions))]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reverse could be avoided by using the 3 arity sort-by.
I believe that'd be

ranked-tuples (sort-by second > production-name->interactions)

or could generalize as

ranked-tuples (sort-by second #(compare %2 %1) production-name->interactions)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that that would work. IMO this is a little easier to understand though and performance doesn't seem like a concern here. I'm inclined to leave it as is.


ranked-tuples (reverse (sort-by second production-name->interactions))]

(apply array-map (into [] cat ranked-tuples))))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: It could be a sorted-map-by with the value-info tied to metadata or something like that.
However, this is simpler and seems fine for now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that could work if we used symbols as the keys. Since strings don't support metadata you'd need to do something more complicated involving storing the "metadata" elsewhere and make sure you didn't create a memory leak in the process. Alternatively the keys could be symbols instead. We'd also need to make sure the compared value was never equal, perhaps by adding a tiebreak on alphabetical order, since sorted-map-by will consider equal values on the comparator to be equal keys.

@WilliamParker
Copy link
Collaborator Author

I'm going to go ahead and merge this; my last commit addressed review comments. If there's anything outstanding to address I'm open to making further changes but it seems like we've covered everything of importance and this is not a change to the core compiler/engine/memory code demanding higher scrutiny.

@WilliamParker WilliamParker merged commit dde409c into oracle-samples:master Feb 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Negated conjunctions and binding visibility
3 participants