Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy parsing logic from observe method to observe decorator #1343

Merged
merged 9 commits into from
Nov 18, 2020

Conversation

aaronayres35
Copy link
Contributor

@aaronayres35 aaronayres35 commented Nov 9, 2020

Checklist

  • Tests
    - [ ] Update API reference (docs/source/traits_api_reference)
    - [ ] Update User manual (docs/source/traits_user_manual)
    - [ ] Update type annotation hints in traits-stubs

This PR uses https://github.com/enthought/traits/blob/master/traits/tests/check_observe_timing.py to validate attempts to improve the efficiency of observe.

In this PR, the only change made was to copy paste the logic for parsing the expression, from the observe method on HasTraits to the observe decorator. The idea being that here the parsing would now just be done at decoration time, rather than each time the HasTraits class is instantiated. There very well may be a cleaner solution, but this does show an effectively 2x speed up for instantiating class with a method decorated with @observe('name') or @observe('child.name') (see below for truncated benchmarking report, but relative time time to instantiate 100000 objects compared with a control drops from 50 to 22 and 56 to 26 for @observe('name') and @observe('child.name') respectively).
EDIT: In this PR, the logic for parsing the expression is pulled out from the observe method into a helper function that is then also called in the observe decorator and in _create_property_observe_state.

running the benchmarking script on master (EDIT: before #1344 was merged) with n=100000 gives the following for the scenarios relevant to this PR:

Given a subclass of HasTraits with a trait defined, e.g. name = Str(),
compare having a method with no decorator to one that is decorated with
@on_trait_change("name"), one decorated with @observe("name"), and one
with @observe(trait('name')).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning the trait to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 8.18560 1.00 (2): 0.12473 1.00 (3): 0.00916 1.00
@on_trait_change('name') -
(1): 8.19230 1.00 (2): 0.69737 5.59 (3): 0.00610 0.67
@observe('name') -
(1): 8.43314 1.03 (2): 6.35615 50.96 (3): 0.00764 0.83
@observe(trait('name')) -
(1): 8.69472 1.06 (2): 2.77342 22.23 (3): 0.00555 0.61


Given a subclass of HasTraits with a trait referring to an instance of
HasTraits, e.g. child = Instance(AnotherClass) where AnotherClass has a
name trait, compare having a method with no decorator to one that is
decorated with @on_trait_change("child.name"), one decorated with
@observe("child.name"), and one with
@observe(trait('child').trait('name')).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning child to a new value.
(4) Time for reassigning child.name to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 8.85182 1.00 (2): 0.27680 1.00 (3): 0.12229 1.00 (4): 0.00621 1.00
@on_trait_change('child.name') -
(1): 9.18126 1.04 (2): 9.12100 32.95 (3): 2.37042 19.38 (4): 0.01042 1.68
@observe('child.name') -
(1): 9.19860 1.04 (2): 15.63174 56.47 (3): 6.42534 52.54 (4): 0.00815 1.31
@observe(trait('child').trait('name')) -
(1): 9.61881 1.09 (2): 7.60048 27.46 (3): 6.63096 54.22 (4): 0.00705 1.14


whereas on this branch we get: (notice the times/relative times that are bolded as these are where the improvement was observed - no pun intended)

Given a subclass of HasTraits with a trait defined, e.g. name = Str(),
compare having a method with no decorator to one that is decorated with
@on_trait_change("name"), one decorated with @observe("name"), and one
with @observe(trait('name')).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning the trait to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 8.90710 1.00 (2): 0.12655 1.00 (3): 0.00587 1.00
@on_trait_change('name') -
(1): 8.41993 0.95 (2): 0.74310 5.87 (3): 0.00575 0.98
@observe('name') -
(1): 13.46335 1.51 (2): 2.86318 22.63 (3): 0.00652 1.11
@observe(trait('name')) -
(1): 9.10302 1.02 (2): 2.94359 23.26 (3): 0.00586 1.00


Given a subclass of HasTraits with a trait referring to an instance of
HasTraits, e.g. child = Instance(AnotherClass) where AnotherClass has a
name trait, compare having a method with no decorator to one that is
decorated with @on_trait_change("child.name"), one decorated with
@observe("child.name"), and one with
@observe(trait('child').trait('name')).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning child to a new value.
(4) Time for reassigning child.name to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 9.70678 1.00 (2): 0.29501 1.00 (3): 0.13536 1.00 (4): 0.00672 1.00
@on_trait_change('child.name') -
(1): 9.62292 0.99 (2): 9.46349 32.08 (3): 2.46827 18.24 (4): 0.00972 1.45
@observe('child.name') -
(1): 18.71555 1.93 (2): 7.79267 26.42 (3): 6.69751 49.48 (4): 0.00802 1.19
@observe(trait('child').trait('name')) -
(1): 9.98197 1.03 (2): 7.76467 26.32 (3): 6.61317 48.86 (4): 0.00750 1.12


This PR also modifies a test that failed once this change was introduced. The test failure seemed to come from an assumption that the expression would not be parsed at decoration time (i.e. the class dict would contain "expression": "name" rather than "expression": [trait("name")]. One thing to note though is that now the expression is not only converted from string to ObserverExpression but it is also wrapped in a list. As I said, this was directly lifted from the logic in the observe method to the observe decorator, but it is possible this is behavior we do not want to have here.

Note:

This PR basically just moves the calls to parse from instantiation time to construction time. Notice that the total time (if you add time for cases (1) and (2)) is nearly identical (EDIT: before #1344 was merged) between this branch and master. The parsing time is basically removed from instantiation and added to construction.

benchmarking results after #1344 was merged:

on master:

Given a subclass of HasTraits with a trait defined, e.g. name = Str(),
compare having a method with no decorator to one that is decorated with
@on_trait_change("name"), one decorated with @observe("name"), and one
with @observe(trait('name')).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning the trait to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 8.14856 1.00 (2): 0.12748 1.00 (3): 0.00501 1.00
@on_trait_change('name') -
(1): 8.53549 1.05 (2): 0.71485 5.61 (3): 0.00579 1.16
@observe('name') -
(1): 8.93377 1.10 (2): 3.00881 23.60 (3): 0.00601 1.20
@observe(trait('name')) -
(1): 9.05333 1.11 (2): 2.99731 23.51 (3): 0.00626 1.25


Given a subclass of HasTraits with a trait referring to an instance of
HasTraits, e.g. child = Instance(AnotherClass) where AnotherClass has a
name trait, compare having a method with no decorator to one that is
decorated with @on_trait_change("child.name"), one decorated with
@observe("child.name"), and one with
@observe(trait('child').trait('name')).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning child to a new value.
(4) Time for reassigning child.name to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 9.32618 1.00 (2): 0.28355 1.00 (3): 0.13320 1.00 (4): 0.00638 1.00
@on_trait_change('child.name') -
(1): 9.45452 1.01 (2): 9.60013 33.86 (3): 2.58519 19.41 (4): 0.00995 1.56
@observe('child.name') -
(1): 9.54662 1.02 (2): 7.98574 28.16 (3): 6.84289 51.37 (4): 0.00764 1.20
@observe(trait('child').trait('name')) -
(1): 10.46585 1.12 (2): 7.94395 28.02 (3): 6.92553 51.99 (4): 0.00757 1.19


Given a subclass of HasTraits with a trait that is defined as Property
that depends on a simple trait, compare having the Property be defined
as Property(), Property(depends_on="name"), Property(observe="name"),
and Property(observe=trait('name')). (Note that this is a feature
currently available on master only).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for changing the trait being depended-on / observed.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 14.67828 1.00 (2): 0.12879 1.00 (3): 0.00480 1.00
depends_on='name' -
(1): 14.28861 0.97 (2): 6.08075 47.22 (3): 0.00736 1.53
observe='name' -
(1): 14.73340 1.00 (2): 2.95466 22.94 (3): 0.00618 1.29
observe=trait('name') -
(1): 14.92038 1.02 (2): 2.93276 22.77 (3): 0.00617 1.28


Identical to the simple property case only using the @chached_property
decorator on the property's getter method.

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for changing the trait being depended-on / observed.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 14.36050 1.00 (2): 0.12756 1.00 (3): 0.00493 1.00
depends_on='name' -
(1): 14.78415 1.03 (2): 13.29759 104.25 (3): 0.00951 1.93
observe='name' -
(1): 16.06462 1.12 (2): 2.86701 22.48 (3): 0.00588 1.19
observe=trait('name') -
(1): 15.23267 1.06 (2): 2.89664 22.71 (3): 0.00625 1.27


Given a subclass of HasTraits with a trait that is defined as Property
that depends on an extended trait, compare having the Property be
defined as Property(), Property(depends_on="child.name"),
Property(observe="child.name"), and
Property(observe=trait('child').trait('name')). (Note that this is a
feature currently available on master only).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning child to a new value.
(4) Time for reassigning child.name to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 14.98461 1.00 (2): 0.28057 1.00 (3): 0.13279 1.00 (4): 0.00628 1.00
depends_on='child.name' -
(1): 15.26837 1.02 (2): 9.91370 35.33 (3): 2.11438 15.92 (4): 0.00830 1.32
observe='child.name' -
(1): 17.21465 1.15 (2): 9.00519 32.10 (3): 7.22518 54.41 (4): 0.00781 1.24
observe=trait('child').trait('name') -
(1): 16.16348 1.08 (2): 7.92671 28.25 (3): 6.69571 50.42 (4): 0.00738 1.17


Identical to the extended property case only using the @cached_property
decorator on the property's getter method.

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning child to a new value.
(4) Time for reassigning child.name to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 15.90703 1.00 (2): 0.27660 1.00 (3): 0.12601 1.00 (4): 0.00614 1.00
depends_on='child.name' -
(1): 15.32743 0.96 (2): 20.29507 73.37 (3): 4.28814 34.03 (4): 0.00874 1.42
observe='child.name' -
(1): 16.28781 1.02 (2): 7.99143 28.89 (3): 6.97696 55.37 (4): 0.00820 1.33
observe=trait('child').trait('name') -
(1): 16.60755 1.04 (2): 7.93145 28.67 (3): 7.04930 55.94 (4): 0.00745 1.21


on this branch:

Given a subclass of HasTraits with a trait defined, e.g. name = Str(),
compare having a method with no decorator to one that is decorated with
@on_trait_change("name"), one decorated with @observe("name"), and one
with @observe(trait('name')).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning the trait to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 8.66655 1.00 (2): 0.13542 1.00 (3): 0.00478 1.00
@on_trait_change('name') -
(1): 8.54825 0.99 (2): 0.68965 5.09 (3): 0.00558 1.17
@observe('name') -
(1): 8.52480 0.98 (2): 2.75759 20.36 (3): 0.00559 1.17
@observe(trait('name')) -
(1): 8.76548 1.01 (2): 2.90105 21.42 (3): 0.00637 1.33


Given a subclass of HasTraits with a trait referring to an instance of
HasTraits, e.g. child = Instance(AnotherClass) where AnotherClass has a
name trait, compare having a method with no decorator to one that is
decorated with @on_trait_change("child.name"), one decorated with
@observe("child.name"), and one with
@observe(trait('child').trait('name')).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning child to a new value.
(4) Time for reassigning child.name to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 9.46985 1.00 (2): 0.28750 1.00 (3): 0.13019 1.00 (4): 0.00622 1.00
@on_trait_change('child.name') -
(1): 9.69560 1.02 (2): 9.30700 32.37 (3): 2.50511 19.24 (4): 0.00906 1.46
@observe('child.name') -
(1): 10.30737 1.09 (2): 8.17948 28.45 (3): 6.65118 51.09 (4): 0.00810 1.30
@observe(trait('child').trait('name')) -
(1): 10.42507 1.10 (2): 8.01708 27.89 (3): 6.79402 52.19 (4): 0.00862 1.39


Given a subclass of HasTraits with a trait that is defined as Property
that depends on a simple trait, compare having the Property be defined
as Property(), Property(depends_on="name"), Property(observe="name"),
and Property(observe=trait('name')). (Note that this is a feature
currently available on master only).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for changing the trait being depended-on / observed.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 14.54702 1.00 (2): 0.13931 1.00 (3): 0.00500 1.00
depends_on='name' -
(1): 15.50286 1.07 (2): 6.11413 43.89 (3): 0.00687 1.37
observe='name' -
(1): 15.05815 1.04 (2): 2.88068 20.68 (3): 0.00613 1.23
observe=trait('name') -
(1): 15.38447 1.06 (2): 3.10586 22.30 (3): 0.00607 1.21


Identical to the simple property case only using the @chached_property
decorator on the property's getter method.

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for changing the trait being depended-on / observed.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 14.28698 1.00 (2): 0.12869 1.00 (3): 0.00502 1.00
depends_on='name' -
(1): 15.14075 1.06 (2): 12.05918 93.71 (3): 0.00622 1.24
observe='name' -
(1): 14.79252 1.04 (2): 2.83872 22.06 (3): 0.00625 1.24
observe=trait('name') -
(1): 15.06480 1.05 (2): 2.83408 22.02 (3): 0.00586 1.17


Given a subclass of HasTraits with a trait that is defined as Property
that depends on an extended trait, compare having the Property be
defined as Property(), Property(depends_on="child.name"),
Property(observe="child.name"), and
Property(observe=trait('child').trait('name')). (Note that this is a
feature currently available on master only).

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning child to a new value.
(4) Time for reassigning child.name to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 15.00476 1.00 (2): 0.28116 1.00 (3): 0.13626 1.00 (4): 0.00655 1.00
depends_on='child.name' -
(1): 15.18960 1.01 (2): 9.72240 34.58 (3): 2.07473 15.23 (4): 0.00870 1.33
observe='child.name' -
(1): 15.76752 1.05 (2): 7.82174 27.82 (3): 6.62114 48.59 (4): 0.00797 1.22
observe=trait('child').trait('name') -
(1): 16.86445 1.12 (2): 7.86445 27.97 (3): 6.58888 48.36 (4): 0.00787 1.20


Identical to the extended property case only using the @cached_property
decorator on the property's getter method.

The timing we are interested in:
(1) Time to import the module, i.e. the class construction.
(2) Time to instantiate the HasTraits object.
(3) Time for reassigning child to a new value.
(4) Time for reassigning child.name to a new value.

Times are reported as:
(#): absolute_time relative_time_to_control

control -
(1): 15.54366 1.00 (2): 0.27699 1.00 (3): 0.13029 1.00 (4): 0.00653 1.00
depends_on='child.name' -
(1): 15.56834 1.00 (2): 20.16819 72.81 (3): 4.17065 32.01 (4): 0.00873 1.34
observe='child.name' -
(1): 15.94161 1.03 (2): 7.90081 28.52 (3): 6.72361 51.61 (4): 0.00805 1.23
observe=trait('child').trait('name') -
(1): 16.70588 1.07 (2): 7.81514 28.21 (3): 6.84599 52.54 (4): 0.00727 1.11


At this point this PR has a minimal effect on the benchmarking.
I was initially a bit confused why the "(1) Time to import the module, i.e. the class construction" is no longer increasing here. Every construction is using the same couple of expressions to be parsed so I believe they are being cached by the lru cache. My confusion was that for case (1) I expected the cache to be cleared or reset before every attempted construction but that is not the case because the cache is set up in the setup for the timeit.timeit() call. At this point that may not be the ideal benchmark to gauge the effects of this change because that is not the realistic scenario (importing/constructing the exact same class many times)

@aaronayres35 aaronayres35 requested a review from kitchoi November 9, 2020 14:43
@kitchoi
Copy link
Contributor

kitchoi commented Nov 9, 2020

whereas on this branch we get: (notice the times/relative times that are bolded as these are where the improvement was observed - no pun intended)

In the benchmark report, item (1) refers to the time to import the module, i.e. the class construction. That time has increased. In addition to the change in instantiation time, there is a degradation in the import time.

By making instantiation faster, this makes import slower, which could be undesirable in use cases where a project imports a lot of modules (e.g. from pyface.api import something) but only uses very few of the imported code.

@aaronayres35
Copy link
Contributor Author

Note a very similar approach can be taken for the Property cases, by adding the same logic to

traits/traits/has_traits.py

Lines 306 to 341 in fa45275

def _create_property_observe_state(observe, property_name, cached):
""" Create the metadata for setting up an observer for Property.
Parameters
----------
observe : str or list or Expression
As is accepted by HasTraits.observe expression argument
This is the value provided in Property(observe=...)
property_name : str
The name of the property trait.
cached : boolean
Whether the property is cached or not.
Returns
-------
state : dict
State to be used by _init_traits_observers
"""
def handler(instance, event):
if cached:
cache_name = TraitsCache + property_name
old = instance.__dict__.pop(cache_name, Undefined)
else:
old = Undefined
instance.trait_property_changed(property_name, old)
def handler_getter(instance, name):
return types.MethodType(handler, instance)
return dict(
expression=observe,
dispatch="same",
handler_getter=handler_getter,
post_init=False,
)

before the return statement.
Doing so we see roughly the same 2x speed up instantiation time, but again a slow down in time to import the module/class construction time. If we decide to continue with this approach, I will open a separate PR for that change as well. If we decide to try a different approach I will not

@kitchoi
Copy link
Contributor

kitchoi commented Nov 9, 2020

To parse the expression early in the decorator effectively store the parsed results on the class as a cache.

This was an idea originally floated from @mdickinson: We could also look into applying functools.lru_cache on the parse function here:

def parse(text):

This approach might require a little investigation into the memory footprint too (machines nowadays have so much memory... but still there is a limit).

In any case, I would be quite interested in seeing the profiling results to back up these changes, e.g. it'd be good to see a profile which clearly indicates the parse function is the costly step.

@aaronayres35
Copy link
Contributor Author

In the benchmark report, item (1) refers to the time to import the module, i.e. the class construction. That time has increased. In addition to the change in instantiation time, there is a degradation in the import time.

By making instantiation faster, this makes import slower, which could be undesirable in use cases where a project imports a lot of modules (e.g. from pyface.api import something) but only uses very few of the imported code.

Theoretically though, for each import we expect >= 1 instantiation as in general you wouldn't expect to import a class if it will never be instantiated. So an increase in import time, with a comparable decrease in instantiation time would be a net benefit. (correct?)

In this PR if you look at the sum of the case (1) and (2) times you see that between master and this branch the total times are roughly the same (within a second or so) because all this PR really does is move the calls to parse from 1 case to the other.
FWIW I confirmed that on master parse is never called in case (1) and is called 10000 times in case (2), and on this branch it is the opposite.

@aaronayres35
Copy link
Contributor Author

aaronayres35 commented Nov 9, 2020

In any case, I would be quite interested in seeing the profiling results to back up these changes, e.g. it'd be good to see a profile which clearly indicates the parse function is the costly step.

I ran snakeviz on the first scenario of having a subclass of HasTraits with a trait defined, e.g. name = Str(), with a method decorated with @observe("name") just on case (2) Time to instantiate the HasTraits object. and this is the result (sorted by tottime (the total time spent in the function not including calls to subfunctions)

Screen Shot 2020-11-09 at 12 02 32 PM

the same profile on this branch:

Screen Shot 2020-11-09 at 12 08 45 PM

@kitchoi
Copy link
Contributor

kitchoi commented Nov 9, 2020

Thank you! It is really good to see the profiles - it confirms parsing being one big chunk of the time spent. We can explore the options to optimize the _add_or_remove_extra_graphs code paths too, but that is likely more complicated than just adding a cache for the parsing step.

I think I have a preference towards lru_cache in #1344 for it adds pretty much no complexity to the code structure, at the cost of some memory which would also be needed on this branch (the parsed expressions will be stored on the class).

@mdickinson @aaronayres35 What is your opinion on this?

@aaronayres35
Copy link
Contributor Author

Thank you! It is really good to see the profiles - it confirms parsing being one big chunk of the time spent. We can explore the options to optimize the _add_or_remove_extra_graphs code paths too, but that is likely more complicated than just adding a cache for the parsing step.

I think I have a preference towards lru_cache in #1344 for it adds pretty much no complexity to the code structure, at the cost of some memory which would also be needed on this branch (the parsed expressions will be stored on the class).

@mdickinson @aaronayres35 What is your opinion on this?

Yeah I think I agree. One thing to note is we are only running a benchmark on a few simple cases. It might be good to try and gauge how well they would both generalize. For example is a maxsize of 128 appropriate for the lru_cache for most cases? It seems to work well for the benchmark but I'm not certain that means it will always be sufficient. (I picked it because it is the default and mark suggested in the slack discussion he suspected between 100-200 to be sufficient).

@kitchoi
Copy link
Contributor

kitchoi commented Nov 9, 2020

I said on this comment:

it adds pretty much no complexity to the code structure

One complexity to do with caching is mutability: The ObserverExpression objects are treated as immutable, so I think we are good here.

It might be good to try and gauge how well they would both generalize. For example is a maxsize of 128 appropriate for the lru_cache for most cases? It seems to work well for the benchmark but I'm not certain that means it will always be sufficient.

That's a good point. Perhaps we could try to work out how many ObserverExpression the cache can hold, just to get a sense how realistic the cache is going to be in practice...though that might vary from Python versions to Python versions. Alternatively, the cache size can be set to None (unbound). In any case, I think this might be a discussion that belongs to #1344.

@aaronayres35
Copy link
Contributor Author

Thank you! It is really good to see the profiles - it confirms parsing being one big chunk of the time spent. We can explore the options to optimize the _add_or_remove_extra_graphs code paths too, but that is likely more complicated than just adding a cache for the parsing step.

I think I have a preference towards lru_cache in #1344 for it adds pretty much no complexity to the code structure, at the cost of some memory which would also be needed on this branch (the parsed expressions will be stored on the class).

@mdickinson @aaronayres35 What is your opinion on this?

I also should note that #1344 also provides a speed up for the property cases (I should have added that to the reports in the PR description there as well!). Whereas if we go with the solution in this PR, as I mentioned here: #1343 (comment) we would need to copy more code there as well.

…o use said helper function for the property case
@aaronayres35
Copy link
Contributor Author

As mentioned in this comment: #1343 (comment)
I pulled the logic out into a helper function.

Additionally, as mentioned in this comment: #1343 (comment)
I used that helper function to address the property case as well

@aaronayres35 aaronayres35 requested a review from kitchoi November 12, 2020 20:19
@aaronayres35
Copy link
Contributor Author

aaronayres35 commented Nov 12, 2020

travis CI is failing with:

No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
The build has been terminated

Not sure how to fix this / what the problem is, googling now

@aaronayres35
Copy link
Contributor Author

travis CI is failing with:

same failure is occurring on #1346

@kitchoi
Copy link
Contributor

kitchoi commented Nov 13, 2020

For reference, if I try to import the api module of a closed source library depending on ETS, just that import triggers ~150 @on_trait_change and depends_on being evaluated as part of class construction (the library has not migrated to observe yet). I am mentioning this because parsing the expression early and putting it on the class is also a form of cache. For classes defined at module level, they will stick around unless the classes are deleted, which is likely when the program exits.

@kitchoi kitchoi removed their request for review November 13, 2020 09:29
@kitchoi
Copy link
Contributor

kitchoi commented Nov 13, 2020

For those who want to compare the performance with master, it would be good to merge master to this branch, so that one can compare the difference brought by the PR after the LRU cache has been put in place.

@aaronayres35
Copy link
Contributor Author

For those who want to compare the performance with master, it would be good to merge master to this branch, so that one can compare the difference brought by the PR after the LRU cache has been put in place.

I merged master into this branch and updated the PR description with the new benchmarking results comparing this branch to master as it currently stands.

Copy link
Member

@mdickinson mdickinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; a few suggestions and change requests.

@@ -333,6 +333,8 @@ def handler(instance, event):
def handler_getter(instance, name):
return types.MethodType(handler, instance)

observe = _parse_expression(observe)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit confusing to use the same name for the input and output, especially when they might have different types. I suggest using expressions instead of observe here, to match the similar case further down.

aaronayres35 and others added 2 commits November 18, 2020 10:31
Co-authored-by: Mark Dickinson <mdickinson@enthought.com>
Copy link
Member

@mdickinson mdickinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants