dev/core#1781 - CRM_Utils_Time - Allow TIME_FUNC to better isolate timing bugs #17414

totten · 2020-05-28T00:18:35Z

Overview

As a general matter, a unit-test is supposed to be both reproducible and representative. Reproducibility is achieved by controlling the inputs to the test. This PR aims to improve reproducibility for certain tests by controlling time-inputs with the option TIME_FUNC.

Logic which relies on time poses a challenge to reproducibility -- every call to time() (or similar) is essentially a new and non-deterministic input. Two calls to time() may (or may not) produce the same result, depending on external happenstance. (To wit: When is the test being executed? How fast is the machine? How much concurrent load is there? etc)

CRM_Utils_Time provides a mechanism (setTime() and getTime()) for taking tighter control over timing, and it is used for testing certain time-sensitive subsystems (eg "Scheduled Reminders"). However, we still see some flaky (hard-to-reproduce) results in the "Schedule Reminder" tests, and -- as pointed out in dev/core#1781 -- the current approach does not try to control for variations in the pace of execution.

Before

When using CRM_Utils_Time::setTime(), it simulates a change in the clock, and the clock is allowed to proceed at a natural/real pace. Thus, suppose we have this code:

// Example: Code which makes two calls to getTime()
CRM_Utils_Time::setTime('2020-01-02 03:04:00');
$time_a = CRM_Utils_Time::getTime();
do_stuff();
$time_b = CRM_Utils_Time::getTime();

Under the current policy, $time_a and $time_b will differ (intuitively) by how long it actually took to do_stuff(). That's pretty natural, but it's also non-deterministic.

This policy represents a trade-off: it is representative of real-world execution. If there is a non-deterministic timing bug in the actual logic, then we'd like the test system to raise an error about it. On the other hand, because it's non-deterministic, the results appear flaky and difficult to reproduce. Sometimes the test passes; sometimes it fails.

After

By default, CRM_Utils_Time still performs simulations where the clock proceeds at a natural pace. Thus, by default, the behavior is more representative of how the system works in real-world usage.

However, you may optionally use a synthetic clock where the pace is pre-determined. Choose a TIME_FUNC:

TIME_FUNC=frozen      ## every call to getTime() yields the same, static value
TIME_FUNC=natural     ## every call to getTime() progresses in tandem with the real/natural clock
TIME_FUNC=linear:333  ## every call to getTime() progresses by 333ms
TIME_FUNC=prng:333    ## every call to getTime() progresses by a random interval, up to 333ms

Comments

For a simple visualization of what TIME_FUNC does, consider this example: https://gist.github.com/totten/d84321dfeb76861aa65a04507897fcc0

For more complete example of debugging a flaky test, consider CRM_Core_BAO_ActionScheduleTest and its functions testMembershipEndDateRepeat, testEventTypeEndDateRepeat, and testRepetitionFrequencyUnit -- which (anecdotally) appear to be flaky. We can run one of these tests with deterministic time functions:

for TIME_FUNC in frozen prng:333 prng:666 prng:1000 prng:1500 linear:333 linear:666 linear:1000 ; do
  export TIME_FUNC
  echo
  echo "Running with TIME_FUNC=$TIME_FUNC"
  env CIVICRM_UF=UnitTests phpunit6 tests/phpunit/CRM/Core/BAO/ActionScheduleTest.php --filter testRepetitionFrequencyUnit
done

I've done multiple trials of testRepetitionFrequencyUnit with each TIME_FUNC, and we do indeed get different results depending on the TIME_FUNC. The results have been stable so far.

PASSING: frozen prng:333 prng:666 linear:333 linear:666 linear:1000 linear:1500
FAILING: prng:1000 prng:1500

This doesn't specifically tell you what the bug is. It could be a real bug in the main logic or less important bug in the test logic. However, you can now reproduce it (locally, in an IDE, etc) by setting TIME_FUNC=prng:1500.

…ming bugs Overview -------- `CRM_Utils_Time` is a helper for (a) getting the time and (b) setting the time to a mocked value (for testing purposes). Timing bugs can manifest as flaky tests. This provides a new option, `TIME_FUNC`, which may help isolate some timing bugs (provided the test consistently uses `CRM_Utils_Time`). Before ------ When using `CRM_Utils_Time::setTime()`, it simulates a change in the clock, and the clock is allowed to proceed at a natural/real pace. This policy aims to be representative of real-world execution (wherein the clock also executes at a natural pace). However, consider that the pace at which the code executes depends on numerous factors (CPU speed, memory managers, system-load, system-caches, the specific code/use-case, etc). If your use-case relies on multiple calls to `getTime()`, and if you run the use-case in repeated trials, then you may find somewhat chaotic variations in the exact values of `getTime()`. Consequently, if there is a timing-sensitive bug, you may find that some trials pass while others fail, and it is impossible to reproduce the natural circumstance of the failure. After ----- By default, `CRM_Utils_Time` still performs simulations where the clock proceeds a natural/real pace. However, if you see flaky results and suspect a timing bug, you can re-run the test suite with alternative time functions (`TIME_FUNC`): ```bash TIME_FUNC=frozen ## every call to getTime() yields the same, static value TIME_FUNC=natural ## every call to getTime() progresses in tandem with the real/natural clock TIME_FUNC=linear:333 ## every call to getTime() progresses by 333ms TIME_FUNC=prng:333 ## every call to getTime() progresses by a random interval, up to 333ms ``` Now "time" is not so natural or chaotic - it's more synthetic and deterministic, which can help with debugging. For example, consider `CRM_Core_BAO_ActionScheduleTest` and its functions `testMembershipEndDateRepeat`, `testEventTypeEndDateRepeat`, and `testRepetitionFrequencyUnit` -- which (anecdotally) appear to be flaky. We can run one of these tests with deterministic time functions: ```bash for TIME_FUNC in frozen prng:333 prng:666 prng:1000 prng:1500 linear:333 linear:666 linear:1000 ; do export TIME_FUNC echo echo "Running with TIME_FUNC=$TIME_FUNC" env CIVICRM_UF=UnitTests phpunit6 tests/phpunit/CRM/Core/BAO/ActionScheduleTest.php --filter testRepetitionFrequencyUnit done ``` We get different results depending on the `TIME_FUNC`. I've done multiple trials with each `TIME_FUNC`, and results have been stable so far: * PASSING: `frozen` `prng:333` `prng:666` `linear:333` `linear:666` `linear:1000` `linear:1500` * FAILING: `prng:1000` `prng:1500` This doesn't specifically tell you what the bug is. It could be a real bug in the main logic or less important bug in the test logic. However, you can now *reproduce it* (locally, in an IDE, etc) by setting `TIME_FUNC=prng:1500`.

civibot · 2020-05-28T00:18:38Z

(Standard links)

If this is your first pull-request for CiviCRM, please browse CONTRIBUTING.md for information about the development and testing processes.
If you are reviewing this pull-request, you may wish to consult the test sites and the Review Standards (long template, short template).

totten · 2020-05-28T01:26:29Z

(Note: Rewrote the description to try to make it more accessible... such as it...)

eileenmcnaughton · 2020-05-28T02:55:00Z

OK - any progress on these flakey tests is good....

The `CRM_Queue_QueueTest` is producing failures with the [TIME_FUNC](civicrm#17414) option of `linear:500` (the value currently used in PR test runs) -- which is to say, it fails when the system runs very slowly (with some 500ms stalling during execution). This gets it working under some very slow scenarios: ``` for TIME_FUNC in natural frozen linear:500 linear:1250 prng:500 prng:666 prng:1000 prng:1500 ; do export TIME_FUNC; echo; echo "TIME_FUNC=$TIME_FUNC" ; env CIVICRM_UF=UnitTests phpunit6 tests/phpunit/CRM/Queue/QueueTest.php ; done ```

civibot bot added the master label May 28, 2020

eileenmcnaughton merged commit cd999d6 into civicrm:master May 28, 2020

totten deleted the master-settime branch June 4, 2020 21:58

totten mentioned this pull request Jun 6, 2020

CommunityMessages - Define more consistent behavior in the face of slow execution #17529

Merged

totten mentioned this pull request Jun 7, 2020

QueueTest - Be more forgiving about slow execution #17532

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dev/core#1781 - CRM_Utils_Time - Allow TIME_FUNC to better isolate timing bugs #17414

dev/core#1781 - CRM_Utils_Time - Allow TIME_FUNC to better isolate timing bugs #17414

totten commented May 28, 2020 •

edited

Loading

civibot bot commented May 28, 2020

totten commented May 28, 2020

eileenmcnaughton commented May 28, 2020

dev/core#1781 - CRM_Utils_Time - Allow TIME_FUNC to better isolate timing bugs #17414

dev/core#1781 - CRM_Utils_Time - Allow TIME_FUNC to better isolate timing bugs #17414

Conversation

totten commented May 28, 2020 • edited Loading

Overview

Before

After

Comments

civibot bot commented May 28, 2020

totten commented May 28, 2020

eileenmcnaughton commented May 28, 2020

totten commented May 28, 2020 •

edited

Loading