Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dev/core#1781 - CRM_Utils_Time - Allow TIME_FUNC to better isolate timing bugs #17414

Merged
merged 1 commit into from
May 28, 2020

Conversation

totten
Copy link
Member

@totten totten commented May 28, 2020

Overview

As a general matter, a unit-test is supposed to be both reproducible and representative. Reproducibility is achieved by controlling the inputs to the test. This PR aims to improve reproducibility for certain tests by controlling time-inputs with the option TIME_FUNC.

Logic which relies on time poses a challenge to reproducibility -- every call to time() (or similar) is essentially a new and non-deterministic input. Two calls to time() may (or may not) produce the same result, depending on external happenstance. (To wit: When is the test being executed? How fast is the machine? How much concurrent load is there? etc)

CRM_Utils_Time provides a mechanism (setTime() and getTime()) for taking tighter control over timing, and it is used for testing certain time-sensitive subsystems (eg "Scheduled Reminders"). However, we still see some flaky (hard-to-reproduce) results in the "Schedule Reminder" tests, and -- as pointed out in dev/core#1781 -- the current approach does not try to control for variations in the pace of execution.

Before

When using CRM_Utils_Time::setTime(), it simulates a change in the clock, and the clock is allowed to proceed at a natural/real pace. Thus, suppose we have this code:

// Example: Code which makes two calls to getTime()
CRM_Utils_Time::setTime('2020-01-02 03:04:00');
$time_a = CRM_Utils_Time::getTime();
do_stuff();
$time_b = CRM_Utils_Time::getTime();

Under the current policy, $time_a and $time_b will differ (intuitively) by how long it actually took to do_stuff(). That's pretty natural, but it's also non-deterministic.

This policy represents a trade-off: it is representative of real-world execution. If there is a non-deterministic timing bug in the actual logic, then we'd like the test system to raise an error about it. On the other hand, because it's non-deterministic, the results appear flaky and difficult to reproduce. Sometimes the test passes; sometimes it fails.

After

By default, CRM_Utils_Time still performs simulations where the clock proceeds at a natural pace. Thus, by default, the behavior is more representative of how the system works in real-world usage.

However, you may optionally use a synthetic clock where the pace is pre-determined. Choose a TIME_FUNC:

TIME_FUNC=frozen      ## every call to getTime() yields the same, static value
TIME_FUNC=natural     ## every call to getTime() progresses in tandem with the real/natural clock
TIME_FUNC=linear:333  ## every call to getTime() progresses by 333ms
TIME_FUNC=prng:333    ## every call to getTime() progresses by a random interval, up to 333ms

Comments

For a simple visualization of what TIME_FUNC does, consider this example: https://gist.github.com/totten/d84321dfeb76861aa65a04507897fcc0

For more complete example of debugging a flaky test, consider CRM_Core_BAO_ActionScheduleTest and its functions testMembershipEndDateRepeat, testEventTypeEndDateRepeat, and testRepetitionFrequencyUnit -- which (anecdotally) appear to be flaky. We can run one of these tests with deterministic time functions:

for TIME_FUNC in frozen prng:333 prng:666 prng:1000 prng:1500 linear:333 linear:666 linear:1000 ; do
  export TIME_FUNC
  echo
  echo "Running with TIME_FUNC=$TIME_FUNC"
  env CIVICRM_UF=UnitTests phpunit6 tests/phpunit/CRM/Core/BAO/ActionScheduleTest.php --filter testRepetitionFrequencyUnit
done

I've done multiple trials of testRepetitionFrequencyUnit with each TIME_FUNC, and we do indeed get different results depending on the TIME_FUNC. The results have been stable so far.

  • PASSING: frozen prng:333 prng:666 linear:333 linear:666 linear:1000 linear:1500
  • FAILING: prng:1000 prng:1500

This doesn't specifically tell you what the bug is. It could be a real bug in the main logic or less important bug in the test logic. However, you can now reproduce it (locally, in an IDE, etc) by setting TIME_FUNC=prng:1500.

…ming bugs

Overview
--------

`CRM_Utils_Time` is a helper for (a) getting the time and (b) setting the time
to a mocked value (for testing purposes).

Timing bugs can manifest as flaky tests.

This provides a new option, `TIME_FUNC`, which may help isolate some timing
bugs (provided the test consistently uses `CRM_Utils_Time`).

Before
------

When using `CRM_Utils_Time::setTime()`, it simulates a change in the clock, and
the clock is allowed to proceed at a natural/real pace.

This policy aims to be representative of real-world execution (wherein the
clock also executes at a natural pace).  However, consider that the pace at
which the code executes depends on numerous factors (CPU speed, memory
managers, system-load, system-caches, the specific code/use-case, etc).  If
your use-case relies on multiple calls to `getTime()`, and if you run the
use-case in repeated trials, then you may find somewhat chaotic variations
in the exact values of `getTime()`.

Consequently, if there is a timing-sensitive bug, you may find that some trials
pass while others fail, and it is impossible to reproduce the natural
circumstance of the failure.

After
-----

By default, `CRM_Utils_Time` still performs simulations where the clock
proceeds a natural/real pace.

However, if you see flaky results and suspect a timing bug, you can re-run the
test suite with alternative time functions (`TIME_FUNC`):

```bash
TIME_FUNC=frozen      ## every call to getTime() yields the same, static value
TIME_FUNC=natural     ## every call to getTime() progresses in tandem with the real/natural clock
TIME_FUNC=linear:333  ## every call to getTime() progresses by 333ms
TIME_FUNC=prng:333    ## every call to getTime() progresses by a random interval, up to 333ms
```

Now "time" is not so natural or chaotic - it's more synthetic and
deterministic, which can help with debugging.

For example, consider `CRM_Core_BAO_ActionScheduleTest` and its functions
`testMembershipEndDateRepeat`, `testEventTypeEndDateRepeat`, and
`testRepetitionFrequencyUnit` -- which (anecdotally) appear to be flaky.  We
can run one of these tests with deterministic time functions:

```bash
for TIME_FUNC in frozen prng:333 prng:666 prng:1000 prng:1500 linear:333 linear:666 linear:1000 ; do
  export TIME_FUNC
  echo
  echo "Running with TIME_FUNC=$TIME_FUNC"
  env CIVICRM_UF=UnitTests phpunit6 tests/phpunit/CRM/Core/BAO/ActionScheduleTest.php --filter testRepetitionFrequencyUnit
done
```

We get different results depending on the `TIME_FUNC`. I've done multiple trials with each `TIME_FUNC`, and
results have been stable so far:

* PASSING: `frozen` `prng:333` `prng:666` `linear:333` `linear:666` `linear:1000` `linear:1500`
* FAILING: `prng:1000` `prng:1500`

This doesn't specifically tell you what the bug is.  It could be a real bug
in the main logic or less important bug in the test logic.  However, you can
now *reproduce it* (locally, in an IDE, etc) by setting `TIME_FUNC=prng:1500`.
@civibot
Copy link

civibot bot commented May 28, 2020

(Standard links)

@civibot civibot bot added the master label May 28, 2020
@totten
Copy link
Member Author

totten commented May 28, 2020

(Note: Rewrote the description to try to make it more accessible... such as it...)

@eileenmcnaughton
Copy link
Contributor

OK - any progress on these flakey tests is good....

@eileenmcnaughton eileenmcnaughton merged commit cd999d6 into civicrm:master May 28, 2020
@totten totten deleted the master-settime branch June 4, 2020 21:58
totten added a commit to totten/civicrm-core that referenced this pull request Jun 7, 2020
The `CRM_Queue_QueueTest` is producing failures with the
[TIME_FUNC](civicrm#17414) option of
`linear:500` (the value currently used in PR test runs) -- which is to say,
it fails when the system runs very slowly (with some 500ms stalling during
execution).

This gets it working under some very slow scenarios:

```
for TIME_FUNC in natural frozen linear:500 linear:1250 prng:500 prng:666 prng:1000 prng:1500 ; do
  export TIME_FUNC;
  echo;
  echo "TIME_FUNC=$TIME_FUNC" ;
  env CIVICRM_UF=UnitTests phpunit6 tests/phpunit/CRM/Queue/QueueTest.php ;
done
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants