-
-
Notifications
You must be signed in to change notification settings - Fork 825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dev/core#1781 - CRM_Utils_Time - Allow TIME_FUNC to better isolate timing bugs #17414
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ming bugs Overview -------- `CRM_Utils_Time` is a helper for (a) getting the time and (b) setting the time to a mocked value (for testing purposes). Timing bugs can manifest as flaky tests. This provides a new option, `TIME_FUNC`, which may help isolate some timing bugs (provided the test consistently uses `CRM_Utils_Time`). Before ------ When using `CRM_Utils_Time::setTime()`, it simulates a change in the clock, and the clock is allowed to proceed at a natural/real pace. This policy aims to be representative of real-world execution (wherein the clock also executes at a natural pace). However, consider that the pace at which the code executes depends on numerous factors (CPU speed, memory managers, system-load, system-caches, the specific code/use-case, etc). If your use-case relies on multiple calls to `getTime()`, and if you run the use-case in repeated trials, then you may find somewhat chaotic variations in the exact values of `getTime()`. Consequently, if there is a timing-sensitive bug, you may find that some trials pass while others fail, and it is impossible to reproduce the natural circumstance of the failure. After ----- By default, `CRM_Utils_Time` still performs simulations where the clock proceeds a natural/real pace. However, if you see flaky results and suspect a timing bug, you can re-run the test suite with alternative time functions (`TIME_FUNC`): ```bash TIME_FUNC=frozen ## every call to getTime() yields the same, static value TIME_FUNC=natural ## every call to getTime() progresses in tandem with the real/natural clock TIME_FUNC=linear:333 ## every call to getTime() progresses by 333ms TIME_FUNC=prng:333 ## every call to getTime() progresses by a random interval, up to 333ms ``` Now "time" is not so natural or chaotic - it's more synthetic and deterministic, which can help with debugging. For example, consider `CRM_Core_BAO_ActionScheduleTest` and its functions `testMembershipEndDateRepeat`, `testEventTypeEndDateRepeat`, and `testRepetitionFrequencyUnit` -- which (anecdotally) appear to be flaky. We can run one of these tests with deterministic time functions: ```bash for TIME_FUNC in frozen prng:333 prng:666 prng:1000 prng:1500 linear:333 linear:666 linear:1000 ; do export TIME_FUNC echo echo "Running with TIME_FUNC=$TIME_FUNC" env CIVICRM_UF=UnitTests phpunit6 tests/phpunit/CRM/Core/BAO/ActionScheduleTest.php --filter testRepetitionFrequencyUnit done ``` We get different results depending on the `TIME_FUNC`. I've done multiple trials with each `TIME_FUNC`, and results have been stable so far: * PASSING: `frozen` `prng:333` `prng:666` `linear:333` `linear:666` `linear:1000` `linear:1500` * FAILING: `prng:1000` `prng:1500` This doesn't specifically tell you what the bug is. It could be a real bug in the main logic or less important bug in the test logic. However, you can now *reproduce it* (locally, in an IDE, etc) by setting `TIME_FUNC=prng:1500`.
(Standard links)
|
(Note: Rewrote the description to try to make it more accessible... such as it...) |
OK - any progress on these flakey tests is good.... |
totten
added a commit
to totten/civicrm-core
that referenced
this pull request
Jun 7, 2020
The `CRM_Queue_QueueTest` is producing failures with the [TIME_FUNC](civicrm#17414) option of `linear:500` (the value currently used in PR test runs) -- which is to say, it fails when the system runs very slowly (with some 500ms stalling during execution). This gets it working under some very slow scenarios: ``` for TIME_FUNC in natural frozen linear:500 linear:1250 prng:500 prng:666 prng:1000 prng:1500 ; do export TIME_FUNC; echo; echo "TIME_FUNC=$TIME_FUNC" ; env CIVICRM_UF=UnitTests phpunit6 tests/phpunit/CRM/Queue/QueueTest.php ; done ```
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
As a general matter, a unit-test is supposed to be both reproducible and representative. Reproducibility is achieved by controlling the inputs to the test. This PR aims to improve reproducibility for certain tests by controlling time-inputs with the option
TIME_FUNC
.Logic which relies on time poses a challenge to reproducibility -- every call to
time()
(or similar) is essentially a new and non-deterministic input. Two calls totime()
may (or may not) produce the same result, depending on external happenstance. (To wit: When is the test being executed? How fast is the machine? How much concurrent load is there? etc)CRM_Utils_Time
provides a mechanism (setTime()
andgetTime()
) for taking tighter control over timing, and it is used for testing certain time-sensitive subsystems (eg "Scheduled Reminders"). However, we still see some flaky (hard-to-reproduce) results in the "Schedule Reminder" tests, and -- as pointed out in dev/core#1781 -- the current approach does not try to control for variations in the pace of execution.Before
When using
CRM_Utils_Time::setTime()
, it simulates a change in the clock, and the clock is allowed to proceed at a natural/real pace. Thus, suppose we have this code:Under the current policy,
$time_a
and$time_b
will differ (intuitively) by how long it actually took todo_stuff()
. That's pretty natural, but it's also non-deterministic.This policy represents a trade-off: it is representative of real-world execution. If there is a non-deterministic timing bug in the actual logic, then we'd like the test system to raise an error about it. On the other hand, because it's non-deterministic, the results appear flaky and difficult to reproduce. Sometimes the test passes; sometimes it fails.
After
By default,
CRM_Utils_Time
still performs simulations where the clock proceeds at a natural pace. Thus, by default, the behavior is more representative of how the system works in real-world usage.However, you may optionally use a synthetic clock where the pace is pre-determined. Choose a
TIME_FUNC
:Comments
For a simple visualization of what
TIME_FUNC
does, consider this example: https://gist.github.com/totten/d84321dfeb76861aa65a04507897fcc0For more complete example of debugging a flaky test, consider
CRM_Core_BAO_ActionScheduleTest
and its functionstestMembershipEndDateRepeat
,testEventTypeEndDateRepeat
, andtestRepetitionFrequencyUnit
-- which (anecdotally) appear to be flaky. We can run one of these tests with deterministic time functions:I've done multiple trials of
testRepetitionFrequencyUnit
with eachTIME_FUNC
, and we do indeed get different results depending on theTIME_FUNC
. The results have been stable so far.frozen
prng:333
prng:666
linear:333
linear:666
linear:1000
linear:1500
prng:1000
prng:1500
This doesn't specifically tell you what the bug is. It could be a real bug in the main logic or less important bug in the test logic. However, you can now reproduce it (locally, in an IDE, etc) by setting
TIME_FUNC=prng:1500
.