-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix flaky test in service_domain_db_SUITE #4072
Conversation
…r_cluster_join sync() assumes that sending message from mim() to mim2() node is instant But it is not true, so check_for_updates message is received after sync_local call in tests. I.e. the following scenario fails: 1. Test node asks mim2() to insert a domain. 2. mim2() sends check_for_updates to mim() node. 3. Test node calls sync_local on mim() node. 4. mim() calls sync_local and replies to the test node. 5. mim() finally receives check_for_updates from mim2(), send in step 2.
small_tests_24 / small_tests / 139130e elasticsearch_and_cassandra_25 / elasticsearch_and_cassandra_mnesia / 139130e small_tests_25 / small_tests / 139130e small_tests_25_arm64 / small_tests / 139130e ldap_mnesia_24 / ldap_mnesia / 139130e dynamic_domains_pgsql_mnesia_24 / pgsql_mnesia / 139130e ldap_mnesia_25 / ldap_mnesia / 139130e dynamic_domains_mysql_redis_25 / mysql_redis / 139130e dynamic_domains_pgsql_mnesia_25 / pgsql_mnesia / 139130e dynamic_domains_mssql_mnesia_25 / odbc_mssql_mnesia / 139130e disco_and_caps_SUITE:disco_with_caps:user_can_query_friend_resources{error,{{assertion_failed,assert_many,false,[is_roster_set],[],[]},
[{escalus_new_assert,assert_true,2,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_new_assert.erl"},
{line,84}]},
{escalus_story,'-make_all_clients_friends/1-fun-0-',2,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_story.erl"},
{line,108}]},
{escalus_utils,'-each_with_index/3-fun-0-',3,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_utils.erl"},
{line,87}]},
{lists,foldl_1,3,[{file,"lists.erl"},{line,1355}]},
{escalus_utils,'-each_with_index/3-fun-0-',3,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_utils.erl"},
{line,87}]},
{lists,foldl,3,[{file,"lists.erl"},{line,1350}]},
{escalus_utils,distinct_pairs,2,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_utils.erl"},
{line,60}]},
{escalus_story,make_all_clients_friends,1,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_story.erl"},
{line,106}]}]}} disco_and_caps_SUITE:disco_with_caps:user_can_query_friend_features{error,{{assertion_failed,assert_many,false,[is_roster_set],[],[]},
[{escalus_new_assert,assert_true,2,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_new_assert.erl"},
{line,84}]},
{escalus_story,'-make_all_clients_friends/1-fun-0-',2,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_story.erl"},
{line,108}]},
{escalus_utils,'-each_with_index/3-fun-0-',3,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_utils.erl"},
{line,87}]},
{lists,foldl_1,3,[{file,"lists.erl"},{line,1355}]},
{escalus_utils,'-each_with_index/3-fun-0-',3,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_utils.erl"},
{line,87}]},
{lists,foldl,3,[{file,"lists.erl"},{line,1350}]},
{escalus_utils,distinct_pairs,2,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_utils.erl"},
{line,60}]},
{escalus_story,make_all_clients_friends,1,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_story.erl"},
{line,106}]}]}} disco_and_caps_SUITE:disco_with_caps:user_cannot_query_stranger_features{error,
{timeout_when_waiting_for_stanza,
[{escalus_client,wait_for_stanza,
[{client,
<<"alice_user_cannot_query_stranger_features_619@domain.example.com/res1">>,
escalus_tcp,<0.12918.0>,
[{event_manager,<0.12882.0>},
{server,<<"domain.example.com">>},
{username,
<<"alicE_user_cannot_query_stranger_features_619">>},
{resource,<<"res1">>}],
[{event_client,
[{event_manager,<0.12882.0>},
{server,<<"domain.example.com">>},
{username,
<<"alicE_user_cannot_query_stranger_features_619">>},
{resource,<<"res1">>}]},
{resource,<<"res1">>},
{username,
<<"alice_user_cannot_query_stranger_features_619">>},
{server,<<"domain.example.com">>},
{host,<<"localhost">>},
{port,5222},
{auth,{escalus_auth,auth_plain}},
{wspath,undefined},
{username,
<<"alicE_user_cannot_query_stranger_features_619">>},
{server,<<"domain.example.com">>},
{host,<<"localhost">>},
{password,<<"matygrysa">>},
{stream_id,<<"3bbf4c7dd1186876">>}]},
5000],
[{file,
"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_client.erl"},
{line,136}]},
{disco_and_caps_SUITE,
'-user_cannot_query_stranger_features/1-fun-0-',2,
[{file,
"/h... disco_and_caps_SUITE:disco_with_caps:user_cannot_query_friend_resources_with_unknown_node{error,{{assertion_failed,assert_many,false,[is_roster_set],[],[]},
[{escalus_new_assert,assert_true,2,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_new_assert.erl"},
{line,84}]},
{escalus_story,'-make_all_clients_friends/1-fun-0-',2,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_story.erl"},
{line,108}]},
{escalus_utils,'-each_with_index/3-fun-0-',3,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_utils.erl"},
{line,87}]},
{lists,foldl_1,3,[{file,"lists.erl"},{line,1355}]},
{escalus_utils,'-each_with_index/3-fun-0-',3,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_utils.erl"},
{line,87}]},
{lists,foldl,3,[{file,"lists.erl"},{line,1350}]},
{escalus_utils,distinct_pairs,2,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_utils.erl"},
{line,60}]},
{escalus_story,make_all_clients_friends,1,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_story.erl"},
{line,106}]}]}} internal_mnesia_25 / internal_mnesia / 139130e pubsub_SUITE:dag+basic:publish_with_max_items_test{error,{{badmatch,false},
[{pubsub_tools,check_response,2,
[{file,"/home/circleci/project/big_tests/tests/pubsub_tools.erl"},
{line,444}]},
{pubsub_tools,receive_response,3,
[{file,"/home/circleci/project/big_tests/tests/pubsub_tools.erl"},
{line,434}]},
{pubsub_tools,receive_and_check_response,4,
[{file,"/home/circleci/project/big_tests/tests/pubsub_tools.erl"},
{line,424}]},
{escalus_story,story,4,
[{file,"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_story.erl"},
{line,72}]},
{test_server,ts_tc,3,[{file,"test_server.erl"},{line,1782}]},
{test_server,run_test_case_eval1,6,
[{file,"test_server.erl"},{line,1291}]},
{test_server,run_test_case_eval,9,
[{file,"test_server.erl"},{line,1223}]}]}} pgsql_mnesia_24 / pgsql_mnesia / 139130e pgsql_mnesia_25 / pgsql_mnesia / 139130e mysql_redis_25 / mysql_redis / 139130e mssql_mnesia_25 / odbc_mssql_mnesia / 139130e internal_mnesia_25 / internal_mnesia / 139130e |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #4072 +/- ##
=======================================
Coverage 83.88% 83.89%
=======================================
Files 526 526
Lines 33181 33187 +6
=======================================
+ Hits 27834 27842 +8
+ Misses 5347 5345 -2
☔ View full report in Codecov by Sentry. |
true = is_pid(LocalPid), | ||
Nodes = [node(Pid) || Pid <- all_members()], | ||
%% Ping from all nodes in the cluster | ||
[pong = rpc:call(Node, ?MODULE, ping, [LocalPid]) || Node <- Nodes], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not erpc:multicall/4
instead maybe? Just a suggestion, otherwise I can merge as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to keep it simple.
It is basically 2 nodes (mim and mim2) in the tests anyway.
And this function is used only in tests :D
And one of the nodes is local :)
Btw, super good PR description, love it 😄 |
Fix a race condition in
service_domain_db_suite.db_keeps_syncing_after_cluster_join
testcase.Our tests assume that sending message from mim() to mim2() node is instant.
But it is not true, so check_for_updates message is received after sync_local call in tests.
I.e. the following scenario fails:
mim2()
to insert a domain.mim2()
sendscheck_for_updates
tomim()
node.sync_local
on mim() node.mim()
receives sync_local and replies to the test node.mim()
finally receivescheck_for_updates
frommim2()
, send in step 2.This PR addresses https://circleci-mim-results.s3.eu-central-1.amazonaws.com/PR/4066/183250/pgsql_cets.25.3-amd64/big/ct_run.test%4048ff31232359.2023-08-01_06.11.42/big_tests.tests.service_domain_db_SUITE.logs/run.2023-08-01_06.28.17/service_domain_db_suite.db_keeps_syncing_after_cluster_join.html
If we add some debug logging we can see that
ping
sometimes arrives earlier thancheck_for_updates
:Proposed changes include:
check_for_updates
is received by the local server, even if it is send by a remote server.It is easy to reproduce the failure with
repeat_until_any_fail
for this testcase (and after the fix I cannot trigger the error anymore).How the flaky test looks like:
So, there are two places
assert_domains_are_equal
could fail before the fix.It does not change any server logic, it is only fixes the test (because the correct state is achieved eventually, the test is just checking the condition too fast) .