Allow single server instance in test suite #7262

dplewis · 2021-03-11T17:29:08Z

New Pull Request Checklist

I am not disclosing a vulnerability.
I am creating this PR in reference to an issue.

Issue Description

Closes: #7080

After cleaning up the JS SDK test suite I realized the test suite in this repo could use some TLC.
With the recent addition of Github Action and compatibility testing. We are constantly running into flaky tests. This can be a hinderance for future development. With the test suite growing everyday we should start looking for poorly written tests.

Approach

The goal of this PR is to create a single server at the start of the tests. This means every test will run against the same instance unless a separate instance is created for an individual test. The database is deleted after each test. Since a clean server isn't created before tests we can find tests that relies on it. If a tests relies on that means a test is affected by another test in the same file (that test isn't properly cleaned up) or by another test file (server instances created between files).

One beforeAll allowed
Fixed tests that relies on a clean server instance.
Properly close connections on servers created using express
Test speed improvement since an instance isn't created for every test.
Remove promise.all when necessary.

This is a first pass. If you see await reconfigureServer() at the start of the test. This may mean there are poorly written tests in the file that require a new server instance to be created. There are a lot of reasons why a test is poorly written. The main reason would be invariants, the state of the test suite should be the same before and after every test.

Let me know if you have any questions on the decisions made for any changes to any tests.

TODOs before merging

Add test cases
Add entry to changelog
Add changes to documentation (guides, repository pages, in-code descriptions)
Add security check
Add new Parse Error codes to Parse JS SDK
...

codecov · 2021-03-11T17:35:28Z

Codecov Report

Merging #7262 (f6db41c) into master (8b0e8cd) will decrease coverage by 0.05%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master    #7262      +/-   ##
==========================================
- Coverage   94.03%   93.97%   -0.06%     
==========================================
  Files         179      179              
  Lines       13137    13140       +3     
==========================================
- Hits        12353    12348       -5     
- Misses        784      792       +8

Impacted Files	Coverage Δ
src/Adapters/Storage/Postgres/PostgresClient.js	`66.66% <0.00%> (-11.12%)`	⬇️
src/Adapters/Storage/Mongo/MongoStorageAdapter.js	`92.25% <0.00%> (-0.67%)`	⬇️
src/Routers/UsersRouter.js	`93.75% <0.00%> (-0.63%)`	⬇️
src/RestWrite.js	`93.84% <0.00%> (-0.17%)`	⬇️
...dapters/Storage/Postgres/PostgresStorageAdapter.js	`95.50% <0.00%> (-0.08%)`	⬇️
src/Adapters/Files/GridFSBucketAdapter.js	`80.32% <0.00%> (+0.81%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8b0e8cd...f6db41c. Read the comment docs.

mtrezza · 2021-03-11T17:54:19Z

Interesting, let's see if I got this right...

You are saying that currently for each test a new server is spun up and therefore each test runs on a clean server instance. Some tests may currently only be passing, because they run on a clean server instance. Your approach to identify poorly written tests is to not spin up a new server for every test and see whether they fail.

But couldn't the opposite be just as true? A test may rely on a previously run test and the change that test did to the server?

dplewis · 2021-03-11T18:05:19Z

Thats basically it. There are a lot of other benefits to using this approach. I have noticed some weird behavior.

The current deleteAllClasses code relies on the _SCHEMA but if a class doesn't exist in the schema and the class exists. That table won't get deleted which could break a test. Edit: discovered it is an indexing issue

I'm still looking into this as this is just a first pass. I would like to never have to use await reconfigureServer().

Got the test suite to 5-7 minutes.

mtrezza · 2021-03-11T18:33:01Z

Interesting approach. I think if the server is not restarted for every test, then the tests should run in random order. Otherwise the likeliness that a test passes only based on a previously run test increases. I think randomizing the order will reveal a lot, did you try that?

Got the test suite to 5-7 minutes.

Very cool, especially now that we increased the test environments

dplewis · 2021-03-11T18:35:51Z

These problems also exist in the JS SDK. Specifically the user spec. I'll randomize that suite first then this one.

dplewis · 2021-03-11T18:43:45Z

I updated my original post with a list of possible test suites to investigate. If anyone wants to give me a hand that would be much appreciated.

dplewis · 2021-03-12T14:38:21Z

@mtrezza I am comfortable with where this is at.

Here is some of my observations with using single instance server.

A test using reconfigureServer({ ...some configuration }) should use reconfigureServer({}) to reset the instance back to the default so that it wouldn't interfere with other tests. I also ran a randomized test 627 failures. reconfigureServer into a global afterEach / beforeEach, the randomized test only failed 5 tests.
Using await reconfigureServer({}) or TestUtils.destroyAllDataPermanently at the beginning of a test means indexes need to be regenerated (Use for Polygon and Geopoint 2d indexes, User unique indexes, Role Indexes)

I may have possibly fixed all Push related flaky tests.
I would like to see what will happen when this is merged into #7214

mtrezza · 2021-03-12T14:51:14Z

If I add any more changes I may as well add reconfigureServer into a global afterEach / beforeEach.

What is the general pattern we want to follow and enforce in future PRs? I think we need to decide on a pattern that we think works best to mitigate flaky tests and yield best code robustness, excluding any special cases a developer may want to test. That should be documented in the contribution guide as well, otherwise we'll always have a mix of patterns. I do not see a clear pattern before or after this PR, as the server is reconfigured at different points, sometimes in afterAll, sometimes in beforeEach, sometimes not at all.

I think randomizing is a big step forward, and luckily that is controlled globally for all tests. Your suggestion of a global afterEach / beforeEach hits the points in this regard, maybe it is actually something we want?

dplewis · 2021-03-12T14:53:19Z

@mtrezza Sorry I updated my previous comment.

mtrezza · 2021-03-12T14:56:11Z

I also ran a randomized test 627 failures.

There we go. I am not surprised, because a developer may as well deliberately (not) reconfigure a server between test. I think we need to decide on a pattern, document it and clean up existing tests to conform with it, which may take a while and happen gradually. Is it possible to randomize only certain files? That would help us to make this change gradually. In regards to pattern, that depends on whether randomized means all tests across all files or only tests within describes?

I would like to see what will happen when this is merged into #7214

Me too! To not block #7214, maybe we can just clean up push tests with the new pattern we decide on and then already merge, and the rest afterwards.

A test using reconfigureServer({ ...some configuration }) should use reconfigureServer({}) to reset the instance back to the default so that it wouldn't interfere with other tests.

How would the server then be reconfigured with a different configuration?

dplewis · 2021-03-12T18:22:29Z

My main goal for this PR was to fix #7080. I'm currently saving 4 minutes per mongo test. If we can keep this speed improvement while documenting a pattern I'm all for it.

I think we need to decide on a pattern, document it and clean up existing tests to conform with it

Cleaning up existing tests would reduce the speed improvement. Here are some improvements in this PR and my reasoning behind them. I'll start with the first 2 test files. For clarity RS = reconfigureServer and SI = single instance

In this file there is 1 RS call. If I don't add an RS call after the test is done to reset the SI, every test following will have
push: pushAdapterOptions until another RS call. This could interfere with other tests and test spec files that would otherwise use the default push adapter.

In this file each test has an RS call. Since each has a RS call there is no interference between them. I added RS afterAll here so there isn't any interference with other test files. This is for speed improvements. I could have used RS afterEach which would be useful for randomization, similarly I could add an RS call after each individual test too.

After typing all that I came to a conclusion. The master branch calls RS before all 2800 tests. But the number of actual tests that needs that RS call for clean up is 400.

mtrezza · 2021-03-12T18:41:28Z

Thanks for explaining. I know you've done all the work already, but your explanation confirms for me that we need to find a pattern that works for all, it has to be documentable to guide developers and PR reviewers.

My 2cts:

Speed is surely a factor, but it should not win against test robustness; it should not be the defining factor when defining a pattern
afterAll should never be necessary, I would even say that we prohibit it and afterEach; every test should have the obligation to prepare the environment as needed not the other way around; using afterX turns that on its head and makes tests more complicated to debug, that's what we want to get away from

So these would already be rules:

Every test must prepare its environment as needed with reconfigureServer({...}) without depending on a previous test.
It's not allowed to use afterEach, afterAll.

Another suggestion:

It is not permitted to call reconfigureServer in beforeEach. --> this because it often leads to double reconfiguration. Say 50% of test need a certain config, and for the other 50% the developer calls reconfigure again inside the test because they need a different configuration.

This is the kind of rule set that we are currently missing I think. Ideally we can even build a CI test to enforce it. I assume this would bring a speed improvement, but maybe not the improvement we currently see in this PR, and that's also fine if there is a benefit in return.

dplewis · 2021-03-12T19:02:58Z

Another thing we should document is the use of Config.get(Parse.applicationId);. This would also prevent random testing.

In the following example, you should get the config after you reconfigureServer. I also noticed a few tests setting the config directly like config[someConfig] = someValue too or get it globally in a beforeEach. I don't know if setting it directly is the best option.

await reconfigureServer({ push: pushAdapterOptions });
const config = Config.get(Parse.applicationId);
// config.push === pushAdapterOptions
await reconfigureServer();
// config.push === pushAdapterOptions

await reconfigureServer({ push: pushAdapterOptions });
let config = Config.get(Parse.applicationId);
// config.push === pushAdapterOptions
await reconfigureServer();
config = Config.get(Parse.applicationId);
// config.push !== pushAdapterOptions

mtrezza · 2021-03-12T19:16:26Z

Yes, I'm using a similar pattern sometimes, and one has to be careful with this. We could disallow that in our "Test Quality Guide" if we think it's likely to be used incorrectly by developers.

dplewis · 2021-03-13T00:14:50Z

It's not allowed to use afterEach, afterAll.

afterAll and afterEach has their uses.

Every test must prepare its environment as needed with reconfigureServer({...}) without depending on a previous test.

I added automatic detection if a tests changes the single instance server. If a change is detected it calls resets it. This way users don't have to worry about it.

I also added jasmine test reporter for debugging also.

@davimacedo @mtrezza I think this is good to go.

mtrezza · 2021-03-13T00:16:25Z

afterAll and afterEach has their uses

~~How if the test are randomized?~~ I see for example in the CLI tests.

dplewis · 2021-03-13T00:34:32Z

How if the test are randomized?

I see what you are saying, lucky our afterEach have little effect on randomness. If we want randomness there are a few things we need to do.

Make Config global (I only found 1 test effected by this and fixed it)

RestQuery.each test afterSave response object is return
  Message:
    TypeError: Cannot read property 'allowCustomObjectId' of undefined
  Stack:
    TypeError: Cannot read property 'allowCustomObjectId' of undefined
        at new RestWrite (/Users/dplewis/Documents/GitHub/parse-server/lib/RestWrite.js:62:21)
        at Object.create (/Users/dplewis/Documents/GitHub/parse-server/lib/rest.js:133:15)
        at UserContext.done (/Users/dplewis/Documents/GitHub/parse-server/spec/RestQuery.spec.js:420:10)
        at <Jasmine>
        at process._tickCallback (internal/process/next_tick.js:68:7)

UserPII.spec.js acts weird when run against random. I don't know much about PII.
GraphQL tests needs to be looked at again. I don't know much about GraphQL. It creates a lot of servers that would have to be properly closed. I reduced the number of servers it creates
LiveQuery tests
Remove the afterAll and afterEach

mtrezza · 2021-03-13T00:35:10Z

spec/helper.js

    if (Object.keys(openConnections).length > 0) {
      fail('There were open connections to the server left after the test finished');
    }
-    TestUtils.destroyAllDataPermanently(true).then(done, done);
+    await TestUtils.destroyAllDataPermanently(true);
+    if (didChangeConfiguration) {


That's a clever idea, but doesn't that cause an unnecessary reconfiguration in many cases?

I'm thinking:

Test A does reconfiguration with custom config

afterEach does reconfiguration with default config

Test B does reconfiguration with custom config

Test A does reconfiguration with custom config
afterEach does reconfiguration with default config
Test B doesn't do reconfiguration (unaffected by Test A Config)
afterEach doesn't do reconfiguration
Test C doesn't do reconfiguration (unaffected by Test A Config)
afterEach doesn't do reconfiguration
Test D does reconfiguration with custom config

The net reconfiguration is the same. If you are interested I left a counter in the test suite

https://github.com/parse-community/parse-server/runs/2099411665

As you can see it reconfigures 461 times. Which is much better than the 2825 it was before.

I think I'm getting it. You are assuming that the tests B and C do not need a reconfig, therefore none should be done. And removing these unnecessary reconfigs, was something this PR also does (among other things)?

I figured out how random testing works.

Randomize the test suites and run them.

Before a test suite runs randomize the test within them.

Basically randomizing a single test suite can isolate the problem instead of running the entire suite.

I'll try it in a separate PR. Only 40 failing tests to go.

Great discovery! And just to be clear, no tests were removed or disabled in this PR, correct? I noticed re-ordering and re-grouping, but not sure whether I've missed something else.

That is correct.

Edit: I have removed some of the duplicate tests.

mtrezza

LGTM!

mtrezza · 2021-03-13T12:46:31Z

I don't think we need UserPII in its current form. It seems to be just testing whether a logged out user can access a user object in various ways which are effectively all the same and unnecessary. That doesn't require a distinct test suite IMO, and should be covered already in the login / logout or object access tests.

Regarding a "Test Quality Guide" for how to write tests, which rules would you say it should include, given the insight you have now after digging into the tests? (I will open a separate PR for this).

* initial pass * reconfigureServer when needed * finish postgres tests * mongo tests * more tests * clean up * re-add skipped test * Fix transaction tests * handle batch * AuthenticationAdapter fix * More reconfiguration * clean up * properly terminate cli servers * handle Parse.Push * Flaky PushController * ensure reconfigureServer when changed * fix postgres tests * remove console.log * LiveQuery spec remove duplicates and listeners

parseplatformorg · 2021-11-01T16:32:12Z

🎉 This change has been released in version 5.0.0-beta.1

parseplatformorg · 2022-03-14T13:04:20Z

🎉 This change has been released in version 5.0.0

dplewis added 7 commits March 10, 2021 13:48

initial pass

69226fb

Merge branch 'master' into clean-tests

ebf496c

reconfigureServer when needed

ed664aa

finish postgres tests

119f82b

mongo tests

571af6b

more tests

fac4830

clean up

2d7f3cd

dplewis added 9 commits March 11, 2021 17:01

re-add skipped test

c883a48

Fix transaction tests

57a08bb

handle batch

5acec68

AuthenticationAdapter fix

aafbdd7

More reconfiguration

e8515c7

clean up

c938820

properly terminate cli servers

aa7ffb8

handle Parse.Push

6ab6ec9

Flaky PushController

d31f08c

dplewis requested review from mtrezza and davimacedo March 12, 2021 14:35

dplewis added 4 commits March 12, 2021 14:24

ensure reconfigureServer when changed

32f118c

fix postgres tests

29439ca

Merge branch 'master' into clean-tests

b307f1f

remove console.log

7f100f3

mtrezza reviewed Mar 13, 2021

View reviewed changes

dplewis requested a review from mtrezza March 13, 2021 01:58

LiveQuery spec remove duplicates and listeners

f6db41c

mtrezza approved these changes Mar 13, 2021

View reviewed changes

dplewis merged commit 9563793 into master Mar 13, 2021

dplewis deleted the clean-tests branch March 13, 2021 15:05

dplewis mentioned this pull request Mar 13, 2021

Enable test suite to be randomized #7265

Merged

28 tasks

parseplatformorg added the state:released-beta Released as beta version label Nov 1, 2021

parseplatformorg added the state:released Released as stable version label Mar 14, 2022

dplewis mentioned this pull request Jul 9, 2024

ci: Find duplicate and slow tests #9188

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow single server instance in test suite #7262

Allow single server instance in test suite #7262

dplewis commented Mar 11, 2021 •

edited

Loading

codecov bot commented Mar 11, 2021 •

edited

Loading

mtrezza commented Mar 11, 2021

dplewis commented Mar 11, 2021 •

edited

Loading

mtrezza commented Mar 11, 2021 •

edited

Loading

dplewis commented Mar 11, 2021

dplewis commented Mar 11, 2021

dplewis commented Mar 12, 2021 •

edited

Loading

mtrezza commented Mar 12, 2021 •

edited

Loading

dplewis commented Mar 12, 2021

mtrezza commented Mar 12, 2021 •

edited

Loading

dplewis commented Mar 12, 2021 •

edited

Loading

mtrezza commented Mar 12, 2021 •

edited

Loading

dplewis commented Mar 12, 2021 •

edited

Loading

mtrezza commented Mar 12, 2021 •

edited

Loading

dplewis commented Mar 13, 2021

mtrezza commented Mar 13, 2021 •

edited

Loading

dplewis commented Mar 13, 2021 •

edited

Loading

mtrezza Mar 13, 2021

dplewis Mar 13, 2021 •

edited

Loading

mtrezza Mar 13, 2021

dplewis Mar 13, 2021

dplewis Mar 13, 2021

mtrezza Mar 13, 2021

dplewis Mar 13, 2021 •

edited

Loading

mtrezza left a comment

mtrezza commented Mar 13, 2021 •

edited

Loading

parseplatformorg commented Nov 1, 2021

parseplatformorg commented Mar 14, 2022

Allow single server instance in test suite #7262

Allow single server instance in test suite #7262

Conversation

dplewis commented Mar 11, 2021 • edited Loading

New Pull Request Checklist

Issue Description

Approach

TODOs before merging

codecov bot commented Mar 11, 2021 • edited Loading

Codecov Report

mtrezza commented Mar 11, 2021

dplewis commented Mar 11, 2021 • edited Loading

mtrezza commented Mar 11, 2021 • edited Loading

dplewis commented Mar 11, 2021

dplewis commented Mar 11, 2021

dplewis commented Mar 12, 2021 • edited Loading

mtrezza commented Mar 12, 2021 • edited Loading

dplewis commented Mar 12, 2021

mtrezza commented Mar 12, 2021 • edited Loading

dplewis commented Mar 12, 2021 • edited Loading

mtrezza commented Mar 12, 2021 • edited Loading

dplewis commented Mar 12, 2021 • edited Loading

mtrezza commented Mar 12, 2021 • edited Loading

dplewis commented Mar 13, 2021

mtrezza commented Mar 13, 2021 • edited Loading

dplewis commented Mar 13, 2021 • edited Loading

mtrezza Mar 13, 2021

Choose a reason for hiding this comment

dplewis Mar 13, 2021 • edited Loading

Choose a reason for hiding this comment

mtrezza Mar 13, 2021

Choose a reason for hiding this comment

dplewis Mar 13, 2021

Choose a reason for hiding this comment

dplewis Mar 13, 2021

Choose a reason for hiding this comment

mtrezza Mar 13, 2021

Choose a reason for hiding this comment

dplewis Mar 13, 2021 • edited Loading

Choose a reason for hiding this comment

mtrezza left a comment

Choose a reason for hiding this comment

mtrezza commented Mar 13, 2021 • edited Loading

parseplatformorg commented Nov 1, 2021

parseplatformorg commented Mar 14, 2022

dplewis commented Mar 11, 2021 •

edited

Loading

codecov bot commented Mar 11, 2021 •

edited

Loading

dplewis commented Mar 11, 2021 •

edited

Loading

mtrezza commented Mar 11, 2021 •

edited

Loading

dplewis commented Mar 12, 2021 •

edited

Loading

mtrezza commented Mar 12, 2021 •

edited

Loading

mtrezza commented Mar 12, 2021 •

edited

Loading

dplewis commented Mar 12, 2021 •

edited

Loading

mtrezza commented Mar 12, 2021 •

edited

Loading

dplewis commented Mar 12, 2021 •

edited

Loading

mtrezza commented Mar 12, 2021 •

edited

Loading

mtrezza commented Mar 13, 2021 •

edited

Loading

dplewis commented Mar 13, 2021 •

edited

Loading

dplewis Mar 13, 2021 •

edited

Loading

dplewis Mar 13, 2021 •

edited

Loading

mtrezza commented Mar 13, 2021 •

edited

Loading