Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow single server instance in test suite #7262

Merged
merged 21 commits into from
Mar 13, 2021
Merged

Allow single server instance in test suite #7262

merged 21 commits into from
Mar 13, 2021

Conversation

dplewis
Copy link
Member

@dplewis dplewis commented Mar 11, 2021

New Pull Request Checklist

Issue Description

Closes: #7080

After cleaning up the JS SDK test suite I realized the test suite in this repo could use some TLC.
With the recent addition of Github Action and compatibility testing. We are constantly running into flaky tests. This can be a hinderance for future development. With the test suite growing everyday we should start looking for poorly written tests.

Approach

The goal of this PR is to create a single server at the start of the tests. This means every test will run against the same instance unless a separate instance is created for an individual test. The database is deleted after each test. Since a clean server isn't created before tests we can find tests that relies on it. If a tests relies on that means a test is affected by another test in the same file (that test isn't properly cleaned up) or by another test file (server instances created between files).

  • One beforeAll allowed
  • Fixed tests that relies on a clean server instance.
  • Properly close connections on servers created using express
  • Test speed improvement since an instance isn't created for every test.
  • Remove promise.all when necessary.

This is a first pass. If you see await reconfigureServer() at the start of the test. This may mean there are poorly written tests in the file that require a new server instance to be created. There are a lot of reasons why a test is poorly written. The main reason would be invariants, the state of the test suite should be the same before and after every test.

Let me know if you have any questions on the decisions made for any changes to any tests.

TODOs before merging

  • Add test cases
  • Add entry to changelog
  • Add changes to documentation (guides, repository pages, in-code descriptions)
  • Add security check
  • Add new Parse Error codes to Parse JS SDK
  • ...

@codecov
Copy link

codecov bot commented Mar 11, 2021

Codecov Report

Merging #7262 (f6db41c) into master (8b0e8cd) will decrease coverage by 0.05%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7262      +/-   ##
==========================================
- Coverage   94.03%   93.97%   -0.06%     
==========================================
  Files         179      179              
  Lines       13137    13140       +3     
==========================================
- Hits        12353    12348       -5     
- Misses        784      792       +8     
Impacted Files Coverage Δ
src/Adapters/Storage/Postgres/PostgresClient.js 66.66% <0.00%> (-11.12%) ⬇️
src/Adapters/Storage/Mongo/MongoStorageAdapter.js 92.25% <0.00%> (-0.67%) ⬇️
src/Routers/UsersRouter.js 93.75% <0.00%> (-0.63%) ⬇️
src/RestWrite.js 93.84% <0.00%> (-0.17%) ⬇️
...dapters/Storage/Postgres/PostgresStorageAdapter.js 95.50% <0.00%> (-0.08%) ⬇️
src/Adapters/Files/GridFSBucketAdapter.js 80.32% <0.00%> (+0.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8b0e8cd...f6db41c. Read the comment docs.

@mtrezza
Copy link
Member

mtrezza commented Mar 11, 2021

Interesting, let's see if I got this right...

You are saying that currently for each test a new server is spun up and therefore each test runs on a clean server instance. Some tests may currently only be passing, because they run on a clean server instance. Your approach to identify poorly written tests is to not spin up a new server for every test and see whether they fail.

But couldn't the opposite be just as true? A test may rely on a previously run test and the change that test did to the server?

@dplewis
Copy link
Member Author

dplewis commented Mar 11, 2021

Thats basically it. There are a lot of other benefits to using this approach. I have noticed some weird behavior.

The current deleteAllClasses code relies on the _SCHEMA but if a class doesn't exist in the schema and the class exists. That table won't get deleted which could break a test. Edit: discovered it is an indexing issue

I'm still looking into this as this is just a first pass. I would like to never have to use await reconfigureServer().

Got the test suite to 5-7 minutes.

@mtrezza
Copy link
Member

mtrezza commented Mar 11, 2021

Interesting approach. I think if the server is not restarted for every test, then the tests should run in random order. Otherwise the likeliness that a test passes only based on a previously run test increases. I think randomizing the order will reveal a lot, did you try that?

Got the test suite to 5-7 minutes.

Very cool, especially now that we increased the test environments

@dplewis
Copy link
Member Author

dplewis commented Mar 11, 2021

These problems also exist in the JS SDK. Specifically the user spec. I'll randomize that suite first then this one.

@dplewis
Copy link
Member Author

dplewis commented Mar 11, 2021

I updated my original post with a list of possible test suites to investigate. If anyone wants to give me a hand that would be much appreciated.

@dplewis dplewis requested review from mtrezza and davimacedo March 12, 2021 14:35
@dplewis
Copy link
Member Author

dplewis commented Mar 12, 2021

@mtrezza I am comfortable with where this is at.

Here is some of my observations with using single instance server.

  • A test using reconfigureServer({ ...some configuration }) should use reconfigureServer({}) to reset the instance back to the default so that it wouldn't interfere with other tests. I also ran a randomized test 627 failures. reconfigureServer into a global afterEach / beforeEach, the randomized test only failed 5 tests.

  • Using await reconfigureServer({}) or TestUtils.destroyAllDataPermanently at the beginning of a test means indexes need to be regenerated (Use for Polygon and Geopoint 2d indexes, User unique indexes, Role Indexes)

I may have possibly fixed all Push related flaky tests.
I would like to see what will happen when this is merged into #7214

@mtrezza
Copy link
Member

mtrezza commented Mar 12, 2021

If I add any more changes I may as well add reconfigureServer into a global afterEach / beforeEach.

What is the general pattern we want to follow and enforce in future PRs? I think we need to decide on a pattern that we think works best to mitigate flaky tests and yield best code robustness, excluding any special cases a developer may want to test. That should be documented in the contribution guide as well, otherwise we'll always have a mix of patterns. I do not see a clear pattern before or after this PR, as the server is reconfigured at different points, sometimes in afterAll, sometimes in beforeEach, sometimes not at all.

I think randomizing is a big step forward, and luckily that is controlled globally for all tests. Your suggestion of a global afterEach / beforeEach hits the points in this regard, maybe it is actually something we want?

@dplewis
Copy link
Member Author

dplewis commented Mar 12, 2021

@mtrezza Sorry I updated my previous comment.

@mtrezza
Copy link
Member

mtrezza commented Mar 12, 2021

I also ran a randomized test 627 failures.

There we go. I am not surprised, because a developer may as well deliberately (not) reconfigure a server between test. I think we need to decide on a pattern, document it and clean up existing tests to conform with it, which may take a while and happen gradually. Is it possible to randomize only certain files? That would help us to make this change gradually. In regards to pattern, that depends on whether randomized means all tests across all files or only tests within describes?

I would like to see what will happen when this is merged into #7214

Me too! To not block #7214, maybe we can just clean up push tests with the new pattern we decide on and then already merge, and the rest afterwards.

A test using reconfigureServer({ ...some configuration }) should use reconfigureServer({}) to reset the instance back to the default so that it wouldn't interfere with other tests.

How would the server then be reconfigured with a different configuration?

@dplewis
Copy link
Member Author

dplewis commented Mar 12, 2021

My main goal for this PR was to fix #7080. I'm currently saving 4 minutes per mongo test. If we can keep this speed improvement while documenting a pattern I'm all for it.

I think we need to decide on a pattern, document it and clean up existing tests to conform with it

Cleaning up existing tests would reduce the speed improvement. Here are some improvements in this PR and my reasoning behind them. I'll start with the first 2 test files. For clarity RS = reconfigureServer and SI = single instance

In this file there is 1 RS call. If I don't add an RS call after the test is done to reset the SI, every test following will have
push: pushAdapterOptions until another RS call. This could interfere with other tests and test spec files that would otherwise use the default push adapter.
Screen Shot 2021-03-12 at 9 42 22 AM

In this file each test has an RS call. Since each has a RS call there is no interference between them. I added RS afterAll here so there isn't any interference with other test files. This is for speed improvements. I could have used RS afterEach which would be useful for randomization, similarly I could add an RS call after each individual test too.

Screen Shot 2021-03-12 at 12 11 16 PM

After typing all that I came to a conclusion. The master branch calls RS before all 2800 tests. But the number of actual tests that needs that RS call for clean up is 400.

@mtrezza
Copy link
Member

mtrezza commented Mar 12, 2021

Thanks for explaining. I know you've done all the work already, but your explanation confirms for me that we need to find a pattern that works for all, it has to be documentable to guide developers and PR reviewers.

My 2cts:

  • Speed is surely a factor, but it should not win against test robustness; it should not be the defining factor when defining a pattern
  • afterAll should never be necessary, I would even say that we prohibit it and afterEach; every test should have the obligation to prepare the environment as needed not the other way around; using afterX turns that on its head and makes tests more complicated to debug, that's what we want to get away from

So these would already be rules:

  • Every test must prepare its environment as needed with reconfigureServer({...}) without depending on a previous test.
  • It's not allowed to use afterEach, afterAll.

Another suggestion:

  • It is not permitted to call reconfigureServer in beforeEach. --> this because it often leads to double reconfiguration. Say 50% of test need a certain config, and for the other 50% the developer calls reconfigure again inside the test because they need a different configuration.

This is the kind of rule set that we are currently missing I think. Ideally we can even build a CI test to enforce it. I assume this would bring a speed improvement, but maybe not the improvement we currently see in this PR, and that's also fine if there is a benefit in return.

@dplewis
Copy link
Member Author

dplewis commented Mar 12, 2021

Another thing we should document is the use of Config.get(Parse.applicationId);. This would also prevent random testing.

In the following example, you should get the config after you reconfigureServer. I also noticed a few tests setting the config directly like config[someConfig] = someValue too or get it globally in a beforeEach. I don't know if setting it directly is the best option.

await reconfigureServer({ push: pushAdapterOptions });
const config = Config.get(Parse.applicationId);
// config.push === pushAdapterOptions
await reconfigureServer();
// config.push === pushAdapterOptions
await reconfigureServer({ push: pushAdapterOptions });
let config = Config.get(Parse.applicationId);
// config.push === pushAdapterOptions
await reconfigureServer();
config = Config.get(Parse.applicationId);
// config.push !== pushAdapterOptions

@mtrezza
Copy link
Member

mtrezza commented Mar 12, 2021

Yes, I'm using a similar pattern sometimes, and one has to be careful with this. We could disallow that in our "Test Quality Guide" if we think it's likely to be used incorrectly by developers.

@dplewis
Copy link
Member Author

dplewis commented Mar 13, 2021

It's not allowed to use afterEach, afterAll.

afterAll and afterEach has their uses.

Every test must prepare its environment as needed with reconfigureServer({...}) without depending on a previous test.

I added automatic detection if a tests changes the single instance server. If a change is detected it calls resets it. This way users don't have to worry about it.

I also added jasmine test reporter for debugging also.

@davimacedo @mtrezza I think this is good to go.

@mtrezza
Copy link
Member

mtrezza commented Mar 13, 2021

afterAll and afterEach has their uses

How if the test are randomized? I see for example in the CLI tests.

@dplewis
Copy link
Member Author

dplewis commented Mar 13, 2021

How if the test are randomized?

I see what you are saying, lucky our afterEach have little effect on randomness. If we want randomness there are a few things we need to do.

  • Make Config global (I only found 1 test effected by this and fixed it)
RestQuery.each test afterSave response object is return
  Message:
    TypeError: Cannot read property 'allowCustomObjectId' of undefined
  Stack:
    TypeError: Cannot read property 'allowCustomObjectId' of undefined
        at new RestWrite (/Users/dplewis/Documents/GitHub/parse-server/lib/RestWrite.js:62:21)
        at Object.create (/Users/dplewis/Documents/GitHub/parse-server/lib/rest.js:133:15)
        at UserContext.done (/Users/dplewis/Documents/GitHub/parse-server/spec/RestQuery.spec.js:420:10)
        at <Jasmine>
        at process._tickCallback (internal/process/next_tick.js:68:7)
  • UserPII.spec.js acts weird when run against random. I don't know much about PII.
  • GraphQL tests needs to be looked at again. I don't know much about GraphQL. It creates a lot of servers that would have to be properly closed. I reduced the number of servers it creates
  • LiveQuery tests
  • Remove the afterAll and afterEach

if (Object.keys(openConnections).length > 0) {
fail('There were open connections to the server left after the test finished');
}
TestUtils.destroyAllDataPermanently(true).then(done, done);
await TestUtils.destroyAllDataPermanently(true);
if (didChangeConfiguration) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a clever idea, but doesn't that cause an unnecessary reconfiguration in many cases?

I'm thinking:

  • Test A does reconfiguration with custom config
  • afterEach does reconfiguration with default config
  • Test B does reconfiguration with custom config

Copy link
Member Author

@dplewis dplewis Mar 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test A does reconfiguration with custom config
afterEach does reconfiguration with default config
Test B doesn't do reconfiguration (unaffected by Test A Config)
afterEach doesn't do reconfiguration
Test C doesn't do reconfiguration (unaffected by Test A Config)
afterEach doesn't do reconfiguration
Test D does reconfiguration with custom config

The net reconfiguration is the same. If you are interested I left a counter in the test suite

https://github.com/parse-community/parse-server/runs/2099411665

As you can see it reconfigures 461 times. Which is much better than the 2825 it was before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm getting it. You are assuming that the tests B and C do not need a reconfig, therefore none should be done. And removing these unnecessary reconfigs, was something this PR also does (among other things)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured out how random testing works.

  1. Randomize the test suites and run them.
  2. Before a test suite runs randomize the test within them.

Basically randomizing a single test suite can isolate the problem instead of running the entire suite.

I'll try it in a separate PR. Only 40 failing tests to go.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great discovery! And just to be clear, no tests were removed or disabled in this PR, correct? I noticed re-ordering and re-grouping, but not sure whether I've missed something else.

Copy link
Member Author

@dplewis dplewis Mar 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct.

Edit: I have removed some of the duplicate tests.

@dplewis dplewis requested a review from mtrezza March 13, 2021 01:58
Copy link
Member

@mtrezza mtrezza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@mtrezza
Copy link
Member

mtrezza commented Mar 13, 2021

I don't think we need UserPII in its current form. It seems to be just testing whether a logged out user can access a user object in various ways which are effectively all the same and unnecessary. That doesn't require a distinct test suite IMO, and should be covered already in the login / logout or object access tests.

Regarding a "Test Quality Guide" for how to write tests, which rules would you say it should include, given the insight you have now after digging into the tests? (I will open a separate PR for this).

@dplewis dplewis merged commit 9563793 into master Mar 13, 2021
@dplewis dplewis deleted the clean-tests branch March 13, 2021 15:05
@dplewis dplewis mentioned this pull request Mar 13, 2021
28 tasks
Arul- pushed a commit to Arul-/parse-server that referenced this pull request Mar 25, 2021
* initial pass

* reconfigureServer when needed

* finish postgres tests

* mongo tests

* more tests

* clean up

* re-add skipped test

* Fix transaction tests

* handle batch

* AuthenticationAdapter fix

* More reconfiguration

* clean up

* properly terminate cli servers

* handle Parse.Push

* Flaky PushController

* ensure reconfigureServer when changed

* fix postgres tests

* remove console.log

* LiveQuery spec remove duplicates and listeners
@parseplatformorg
Copy link
Contributor

🎉 This change has been released in version 5.0.0-beta.1

@parseplatformorg parseplatformorg added the state:released-beta Released as beta version label Nov 1, 2021
@parseplatformorg
Copy link
Contributor

🎉 This change has been released in version 5.0.0

@parseplatformorg parseplatformorg added the state:released Released as stable version label Mar 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
state:released Released as stable version state:released-beta Released as beta version
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance between wiredTiger mongo and mmapv1
3 participants