Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[7.0.0] Performance regression from 6.8 (over 3x slower in our test suite) #15853

Closed
GC-Mark opened this issue Apr 7, 2021 · 37 comments · Fixed by #16113
Closed

[7.0.0] Performance regression from 6.8 (over 3x slower in our test suite) #15853

GC-Mark opened this issue Apr 7, 2021 · 37 comments · Fixed by #16113
Assignees
Labels
type: performance 🏃‍♀️ Performance related type: regression A bug that didn't appear until a specific Cy version release v7.0.0 🐛 Issue present since 7.0.0

Comments

@GC-Mark
Copy link

GC-Mark commented Apr 7, 2021

I have split this off from the performance regression issue related to 6.6 - 6.7 #15779

Can also confirm our tests are taking way longer in v7.0.0 than the were in v6.8.0

Also, because the tests are taking longer, we get random failing tests due to timeout issues.

Im not sure what info i can give you to help you debug this, but if there is anything you need me to do, just ask.

Our last few CI runs

Screenshot 2021-04-07 at 10 44 13

v6.8.0

Screenshot 2021-04-07 at 10 47 45

v7.0.0

(including a random failing test due to a timeout)
Screenshot 2021-04-07 at 10 47 28

@GC-Mark GC-Mark changed the title [7.0.0] Performance regression from 6.8 [7.0.0] Performance regression from 6.8 (over 3x slower in out test suite) Apr 7, 2021
@GC-Mark GC-Mark changed the title [7.0.0] Performance regression from 6.8 (over 3x slower in out test suite) [7.0.0] Performance regression from 6.8 (over 3x slower in our test suite) Apr 7, 2021
@rbayliss
Copy link

rbayliss commented Apr 7, 2021

Yeah, I can confirm this as well - our tests are much slower under 7.0. In our case, it's closer to 50% slower, although we have much longer tests overall. This is only happening in CI (Github Actions) for us, local runs cypress run have no measurable difference. Rolling back to 6.9.1 fixes the slowness for us.

@csvan
Copy link

csvan commented Apr 7, 2021

Probably related: #15827

@lassesteffen
Copy link

We also noticed a heavy increase in run time in our test suite (13mins to 28 mins = 2x slower) when upgrading from 6.8.0 to 7.0.0. Very interested in a workaround / fix for this.

@lugus
Copy link

lugus commented Apr 8, 2021

I confirm, rollback to 6.9.

@pkyeck
Copy link

pkyeck commented Apr 8, 2021

I can also confirm the increase in duration when switching from 5.6.0 to 7.0.0 on CircleCI:

before:

✔  All specs passed!                        10:10       73       73        -        -        -  

after:

✔  All specs passed!                        16:59       73       73        -        -        -  

@lukeapage
Copy link
Contributor

lukeapage commented Apr 8, 2021

Am trying to narrow down the commit in 7.0 that fails.. but it take a long time for me as our process is quite slow. Will continue tomorrow.

6f9e80a - slow - bad

232bd87 - ok - good

@agg23 agg23 self-assigned this Apr 8, 2021
@j1000
Copy link

j1000 commented Apr 8, 2021

We are observing the same thing. On my workstation it's as fast as ever, but on our CI machine (with fewer resources) it's taking 3x as long. Maybe it's using more memory?

@JayBizzle
Copy link

CI was definitely the slowest for us, but we also saw the slow down locally too, just not as bad

@lukeapage
Copy link
Contributor

I narrowed it down above #15853 (comment) (editing the comment)
but I don't see anything in the bad commit thats suspicious. I'm also in a big doubt because the cypress commit history is a absolute total nightmare to navigate - I'm used to rebasing or squash commits so you see everything flat but not only does cypress not do that, but they have a develop and a master branch and then merge between them constantly 😱
image

@estefafdez
Copy link

Same here, almost 1h extra running the same tests on the 6.9.1 and on the 7.0 and 7.0.1.

@lukeapage
Copy link
Contributor

I can't see any obvious regression in the times it takes to run the cypress tests themselves - but I don't think that is uncommon. Cypress uses itself for testing, but all of their tests are fast because they test on tiny pages with very little js and additional resources. Where as real-world cypress usage is on pages that are resource heavy.

My own app is a very heavy SPA with 3mb of javascript and we get a 7x speed regression on low spec CI machines and no regression on fast multi-core developer machines.

@pjg
Copy link

pjg commented Apr 10, 2021

I believe #15841 might play a role here. Depending on the testing setup (locally/with backend API running/CI) those incorrect network requests might require a timeout, which might be the cause of the slowdown.

@lukeapage
Copy link
Contributor

We don’t use intercept or any stubbing and still have a big slowdown

@brian-mann
Copy link
Member

brian-mann commented Apr 11, 2021

I narrowed it down above #15853 (comment) (editing the comment)
but I don't see anything in the bad commit thats suspicious. I'm also in a big doubt because the cypress commit history is a absolute total nightmare to navigate - I'm used to rebasing or squash commits so you see everything flat but not only does cypress not do that, but they have a develop and a master branch and then merge between them constantly 😱
image

There was a period of time where we accidentally did not squash PR's, but that shouldn't be the case any more. Most of the commits are likely due to us using a monorepo as you're seeing commits to things outside of the binary. The master <-> develop is expected because we release on 2 week cycles, and therefore master represents the latest tip(s) that have been released, and develop represents the latest tip(s) that are slated to be released.

@agg23
Copy link
Contributor

agg23 commented Apr 12, 2021

I apologize for the delay. If you haven't seen my testing in #15779 (comment), take a look. After a bunch of playing with it, I arrived at the following system for automatically bisecting, assuming we're looking for differences in Chrome/Electron CPU time. This gist contains the final run scripts.

This was run on an AWS t2.xlarge (this is important), configured in the following way:

wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
sudo apt install ./google-chrome-stable_current_amd64.deb
sudo apt install build-essential unzip

git clone https://github.com/cypress-io/cypress-example-kitchensink.git
cd cypress-example-kitchensink
yarn

cd ../
git clone https://github.com/cypress-io/cypress.git

The process was initiated via:

git bisect start 0399a6e58e226b860a41acf2847ee43adb8ad41e 73317218230319d45689bbad3ce46e7ab2312e18
git bisect run ./bisect.sh

considering 0399a6e (tag 7.0.0) as known bad and 7331721 (tag 6.8.0) as known good.

This process turned up b52ac98 as the first bad commit. This commit is notable as it both updates to a new major version of Electron (12.0, from 11.3.0) and a new major version of Node (14.16.0, from 12.18.3). However, it doesn't seem likely that upgrading Electron would change Chrome performance whatsoever.

Commit Chrome real Chrome user Chrome sys Electron real Electron user Electron sys
3fc5f03 (good) 119.90 101.23 25.86 111.85 49.45 9.13
f6a5d1e (bad) 112.19 100.00 25.46 107.51 75.89 21.13
e76edd7 (good) 102.69 36.74 5.55 100.84 39.95 6.44
f31013d (bad) 109.51 96. 38 24.04 106.62 75.27 20.78
7e9da23 (good) 112.28 92.88 24.48 107.10 48.36 9.24
bc4d5ea (good) 108.89 91.79 23.99 103.88 47.09 9.34
b52ac98 (bad) 109.05 96.30 24.99 110.26 77.43 21.02
c449775 (good) 109.71 93.54 25.01 103.92 48.06 8.87

Looking at the numbers, that does indeed seem to be the case, as Chrome numbers did not change significantly other than one run being significantly faster than the others. This also points to another potential problem. As seen in #15779 (comment), both Chrome and Electron were noted to slow down with each version upgrade.

  1. I speculate that Electron (the default for cypress run, which is what will be used in CI) has been slowed at least by the Electron upgrades, which is the primary cause of the visibility in Performance regression in CI from 6.6 -> 6.7 (and 6.8/7.0) #15779 and [7.0.0] Performance regression from 6.8 (over 3x slower in our test suite) #15853. This problem is exacerbated on limited CI machines.
  2. Chrome has a more niche slowdown, specifically correlated with strength of the machine. On ample resources, Chrome runs fine, but when constrained, the differences between 6.6, 6.7, 6.8, and 7.0 slow it down in some way.

TLDR:
We will be looking into the Electron upgrade performance, and I am going to re-bisect on a slower AWS instance to compare the results. The issue with Chrome performance has not yet been identified.

@GC-Mark
Copy link
Author

GC-Mark commented Apr 13, 2021

Anyone tried the latest 7.1.0 version?

@SimonasB88
Copy link

SimonasB88 commented Apr 13, 2021

Not sure how this relates, but noticed extremely high CPU usage for my Mac, when running test from local terminal, on 7.1.0 version. But I gotta double check other apps in Mac. There wasn't such high CPU usage thing in earlier versions. Not sure now.

@estefafdez
Copy link

@GC-Mark we did, same result and time as 7.0 and 7.0.1

image

@agg23
Copy link
Contributor

agg23 commented Apr 13, 2021

Anyone tried the latest 7.1.0 version?

Just for clarity, 7.1.0 does not include any changes we expect would influence performance.

@GC-Mark
Copy link
Author

GC-Mark commented Apr 13, 2021

Anyone tried the latest 7.1.0 version?

Just for clarity, 7.1.0 does not include any changes we expect would influence performance.

Ha, I bet that's what you said about version 7.0.0 😆

@yktoo
Copy link

yktoo commented Apr 13, 2021

I confirm that, too. Tests with both 7.0.1 and 7.1.0 take at least double the time as compared to 6.9.1.

@mmonteiroc
Copy link

Same here switching from version 6.8.0 to 7.1.0 directly

@Case09
Copy link

Case09 commented Apr 14, 2021

Same, jumped from 6.2.0 to 7.1.0, all suites take double the time to finish

@pete-om
Copy link

pete-om commented Apr 19, 2021

Do you have any instructions of how to generate a metric like this using Cypress?

Hey I don't want to pollute this thread with offtopic stuff but the TLDR is to pass values from cypress (eg: elapsed time) to a statsd server (search statsd-client) which then feeds to a prometheus data source in grafana

@yann-combarnous
Copy link

@agg23, any update on current investigation status, or place to look for updates? Thx in advance.

@agg23
Copy link
Contributor

agg23 commented Apr 21, 2021

@flotwig is now the primary investigator on this issue. I am unaware of any new breakthroughs.

@IgorPahota
Copy link

I have the same issue. I have a mock api with intercept and fixture features, after update to 7.0 it just stopped working! Running tests with mock now slower than with real api!

@agg23 agg23 assigned flotwig and unassigned agg23 Apr 21, 2021
flotwig added a commit that referenced this issue Apr 21, 2021
Fixes (avoids?) #15853

Electron v12.0.0-beta.16 and above contain an unknown bug causing a major slowdown when video recording is enabled. For now maybe we can downgrade Electron to this last known good version, v12.0.0-beta.14
@lukeapage
Copy link
Contributor

Will this commit give us any improvement?
04e854e

@flotwig
Copy link
Contributor

flotwig commented Apr 21, 2021

@lukeapage yeah, this slowdown seems to be related to video recording, and there was a bug in Cypress 7 causing it to always be capturing frames of video. So with 04e854e you'll be able to at least disable video to avoid this issue.

@jennifer-shehane jennifer-shehane added type: performance 🏃‍♀️ Performance related stage: work in progress type: regression A bug that didn't appear until a specific Cy version release labels Apr 22, 2021
flotwig added a commit that referenced this issue Apr 22, 2021
Fixes (avoids?) #15853

Electron v12.0.0-beta.16 and above contain an unknown bug causing a major slowdown when video recording is enabled. For now maybe we can downgrade Electron to this last known good version, v12.0.0-beta.14
@cypress-bot
Copy link
Contributor

cypress-bot bot commented Apr 22, 2021

The code for this is done in cypress-io/cypress#16113, but has yet to be released.
We'll update this issue and reference the changelog when it's released.

@mmonteiroc
Copy link

mmonteiroc commented Apr 23, 2021

Hey ! This issue is non-related to the video recording.... At our side we never recorded videos with cypress and we encountered this issue when upgrading to 7.1.0 among other issues that we opened...... @flotwig

@lukeapage
Copy link
Contributor

@mmonteiroc #15853 (comment)
2 bugs - 1. Slowdown with video 2. Always doing video

@cypress-bot
Copy link
Contributor

cypress-bot bot commented Apr 26, 2021

Released in 7.2.0.

This comment thread has been locked. If you are still experiencing this issue after upgrading to
Cypress v7.2.0, please open a new issue.

@cypress-bot cypress-bot bot locked as resolved and limited conversation to collaborators Apr 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: performance 🏃‍♀️ Performance related type: regression A bug that didn't appear until a specific Cy version release v7.0.0 🐛 Issue present since 7.0.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.