Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New Feature] Sharding / Parallel Execution #1732

Merged
merged 8 commits into from
Jun 17, 2024

Conversation

sdfgsdfgd
Copy link
Contributor

@sdfgsdfgd sdfgsdfgd commented May 5, 2024

Highlights

Parallel Execution and Sharding

  • New shards parameter --shards 5 or -s 5 to distribute tests across multiple devices

  • Enhance --deviceid parameter to allow specification of multiple devices like --deviceid 'emulator-5554,emulator-5556,1C011FDEE052JE'

  • Implement sharding to distribute test flows evenly across emulators and devices ( theoretically limited by adb port limitation of 64 pairs in one instance of maestro )

  • Uses specified devices, if not uses available/connected devices, if not prompts the user for device creation for the missing devices up to shard count

  • Wait until all shards are initialised

  • Can run even multiple sharded maestro instances of real devices and emulators simultaneously at the same time, up to infinity (or 64? who knows), as much as the host memory allows (Haven't been able to test beyond 7 shards yet, if there is an inefficiency I'll be happy to work on it to improve this so we can indeed support as many devices as possible)

  • Improved Port Management and thread synchronizations, use of a cute little CountDownLatch to make shards wait

  • Set dynamic port allocation for the AndroidDriver to enhance RPC communication

  • JVM Coroutines (version 1.6.3) to improve asynchronous task management

  • Small clean up for assemble gradle task by: Disable lint checks on assemble tasks to streamline the build process and quiet the errors

  • Enables dynamic reception of the server port parameter from the Activity Manager (am - nifty little tool !)
    https://developer.android.com/tools/adb#am
    https://developer.android.com/studio/test/command-line

Impact:
these enhancements not only increase the robustness and flexibility of Maestro but also set the stage for future extensions and improvements

Run multiple sharded maestro clusters on a horde of real devices and simulators mixed together, run 15 real devices alongside 17 emulators, as 5 different clusters of 120 different devices - go crazy. Help us find a bottleneck so we can actually achieve this 100% smoothly and reliably. I haven't been able to test beyond almost 5 to 7 shards on my device yet.

Attachments

demo setup: 4 devices/shards 8 test files in folder, (the tests scroll & tap around settings app)

  1. 1 x my real device (scrcpy)
  2. 2 x API 34 android emulators
  3. 1 x prompt created android 34

demo command, run installDist executable, or with run gradle task:

  1. ./maestro --device 'emulator-5554,emulator-5556,emulator-5558,1C011FDEE002JE' test -s 5 ~/Desktop
  2. run gradle task run --args="--device 'emulator-5554,emulator-5556,emulator-5558,1C011FDEE002JE' test -s 5 ~/Desktop
Sharding_720p.mov

New different ways to run :

  • maestro test -s 4 ~/Desktop
    run with 4 shards on all yaml files under Desktop. Either reuses all online devices or prompts for creation of 4 devices if no running devices found
  • maestro test -s 4 ~/Desktop --device '1C011FDEE002JE'
    reuses a physical device with serial 1C011FDEE002JE and either reuses 3 other online devices, or prompts for creation of enough devices to reach 4 shards

@axelniklasson
Copy link
Contributor

Thanks for taking the time to create this PR @sdfgsdfgd, this will be a great addition! Someone from the team will review it next week and work with you to get this merged.

@sdfgsdfgd
Copy link
Contributor Author

sdfgsdfgd commented May 11, 2024

thank you, in the meantime

  1. noticed a minor bug with device creation's 2nd and later runs if you do sharding without any online devices
  2. report file generations for each shard I totally forgot about this,

fixed both ! pushing them soon

…for all shards 3. Fix bug on device creation for two or more devices 4. Move device cfg creation sequence to happen before launch
Copy link
Contributor

@igorsmotto igorsmotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is a big one, but most importantly is a hard one. Your ideas on how to solve this challenge are really good.

Unfortunately, let me say upfront that it will take me a while to fully review it, since of its high complexity.

In the meantime I'll be posting here all my questions as I spend more and more time in this PR.


It seems like this whole solution doesn't work on iOS::

  1. It can only create one device, probably because it tries to create them all of the same name (Error: A device with name Maestro_iPhone11_17 is already connected).
  2. Even if I use already created multiple devices, it seems like only one is actually being operated by Maestro, unsure what is going on.
  3. Haven't tested: Unsure what will happen if the user pass a list of devices that has a mix of android and devices in it. We should probably error it out, this definitely means that the user is making a mistake. This is also a weird user experience if no devices is passed, because the user is prompted multiple times what platforms it wants to run that shard.

Is this something planned? To only add support for Android? If so, then we need to error out and properly document that this feature can only work for Android


Having shards silently overrides the sequence execution, which can be unexpected to a user given that the sequence is a workspace-configuration.
Adding support to it is non-trivial, as it would essentially means there is a need to build a graph to have some level of parallel execution, so I'd expect this to error out.


The TestCommand is responsible for creating those devices, but this suggests that the StartDeviceCommand should also have a shard parameter. This way a user can avoid having to interact with Maestro and just create a bunch of devices in a single sweep.

Improve wording

Co-authored-by: Igor Lema <48068670+igorsmotto@users.noreply.github.com>
@sdfgsdfgd
Copy link
Contributor Author

This PR is a big one, but most importantly is a hard one. Your ideas on how to solve this challenge are really good.

Unfortunately, let me say upfront that it will take me a while to fully review it, since of its high complexity.

In the meantime I'll be posting here all my questions as I spend more and more time in this PR.

It seems like this whole solution doesn't work on iOS::

  1. It can only create one device, probably because it tries to create them all of the same name (Error: A device with name Maestro_iPhone11_17 is already connected).
  2. Even if I use already created multiple devices, it seems like only one is actually being operated by Maestro, unsure what is going on.
  3. Haven't tested: Unsure what will happen if the user pass a list of devices that has a mix of android and devices in it. We should probably error it out, this definitely means that the user is making a mistake. This is also a weird user experience if no devices is passed, because the user is prompted multiple times what platforms it wants to run that shard.

Is this something planned? To only add support for Android? If so, then we need to error out and properly document that this feature can only work for Android

Having shards silently overrides the sequence execution, which can be unexpected to a user given that the sequence is a workspace-configuration. Adding support to it is non-trivial, as it would essentially means there is a need to build a graph to have some level of parallel execution, so I'd expect this to error out.

The TestCommand is responsible for creating those devices, but this suggests that the StartDeviceCommand should also have a shard parameter. This way a user can avoid having to interact with Maestro and just create a bunch of devices in a single sweep.

Is this something planned? To only add support for Android? If so, then we need to error out and properly document that this feature can only work for Android

yes, this is exactly right. Sry I forgot to mention this :) The support is initially for Android. I was hoping we could address iOS and Web support seperately in a different thread. Or unless somebody wants to contribute that here, go ahead. Otherwise I might help out with this effort and look into those after this PR is merged ? I'll be around slack as well, if you'd like to discuss more.

Having shards silently overrides the [sequence execution](https://maestro.mobile.dev/cli/test-suites-and-reports#sequential-execution), which can be unexpected to a user given that the sequence is a workspace-configuration.
Adding support to it is non-trivial, as it would essentially means there is a need to build a graph to have some level of parallel execution, so I'd expect this to error out.

great point ! Completely missed this. I will make it error out if the sequence order thing is enabled, you're right this part would be trivial to address right now. Can get back to it at some point later perhaps. Thank u

The TestCommand is responsible for creating those devices, but this suggests that the StartDeviceCommand should also have a shard parameter. This way a user can avoid having to interact with Maestro and just create a bunch of devices in a single sweep.

agreed, addressing this as well

@igorsmotto
Copy link
Contributor

yes, this is exactly right. Sry I forgot to mention this :) The support is initially for Android. I was hoping we could address iOS and Web support seperately in a different thread.

@sdfgsdfgd

Got it!

However, to not impose a worse experience to maestro users, its important for this PR to either error out if the user tries to do something that is not supported (mixing platforms or simply using iOS), or to add support for iOS. Otherwise, it will quickly turn into a bug report as a user tries to use it unknowingly.

Looking from above - I don't think this PR is that far from adding support to iOS. I see just a couple of changes that needs to be done:

  • LocalXCTestInstaller is receiving the hard coded port 22087 to spawn the XCTestServer (which already receives the port to run the server by argument) inside the Simulator, it needs to be a unique port per instance.
  • To create an iOS device the name needs to be unique.
    Both changes looks to be very close to exactly what you did for Android.

Let me know what you end up deciding to do!

@sdfgsdfgd
Copy link
Contributor Author

sdfgsdfgd commented May 25, 2024

yes iOS support seems to be a very minimal addition on top of the big work - including it with my next commit as well, almost done with addressing this round's feedbacks.

Looking from above - I don't think this PR is that far from adding support to iOS. I see just a couple of changes that needs to be done:

  • LocalXCTestInstaller is receiving the hard coded port 22087 to spawn the XCTestServer (which already receives the port to run the server by argument) inside the Simulator, it needs to be a unique port per instance.
  • To create an iOS device the name needs to be unique.
    Both changes looks to be very close to exactly what you did for Android.

Let me know what you end up deciding to do!

sdfgsdfgd added 3 commits June 1, 2024 21:48
… App.port CLI 2. Use port range 7000-7128 instead of 9000-9128 3. Fix reporting of sharded test runs to be beautifully verbose with device info 4.
@sdfgsdfgd
Copy link
Contributor Author

sdfgsdfgd commented Jun 2, 2024

addressed everything so far, added iOS support for sharding
suites.mergeSummaries()?.saveReport() should report to a single file as before, with testsuites w/ device info merged inside xml this time

polished the reporting a little bit, I think verbose display of device to test mapping is going to be very helpful to all of us, see the example screenshot below

also - MAESTRO_DRIVER_STARTUP_TIMEOUT with even 2 sharded wait time easily crosses the default time of 15 secs. Maybe the default value could be increased a bit to a more realistic 20-30 secs instead.

Screenshot 2024-06-02 at 2 15 25 PM

edit: I don't know where those conflicts are coming from, I can't see anything wrong when I merge the master, ._.

LMK any issues or performance related stuff comes up. I could improve resource cleanup at the end

@sdfgsdfgd sdfgsdfgd requested a review from igorsmotto June 2, 2024 04:24
@igorsmotto
Copy link
Contributor

That's great @sdfgsdfgd! Thank you once again for all the hard work here!

From my initial tests it seems like everything is working perfectly. Including iOS, which is amazing!
I'm now in the process to run this branch against our whole internal test suite (it does take some time to run) and I'll get back to you as soon as I can.

Also, I took the liberty to push a merge commit to resolve the conflicts.

@igorsmotto igorsmotto merged commit dc1c0fa into mobile-dev-inc:main Jun 17, 2024
1 check passed
@igorsmotto
Copy link
Contributor

After testing thoroughly didn't found any regression. Thanks for the amazing work @sdfgsdfgd! 🎉

This is probably going to be released on the next Maestro version. Release date TBD

@bartekpacia
Copy link
Contributor

See also #1818

:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants