Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update androidxTestEspressoVersion to 3.5.0 #17615

Closed
wants to merge 3 commits into from

Conversation

oguzkocer
Copy link
Contributor

@oguzkocer oguzkocer commented Dec 5, 2022

Updates androidxTestEspressoVersion to 3.5.0.

To test:

Since this dependency change is only related to testing, CI checks should be enough.

Regression Notes

  1. Potential unintended areas of impact
    N/A

  2. What I did to test those areas of impact (or what existing automated tests I relied on)
    N/A

  3. What automated tests I added (or what prevented me from doing so)
    N/A

PR submission checklist:

  • I have completed the Regression Notes.
  • I have considered adding accessibility improvements for my changes.
  • I have considered if this change warrants user-facing release notes and have added them to RELEASE-NOTES.txt if necessary.

@oguzkocer oguzkocer added this to the 21.4 milestone Dec 5, 2022
@wpmobilebot
Copy link
Contributor

wpmobilebot commented Dec 5, 2022

WordPress📲 You can test these changes on WordPress by downloading wordpress-installable-build-pr17615-83228e9.apk
💡 Scan this QR code with your Android phone to download and install the APK directly on it.
AppWordPress
Build FlavorJalapeno
Build TypeDebug
Commit83228e9
Note: This installable build uses the JalapenoDebug build flavor, and does not support Google Login.

@wpmobilebot
Copy link
Contributor

wpmobilebot commented Dec 5, 2022

Jetpack📲 You can test these changes on Jetpack by downloading jetpack-installable-build-pr17615-83228e9.apk
💡 Scan this QR code with your Android phone to download and install the APK directly on it.
AppJetpack
Build FlavorJalapeno
Build TypeDebug
Commit83228e9
Note: This installable build uses the JalapenoDebug build flavor, and does not support Google Login.

Base automatically changed from deps/remove-unused-dependencies-for-libs to trunk December 5, 2022 13:32
@ParaskP7 ParaskP7 self-assigned this Dec 6, 2022
@ParaskP7 ParaskP7 self-requested a review December 6, 2022 09:54
@ParaskP7
Copy link
Contributor

ParaskP7 commented Dec 6, 2022

👋 @pachlava !

I am shameless pinging you in this in case you can help with this single and consistent UI test failure on e2eLoginWithMagicLink. The logs that Firebase provides in this case aren't being too helpful, at least for me, maybe you can figure out what's wrong, since you've been working with failed UI tests for some time now. 🙏 🤞 🙇

I see this test got re-enabled a couple of months ago (see here), as such, I am not sure if it is just flaky or whatnot (see PR). I am also adding @jostnes here, in case you would like to collaborate on this, and mentioning this below, having copy-pasted it from that PR that enabled those tests. 🤔

Tests were previously disabled because of flakiness. After some testing in FTL and local,
looks like we can re-enable these back without any code change. There was a recent
update to the FTL emulator version so that change might have helped with the flakiness.

Below, just an FYI for you on what I did so far.

  1. Firstly, I run the e2eLoginWithMagicLink and the whole LoginTests suite locally, with a physical device of mine (Pixel 4), and everything went ✅ on my side.
  2. Then, I run the e2eLoginWithMagicLink and the whole LoginTests suite via AS but on Firebase (following your lovely instructions here + kudos) and the results were again (kind of) ✅ as well, see below:
    1. First try (LoginTests suite): The failure was on e2eLoginWithPasswordlessAccount this time, but e2eLoginWithMagicLink was successful. 🤷
    2. Second try (e2eLoginWithMagicLink only): ✅
    3. Third try (LoginTests suite again): ✅
    4. Another try (after merging androidxTestCoreVersion -> 1.5.0 and androidxTestExtJunitVersion -> 1.1.3): ✅

PS: You will notice that I used Pixel 6, API 33 to run these Firebase tests above, as on AS, for some reason, it didn’t allow me to run them on our usual Pixel 2, API 30 such configuration, see AS error message below that point to Incompatible device/OS combination, which to be fair, it is totally true. Maybe we should update our CI such configuration to point to newer devices at some point, but this is another discussion to be having, right… 🤷

Pixel 2, Google | Android 11, API Level 30 (R) | English | Portrait
	Skipped triggering the test execution: Incompatible device/OS combination
Test running failed: No test results
Finish

@pachlava
Copy link
Contributor

pachlava commented Dec 6, 2022

@ParaskP7 👋🌞

I'm still looking into it, but I'm very close to being sure that this is caused by too long duration of tests execution. In both runs associated with this PR (one, two) the execution time was over 15 min, which this warning tells about:

Screenshot 2022-12-06 at 17 21 58

So, it's possible that it has nothing to do with this exact test, it just happens to be the executed one when time reaches 15 min. Usually, the WPAndroid tests run for 11-12 minutes. I'll continue taking a look into why it became slower (update to 3.5.0 or any other reason).

@ParaskP7
Copy link
Contributor

ParaskP7 commented Dec 6, 2022

👋 @pachlava !

Thank you so much for jumping into this problem so quickly, much appreciated! 🙇

I'm still looking into it, but I'm very close to being sure that this is caused by too long duration of tests execution. In both runs associated with this PR (one, two) the execution time was over 15 min, which this warning tells about:

Ah, this is very interesting! 🥇

It didn't even occur to me to do this time out association you did, nice catch! 😅 💪

I wonder how come it times-out exactly on this e2eLoginWithMagicLink test, every single time. I think we've run the UI test on this PR about 10 times now. Maybe this specific test is now hanging on forever without moving at all? 🤔

So, it's possible that it has nothing to do with this exact test, it just happens to be the executed one when time reaches 15 min.

Hmmm, you think? 🤔

Still, it is very unlikely that all other test succeed and this test is the one that fails, every single time.

Maybe what we can do to verify this assumption is to:

  1. First, try by disabling (via @Ignore) this e2eLoginWithMagicLink test and see how long it will take for all other tests to complete?
  2. Then, try disabling all other tests (20 of them, via @Ignore) and see how long it takes for e2eLoginWithMagicLink to complete.

Wdyt? 🤔

Usually, the WPAndroid tests run for 11-12 minutes.

This is good to know, thanks for sharing! 💯

Having said that, I now feel that we are already at our limit here, right? We have a buffer of 3' mins, which might start causing us trouble going forward, if not already causing us hiccups. 🤔

Can't we update this 15' mins limit somehow? 🤔

If not, maybe we should try making our tests faster, or split them into two runs? 🤔

I'll continue taking a look into why it became slower (update to 3.5.0 or any other reason).

Again, thank you so much for helping us here Sergiy! 🙇

@pachlava
Copy link
Contributor

pachlava commented Dec 6, 2022

@ParaskP7 would it be possible to temporarily roll back to 3.4.0 to see if the tests run faster after that?

Regardless of the results from ^, I think the normal execution being still close to 15 minutes is a sign that we need to adjust it to 30? 🤔 (with potential of adding more tests / FLT just being slow). It looks like default value for --timeout parameter is 15 min, and to make it adjustable, this param has to be added to android_firebase_test.rb

@ParaskP7
Copy link
Contributor

ParaskP7 commented Dec 6, 2022

@ParaskP7 would it be possible to temporarily roll back to 3.4.0 to see if the tests run faster after that?

Of course @pachlava , feel free to create a branch from this PR and roll-it-back to 3.4.0 there, then open a test PR based on that branch. Then, you could try any other update you would like to this newly created test branch and/or PR of yours! 💯

PS: I am asking you to do that in order to avoid adding temporary commits to this branch and PR. I hope that's not too much of a trouble for you. 🙏

Regardless of the results from ^, I think the normal execution being still close to 15 minutes is a sign that we need to adjust it to 30? 🤔 (with potential of adding more tests / FLT just being slow).

That's exactly what I was thinking too! 🤔

It looks like default value for --timeout parameter is 15 min, and to make it adjustable, this param has to be added to android_firebase_test.rb

Sure, I would consider doing that too, but after we've exhausted all other options. Wdyt? 🤔

@pachlava
Copy link
Contributor

pachlava commented Dec 6, 2022

Still, it is very unlikely that all other test succeed and this test is the one that fails, every single time.

The order of tests execution on FTL tends to be the same from what I saw, and the e2eLoginWithMagicLink test is always the 7th test. In the cases of timeout, you can see that all previous tests were green, and at the same time no remaining tests are executed after the magic link one (we have 15 tests in overall). If it was just the flakiness thing, the remaining 7 tests should have been executed, I think.

feel free to create a branch from this PR and roll-it-back to 3.4.0 there

I'll do it @ParaskP7 and will get back with the results 👋

@ParaskP7
Copy link
Contributor

ParaskP7 commented Dec 6, 2022

The order of tests execution on FTL tends to be the same from what I saw, and the e2eLoginWithMagicLink test is always the 7th test.

👍

In the cases of timeout, you can see that all previous tests were green, and at the same time no remaining tests are executed after the magic link one (we have 15 tests in overall).

👍

If it was just the flakiness thing, the remaining 7 tests should have been executed, I think.

Hmmm... true, this makes sense, I think! 😅 🤔

I'll do it @ParaskP7 and will get back with the results 👋

👍 🙇 👋

@pachlava
Copy link
Contributor

pachlava commented Dec 6, 2022

@ParaskP7 👋

I've created a testing branch and PR #17631 where I:

  • Firstly rolled back to Espresso 3.4.0 with 93b0edf with no changes to tests. The test run took 11 minutes, and all 15 tests were executed 🟢. This is a usual situation for 3.4.0.
  • Reverted the rollback and then disabled all BlockEditor tests with 7b05941, which are usually executed at the very beginning. If my theory about timeout being the culprit is correct, the e2eLoginWithMagicLink should now pass, and either all remaining tests will be executed with no fails, if this takes less than 15 minutes, or some other test should fail, when the time reaches 15 minutes ~> after execution is done, the latter appeared to be true. As can be seen from this video, now it's the e2eNavigateThroughPosts test that is in progress when the time reaches 15 minutes, and it's marked as failed, while e2eLoginWithMagicLink has passed.

I wonder how come it times-out exactly on this e2eLoginWithMagicLink test, every single time. I think we've run the UI test on this PR about 10 times now. Maybe this specific test is now hanging on forever without moving at all? 🤔

Now I see what you mean, sorry. I took a look into the first commit from this PR (4fb6f5a) which had 5 reruns of Instrumented Tests. In all 5 runs, the execution took a few seconds more than 15 minutes, and e2eLoginWithMagicLink was the one executed when the time reached 15 min. What I forgot to mention earlier, which could have maybe helped, is that in every case I'm taking a look into the execution video, where it's visible that the test is not hanging, and it just gets closed down. Also, it can be seen from the video that it's indeed the e2eLoginWithMagicLink test in progress when the time reaches 15 min.

I think this all is a sign that:

  • The switch to 3.5.0 affects tests performance negatively.
  • Regardless of the plans to switch to 3.5.0, it probably makes sense to add support for --timeout parameter.

Speaking of how to move on with this PR, I'm on the fence, if you ask for my opinion. On one hand, it's good to stay up-to-date, and maybe it makes sense to update to 3.5.0 once timeout support is available, since there are some new features in their release notes. On the other hand, we can see that the execution is at least 4 minutes slower for 3.5.0, which is a lot for multiple reruns per day for such an active repo. So maybe it's worth to wait for a bit, taking into account that 3.5.0 is relatively young 🤔

@ParaskP7
Copy link
Contributor

ParaskP7 commented Dec 7, 2022

👋 @pachlava !

Thank you so much for looking into all that and of course for testing it out in order to figure out what's what, really appreciate the help here! 🙇

The switch to 3.5.0 affects tests performance negatively.

Yeap, it seems so indeed. 💯

I wonder why and whether there is something with our configuration that causes this trouble and performance degradation... 🤔

FYI: I know our UI e2e tests were build a while back, and that since then, we are only maintaining them, but without actively trying to improve them. Also, I know we have lots of Thread.sleep(...) here and there, for example see idleFor(...) and its usages, which might even go as high as to cause a delay of 8'' seconds, for example see generateStats(). Also, we have about 40 of those with seconds for delay. If we take an average of 4'' secs and multiply that to 40 of those in total, we get a delay of about 3' mins, just because of Thread.sleep(...). This, by default, doesn't help with UI test performance, that is. We might want to consider idling resources as some point, or at least deal with this delays sooner than later. ⏳

Also, I wonder if us using Pixel 2 API 30, which is quite old and outdated by now, especially device wise, that is, in comparison to something like Pixel 6 API 33, has anything to do with that. Wdyt? 🤔

Regardless of the plans to switch to 3.5.0, it probably makes sense to add support for --timeout parameter.

I totally agree on that, we should add this --timeout on the release-toolkit repo, and more specifically on the android_firebase_test.rb ruby file, in advance, so that to make it configurable for clients to choose their own timeout and according to their needs. 💯

Having said the above, for now, I wouldn't change the --timeout and stick to the 15' mins default. But, I would like for us to have this option available for when we do need to change the timeout, or for cases like where we do need to test if increasing the timeout fixes the issue and what that means, that is, in terms of how longer the UI tests are now running.

Speaking of how to move on with this PR, I'm on the fence, if you ask for my opinion. On one hand, it's good to stay up-to-date, and maybe it makes sense to update to 3.5.0 once timeout support is available, since there are some new features in their release notes. On the other hand, we can see that the execution is at least 4 minutes slower for 3.5.0, which is a lot for multiple reruns per day for such an active repo. So maybe it's worth to wait for a bit, taking into account that 3.5.0 is relatively young 🤔

Yeap, I agree with you on that Sergiy, and I am actually not so much on the fence here, I think we should definitely delay the upgrade to 3.5.0. We should stick to 3.4.0 for now, investigate if others in the community are/were also experiencing such performance degradation issues, if there are open tickets about it, whether they exist workarounds to that, etc.

Then, we could also wait on a newer possible 3.5.1 release that might be fixing any or all of this. We would then try it out again, check the result and take it from there.


Cc @oguzkocer , what are your thoughts on all that? 🤔

@pachlava
Copy link
Contributor

pachlava commented Dec 7, 2022

I know we have lots of Thread.sleep(...) here and there, for example see idleFor(...) and its usages, which might even go as high as to cause a delay of 8'' seconds, for example see generateStats().

Generally, I agree with you, even though I saw a different idle() mostly used to wait for 100ms in the "last harmful" kind of finite waiting loop, and generateStats() is only used in Screenshot tests, we still have a lot of 1-2 sec delays that are not helping with performance. I'm not sure the time savings will reach 3 minutes, but there definitely will be some.

We might want to consider idling resources as some point, or at least deal with this delays sooner than later. ⏳

I agree with you totally 👍

Also, I wonder if us using Pixel 2 API 30, which is quite old and outdated by now, especially device wise, that is, in comparison to something like Pixel 6 API 33, has anything to do with that. Wdyt? 🤔

I know that some work had been done by @jostnes in this area recently, since a few months ago we were still using API28. I think that the main guidance for device selection was speed + tests stability, and the "Pixel2.arm + API30" combination was the fastest + most stable back then. Since this task does not require a lot of active involvement (and Internet 😅), I'll run a series of tests today in the dedicated PR #17636 I just opened, to see what else we can try.

Having said the above, for now, I wouldn't change the --timeout and stick to the 15' mins default.

👍 about that. It can be a good natural limiter, and also we will still have the option to adjust if really easily, if needed.

I think we should definitely delay the upgrade to 3.5.0

I agree here as well @ParaskP7 💯

@ParaskP7
Copy link
Contributor

ParaskP7 commented Dec 7, 2022

👋 @pachlava !

...even though I saw a different idle()...

Yeap, this is a different idle(...) function and its usage is neglitable, thus I didn't mentioned it. However, this other idleFor(...) is indeed a heavy one. 🚔

...and generateStats() is only used in Screenshot tests...

True, we should exclude anything screenshot related from the metrics, good catch! 💯

I'll run a series of tests today in the dedicated PR #17636 I just opened, to see what else we can try.

This is awesome, thank you so much! 🙇 ❤️ 🤞

@pachlava
Copy link
Contributor

pachlava commented Dec 7, 2022

I'll run a series of tests today in the dedicated PR #17636 I just opened, to see what else we can try.

Some follow up on this, @ParaskP7 👋

Even though FTL lists a wide selection of devices for use ...

Expand for picture

SCR-20221207-ios

.. and it looks like they are available right now with no additional purchase (or am I getting this wrong 🤔), this is the actual selection:

Screenshot 2022-12-07 at 13 46 33

Not that interesting any longer, with Pixel 3 probably being the fanciest.

  • I tried using Pixel 3 (938f2c5, the FTL run is marked as failed because of unrelated app crash, while all the tests have passed, it's a different topic 😞) but the test run took 3 minutes longer than it takes for Pixel 2 ARM API 30.

  • Then I tried Pixel 2 ARM API 33 (834825e) and four tests have failed, which rarely fail usually

  • Then I went for Pixel 2 ARM API 32 (727eff3), the tests have passed there, and execution took 11:10. It's still slower than what we have with Pixel 2 ARM API 30: 10:49 (here) or even 09:19 (here).

This goes in line with what @jostnes described in /woocommerce/woocommerce-android/pull/7454:

... From testing, found that Pixel 3 API 30 runs the test really slow so will exclude that device for now. There are also some changes needed if we remain to use the current Pixel 2 device.

There is Pixel 2 (ARM) emulator that supports multiple APIs including newer ones. I tested that out and it runs the tests well without requiring any other code changes. Will update the tests to use that device with API 30 (Android 11) as it is currently the most used version.

After those checks, if we choose from what we currently have available, the Pixel 2 ARM API 30 shows the best performance 🤷‍♂️

@pachlava
Copy link
Contributor

pachlava commented Dec 7, 2022

I'll also actually try out the mysterious MediumPhone.arm now 🙂

@ParaskP7
Copy link
Contributor

ParaskP7 commented Dec 7, 2022

👋 @pachlava !

Thank you so much for going into all this trouble to figure out whether we could use another device/api alternative, you are awesome! ❤️

After those checks, if we choose from what we currently have available, the Pixel 2 ARM API 30 shows the best performance 🤷‍♂️

🤷 Yeaaa, it does seem so...

I'll also actually try out the mysterious MediumPhone.arm now 🙂

😅 Good luck! 🍀 🤞

@pachlava
Copy link
Contributor

pachlava commented Dec 8, 2022

😅 Good luck! 🍀 🤞

Thanks, @ParaskP7, it was needed 😅 ! With MediumPhone, the behaviour is very similar to Pixel2:

  • 1b157fd For API 30 no difference in speed from what we use now ("Pixel 2 API 30")
  • 85e4b49 for API 32 ~ same performance
  • 952358c for API 33 slower performance, and same kind of fails as for Pixel 2 API 33 (834825e)

This makes me think that API version plays the main role, while the model/device plays the secondary role (does it play any role at all, apart from a different screen size 🤔?), and there's no win in changing from what we currently use to other available options.

@ParaskP7
Copy link
Contributor

ParaskP7 commented Dec 8, 2022

👋 @pachlava !

With MediumPhone, the behaviour is very similar to Pixel2.

Thank you a lot for sharing this data and for testing with MediumPhone! 🥇 🙇

This makes me think that API version plays the main role, while the model/device plays the secondary role (does it play any role at all, apart from a different screen size 🤔?), and there's no win in changing from what we currently use to other available options.

Yea, it indeed seems so, and once more, thanks for the thorough testing, which ends up helping with this first conclusion! 💯

I think we now have a better idea were to focus on more, which is the 3.5.0 library update itself, that is, more so that anything else. 🙏

@oguzkocer
Copy link
Contributor Author

@pachlava @ParaskP7 Thank you so much for looking into this in so much detail. I agree with both of you to postpone/skip this update for now especially due to the increase in execution time. I'll close the PR and we can circle back on it when it makes sense.

@oguzkocer oguzkocer closed this Dec 9, 2022
@oguzkocer oguzkocer deleted the dep-updates/androidxTestEspressoVersion branch December 9, 2022 17:23
ParaskP7 added a commit that referenced this pull request May 11, 2023
This is a temporal test configuration to see if connected tests will
stop timing-out after 15' minutes of run.

------------------------------------------------------------------------

This is temporal test configuration is applied because running the below
Gradle 'dependencies' command did show that the 'Espresso' dependency is
now being transitively updated to '3.5.0' (from '3.4.0'), which is known
to add significant delays to our UI tests, that is, as seen from testing
such an update via its dedicated dependency update PR (see below):

Gradle Command: ./gradlew :WordPress:dependencies --configuration
jetpackWasabiDebugAndroidTestRuntimeClasspath
Espresso PR: https://github.com/wordpress-mobile/WordPress-Android
/pull/17615
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants