Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValidateJavaVersion is expensive #8369

Closed
jeromelaban opened this issue Sep 25, 2023 · 14 comments · Fixed by #8663
Closed

ValidateJavaVersion is expensive #8369

jeromelaban opened this issue Sep 25, 2023 · 14 comments · Fixed by #8663
Assignees
Labels
Area: App+Library Build Issues when building Library projects or Application projects. need-attention A xamarin-android contributor needs to review

Comments

@jeromelaban
Copy link

jeromelaban commented Sep 25, 2023

Android application type

.NET Android (net7.0-android, etc.)

Affected platform version

net 7 and later

Description

The task ValidateJavaVersion takes about 23 seconds to run for every project of a solution, even when running on a beefy machine (Standard_L16s_v3 with premium SSDs).

https://github.com/xamarin/xamarin-android/blob/4fffbf686bad5479a8ef7954d7d9d0d68941f034/src/Xamarin.Android.Build.Tasks/Tasks/ValidateJavaVersion.cs#L14

Interestingly, the caching mechanism here does seem to work, but not in a meaningful way:

https://github.com/xamarin/xamarin-android/blob/4fffbf686bad5479a8ef7954d7d9d0d68941f034/src/Xamarin.Android.Build.Tasks/Tasks/ValidateJavaVersion.cs#L104

Steps to Reproduce

Build any android library and observe the task takes a long time to run. In a large solution, this accounts for about 4 minutes of build time.

image

image

Did you find any workaround?

None at this time.
image

Relevant log output

No response

@jeromelaban jeromelaban added Area: App+Library Build Issues when building Library projects or Application projects. needs-triage Issues that need to be assigned. labels Sep 25, 2023
@dellis1972
Copy link
Contributor

out of curisity how long does java -version take to run on the command line?
also what output does it produce for your sdk? (I see you are using a custom java sdk).

@dellis1972 dellis1972 added need-info Issues that need more information from the author. and removed needs-triage Issues that need to be assigned. labels Sep 25, 2023
@microsoft-github-policy-service
Copy link
Contributor

Hi @jeromelaban. We have added the "need-info" label to this issue, which indicates that we have an open question for you before we can take further action. This issue will be closed automatically in 7 days if we do not hear back from you by then - please feel free to re-open it if you come back to this issue after that time.

@jeromelaban
Copy link
Author

jeromelaban commented Sep 25, 2023

So running the commands directly give a 7.4s for java.exe and 3.8s for javac.exe. Do you get similar values on your end?

About the Java SDK, we're using is the one by default on azure devops hosted agents, but maybe we've been using the wrong one for a while... which one's the most appropriate in that context?

@microsoft-github-policy-service microsoft-github-policy-service bot added need-attention A xamarin-android contributor needs to review and removed need-info Issues that need more information from the author. labels Sep 25, 2023
@dellis1972
Copy link
Contributor

so on my mac I get the following

 time java -version
openjdk version "17.0.8" 2023-07-18 LTS
OpenJDK Runtime Environment Microsoft-8035246 (build 17.0.8+7-LTS)
OpenJDK 64-Bit Server VM Microsoft-8035246 (build 17.0.8+7-LTS, mixed mode, sharing)
java -version  0.04s user 0.03s system 81% cpu 0.094 total

So that is really odd its taking so long. I'll have to try on a windows box somewhere and see what I see.

@jonathanpeppers
Copy link
Member

One thing to note is that we cache the results of these calls with an AppDomain lifetime with RegisterTaskObject. So you'd see this on the first build per MSBuild node unless the Java SDK path changes.

We do call both javac -version and java -version, maybe we could drop one of those?

@jonathanpeppers
Copy link
Member

@jeromelaban can you tell if the Java SDK path is changing between calls? Maybe this invalidates the cache for all of them?

@dellis1972 dellis1972 added need-info Issues that need more information from the author. and removed need-attention A xamarin-android contributor needs to review labels Sep 25, 2023
@microsoft-github-policy-service
Copy link
Contributor

Hi @jeromelaban. We have added the "need-info" label to this issue, which indicates that we have an open question for you before we can take further action. This issue will be closed automatically in 7 days if we do not hear back from you by then - please feel free to re-open it if you come back to this issue after that time.

@jeromelaban
Copy link
Author

@jonathanpeppers The path does not seem to change between invocations.

Here's a binlog, if there are specifics you're looking for:
build-5.0.0-dev.pool-update.2751-netcoremobile-net8-UWP.binlog.zip

@microsoft-github-policy-service microsoft-github-policy-service bot added need-attention A xamarin-android contributor needs to review and removed need-info Issues that need more information from the author. labels Sep 25, 2023
@jonathanpeppers
Copy link
Member

The caching is working:

ValidateJavaVersion
    Using cached value for `C:\hostedtoolcache\windows\Java_Temurin-Hotspot_jdk\11.0.20-101\x64\bin\java.exe -version`: 11.0.20.1
    Found Java SDK version 11.0.20.1.
    Using cached value for `C:\hostedtoolcache\windows\Java_Temurin-Hotspot_jdk\11.0.20-101\x64\bin\javac.exe -version`: 11.0.20.1
    Found Java SDK version 11.0.20.1.

But what is weird is how the non-cached calls like like 20s:
image
image
(It seems like I see one of these per MSBuild node)

@jeromelaban can you check what AzDO image you are using?

It does seem to be using a different Temurin JDK than what VS installs:

JavaSdkPath = C:\hostedtoolcache\windows\Java_Temurin-Hotspot_jdk\11.0.20-101\x64

We normally use Microsoft Open JDK in C:\Program Files\Microsoft\jdk-11.0.19.7-hotspot.

@jeromelaban
Copy link
Author

jeromelaban commented Sep 25, 2023

The caching is working, but not always, that's the curious part. It could be because the build is using multiple msbuild nodes.

I'm currently using this particular image: https://github.com/actions/runner-images/tree/releases/win22/20230918. I've checked with more recent builds with the same image, and this particular task can vary between 8s and 20s.

Now with regards to selecting the Open JDK that comes with VS, do you select it with boots? or do you expect to have this specific path?

@jonathanpeppers
Copy link
Member

I wonder if some version of C:\Program Files\Microsoft\jdk-11.0.19.7-hotspot is already preinstalled.

But the AzDO images have JAVA_HOME set, and it overrides it?

@dellis1972
Copy link
Contributor

dellis1972 commented Sep 25, 2023

@jeromelaban out of curiosity can you add the msbuild flag -m:1 to force it to use only one node. See if that changes the cache usage?

@jonpryor
Copy link
Member

jonpryor commented Oct 3, 2023

I wonder if it would be "good enough" to instead parse $(_JavaSdkDirectory)\release, which would replace a java -version invocation with reading a file with contents:

IMPLEMENTOR="Microsoft"
IMPLEMENTOR_VERSION="Microsoft-40648"
JAVA_VERSION="11.0.16.1"
JAVA_VERSION_DATE="2022-08-12"
LIBC="default"
MODULES=""
OS_ARCH="x86_64"
OS_NAME="Darwin"
SOURCE=""

…which we already have code to read: https://github.com/xamarin/xamarin-android-tools/blob/8a971d94a3fa2f0e8f69c5cf742c6836c14be1cd/src/Xamarin.Android.Tools.AndroidSdk/JdkInfo.cs#L208-L234

and we already create a JdkInfo instance in <ResolveSdks/>, so maybe that value should be cached instead?

https://github.com/xamarin/xamarin-android/blob/a1b1f7be388b69506d4c3521a4e357f88e43080b/src/Xamarin.Android.Build.Tasks/Tasks/ResolveSdksTask.cs#L97

Do we even need <ValidateJavaVersion/> anymore? Considering that the <ResolveSdks/> task runs first, and runs anyway, should we just "merge" these tasks?

@jeromelaban
Copy link
Author

@jeromelaban out of curiosity can you add the msbuild flag -m:1 to force it to use only one node. See if that changes the cache usage?

It does though not everywhere (maybe some nodes are still created anyways), but it's not particularly applicable in larger projects :)

jonathanpeppers added a commit to jonathanpeppers/xamarin-android that referenced this issue Jan 22, 2024
Fixes: dotnet#8369

During Android project builds, the `<ValidateJavaVersion/>` MSBuild task
currently shells out to `java -version` and `javac -version` to
determine if we have a *valid* & usable JDK.

Unfortunately, some customer logs have reported build times that are
completely bananas:

    Task ValidateJavaVersion 23.469s
    ...
    Top 10 most expensive tasks
    ...
    ValidateJavaVersion = 3:46.740, 35 calls

`java -version` and/or `javac -version` must be quite slow! Reading the
source for OpenJDK, it appears that `java -version` actually starts a
JVM to call a single method:

* https://github.com/openjdk/jdk/blob/df370d725e5ae55a05479e8375bf665233ac3e44/src/java.base/share/native/libjli/java.c#L1895-L1913
* https://github.com/openjdk/jdk/blob/c9cacfb25d1f15c879c961d2965a63c9fe4d9fa7/src/java.base/share/classes/java/lang/VersionProps.java.template#L191-L233

That could explain the slowness?

To improve this, let's just parse the file:

* C:\Program Files\Microsoft\jdk-11.0.19.7-hotspot\release

With contents:

    IMPLEMENTOR="Microsoft"
    IMPLEMENTOR_VERSION="Microsoft-7621296"
    JAVA_VERSION="11.0.19"
    JAVA_VERSION_DATE="2023-04-18"
    LIBC="default"
    MODULES="java.base java.compiler java.datatransfer java.xml java.prefs java.desktop java.instrument java.logging java.management java.security.sasl java.naming java.rmi java.management.rmi java.net.http java.scripting java.security.jgss java.transaction.xa java.sql java.sql.rowset java.xml.crypto java.se java.smartcardio jdk.accessibility jdk.internal.vm.ci jdk.management jdk.unsupported jdk.internal.vm.compiler jdk.aot jdk.internal.jvmstat jdk.attach jdk.charsets jdk.compiler jdk.crypto.ec jdk.crypto.cryptoki jdk.crypto.mscapi jdk.dynalink jdk.internal.ed jdk.editpad jdk.hotspot.agent jdk.httpserver jdk.internal.le jdk.internal.opt jdk.internal.vm.compiler.management jdk.jartool jdk.javadoc jdk.jcmd jdk.management.agent jdk.jconsole jdk.jdeps jdk.jdwp.agent jdk.jdi jdk.jfr jdk.jlink jdk.jshell jdk.jsobject jdk.jstatd jdk.localedata jdk.management.jfr jdk.naming.dns jdk.naming.ldap jdk.naming.rmi jdk.net jdk.pack jdk.rmic jdk.scripting.nashorn jdk.scripting.nashorn.shell jdk.sctp jdk.security.auth jdk.security.jgss jdk.unsupported.desktop jdk.xml.dom jdk.zipfs"
    OS_ARCH="x86_64"
    OS_NAME="Windows"
    SOURCE=".:git:6171c19d2c18"

We can just call `StreamReader.ReadLine()` until `JAVA_VERSION="` is
found.

I kept the existing logic in place, with an escape hatch if it were ever
needed. After a few .NET 9 previews, we could consider removing old
logic entirely?

This logic is considerably faster, because we no longer launch a JVM!

* Before: Task ValidateJavaVersion 377ms
* After: Task ValidateJavaVersion 2ms

Where you can test the "Before" logic by passing
`-p:_AndroidUseJavaExeVersion=true`.

I assume the customer's log from before (35 calls) would be around 70ms
give or take instead of 3 hours and 46 minutes? Even if this estimation
was 10x worse, it would still be a huge improvement! :eyes:
dellis1972 pushed a commit that referenced this issue Jan 23, 2024
Fixes: #8369

During Android project builds, the `<ValidateJavaVersion/>` MSBuild task
currently shells out to `java -version` and `javac -version` to
determine if we have a *valid* & usable JDK.

Unfortunately, some customer logs have reported build times that are
completely bananas:

    Task ValidateJavaVersion 23.469s
    ...
    Top 10 most expensive tasks
    ...
    ValidateJavaVersion = 3:46.740, 35 calls

`java -version` and/or `javac -version` must be quite slow! Reading the
source for OpenJDK, it appears that `java -version` actually starts a
JVM to call a single method:

* https://github.com/openjdk/jdk/blob/df370d725e5ae55a05479e8375bf665233ac3e44/src/java.base/share/native/libjli/java.c#L1895-L1913
* https://github.com/openjdk/jdk/blob/c9cacfb25d1f15c879c961d2965a63c9fe4d9fa7/src/java.base/share/classes/java/lang/VersionProps.java.template#L191-L233

That could explain the slowness?

To improve this, let's just parse the file:

* C:\Program Files\Microsoft\jdk-11.0.19.7-hotspot\release

With contents:

    IMPLEMENTOR="Microsoft"
    IMPLEMENTOR_VERSION="Microsoft-7621296"
    JAVA_VERSION="11.0.19"
    JAVA_VERSION_DATE="2023-04-18"
    LIBC="default"
    MODULES="java.base java.compiler java.datatransfer java.xml java.prefs java.desktop java.instrument java.logging java.management java.security.sasl java.naming java.rmi java.management.rmi java.net.http java.scripting java.security.jgss java.transaction.xa java.sql java.sql.rowset java.xml.crypto java.se java.smartcardio jdk.accessibility jdk.internal.vm.ci jdk.management jdk.unsupported jdk.internal.vm.compiler jdk.aot jdk.internal.jvmstat jdk.attach jdk.charsets jdk.compiler jdk.crypto.ec jdk.crypto.cryptoki jdk.crypto.mscapi jdk.dynalink jdk.internal.ed jdk.editpad jdk.hotspot.agent jdk.httpserver jdk.internal.le jdk.internal.opt jdk.internal.vm.compiler.management jdk.jartool jdk.javadoc jdk.jcmd jdk.management.agent jdk.jconsole jdk.jdeps jdk.jdwp.agent jdk.jdi jdk.jfr jdk.jlink jdk.jshell jdk.jsobject jdk.jstatd jdk.localedata jdk.management.jfr jdk.naming.dns jdk.naming.ldap jdk.naming.rmi jdk.net jdk.pack jdk.rmic jdk.scripting.nashorn jdk.scripting.nashorn.shell jdk.sctp jdk.security.auth jdk.security.jgss jdk.unsupported.desktop jdk.xml.dom jdk.zipfs"
    OS_ARCH="x86_64"
    OS_NAME="Windows"
    SOURCE=".:git:6171c19d2c18"

We can just call `StreamReader.ReadLine()` until `JAVA_VERSION="` is
found.

I kept the existing logic in place, with an escape hatch if it were ever
needed. After a few .NET 9 previews, we could consider removing old
logic entirely?

This logic is considerably faster, because we no longer launch a JVM!

* Before: Task ValidateJavaVersion 377ms
* After: Task ValidateJavaVersion 2ms

Where you can test the "Before" logic by passing
`-p:_AndroidUseJavaExeVersion=true`.

I assume the customer's log from before (35 calls) would be around 70ms
give or take instead of 3 hours and 46 minutes? Even if this estimation
was 10x worse, it would still be a huge improvement! :eyes:
@github-actions github-actions bot locked and limited conversation to collaborators Feb 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Area: App+Library Build Issues when building Library projects or Application projects. need-attention A xamarin-android contributor needs to review
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants