Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[generator] Cache static final field values. #1248

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

jpobst
Copy link
Contributor

@jpobst jpobst commented Aug 27, 2024

Fixes: #1243

Consider the Thread.State enum:

/* partial */ class Thread {
    public static final /* partial */ enum State {
        NEW, RUNNABLE, BLOCKED, WAITING, TIMED_WAITING, TERMINATED
    }
}

This "actually" is a class with fields for each enum value:

% javap java.lang.Thread.State
Compiled from "Thread.java"
public final class java.lang.Thread$State extends java.lang.Enum<java.lang.Thread$State> {
  public static final java.lang.Thread$State NEW;
  public static final java.lang.Thread$State RUNNABLE;
  public static final java.lang.Thread$State BLOCKED;
  public static final java.lang.Thread$State WAITING;
  public static final java.lang.Thread$State TIMED_WAITING;
  public static final java.lang.Thread$State TERMINATED;
  public static java.lang.Thread$State[] values();
  public static java.lang.Thread$State valueOf(java.lang.String);
  static {};
}

When we bind it, we bind the fields as properties:

public sealed partial class State : Java.Lang.Enum
{
internal State() : base (default(Java.Interop.JniObjectReference), default(Java.Interop.JniObjectReferenceOptions)) { }
public static Java.Lang.Thread.State Blocked { get { throw null; } }
[System.ComponentModel.EditorBrowsableAttribute(1)]
[System.Diagnostics.DebuggerBrowsableAttribute(0)]
public override Java.Interop.JniPeerMembers JniPeerMembers { get { throw null; } }
public static Java.Lang.Thread.State New { get { throw null; } }
public static Java.Lang.Thread.State Runnable { get { throw null; } }
public static Java.Lang.Thread.State Terminated { get { throw null; } }
public static Java.Lang.Thread.State TimedWaiting { get { throw null; } }
public static Java.Lang.Thread.State Waiting { get { throw null; } }
[Java.Interop.JniMethodSignatureAttribute("valueOf", "(Ljava/lang/String;)Ljava/lang/Thread$State;")]
public static Java.Lang.Thread.State ValueOf(string name) { throw null; }
[Java.Interop.JniMethodSignatureAttribute("values", "()[Ljava/lang/Thread$State;")]
public static Java.Interop.JavaObjectArray<Java.Lang.Thread.State>? Values() { throw null; }
}

However, each of those properties involves a ValueManager lookup:

namespace Java.Lang {
	public partial class Thread {
		public sealed partial class State : global::Java.Lang.Enum {
			public static global::Java.Lang.Thread.State? Runnable {
				get {
					const string __id = "RUNNABLE.Ljava/lang/Thread$State;";

					var __v = _members.StaticFields.GetObjectValue (__id);
					return global::Java.Interop.JniEnvironment.Runtime.ValueManager.GetValue<global::Java.Lang.Thread.State? >(ref __v, JniObjectReferenceOptions.Copy);
				}
			}
		}
	}
}

This is JavaInterop1, not XAJavaInterop1, but .NET for Android isn't much different:

			public static Java.Lang.Thread.State? Runnable {
				get {
					const string __id = "RUNNABLE.Ljava/lang/Thread$State;";

					var __v = _members.StaticFields.GetObjectValue (__id);
					return global::Java.Lang.Object.GetObject<Java.Lang.Thread.State> (__v.Handle, JniHandleOwnership.TransferLocalRef);
				}
			}

The problem is that value lookup is not fast -- identity hash code needs to be obtained, locks obtained, etc. -- to the point that repeated enum lookups can actually show up in profiles.

Instead, update property generation for all final fields to cache the return value. This means we'd only need to call StaticFields.GetObjectValue() and "GetValue" once, instead of once per-access:

			global::Java.Lang.Thread.State? _Runnable_cache;
			public static global::Java.Lang.Thread.State? Runnable {
				get {
					if (_Runnable_cache != null) return _Runnable_cache;
					const string __id = "RUNNABLE.Ljava/lang/Thread$State;";

					var __v = _members.StaticFields.GetObjectValue (__id);
					return _Runnable_cache  = global::Java.Interop.JniEnvironment.Runtime.ValueManager.GetValue<global::Java.Lang.Thread.State? >(ref __v, JniObjectReferenceOptions.Copy);
				}
			}

Note there are a few wrinkles here:

  • If the field value is null, we will set the cached value to null which will trigger us to re-request the value. IE: the cache will not work for null values. This likely doesn't occur enough to be a concern.
  • The cache field (_Runnable_cache) could be a nullable value type or a nullable reference type. This complicates things as .NET treats these as very different things. A solution was found using casting that allows us to treat them the same without needing to explicitly use Nullable.HasValue or Nullable.Value.
  • Generating the correct code when NRT is not enabled gets pretty complex, as we then have to track what field types are value types and which are reference types to generate different code. To avoid this complexity, this entire feature is only enabled when NRT is enabled for the binding project. This includes Mono.Android, every AndroidX/GPS library we bind, and every user binding created from our templates (unless the user explicitly disabled NRT). This limitation should not impact too many bindings.

@jpobst jpobst force-pushed the cached-fields branch 6 times, most recently from 8417edd to 29f4095 Compare August 27, 2024 23:43
@jpobst jpobst marked this pull request as ready for review August 28, 2024 21:05
@jpobst jpobst requested a review from jonpryor August 28, 2024 21:05
@jonpryor
Copy link
Member

jonpryor commented Sep 3, 2024

Draft commit message:

Fixes: https://github.com/dotnet/java-interop/issues/1243

Consider the [`Thread.State` enum][0]:

	// Java
	/* partial */ class Thread {
	    public static final /* partial */ enum State {
	        NEW, RUNNABLE, BLOCKED, WAITING, TIMED_WAITING, TERMINATED
	    }
	}

This "actually" is a class with *fields* for each enum value:

	% javap java.lang.Thread.State
	Compiled from "Thread.java"
	public final class java.lang.Thread$State extends java.lang.Enum<java.lang.	Thread$State> {
	  public static final java.lang.Thread$State BLOCKED;
	  public static final java.lang.Thread$State NEW;
	
	}

When we bind it, we bind the fields as properties:

	// C# Binding
	partial class Thread {
	    public sealed partial class State : Java.Lang.Enum {
	        public static Java.Lang.Thread.State Blocked { get { throw null; } }
	        public static Java.Lang.Thread.State New { get { throw null; } }
	        // …
	    }
	}

However, each of those properties involves a ValueManager lookup:

	// C# Binding
	public partial class Thread {
	    public sealed partial class State : global::Java.Lang.Enum {
	        public static global::Java.Lang.Thread.State? Blocked {
	            get {
	                const string __id = "BLOCKED.Ljava/lang/Thread$State;";

	                var __v = _members.StaticFields.GetObjectValue (__id);
	                // JavaInterop1 / Java.Base.dll
	                return global::Java.Interop.JniEnvironment.Runtime.ValueManager.GetValue<global::Java.Lang.Thread.State? >(ref __v, JniObjectReferenceOptions.Copy);
	                // XAJavaInterop1 / Mono.Android.dll
	                return global::Java.Lang.Object.GetObject<Java.Lang.Thread.State> (__v.Handle, JniHandleOwnership.TransferLocalRef);
	            }
	        }
	    }
	}

The problem is that value lookup is not fast -- identity hash code
needs to be obtained, locks obtained, etc. -- to the point that
[repeated enum lookups can actually show up in profiles][1],
[necessitating manual caching of properties][2].

Instead, for all static "readonly" fields, fields which are `final`
but have no value:

	% xpath -e '//field[@final="true" and @static="true" and not(@value)]' src/Java.Base/obj/Release-net8.0/mcw/api.xml
	
	<field deprecated="not deprecated" final="true" name="BLOCKED" static="true" synthetic="false" transient="false" type="java.lang.Thread.State" type-generic-aware="java.lang.Thread.State" jni-signature="Ljava/lang/Thread$State;" visibility="public" volatile="false" />

update binding generation to *cache* the value returned.  This means
we'd only need to call `StaticFields.GetObjectValue()` and "GetValue"
*once*, instead of once per-access:

	// C# Binding, now with caching!
	public partial class Thread {
	    public sealed partial class State : global::Java.Lang.Enum {
	        static global::Java.Lang.Thread.State? _Runnable_cache;
	        public static global::Java.Lang.Thread.State? Blocked {
	            get {
	                if (_Runnable_cache != null) {
	                    return _Runnable_cache;
			}
	                const string __id = "BLOCKED.Ljava/lang/Thread$State;";

	                var __v = _members.StaticFields.GetObjectValue (__id);
	                // JavaInterop1 / Java.Base.dll
	                return (Thread.State?) (_Runnable_cache = global::Java.Interop.JniEnvironment.Runtime.ValueManager.GetValue<global::Java.Lang.Thread.State? >(ref __v, JniObjectReferenceOptions.Copy));
	                // XAJavaInterop1 / Mono.Android.dll
	                return (Thread.State?) (_Runnable_cache = global::Java.Lang.Object.GetObject<Java.Lang.Thread.State> (__v.Handle, JniHandleOwnership.TransferLocalRef));
	            }
	        }
	    }
	}

Note there are a few wrinkles here:

  - If the field value is `null`, we will set the cached value to
    `null` which will trigger us to re-request the value. i.e.: the
    cache will not work for `null` values.  This likely doesn't occur
    enough to be a concern.

  - The cache field (`_Runnable_cache`) could be a nullable value
    type or a nullable reference type.  This complicates things as
    .NET treats these as very different things.  A solution was found
    using casting that allows us to treat them the same without
    needing to explicitly use `Nullable.HasValue` or `Nullable.Value`.

  - Generating the correct code when NRT is not enabled gets pretty
    complex, as we then have to track what field types are value
    types and which are reference types to generate different code.
    To avoid this complexity, this entire feature is only enabled
    when NRT is enabled for the binding project.  This includes
    `Mono.Android`, every AndroidX/GPS library we bind, and every
    user binding created from our templates (unless the user
    explicitly disabled NRT).
    
    This limitation should not impact too many bindings.

[0]: https://developer.android.com/reference/java/lang/Thread.State
[1]: https://github.com/dotnet/java-interop/issues/1243#issuecomment-2291308192
[2]: https://github.com/dotnet/maui/pull/24248

@jonpryor
Copy link
Member

jonpryor commented Sep 3, 2024

@jpobst: a problem with this approach occurred to me around thread safety: reads and writes of pointer-sized values are atomic, and thus writes to and reads from the *_cache fields of reference types are atomic.

What won't be atomic are reads and writes to int?/Nullable<T>.

For now, could we modify the PR so that only reference types are supported, skipping "readonly" int/etc. fields?

@jonpryor
Copy link
Member

jonpryor commented Sep 3, 2024

On Discord, @jpobst wrote:

i was really trying to avoid treating value/reference types differently.

I think we can do this, by using two fields instead of one. Instead of _SdkInt_cache in:

// android.os.Build
partial class Build {
    static Android.OS.BuildVersionCodes? _SdkInt_cache;

    public static Android.OS.BuildVersionCodes SdkInt {
        get {
            if (_SdkInt_cache != null) return (BuildVersionCodes) _SdkInt_cache;
            const string __id = "SDK_INT.I";
            var __v = _members.StaticFields.GetInt32Value(__id);
            return (BuildVersionCodes) (_SdkInt_cache = __v);
        }
    }
}

we would instead have the fields:

  • static int _have_*: 0 if not set, 1 if set.
  • static T _*_cache: cached value.

a'la:

// android.os.Build
partial class Build {
    static int _have_SdkInt;
    static Android.OS.BuildVersionCodes _SdkInt_cache;

    public static Android.OS.BuildVersionCodes SdkInt {
        get {
            if (1 == Interlocked.CompareExchange (ref _have_SdkInt, 1, 0)) return _SdkInt_cache;
            const string __id = "SDK_INT.I";
            var __v = _members.StaticFields.GetInt32Value(__id);
            _SdkInt_cache = __v;
            return _SdkInt_cache;
        }
    }
}

TODO: is the above actually thread-safe?

jonpryor added a commit that referenced this pull request Sep 4, 2024
Context: #1243
Context: #1248

Java's `final` keyword is contextual, and maps to (at least?) three
separate keywords in C#:

  * `const` on fields
  * `readonly` on fields
  * `sealed` on types and methods

When binding fields, we only support "const" `final` fields: fields
for which the value is known at compile-time.

Non-`const` fields are bound as properties, requiring a lookup for
every property access.

This can be problematic, performance-wise, as `final` fields without
a compile-time value only need to be looked up once; afterward, their
value cannot change [^1].  As such, we should consider altering our
binding of "readonly" static properties to *cache* the value.

PR #1248 implemented a "nullable"-based approach to caching the field
value.  While this approach works for reference types, it is likely
not thread safe for `int?` and other value types.

[There is a comment on #1248 to make the approach thread-safe][0],
but @jonpryor isn't entirely sure if it's correct.  The
"straightfoward" approach would be to use a C# `lock` statement,
but that requires a GC-allocated lock object, which would increase
memory use.  Furthermore, if this code is wrong, the only way to fix
it is by regenerating the bindings.

@jonpryor considered moving the thread-safety logic into a separate
type, moving it outside of the generated code.  This is implemented
as `ReadOnlyProperty<T>`, in this commit.

To help figure this out, along with the performance implications,
add a `ReadOnlyPropertyTiming` test fixture to
`Java.Interop-PerformanceTests.dll` to measure performance, and
update `JavaTiming` to have the various proposed binding ideas so
that we can determine the resulting code size.

Results are as follows:

| Approach                                              | Code Size (bytes) | Total (s) | Amortized (ticks) |
| ----------------------------------------------------- | ----------------: | --------: | ----------------: |
| No caching (current)                                  |                21 | 0.0029275 |              2927 |
| "nullable" caching (not thread-safe; #1248 approach)  |                65 | 0.0000823 |                82 |
| Inlined thread-safe caching                           |                48 | 0.0000656 |                65 |
| `ReadOnlyProperty<T>` caching                         |        24+17 = 41 | 0.0001644 |               164 |

Worst performance is to not cache anything.  At least the expected
behavior is verified.

"Nullable" caching is quite performant.  Pity it isn't thread-safe.

"Inlined thread-safe caching" is ~20% faster than "nullable" caching.

`ReadOnlyProperty<T>` caching is nearly 2x slower than "nullable".

Can `ReadOnlyProperty<T>` be made faster?

[0]: #1248 (comment)

[^1]: Not strictly true; *instance* fields can change within the
      object constructor, and *static* fields change change within
      the static constructor.  As #1248 is about static fields of
      *bound* types, there should be no way for us to observe this.
      Things become trickier with instance fields.
jonpryor added a commit that referenced this pull request Sep 4, 2024
Context: #1243
Context: #1248

Java's `final` keyword is contextual, and maps to (at least?) three
separate keywords in C#:

  * `const` on fields
  * `readonly` on fields
  * `sealed` on types and methods

When binding fields, we only support "const" `final` fields: fields
for which the value is known at compile-time.

Non-`const` fields are bound as properties, requiring a lookup for
every property access.

This can be problematic, performance-wise, as `final` fields without
a compile-time value only need to be looked up once; afterward, their
value cannot change [^1].  As such, we should consider altering our
binding of "readonly" static properties to *cache* the value.

PR #1248 implemented a "nullable"-based approach to caching the field
value.  While this approach works for reference types, it is likely
not thread safe for `int?` and other value types.

[There is a comment on #1248 to make the approach thread-safe][0],
but @jonpryor isn't entirely sure if it's correct.  The
"straightfoward" approach would be to use a C# `lock` statement,
but that requires a GC-allocated lock object, which would increase
memory use.  Furthermore, if this code is wrong, the only way to fix
it is by regenerating the bindings.

@jonpryor considered moving the thread-safety logic into a separate
type, moving it outside of the generated code.  This is implemented
as `ReadOnlyProperty<T>`, in this commit.

To help figure this out, along with the performance implications,
add a `ReadOnlyPropertyTiming` test fixture to
`Java.Interop-PerformanceTests.dll` to measure performance, and
update `JavaTiming` to have the various proposed binding ideas so
that we can determine the resulting code size.

Results are as follows:

| Approach                                              | Code Size (bytes) | Total (s) | Amortized (ticks) |
| ----------------------------------------------------- | ----------------: | --------: | ----------------: |
| No caching (current)                                  |                21 | 0.0029098 |              2909 |
| "nullable" caching (not thread-safe; #1248 approach)  |                65 | 0.0000827 |                82 |
| Inlined thread-safe caching                           |                48 | 0.0000664 |                66 |
| `ReadOnlyProperty<T>` caching                         |        19+21 = 40 | 0.0001548 |               154 |

Worst performance is to not cache anything.  At least the expected
behavior is verified.

"Nullable" caching is quite performant.  Pity it isn't thread-safe.

"Inlined thread-safe caching" is ~20% faster than "nullable" caching.

`ReadOnlyProperty<T>` caching is nearly 2x slower than "nullable".

Can `ReadOnlyProperty<T>` be made faster?

[0]: #1248 (comment)

[^1]: Not strictly true; *instance* fields can change within the
      object constructor, and *static* fields change change within
      the static constructor.  As #1248 is about static fields of
      *bound* types, there should be no way for us to observe this.
      Things become trickier with instance fields.
@jpobst
Copy link
Contributor Author

jpobst commented Sep 10, 2024

This is on hold due to:

  • Concerns about thread-safety
  • Making this change this late in the .NET 9 development cycle

We intend to pick this back up for .NET 10.

@jpobst jpobst marked this pull request as draft September 10, 2024 21:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Java enumerations should *cache* the field values
2 participants