Reference assemblies need to include private struct fields #16402

jaredpar · 2016-02-18T02:04:28Z

Today the reference assemblies for .NET assemblies strip private fields from structs. This has an observable, and potentially dangerous, impact on project that consume them:

Allows pointers to be created for structs with fields of a reference type.
Prevents the compiler from catching cyclic struct layout problems.
Breaks C# definite assignment checking.
Allows structs using [FieldOffset] to verify when they should not.
Prevents developers from correctly planning for interop scenarios.

More details are available here and an example of the problems this can produce is here.

I understand part of the motivation for removing these fields is to keep the reference assemblies small. Keeping the fields necessitates keeping the type definitions for the types of the fields and this can cascade into many more types / members being included thus increasing size.

Cutting private fields for classes is fine as it's unobservable. Unfortunately for structs fields are observable irrespective of their accessibility and must be maintained in reference assemblies. The only action I think that can be done to curtail the number of types a struct brings in is the following:

A reference assembly can represent an inaccessible struct field which is a reference type as object.

Essentially class, interface, delegate and generic type parameters constrained to class can be transformed to object.

// Implementation assembly
private interface IBigInterface { ... } 
public struct S 
{
  private IBigInterface _field;
}

// Reference assembly
public struct S 
{
  private object _field;
}

This is unobservable to the compiler and can help limit the number of types brought it.

The text was updated successfully, but these errors were encountered:

nguerrera · 2016-02-18T03:10:44Z

This has versioning implications. The best definition for a valid reference assembly I have is that it's indistinguishable to the compiler from what could just have been an earlier version of the implementation assembly. As such, for any transformation deemed invalid from implementation to reference assembly, the inverse should be deemed to be a breaking change. We should update https://github.com/dotnet/corefx/blob/master/Documentation/coding-guidelines/breaking-change-rules.md accordingly.

(That document could probably use a general review as I've just noted that the second bullet is ~~wrong~~ misleading. :( EDIT: Submitted dotnet/corefx#6225 to fix that at least.)

jaredpar · 2016-02-18T16:35:11Z

I agree there are versioning implications here. I looked at that document for a bit but I'm having trouble processing it because it's unclear what it is targeting for breaking changes. Is the document concerned only about compilation back compat or is it concerned with binary compat? This has a big distinction for structs contained in a reference assembly. Here are my quick thoughts on it.

Binary Compat

At this level structs need to be broken down into two categories: those which have a field which is a reference type and those which do not.

Structs without a reference type field cannot change size between versions. The developer has to assume the worst which is the struct is involved in a PInvoke / COM interop scenario. Changing the size of the struct in such a scenario is an observable change which could break runtime compat.

There is probably some flexibility if such structs could change shape so long as they don't change size or add a reference type field. At a glance that seems okay but I may be missing something.

A struct with a reference type field could never be used in interop though. Hence it can fall back to the compile compat rules.

Compile Compat

For structs the hardest language to deal with compat on is C# as it has both definite assignment rules (F# does as well) and unsafe code (unique to C#). The compat rules for C# would be the following:

A struct with no inaccessible fields is frozen. Developer must assume such a struct was via direct field initialization in C# (no overall value assignment). Any change in this type by adding an accessible or inaccessible field would break such code.
A struct with at least one inacessible field can add / remove fields of any accessibility. Developer can safely assume struct instances were assigned with either default(T) or a ctor call.
- Exception here is the new field cannot be a reference type if the struct previously did not have any reference type fields. Developer much assume the struct was previously used as a pointer type and adding a reference field would break that.
- Remove of a public field could break a FieldOffset trick. I don't know if this falls into the compat rules considered here though.
A struct can change the type of inaccessible reference type fields to other reference types.

gafter · 2016-02-18T16:42:44Z

For compile-time compat, the following rules should work for producing a reference assembly. We drop all private fields, but add back certain synthesized private fields for a value type (struct) as follows:

If there are any private fields that are or contain any value type members, add a single private field of type int
If there are any private fields that are or contain any reference type members, add a single private field of type object.
If the type is generic, then for every type parameter of the type, if there are any private fields that are or contain any members whose type is that type parameter, we add a direct private field of that type.

jaredpar · 2016-02-18T16:45:04Z

@gafter

If there are any private fields that are or contain any value type members, add a single private field of type int

This breaks the ability for developers to write correct PInvoke and unsafe code. The size of the struct is important in both scenarios to write correct code.

gafter · 2016-02-18T21:31:55Z

This breaks the ability for developers to write correct PInvoke and unsafe code. The size of the struct is important in both scenarios to write correct code.

@jaredpar Can you unpack that assertion? The size of a struct isn't considered to be a compile-time constant except for primitive types and enums. What kind of code would be broken by this?

nguerrera · 2016-02-18T21:40:37Z

I would say that interop is the issue more than unsafe code. You can certainly write correct unsafe code without having size or layout guarantees. e.g. There is nothing wrong with &foo.x even if x moves around.

If we put further restrictions on struct changes in the name of interop, then structs marked with auto layout in v1 would be exempt. They are rejected by the interop layer, but that (justifiably) does not prevent them from being used by unsafe managed code.

I wish C# made auto the default and that sequential was opt-in for interop so that we didn't have to assume that any struct might be used for interop, but rather only those that were explicitly marked sequential or explicit.

jaredpar · 2016-02-18T21:55:47Z

The size of a struct isn't considered to be a compile-time constant except for primitive types and enums

Agree but there is plenty of interop that is built around exactly such types. Or more accurately structs that transitively wrap those types.

Take for example Guid. This is a struct that is often used in interop and contains only primitive type fields. Adding / removing a primitive field to this struct would be essentially unobservable from the stand point of C#. But it would wreak havoc on interop code that used Guid

jaredpar · 2016-02-18T21:56:33Z

I wish C# made auto the default and that sequential was opt-in for interop so that we didn't have to assume that any struct might be used for interop, but rather only those that were explicitly marked sequential or explicit.

Agree. Auto is also advantageous from the stand point of performance because the JIT can pack structs tighter. Stupid back compat 😦

scottdorman · 2016-02-19T00:33:05Z

Stupid back compat

Well said, @jaredpar! :)

I wish C# made auto the default and that sequential was opt-in for interop...

That would have been advantageous, especially since intereop usually requires a good understanding of what's going on at runtime. Another choice that would have been good was requiring constructor call chaining rather than implicitly adding it. :) I normally always follow that rule, except for the one time in the code that "discovered" this issue, specifically because it takes away the possibility of ambiguity and lead to "strange" results/compiler errors. 20-20 hindsight and all.

weshaggard · 2018-01-03T19:14:56Z

Update CCIExtensions to understand private struct fields
Update GenAPI's to emit these private modifiers into the reference assembly code.

Cost assumes its @weshaggard

Will be done along with dotnet/corefx#6185.

jkotas · 2018-01-03T19:20:48Z

How are you going to deal with the fact that the private fields are different between different implementations in number of cases?

weshaggard · 2018-01-03T20:24:57Z

@jkotas is that true for any public structs?

jkotas · 2018-01-03T21:43:06Z

A few examples:

// Similar differences exist for other async-related structs. These structs have been tweaked
// for performance a lot in .NET Core, and I think we would love to have the option to keep tweaking
// them without setting their layout in stone.
// .NET Framework
public struct AsyncVoidMethodBuilder
{
    private SynchronizationContext m_synchronizationContext;
    private AsyncMethodBuilderCore m_coreState;
    private Task m_task;
}
// .NET Core
public struct AsyncVoidMethodBuilder
{
    private SynchronizationContext _synchronizationContext;
    private AsyncTaskMethodBuilder _builder;
}
// .NET Native
public struct AsyncVoidMethodBuilder
{
    private Action m_moveNextAction;
    private SynchronizationContext m_synchronizationContext;
    private Task m_task;
}

// .NET Framework
public struct AsyncFlowControl: IDisposable
{
    private bool useEC;
    private ExecutionContext _ec;
    private SecurityContext _sc;
    private Thread _thread;
}
// .NET Core
public struct AsyncFlowControl : IDisposable
{
    private Internal.Runtime.Augments.RuntimeThread _thread;
}

// .NET Framework
struct OpCode
{
    private String m_stringname; // not used - computed lazily
    private StackBehaviour m_pop;
    private StackBehaviour m_push;
    private OperandType m_operand;
    private OpCodeType m_type;
    private int m_size;
    private byte m_s1;
    private byte m_s2;
    private FlowControl m_ctrl;
    private bool m_endsUncondJmpBlk;
    private int m_stackChange;
}
// .NET Core
public struct OpCode
{
    private OpCodeValues m_value;
    private int m_flags;
}

// Similar differences exist for other XXXHandles
// NET Framework and .NET Core
struct RuntimeTypeHandle
{
    private RuntimeType m_type;
}
// .NET Native:
struct RuntimeTypeHandle
{
    private IntPtr _value;
}

// Similar differences exist for other types related to Span
// Fast span
ref struct Span<T>
{
    internal readonly ByReference<T> _pointer;
    private readonly int _length;
}
// Slow span
ref struct Span<T>
{
    private readonly Pinnable<T> _pinnable;
    private readonly IntPtr _byteOffset;
    private readonly int _length;
}

For compile-time compat, the following rules should work for producing a reference assembly. We drop all private fields, but add back certain synthesized private fields for a value type (struct) as follows: - If there are any private fields that are or contain any value type members, add a single private field of type int - If there are any private fields that are or contain any reference type members, add a single private field of type object. - If the type is generic, then for every type parameter of the type, if there are any private fields that are or contain any members whose type is that type parameter, we add a direct private field of that type. For more details see issue https://github.com/dotnet/corefx/issues/6185 this blog is helpful as well http://blog.paranoidcoding.com/2016/02/15/are-private-members-api-surface.html

This also includes an update to the genapi tool that includes the necessary private fields in structs to enable the compiler to do correctness checks. See https://github.com/dotnet/corefx/issues/6185.

For compile-time compat, the following rules should work for producing a reference assembly. We drop all private fields, but add back certain synthesized private fields for a value type (struct) as follows: - If there are any private fields that are or contain any value type members, add a single private field of type int - If there are any private fields that are or contain any reference type members, add a single private field of type object. - If the type is generic, then for every type parameter of the type, if there are any private fields that are or contain any members whose type is that type parameter, we add a direct private field of that type. For more details see issue https://github.com/dotnet/corefx/issues/6185 this blog is helpful as well http://blog.paranoidcoding.com/2016/02/15/are-private-members-api-surface.html

Contributes to https://github.com/dotnet/corefx/issues/6185.

weshaggard · 2018-01-12T22:49:51Z

All the refs in corefx that have structs have had their refs updated to have private fields that follow the rules defined in our tools
dotnet/buildtools#1859 https://github.com/dotnet/buildtools/blob/master/src/Microsoft.Cci.Extensions/Writers/CSharp/CSharpWriter.cs#L141.

This fixes dotnet#678. In a nutshell, the compiler needs to know whether a struct has any fields in order to apply definitive assignment rules. While stripping all private fields from types is generally OK, we can't do this for structs. Fortunately, for private fields the compiler doesn't really care what they are, but that their characteristics are. For example: 1. Does the struct have any fields? 2. Does the struct contain any reference types (to validate generic instantiations that have the unmanaged constraint)? 3. Does the struct use the generic parameter in a field declaration (to validate cyclic layout problems)? This adds dummy fields to structs to conform to these rules. These aren't computed separately but are instead taken from .NET Core. For more details, see this issue in CoreFX: https://github.com/dotnet/corefx/issues/6185

This fixes dotnet#678. In a nutshell, the compiler needs to know whether a struct has any fields in order to apply definitive assignment rules. While stripping all private fields from types is generally OK, we can't do this for structs. Fortunately, for private fields the compiler doesn't really care what they are, but what their characteristics are. For example: 1. Does the struct have any fields? 2. Does the struct contain any reference types (to validate generic instantiations that have the unmanaged constraint)? 3. Does the struct use the generic parameter in a field declaration (to validate cyclic layout problems)? This adds dummy fields to structs to conform to these rules. These aren't computed separately but are instead taken from .NET Core. For more details, see this issue in CoreFX: https://github.com/dotnet/corefx/issues/6185

Contributes to https://github.com/dotnet/corefx/issues/6185.

jaredpar assigned weshaggard Feb 18, 2016

weshaggard removed their assignment Apr 7, 2017

weshaggard self-assigned this Jan 3, 2018

weshaggard referenced this issue in weshaggard/corefx Jan 12, 2018

Update references assemblies to include private fields for structs

2bb1b19

Contributes to https://github.com/dotnet/corefx/issues/6185.

weshaggard referenced this issue in weshaggard/corefx Jan 12, 2018

Update references assemblies to include private fields for structs

2e97d43

Contributes to https://github.com/dotnet/corefx/issues/6185.

weshaggard closed this as completed Jan 12, 2018

filipnavara referenced this issue in emclient/sysdrawing-coregraphics May 20, 2019

Update references assemblies to include private fields for structs

ed2258d

Contributes to https://github.com/dotnet/corefx/issues/6185.

msftgits transferred this issue from dotnet/corefx Jan 31, 2020

msftgits added this to the 2.1.0 milestone Jan 31, 2020

ghost locked as resolved and limited conversation to collaborators Jan 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reference assemblies need to include private struct fields #16402

Reference assemblies need to include private struct fields #16402

jaredpar commented Feb 18, 2016

nguerrera commented Feb 18, 2016

jaredpar commented Feb 18, 2016

gafter commented Feb 18, 2016

jaredpar commented Feb 18, 2016

gafter commented Feb 18, 2016

nguerrera commented Feb 18, 2016

jaredpar commented Feb 18, 2016

jaredpar commented Feb 18, 2016

scottdorman commented Feb 19, 2016

weshaggard commented Jan 3, 2018

jkotas commented Jan 3, 2018

weshaggard commented Jan 3, 2018

jkotas commented Jan 3, 2018 •

edited

Loading

weshaggard commented Jan 12, 2018 •

edited

Loading

Reference assemblies need to include private struct fields #16402

Reference assemblies need to include private struct fields #16402

Comments

jaredpar commented Feb 18, 2016

nguerrera commented Feb 18, 2016

jaredpar commented Feb 18, 2016

Binary Compat

Compile Compat

gafter commented Feb 18, 2016

jaredpar commented Feb 18, 2016

gafter commented Feb 18, 2016

nguerrera commented Feb 18, 2016

jaredpar commented Feb 18, 2016

jaredpar commented Feb 18, 2016

scottdorman commented Feb 19, 2016

weshaggard commented Jan 3, 2018

jkotas commented Jan 3, 2018

weshaggard commented Jan 3, 2018

jkotas commented Jan 3, 2018 • edited Loading

weshaggard commented Jan 12, 2018 • edited Loading

jkotas commented Jan 3, 2018 •

edited

Loading

weshaggard commented Jan 12, 2018 •

edited

Loading