Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crossgen2 determinism fixes #164

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
using System.IO;
using System.Diagnostics;
using System.Collections.Generic;
using System.Linq;
using System.Text;

// Managed mirror of NativeFormatWriter.h/.cpp
Expand Down Expand Up @@ -2041,13 +2042,9 @@ void ComputeLayout()
// we can use the ordering to terminate the lookup prematurely.
uint mask = ((_nBuckets - 1) << 8) | 0xFF;

// sort it by hashcode
_Entries.Sort(
(a, b) =>
{
return (int)(a.Hashcode & mask) - (int)(b.Hashcode & mask);
}
);
// Sort by hashcode. This sort must be stable since we need determinism even if two entries have
// the same hashcode. This is deterministic if entries are added in a deterministic order.
_Entries = _Entries.OrderBy(entry => entry.Hashcode & mask).ToList();
fadimounir marked this conversation as resolved.
Show resolved Hide resolved

// Start with maximum size entries
_entryIndexSize = 2;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -64,8 +64,20 @@ public override byte[] GetILBytes()
if (_ilBytes != null)
return _ilBytes;

byte[] ilBytes = _methodBody.GetILBytes();
return (_ilBytes = ilBytes);
// This lock is necessary since the JIT requires that we provide the same address
// for the IL buffer each time (it uses the address to detect recursive calls).
jkotas marked this conversation as resolved.
Show resolved Hide resolved
// Since MethodBodyBlock.GetAllBytes creates a new array each time, if two threads
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Just mention the name of the API we're calling here (GetILBytes) to reduce confusion. There's no GetAllBytes here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using Lazy to avoid writing our own double-checked locking?

Copy link
Contributor

@fadimounir fadimounir Nov 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like what Jan suggested here, to use Interlocked.CompareExchange

// call this on the same function then it is possible for both to set _ilBytes.
// Then, one of the threads will receive the first value on the first call and the
// second value on all subsequent calls, and we will not detect the recursive call.
lock (this)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should rather use Interlocked.CompareExchange

{
if (_ilBytes != null)
return _ilBytes;

byte[] ilBytes = _methodBody.GetILBytes();
return (_ilBytes = ilBytes);
}
}

public override bool IsInitLocals
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -96,8 +96,8 @@ public void RegisterEmitter(ISignatureEmitter emitter)
public int ModuleToIndex(EcmaModule module)
{
AssemblyName assemblyName = module.Assembly.GetName();

if (!_manifestAssemblies.Contains(assemblyName))
int assemblyRefIndex;
if (!_assemblyRefToModuleIdMap.TryGetValue(assemblyName.Name, out assemblyRefIndex))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong: this will cause missing entries in the _manifestAssemblies (the bug that I fixed recently).

The right way to fix this is to get rid of _manifestAssemblies, and emit a sorted version of _assemblyRefToModuleIdMap in GetData(), sorted by module ID

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And make _assemblyRefToModuleIdMap a Dictionary<AssemblyName, int>

{
if (_emissionCompleted)
{
Expand All @@ -108,14 +108,11 @@ public int ModuleToIndex(EcmaModule module)
// the verification logic would be broken at runtime.
Debug.Assert(_nodeFactory.CompilationModuleGroup.VersionsWithModule(module));

assemblyRefIndex = _nextModuleId++;
_manifestAssemblies.Add(assemblyName);
if (!_assemblyRefToModuleIdMap.ContainsKey(assemblyName.Name))
_assemblyRefToModuleIdMap.Add(assemblyName.Name, _nextModuleId);

_nextModuleId++;
_assemblyRefToModuleIdMap.Add(assemblyName.Name, assemblyRefIndex);
}

return _assemblyRefToModuleIdMap[assemblyName.Name];
return assemblyRefIndex;
}

public override int ClassCode => 791828335;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,15 @@ public override ObjectData GetData(NodeFactory factory, bool relocsOnly = false)
{
ReadyToRunCodegenNodeFactory r2rFactory = (ReadyToRunCodegenNodeFactory)factory;
ObjectDataSignatureBuilder dataBuilder = new ObjectDataSignatureBuilder();
dataBuilder.AddSymbol(this);

EcmaModule targetModule = _signatureContext.GetTargetModule(_arrayType);
SignatureContext innerContext = dataBuilder.EmitFixup(r2rFactory, ReadyToRunFixupKind.READYTORUN_FIXUP_NewArray, targetModule, _signatureContext);
dataBuilder.EmitTypeSignature(_arrayType, innerContext);
if (!relocsOnly)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: would be nice to follow the same convention we have in other places:

            if (relocsOnly)
            {
                return new ObjectData(Array.Empty<byte>(), null, 1, null);
            }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we use a mix of both, but I like the if (relocsOnly) one better because of the indentation. Can I move all of them over to that one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the ones which return trivially like that. There are some legitimate uses of it as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, leave it the way it is.

{
dataBuilder.AddSymbol(this);

EcmaModule targetModule = _signatureContext.GetTargetModule(_arrayType);
SignatureContext innerContext = dataBuilder.EmitFixup(r2rFactory, ReadyToRunFixupKind.READYTORUN_FIXUP_NewArray, targetModule, _signatureContext);
dataBuilder.EmitTypeSignature(_arrayType, innerContext);
}

return dataBuilder.ToObjectData();
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,14 @@ public override ObjectData GetData(NodeFactory factory, bool relocsOnly = false)
{
ReadyToRunCodegenNodeFactory r2rFactory = (ReadyToRunCodegenNodeFactory)factory;
ObjectDataSignatureBuilder dataBuilder = new ObjectDataSignatureBuilder();
dataBuilder.AddSymbol(this);

dataBuilder.EmitFixup(r2rFactory, ReadyToRunFixupKind.READYTORUN_FIXUP_StringHandle, _token.Module, _signatureContext);
dataBuilder.EmitUInt(_token.TokenRid);
if (!relocsOnly)
{
dataBuilder.AddSymbol(this);

dataBuilder.EmitFixup(r2rFactory, ReadyToRunFixupKind.READYTORUN_FIXUP_StringHandle, _token.Module, _signatureContext);
dataBuilder.EmitUInt(_token.TokenRid);
}

return dataBuilder.ToObjectData();
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
using System.Threading;

using Internal.IL;
using Internal.IL.Stubs;
using Internal.JitInterface;
using Internal.TypeSystem;

Expand All @@ -26,7 +25,7 @@ public abstract class Compilation : ICompilation
protected readonly NodeFactory _nodeFactory;
protected readonly Logger _logger;
private readonly DevirtualizationManager _devirtualizationManager;
private ILCache _methodILCache;
protected ILCache _methodILCache;
private readonly HashSet<ModuleDesc> _modulesBeingInstrumented;


Expand Down Expand Up @@ -79,10 +78,8 @@ public bool CanInline(MethodDesc caller, MethodDesc callee)

public virtual MethodIL GetMethodIL(MethodDesc method)
{
// Flush the cache when it grows too big
if (_methodILCache.Count > 1000)
_methodILCache = new ILCache(_methodILCache.ILProvider, NodeFactory.CompilationModuleGroup);

// TODO: Remove the ILCache and use the pinned heap for the IL stream once this GC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see why we would ever want to remove the ILCache. Using pinned heap is not going to help. I would delete this comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we would ever want to remove the ILCache

I don't see why we need it in the first place. This just seems to help releasing the pinning of the IL byte arrays to reduce GC fragmentation. The IL streams are saved on the type system method object, and there's no point in "caching".

My understanding is that the pinned heap will not have fragmentation. If so, we should just allocate the IL byte arrays on it, and remove the ILCache

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ILCache does not do any pinning. All pinning is done by jitinterface (CorInfoImpl.cs).

The purpose of ILCache is to cache the ILCode for given MethodDesc. The algorithm to compute ILCode for given MethodDesc is not cheap.

My understanding is that the pinned heap will not have fragmentation.

Pinned heap will have tend to have worse fragmentation than regular heaps because of it cannot be compacted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see what you mean. I debugged this, and it's more involved than I thought. I was under the impression that GetMethodIL() would directly just go to the EcmaMethodDesc's IL byte array that we load from metadata. This is why I was under the impression that the ILCache was used to free up the pins.

@SrivastavaAnubhav please delete the comment then.

// feature ships (in dotnet5)
return _methodILCache.GetOrCreateValue(method).MethodIL;
}

Expand All @@ -106,7 +103,7 @@ public bool IsModuleInstrumented(ModuleDesc module)
return _modulesBeingInstrumented.Contains(module);
}

private sealed class ILCache : LockFreeReaderHashtable<MethodDesc, ILCache.MethodILData>
public sealed class ILCache : LockFreeReaderHashtable<MethodDesc, ILCache.MethodILData>
{
public ILProvider ILProvider { get; }
private readonly CompilationModuleGroup _compilationModuleGroup;
Expand Down Expand Up @@ -147,7 +144,7 @@ protected override MethodILData CreateValueFromKey(MethodDesc key)
return new MethodILData() { Method = key, MethodIL = methodIL };
}

internal class MethodILData
public class MethodILData
{
public MethodDesc Method;
public MethodIL MethodIL;
Expand Down Expand Up @@ -302,6 +299,11 @@ protected override void ComputeDependencyNodeDependencies(List<DependencyNodeCor
}
}
}

if (_methodILCache.Count > 1000)
{
_methodILCache = new ILCache(_methodILCache.ILProvider, NodeFactory.CompilationModuleGroup);
}
}

public ISymbolNode GetFieldRvaData(FieldDesc field) => NodeFactory.CopiedFieldRva(field);
Expand Down