Adding version bubbles and complex compilation rules to compilation in crossgen2 for .NET 5.
This document describes the concept of a version bubble, how to specify them to the ahead of time compiler (crossgen2), and the effect of various options on compilation.
System.Runtime.Versioning.NonVersionableAttribute
. If the inlinee is marked as NonVersionable, it may ALWAYS be inlined into the method being compiled.System.__Canon
will always be considered to be part of the version bubble. Also, the list of very well known types (object, string, int, uint, short, ushort, byte, sbyte, long, ulong, float, double, IntPtr, and UIntPtr) are also considered to be part of the version bubble as long as the generic is not constrained on an interface or class. For Exampleclass MyGeneric<T> {}
class ConstrainedGeneric<T> where T : IEquatable<T> {}
...
//MyGeneric<int> would be always be in the version bubble where MyGeneric was defined.
//ConstrainedGeneric<int> would only be in the version bubble of ConstrainedGeneric if ConstrainedGeneric shared a version bubble with System.Private.CoreLib.
//MyGeneric<DateTime> would would only be in the version bubble of MyGeneric if MyGeneric shared a version bubble with System.Private.CoreLib.
//ConstrainedGeneric<DateTime> would would only be in the version bubble of ConstrainedGeneric if ConstrainedGeneric shared a version bubble with System.Private.CoreLib.
A compilation group is a set of assemblies that are compiled together. Typically in the customer case, this is a set of assemblies that is compiled in a similar timescale. In general, all assemblies compiled at once are considered to be part of the same version bubble, but in the case of the core libraries and ASP.NET this isn’t actually true. Both of these layers include assemblies which support replacement by either higher layers (in the case of the WinForms/WPF frameworks) or by the application (in the case of ASP.NET).
The end user developer will specify which version bubble an application is using with a notation such as the following in the project file of application.
<PropertyGroup>
<CompilationVersionBubble>XXX</CompilationVersionBubble>
</PropertyGroup>
If CompilationVersionBubble
is set to IncludeFrameworks
, then the application will be compiled with a version bubble that includes the entire set of frameworks, and application. The framework in this scenario is the core-sdk concept of frameworks of which today we have the ASP.NET framework, WinForms, WPF, and the Core-Sdk. If the property is Application
, then the version bubble will be that of the application only. (Note that Application
will require additional development effort in the runtime to support some new token resolution behavior, etc.) If the property is Assembly
, then the version bubble will be at the individual assembly level. (Note, naming here is theoretical and unvetted, this is simply the level of control proposed to give to typical developers.)
The default value of the CompilationVersionBubble flag would be dependent on how the application is published. My expectation is that a standalone application will default to IncludeFrameworks
, and that if we implement sufficient support for Application
that will be the other default. Otherwise the default shall be Assembly
. Interaction with Docker build scenarios is also quite interesting, I suspect we would like to enable IncludeFrameworks
by default for Docker scenarios if possible.
There are 3 sets of files to pass to crossgen2:
Note, this approach is probably more complete than we will finish in one release, but encompasses a large set of future vision in this space.
For non-generic code this is straightforward. Either compile all the non-generic code in the binary, or compile only that which is specified via a profile guided optimization step. This choice shall be driven by a per “input assembly” switch as in the presence of a composite R2R image we likely will want to have different policy for different assemblies, as has proven valuable in the past. Until proven otherwise, per assembly specification of this behavior shall be considered to be sufficient.
We shall set a guideline for how much generic code to generate, and the amount of generic code to generate shall be gated as a multiplier of the amount of non-generic code generated.
For generic code we also need a per assembly switch to adjust between various behaviors, but the proposal is as follows:
With the advent of a version bubble larger than a single binary and the ability to generate generics, comes the problem of managing the multiple copies of generic code that might be generated.
The traditional NGEN model was to greedily generate generic code everywhere and assume it wouldn’t get out of hand. As generics have become more used and applications have become more broken down into many assemblies, this model has become less workable, and thus this model attempts to prevent such overgeneration.
Application construction consists of 1 or more frameworks, which may be built and distributed independently from the application (docker container scenario) and the application.
For instance,
Application ASP.NET Runtime Layer
Each layer in this stack will be compiled as a consistent set of crossgen2 compilations.
I propose to reduce the generics duplication problem to allow duplication between layers, but not within a layer. There are two ways to do this. The first of which is to produce composite R2R images for a layer. Within a single composite R2R image generation, running heuristics and generating generics eagerly should be straightforward. This composite R2R image would have all instantiations statically computed that are local to that particular layer of compilation, and also any instantiations from other layers. The duplication problem would be reduced in that a single analysis would trigger these multi-layer dependent compilations, and so which there may be duplication between layers, there wouldn’t be duplication within a layer. And given that the count of layers is not expected to exceed 3 or 4, that duplication will not be a major concern.
The second approach is to split compilation up into assembly level units, run the heuristics per assembly, generate the completely local generics in the individual assemblies, and then nominate a final mop up assembly that consumes a series of data files produced by the individual assembly compilations and holds all of the stuff that didn’t make sense in the individual assemblies. In my opinion this second approach would be better for debug builds, but the first approach is strictly better for release builds, and really shouldn’t be terribly slow.
Address the reducing generics duplication concern by implementing composite R2R files instead of attempting to build a multifile approach, where each layer is expected to know the exact layer above, or be fully R2R with respect to all layers above. Loading of R2R images built with version bubble dependence will lazily verify that version rules are not violated, and produce a FailFast if version bubble usage rules are used incorrectly. For customers which produce customized versions of the underlying frameworks, if any differences are present they will likely be forced to provide their own targeting packs to use the fully AOT layered scenarios.