Low Level Struct Improvements

Summary

This proposal is an aggregation of several different proposals for struct performance improvements: ref fields and the ability to override lifetime defaults. The goal being a design which takes into account the various proposals to create a single overarching feature set for low level struct improvements.

Motivation

Earlier versions of C# added a number of low level performance features to the language: ref returns, ref struct, function pointers, etc. ... These enabled .NET developers to write highly performant code while continuing to leverage the C# language rules for type and memory safety. It also allowed the creation of fundamental performance types in the .NET libraries like Span<T>.

As these features have gained traction in the .NET ecosystem developers, both internal and external, have been providing us with information on remaining friction points in the ecosystem. Places where they still need to drop to unsafe code to get their work done, or require the runtime to special case types like Span<T>.

Today Span<T> is accomplished by using the internal type ByReference<T> which the runtime effectively treats as a ref field. This provides the benefit of ref fields but with the downside that the language provides no safety verification for it, as it does for other uses of ref. Further only dotnet/runtime can use this type as it's internal, so 3rd parties can not design their own primitives based on ref fields. Part of the motivation for this work is to remove ByReference<T> and use proper ref fields in all code bases.

This proposal plans to address these issues by building on top of our existing low level features. Specifically it aims to:

  • Allow ref struct types to declare ref fields.
  • Allow the runtime to fully define Span<T> using the C# type system and remove special case type like ByReference<T>
  • Allow struct types to return ref to their fields.
  • Allow runtime to remove unsafe uses caused by limitations of lifetime defaults
  • Allow the declaration of safe fixed buffers for managed and unmanaged types in struct

Detailed Design

The rules for ref struct safety are defined in the span safety document. This document will describe the required changes to this document as a result of this proposal. Once accepted as an approved feature these changes will be incorporated into that document.

Once this design is complete our Span<T> definition will be as follows:

readonly ref struct Span<T>
{
    readonly ref T _field;
    readonly int _length;

    // This constructor does not exist today but will be added as a part 
    // of changing Span<T> to have ref fields. It is a convenient, and
    // safe, way to create a length one span over a stack value that today 
    // requires unsafe code.
    public Span(ref T value)
    {
        _field = ref value;
        _length = 1;
    }
}

Provide ref fields and scoped

The language will allow developers to declare ref fields inside of a ref struct. This can be useful for example when encapsulating large mutable struct instances or defining high performance types like Span<T> in libraries besides the runtime.

ref struct S 
{
    public ref int Value;
}

A ref field will be emitted into metadata using the ELEMENT_TYPE_BYREF signature. This is no different than how we emit ref locals or ref arguments. For example ref int _field will be emitted as ELEMENT_TYPE_BYREF ELEMENT_TYPE_I4. This will require us to update ECMA335 to allow this entry but this should be rather straight forward.

Developers can continue to initialize a ref struct with a ref field using the default expression in which case all declared ref fields will have the value null. Any attempt to use such fields will result in a NullReferenceException being thrown.

ref struct S 
{
    public ref int Value;
}

S local = default;
local.Value.ToString(); // throws NullReferenceException

While the C# language pretends that a ref cannot be null this is legal at the runtime level and has well defined semantics. Developers who introduce ref fields into their types need to be aware of this possibility and should be strongly discouraged from leaking this detail into consuming code. Instead ref fields should be validated as non-null using the runtime helpers and throwing when an uninitialized struct is used incorrectly.

ref struct S1 
{
    private ref int Value;

    public int GetValue()
    {
        if (System.Runtime.CompilerServices.Unsafe.IsNullRef(ref Value))
        {
            throw new InvalidOperationException(...);
        }

        return Value;
    }
}

A ref field can be combined with readonly modifiers in the following ways:

  • readonly ref: this is a field that cannot be ref reassigned outside a constructor or init methods. It can be value assigned though outside those contexts
  • ref readonly: this is a field that can be ref reassigned but cannot be value assigned at any point. This how an in parameter could be ref reassigned to a ref field.
  • readonly ref readonly: a combination of ref readonly and readonly ref.
ref struct ReadOnlyExample
{
    ref readonly int Field1;
    readonly ref int Field2;
    readonly ref readonly int Field3;

    void Uses(int[] array)
    {
        Field1 = ref array[0];  // Okay
        Field1 = array[0];      // Error: can't assign ref readonly value (value is readonly)
        Field2 = ref array[0];  // Error: can't repoint readonly ref
        Field2 = array[0];      // Okay
        Field3 = ref array[0];  // Error: can't repoint readonly ref
        Field3 = array[0];      // Error: can't assign ref readonly value (value is readonly)
    }
}

A readonly ref struct will require that ref fields are declared readonly ref. There is no requirement that they are declared readonly ref readonly. This does allow a readonly struct to have indirect mutations via such a field but that is no different than a readonly field that pointed to a reference type today (more details)

A readonly ref will be emitted to metadata using the initonly flag, same as any other field. A ref readonly field will be attributed with System.Runtime.CompilerServices.IsReadOnlyAttribute. A readonly ref readonly will be emitted with both items.

This feature requires runtime support and changes to the ECMA spec. As such these will only be enabled when the corresponding feature flag is set in corelib. The issue tracking the exact API is tracked here https://github.com/dotnet/runtime/issues/64165

The set of changes to our span safety rules necessary to allow ref fields is small and targeted. The rules already account for ref fields existing and being consumed from APIs. The changes need to focus on only two aspects: how they are created and how they are ref reassigned.

First the rules establishing ref-safe-to-escape values for fields need to be updated for ref fields as follows:

An expression in the form ref e.F ref-safe-to-escape as follows:

  1. If F is a ref field its ref-safe-to-escape scope is the safe-to-escape scope of e.
  2. Else if e is of a reference type, it has ref-safe-to-escape of calling method
  3. Else its ref-safe-to-escape is taken from the ref-safe-to-escape of e.

This does not represent a rule change though as the rules have always accounted for ref state to exist inside a ref struct. This is in fact how the ref state in Span<T> has always worked and the consumption rules correctly account for this. The change here is just accounting for developers to be able to access ref fields directly and ensure they do so by the existing rules implicitly applied to Span<T>.

This does mean though that ref fields can be returned as ref from a ref struct but normal fields cannot.

ref struct RS
{
    ref int _refField;
    int _field;

    // Okay: this falls into bullet one above. 
    public ref int Prop1 => ref _refField;

    // Error: This is bullet four above and the ref-safe-to-escape of `this`
    // in a `struct` is the current method scope.
    public ref int Prop2 => ref _field;
}

This may seem like an error at first glance but this is a deliberate design point. Again though, this is not a new rule being created by this proposal, it is instead acknowledging the existing rules Span<T> behaved by now that developers can declare their own ref state.

Next the rules for ref reassignment need to be adjusted for the presence of ref fields. The primary scenario for ref reassignment is ref struct constructors storing ref parameters into ref fields. The support will be more general but this is the core scenario. To support this the rules for ref reassignment will be adjusted to account for ref fields as follows:

Ref reassignment rules

The left operand of the = ref operator must be an expression that binds to a ref local variable, a ref parameter (other than this), an out parameter, or a ref field.

For a ref reassignment in the form e1 = ref e2 both of the following must be true:

  1. e2 must have ref-safe-to-escape at least as large as the ref-safe-to-escape of e1
  2. e1 must have the same safe-to-escape as e2 Note

That means the desired Span<T> constructor works without any extra annotation:

readonly ref struct Span<T>
{
    readonly ref T _field;
    readonly int _length;

    public Span(ref T value)
    {
        // Falls into the `x.e1 = ref e2` case, where `x` is the implicit `this`. The 
        // safe-to-escape of `this` is *return only* and ref-safe-to-escape of `value` is 
        // *calling method* hence this is legal.
        _field = ref value;
        _length = 1;
    }
}

The change to ref reassignment rules means ref parameters can now escape from a method as a ref field in a ref struct value. As discussed in the compat considerations section this can change the rules for existing APIs that never intended for ref parameters to escape as a ref field. The lifetime rules for parameters are based solely on their declaration not on their usage. All ref and in parameters have ref-safe-to-escape of return only and hence can now be returned by ref or a ref field. In order to support APIs having ref parameters that can be escaping or non-escaping, and thus restore C# 10 call site semantics, the language will introduce limited lifetime annotations.

scoped modifier

The keyword scoped will be used to restrict the lifetime of a value. It can be applied to a ref or a value that is a ref struct and has the impact of restricting the ref-safe-to-escape or safe-to-escape lifetime, respectively, to the current method. For example:

Parameter or Local ref-safe-to-escape safe-to-escape
Span<int> s current method calling method
scoped Span<int> s current method current method
ref Span<int> s calling method calling method
scoped ref Span<int> s current method calling method

In this relationship the ref-safe-to-escape of a value can never exceed the safe-to-escape.

This allows for APIs in C# 11 to be annotated such that they have the same rules as C# 10:

Span<int> CreateSpan(scoped ref int parameter)
{
    // Just as with C# 10, the implementation of this method isn't relevant to callers.
}

Span<int> BadUseExamples(int parameter)
{
    // Legal in C# 10 and legal in C# 11 due to scoped ref
    return CreateSpan(ref parameter);

    // Legal in C# 10 and legal in C# 11 due to scoped ref
    int local = 42;
    return CreateSpan(ref local);

    // Legal in C# 10 and legal in C# 11 due to scoped ref
    Span<int> span = stackalloc int[42];
    return CreateSpan(ref span[0]);
}

The scoped annotation also means that the this parameter of a struct can now be defined as scoped ref T. Previously it had to be special cased in the rules as ref parameter that had different ref-safe-to-escape rules than other ref parameters (see all the references to including or excluding the receiver in the span safety rules). Now it can be expressed as a general concept throughout the rules which further simplifies them.

The scoped annotation can also be applied to the following locations:

  • locals: This annotation sets the lifetime as safe-to-escape, or ref-safe-to-escape in case of a ref local, to of current method irrespective of the initializer lifetime.
Span<int> ScopedLocalExamples()
{
    // Error: `span` has a safe-to-escape of *current method*. That is true even though the 
    // initializer has a safe-to-escape of *calling method*. The annotation overrides the 
    // initializer
    scoped Span<int> span = default;
    return span;

    // Okay: the initializer has safe-to-escape of *calling method* hence so does `span2` 
    // and the return is legal.
    Span<int> span2 = default;
    return span2;

    // The declarations of `span3` and `span4` are functionally identical because the 
    // initializer has a safe-to-escape of *current method* meaning the `scoped` annotation
    // is effectively implied on `span3`
    Span<int> span3 = stackalloc int[42];
    scoped Span<int> span4 = stackalloc int[42];
}

Other uses for scoped on locals are discussed below.

The scoped annotation cannot be applied to any other location including returns, fields, array elements, etc ... Further while scoped has impact when applied to any ref, in or out it only has impact when applied to values which are ref struct. Having declarations like scoped int has no impact because a non ref struct is always safe to return. The compiler will create a diagnostic for such cases to avoid developer confusion.

Change the behavior of out parameters

To further limit the impact of the compat change of making ref and in parameters returnable as ref fields, the language will change the default ref-safe-to-escape value for out parameters to be current method. Effectively out parameters are implicitly scoped out going forward. From a compat perspective this means they cannot be returned by ref:

ref int Sneaky(out int i) 
{
    i = 42;

    // Error: ref-safe-to-escape of out is now the current method
    return ref i;
}

This will increase the flexibility of APIs that return ref struct values and have out parameters because it does not have to consider the parameter being captured by reference anymore. This is important because it's a common pattern in reader style APIs:

Span<byte> Read(Span<byte> buffer, out int read)
{
    // .. 
}

Span<int> Use()
{
    var buffer = new byte[256];

    // If we keep current `out` ref-safe-to-escape this is an error. The language must consider
    // the `read` parameter as returnable as a `ref` field
    //
    // If we change `out` ref-safe-to-escape this is legal. The language does not consider the 
    // `read` parameter to be returnable hence this is safe
    int read;
    return Read(buffer, out read);
}

The language will also no longer consider arguments passed to an out parameter to be returnable. Treating the input to an out parameter as returnable was extremely confusing to developers. It essentially subverts the intent of out by forcing developers to consider the value passed by the caller which is never used except in languages that don't respect out. Going forward languages that support ref struct must ensure the original value passed to an out parameter is never read.

C# achieves this via it's definite assignment rules. That both achieves our ref safety rules as well as allowing for existing code which assigns and then returns out parameters values.

Span<int> StrangeButLegal(out Span<int> span)
{
    span = default;
    return span;
}

Together these changes mean the argument to an out parameter does not contribute safe-to-escape or ref-safe-to-escape values to method invocations. This significantly reduces the overall compat impact of ref fields as well as simplifies how developers think about out. An argument to an out parameter does not contribute to the return, it is simply an output.

Infer safe-to-escape of declaration expressions

The safe-to-escape of a declaration variable from an out argument (M(x, out var y)) or deconstruction ((var x, var y) = M()) is the narrowest of the following:

  • calling method
  • if out variable is marked scoped, then the current local scope (i.e. current method or narrower).
  • if out variable's type is ref struct, consider all arguments to the containing invocation, including the receiver:
    • STE of any argument where its corresponding parameter is not out and has STE of ReturnOnly or wider
    • RSTE of any argument where its corresponding parameter has RSTE of ReturnOnly or wider

See also Examples of inferred safe-to-escape of declaration expressions.

Implicitly scoped parameters

Overall there are two ref location which are implicitly declared as scoped:

  • this on a struct instance method
  • out parameters

The span safety rules will be written in terms of scoped ref and ref. For span safety purposes an in parameter is equivalent to ref and out is equivalent to scoped ref. Both in and out will only be specifically called out when it is important to the semantic of the rule. Otherwise they are just considered ref and scoped ref respectively.

When discussing the ref-safe-to-escape of arguments that correspond to in parameters they will be generalized as ref arguments in the spec. In the case the argument is an lvalue then the ref-safe-to-escape is that of the lvalue, otherwise it is current method. Again in will only be called out here when it is important to the semantic of the current rule.

Return-only escape scope

The design also requires that the introduction of a new escape scope: return only. This is similar to calling method in that it can be returned but it can only be returned through a return statement.

The details of return only is that it's a scope which is greater than current method but smaller than calling method. An expression provided to a return statement must be at least return only. As such most existing rules fall out. For example assignment into a ref parameter from an expression with a safe-to-escape of return only will fail because it's smaller than the ref parameter's safe-to-escape which is calling method. The need for this new escape scope will be discussed below.

There are three locations which default to return only:

  • A ref or in parameter. This is done in part for ref struct to prevent silly cyclic assignment issues. It is done uniformly though to simplify the model as well as minimize compat changes.
  • A out parameter for a ref struct will have safe-to-escape of return only. This allows for return and out to be equally expressive. This does not have the silly cyclic assignment problem because out is implicitly scoped so the ref-safe-to-escape is still smaller than the safe-to-escape.
  • A this parameter for a struct constructor. This falls out due to being modeled as out parameters.

Any expression or statement which explicitly returns a value from a method or lambda must have a safe-to-escape, and if applicable a ref-safe-to-escape, of at least return only. That includes return statements, expression bodied members and lambda expressions.

Likewise any assignment to an out must have a safe-to-escape of at least return only. This is not a special case though, this just follows from the existing assignment rules.

Note: An expression whose type is not a ref struct type always has a safe-to-return of calling method.

Rules for method invocation

The span safety rules for method invocation will be updated in several ways. The first is by recognizing the impact that scoped has on arguments. For a given argument expr that is passed to parameter p:

  1. If p is scoped ref then expr does not contribute ref-safe-to-escape when considering arguments.
  2. If p is scoped then expr does not contribute safe-to-escape when considering arguments.
  3. If p is out then expr does not contribute ref-safe-to-escape or safe-to-escape more details

The language "does not contribute" means the arguments are simply not considered when calculating the ref-safe-to-escape or safe-to-escape value of the method return respectively. That is because the values can't contribute to that lifetime as the scoped annotation prevents it.

The method invocation rules can now be simplified. The receiver no longer needs to be special cased, in the case of struct it is now simply a scoped ref T. The value rules need to change to account for ref field returns:

A value resulting from a method invocation e1.M(e2, ...) is safe-to-escape from the narrowest of the following scopes:

  1. The calling method
  2. When the return is a ref struct the safe-to-escape contributed by all argument expressions
  3. When the return is a ref struct the ref-safe-to-escape contributed by all ref arguments

The ref calling rules can be simplified to:

A value resulting from a method invocation ref e1.M(e2, ...) is ref-safe-to-escape the narrowest of the following scopes:

  1. The calling method
  2. The safe-to-escape contributed by all argument expressions
  3. The ref-safe-to-escape contributed by all ref arguments

This rule now lets us define the two variants of desired methods:

Span<int> CreateWithoutCapture(scoped ref int value)
{
    // Error: value Rule 3 specifies that the safe-to-escape be limited to the ref-safe-to-escape
    // of the ref argument. That is the *current method* for value hence this is not allowed.
    return new Span<int>(ref value);
}

Span<int> CreateAndCapture(ref int value)
{
    // Okay: value Rule 3 specifies that the safe-to-escape be limited to the ref-safe-to-escape
    // of the ref argument. That is the *calling method* for value hence this is not allowed.
    return new Span<int>(ref value)
}

Span<int> ComplexScopedRefExample(scoped ref Span<int> span)
{
    // Okay: the safe-to-escape of `span` is *calling method* hence this is legal.
    return span;

    // Okay: the local `refLocal` has a ref-safe-to-escape of *current method* and a 
    // safe-to-escape of *calling method*. In the call below it is passed to a 
    // parameter that is `scoped ref` which means it does not contribute 
    // ref-safe-to-escape. It only contributes its safe-to-escape hence the returned
    // rvalue ends up as safe-to-escape of *calling method*
    Span<int> local = default;
    ref Span<int> refLocal = ref local;
    return ComplexScopedRefExample(ref refLocal);

    // Error: similar analysis as above but the safe-to-escape scope of `stackLocal` is 
    // *current method* hence this is illegal
    Span<int> stackLocal = stackalloc int[42];
    return ComplexScopedRefExample(ref stackLocal);
}

The presence of ref fields means the rules around method arguments must match need to be updated as a ref parameter can now be stored as a field in a ref struct argument to the method. Previously the rule only had to consider another ref struct being stored as a field. The impact of this is discussed in the compat considerations. The new rule is ...

For any method invocation e.M(a1, a2, ... aN)

  1. Calculate the narrowest safe-to-escape from:
    • calling method
    • The safe-to-escape of all arguments
    • The ref-safe-to-escape of all ref arguments whose corresponding parameters have a ref-safe-to-escape of calling method
  2. All ref arguments of ref struct types must be assignable by a value with that safe-to-escape. This is a case where ref does not generalize to include in and out

For any method invocation e.M(a1, a2, ... aN)

  1. Calculate the narrowest safe-to-escape from:
    • calling method
    • The safe-to-escape of all arguments
    • The ref-safe-to-escape of all ref arguments whose corresponding parameters are not scoped
  2. All out arguments of ref struct types must be assignable by a value with that safe-to-escape.

The presence of scoped allows developers to reduce the friction this rule creates by marking parameters which are not returned as scoped. This removes their arguments from (1) in both cases above and provides greater flexibility to callers.

Impact of this change is discussed more deeply below. Overall this will allow developers to make call sites more flexible by annotating non-escaping ref-like values with scoped.

Parameter scope variance

The scoped modifier and [UnscopedRef] attribute (see below) on parameters also impacts our object overriding, interface implementation and delegate conversion rules. The signature for an override, interface implementation or delegate conversion can:

  • Add scoped to a ref or in parameter
  • Add scoped to a ref struct parameter
  • Remove [UnscopedRef] from an out parameter
  • Remove [UnscopedRef] from a ref parameter of a ref struct type

Any other difference with respect to scoped or [UnscopedRef] is considered a mismatch.

The compiler will report a diagnostic for unsafe scoped mismatches across overrides, interface implementations, and delegate conversions when:

  • The method returns a ref struct or returns a ref or ref readonly, or the method has a ref or out parameter of ref struct type, and
  • The method has at least one additional ref, in, or out parameter, or a parameter of ref struct type.

The rules above ignore this parameters because ref struct instance methods cannot be used for overrides, interface implementations, or delegate conversions.

The diagnostic is reported as an error if the mismatched signatures are both using C#11 ref safety rules; otherwise, the diagnostic is a warning.

The scoped mismatch warning may be reported on a module compiled with C#7.2 ref safety rules where scoped is not available. In some such cases, it may be necessary to suppress the warning if the other mismatched signature cannot be modified.

The scoped modifier and [UnscopedRef] attribute also have the following effects on method signatures:

  • The scoped modifier and [UnscopedRef] attribute do not affect hiding
  • Overloads cannot differ only on scoped or [UnscopedRef]

The section on ref field and scoped is long so wanted to close with a brief summary of the proposed breaking changes:

  • A value that has ref-safe-to-escape to the calling method is returnable by ref or ref field.
  • A out parameter would be considered safe-to-escape within the current method.

Detailed Notes:

  • A ref field can only be declared inside of a ref struct
  • A ref field cannot be declared static, volatile or const
  • A ref field cannot have a type that is ref struct
  • The reference assembly generation process must preserve the presence of a ref field inside a ref struct
  • A readonly ref struct must declare its ref fields as readonly ref
  • For by-ref values the scoped modifier must appear before in, out, or ref
  • The span safety rules document will be updated as outlined in this document
  • The new span safety rules will be in effect when either
    • The core library contains the feature flag indicating support for ref fields
    • The langversion value is 11 or higher

Syntax

12.6.2 Local variable declarations: added 'scoped'?.

local_variable_declaration
    : 'scoped'? local_variable_mode_modifier? local_variable_type local_variable_declarators
    ;

local_variable_mode_modifier
    : 'ref' 'readonly'?
    ;

12.9.4 The for statement: added 'scoped'? indirectly from local_variable_declaration.

12.9.5 The foreach statement: added 'scoped'?.

foreach_statement
    : 'foreach' '(' 'scoped'? local_variable_type identifier 'in' expression ')'
      embedded_statement
    ;

11.6.2 Argument lists: added 'scoped'? for out declaration variable.

argument_value
    : expression
    | 'in' variable_reference
    | 'ref' variable_reference
    | 'out' ('scoped'? local_variable_type)? identifier
    ;

--.-.- Deconstruction expressions:

[TBD]

14.6.2 Method parameters: added 'scoped'? to parameter_modifier.

fixed_parameter
    : attributes? parameter_modifier? type identifier default_argument?
    ;

parameter_modifier
    | 'this' 'scoped'? parameter_mode_modifier?
    | 'scoped' parameter_mode_modifier?
    | parameter_mode_modifier
    ;

parameter_mode_modifier
    : 'in'
    | 'ref'
    | 'out'
    ;

19.2 Delegate declarations: added 'scoped'? indirectly from fixed_parameter.

11.16 Anonymous function expressions: added 'scoped'?.

explicit_anonymous_function_parameter
    : 'scoped'? anonymous_function_parameter_modifier? type identifier
    ;

anonymous_function_parameter_modifier
    : 'in'
    | 'ref'
    | 'out'
    ;

Sunset restricted types

The compiler has a concept of a set of "restricted types" which is largely undocumented. These types were given a special status because in C# 1.0 there was no general purpose way to express their behavior. Most notably the fact that the types can contain references to the execution stack. Instead the compiler had special knowledge of them and restricted their use to ways that would always be safe: disallowed returns, cannot use as array elements, cannot use in generics, etc ...

Once ref fields are available and extended to support ref struct these types can be correctly defined in C# using a combination of ref struct and ref fields. Therefore when the compiler detects that a runtime supports ref fields it will no longer have a notion of restricted types. It will instead use the types as they are defined in the code.

To support this our span safety rules will be updated as follows:

  • __makeref will be treated as a method with the signature static TypedReference __makeref<T>(ref T value)
  • __refvalue will be treated as a method with the signature static ref T __refvalue<T>(TypedReference tr). The expression __refvalue(tr, int) will effectively use the second argument as the type parameter.
  • __arglist as a parameter will have a ref-safe-to-escape and safe-to-escape of current method.
  • __arglist(...) as an expression will have a ref-safe-to-escape and safe-to-escape of current method.

Conforming runtimes will ensure that TypedReference, RuntimeArgumentHandle and ArgIterator are defined as ref struct. Further TypedReference must be viewed as having a ref field to a ref struct for any possible type (it can store any value). That combined with the above rules will ensure references to the stack do not escape beyond their lifetime.

Note: strictly speaking this is a compiler implementation detail vs. part of the language. But given the relationship with ref fields it is being included in the language proposal for simplicity.

Provide unscoped

One of the most notable friction points is the inability to return fields by ref in instance members of a struct. This means developers can't create ref returning methods / properties and have to resort to exposing fields directly. This reduces the usefulness of ref returns in struct where it is often the most desired.

struct S
{
    int _field;

    // Error: this, and hence _field, can't return by ref
    public ref int Prop => ref _field;
}

The rationale for this default is reasonable but there is nothing inherently wrong with a struct escaping this by reference, it is simply the default chosen by the span safety rules.

To fix this the language will provide the opposite of the scoped lifetime annotation by supporting an UnscopedRefAttribute. This can be applied to any ref and it will change the ref-safe-to-escape to be one level wider than its default. For example:

  • if applied to a struct instance method it will become return only where previously it was containing method.
  • if applied to a ref parameter it will become calling method where previously it was return only

When applying [UnscopedRef] to an instance method of a struct it has the impact of modifying the implicit this parameter. This means this acts as an unannotated ref of the same type.

struct S
{
    int field; 

    // Error: `field` has the ref-safe-to-escape of `this` which is *current method* because 
    // it is a `scoped ref`
    ref int Prop1 => ref field;

    // Okay: `field` has the ref-safe-to-escape of `this` which is *calling method* because 
    // it is a `ref`
    [UnscopedRef] ref int Prop1 => ref field;
}

The annotation can also be placed on out parameters to restore them to C# 10 behavior.

ref int SneakyOut([UnscopedRef] out int i)
{
    i = 42;
    return ref i;
}

For the purposes of span safety rules, such an [UnscopedRef] out is considered simply a ref. Similar to how in is considered ref for lifetime purposes.

The [UnscopedRef] annotation will be disallowed on init members and constructors inside struct. Those members are already special with respect to ref semantics as they view readonly members as mutable. This means taking ref to those members appears as a simple ref, not ref readonly. This is allowed within the boundary of constructors and init. Allowing [UnscopedRef] would permit such a ref to incorrectly escape outside the constructor and permit mutation after readonly semantics had taken place.

The attribute type will have the following definition:

namespace System.Diagnostics.CodeAnalysis
{
    [AttributeUsage(
        AttributeTargets.Method | AttributeTargets.Property | AttributeTargets.Parameter,
        AllowMultiple = false,
        Inherited = false)]
    public sealed class UnscopedRefAttribute : Attribute
    {
    }
}

Detailed Notes:

  • An instance method or property annotated with [UnscopedRef] has ref-safe-to-escape of this set to the calling method.
  • A member annotated with [UnscopedRef] cannot implement an interface.
  • It is an error to use [UnscopedRef] on
    • A member that is not declared on a struct
    • A static member, init member or constructor on a struct
    • A parameter marked scoped
    • A parameter passed by value
    • A parameter passed by reference that is not implicitly scoped

ScopedRefAttribute

The scoped annotations will be emitted into metadata via the type System.Runtime.CompilerServices.ScopedRefAttribute attribute. The attribute will be matched by namespace-qualified name so the definition does not need to appear in any specific assembly.

The ScopedRefAttribute type is for compiler use only - it is not permitted in source. The type declaration is synthesized by the compiler if not already included in the compilation.

The type will have the following definition:

namespace System.Runtime.CompilerServices
{
    [AttributeUsage(AttributeTargets.Parameter, AllowMultiple = false, Inherited = false)]
    internal sealed class ScopedRefAttribute : Attribute
    {
    }
}

The compiler will emit this attribute on the parameter with scoped syntax. This will only be emitted when the syntax causes the value to differ from its default state. For example scoped out will cause no attribute to be emitted.

RefSafetyRulesAttribute

There are several differences in the ref safety rules between C#7.2 and C#11. Any of these differences could result in breaking changes when recompiling with C#11 against references compiled with C#10 or earlier.

  1. unscoped ref/in/out parameters may escape a method invocation as a ref field of a ref struct in C#11, not in C#7.2
  2. out parameters are implicitly scoped in C#11, and unscoped in C#7.2
  3. ref/in parameters to ref struct types are implicitly scoped in C#11, and unscoped in C#7.2

To reduce the chance of breaking changes when recompiling with C#11, we will update the C#11 compiler to use the ref safety rules for method invocation that match the rules that were used to analyze the method declaration. Essentially, when analyzing a call to a method compiled with an older compiler, the C#11 compiler will use C#7.2 ref safety rules.

To enable this, the compiler will emit a new [module: RefSafetyRules(11)] attribute when the module is compiled with -langversion:11 or higher or compiled with a corlib containing the feature flag for ref fields.

The argument to the attribute indicates the language version of the ref safety rules used when the module was compiled. The verion is currently fixed at 11 regardless of the actual language version passed to the compiler.

The expectation is that future versions of the compiler will update the ref safety rules and emit attributes with distinct versions.

If the compiler loads a module that includes a [module: RefSafetyRules(version)] with a version other than 11, the compiler will report a warning for the unrecognized version if there are any calls to methods declared in that module.

When the C#11 compiler analyzes a method call:

  • If the module containing the method declaration includes [module: RefSafetyRules(version)], regardless of version, the method call is analyzed with C#11 rules.
  • If the module containing the method declaration is from source, and compiled with -langversion:11 or with a corlib containing the feature flag for ref fields, the method call is analyzed with C#11 rules.
  • If the module containing the method declaration references System.Runtime { ver: 7.0 }, the method call is analyzed with C#11 rules. This rule is a temporary mitigation for modules compiled with earlier previews of C#11 / .NET 7 and will be removed later.
  • Otherwise, the method call is analyzed with C#7.2 rules.

A pre-C#11 compiler will ignore any RefSafetyRulesAttribute and analyze method calls with C#7.2 rules only.

The RefSafetyRulesAttribute will be matched by namespace-qualified name so the definition does not need to appear in any specific assembly.

The RefSafetyRulesAttribute type is for compiler use only - it is not permitted in source. The type declaration is synthesized by the compiler if not already included in the compilation.

namespace System.Runtime.CompilerServices
{
    [AttributeUsage(AttributeTargets.Module, AllowMultiple = false, Inherited = false)]
    internal sealed class RefSafetyRulesAttribute : Attribute
    {
        public RefSafetyRulesAttribute(int version) { Version = version; }
        public readonly int Version;
    }
}

Safe fixed size buffers

The language will relax the restrictions on fixed sized arrays such that they can be declared in safe code and the element type can be managed or unmanaged. This will make types like the following legal:

internal struct CharBuffer
{
    internal char Data[128];
}

These declarations, much like their unsafe counter parts, will define a sequence of N elements in the containing type. These members can be accessed with an indexer and can also be converted to Span<T> and ReadOnlySpan<T> instances.

When indexing into a fixed buffer of type T the readonly state of the container must be taken into account. If the container is readonly then the indexer returns ref readonly T else it returns ref T.

Accessing a fixed buffer without an indexer has no natural type however it is convertible to Span<T> types. In the case the container is readonly the buffer is implicitly convertible to ReadOnlySpan<T>, else it can implicitly convert to Span<T> or ReadOnlySpan<T> (the Span<T> conversion is considered better).

The resulting Span<T> instance will have a length equal to the size declared on the fixed buffer. The safe-to-escape scope of the returned value will be equal to the safe-to-escape scope of the container, just as it would if the backing data was accessed as a field.

For each fixed declaration in a type where the element type is T the language will generate a corresponding get only indexer method whose return type is ref T. The indexer will be annotated with the [UnscopedRef] attribute as the implementation will be returning fields of the declaring type. The accessibility of the member will match the accessibility on the fixed field.

For example, the signature of the indexer for CharBuffer.Data will be the following:

[UnscopedRef] internal ref char DataIndexer(int index) => ...;

If the provided index is outside the declared bounds of the fixed array then an IndexOutOfRangeException will be thrown. In the case a constant value is provided then it will be replaced with a direct reference to the appropriate element. Unless the constant is outside the declared bounds in which case a compile time error would occur.

There will also be a named accessor generated for each fixed buffer that provides by value get and set operations. Having this means that fixed buffers will more closely resemble existing array semantics by having a ref accessor as well as byval get and set operations. This means compilers will have the same flexibility when emitting code consuming fixed buffers as they do when consuming arrays. This should make operations like await over fixed buffers easier to emit.

This also has the added benefit that it will make fixed buffers easier to consume from other languages. Named indexers is a feature that has existed since the 1.0 release of .NET. Even languages which cannot directly emit a named indexer can generally consume them (C# is actually a good example of this).

The backing storage for the buffer will be generated using the [InlineArray] attribute. This is a mechanism discussed in issue 12320 which allows specifically for the case of efficiently declaring sequence of fields of the same type. This particular issue is still under active discussion and the expectation is that the implementation of this feature will follow however that discussion goes.

Initializers with ref values in new and with expressions

In section 11.7.15.3 Object initializers, we update the grammar to:

initializer_value
    : 'ref' expression // added
    | expression
    | object_or_collection_initializer
    ;

In the section for with expression, we update the grammar to:

member_initializer
    : identifier '=' 'ref' expression // added
    | identifier '=' expression
    ;

The left operand of the assignment must be an expression that binds to a ref field.
The right operand must be an expression that yields an lvalue designating a value of the same type as the left operand.

We add a similar rule to ref local reassignment:
If the left operand is a writeable ref (i.e. it designates anything other than a ref readonly field), then the right operand must be a writeable lvalue.

The escape rules for constructor invocations remain:

A new expression that invokes a constructor obeys the same rules as a method invocation that is considered to return the type being constructed.

Namely the rules of method invocation updated above:

An rvalue resulting from a method invocation e1.M(e2, ...) is safe-to-escape from the smallest of the following scopes:

  1. The calling method
  2. The safe-to-escape contributed by all argument expressions
  3. When the return is a ref struct then ref-safe-to-escape contributed by all ref arguments

For a new expression with initializers, the initializer expressions count as arguments (they contribute their safe-to-escape) and the ref initializer expressions count as ref arguments (they contribute their ref-safe-to-escape), recursively.

Changes in unsafe context

Pointer types (section 22.3) are extended to allow managed types as referent type. Such pointer types are written as a managed type followed by a * token. They produce a warning.

The address-of operator (section 22.6.5) is relaxed to accept a variable with a managed type as its operand.

The fixed statement (section 22.7) is relaxed to accept fixed_pointer_initializer that is the address of a variable of managed type T or that is an expression of an array_type with elements of a managed type T.

The stack allocation initializer (section 22.9) is similarly relaxed.

Considerations

There are considerations other parts of the development stack should consider when evaluating this feature.

Compat Considerations

The challenge in this proposal is the compatibility implications this design has to our existing span safety rules. While those rules fully support the concept of a ref struct having ref fields they do not allow for APIs, other than stackalloc, to capture ref state that refers to the stack. The span safety rules have a hard assumption that a constructor of the form Span(ref T value) does not exist. That means the safety rules do not account for a ref parameter being able to escape as a ref field hence it allows for code like the following.

Span<int> CreateSpan<int>()
{
    // This is legal according to the 7.2 span rules because they do not account
    // for a constructor in the form Span(ref T value) existing. 
    int local = 42;
    return new Span<int>(ref local);
}

Effectively there are three ways for a ref parameter to escape from a method invocation:

  1. By value return
  2. By ref return
  3. By ref field in ref struct that is returned or passed as ref / out parameter

The existing rules only account for (1) and (2). They do not account for (3) hence gaps like returning locals as ref fields are not accounted for. This design must change the rules to account for (3). This will have a small impact to compatibility for existing APIs. Specifically it will impact APIs that have the following properties.

  • Have a ref struct in the signature
    • Where the ref struct is a return type, ref or out parameter
    • Has an additional in or ref parameter excluding the receiver

In C# 10 callers of such APIs never had to consider that ref state input to the API could be captured as a ref field. That allowed for several patterns to exist, safely in C# 10, that will be unsafe in C# 11 due to the ability for ref state to escape as a ref field. For example:

Span<int> CreateSpan(ref int parameter)
{
    // The implementation of this method is irrelevant when considering the lifetime of the 
    // returned Span<T>. The span safety rules only look at the method signature, not the 
    // implementation. In C# 10 ref fields didn't exist hence there was no way for `parameter`
    // to escape by ref in this method
}

Span<int> BadUseExamples(int parameter)
{
    // Legal in C# 10 but would be illegal with ref fields
    return CreateSpan(ref parameter);

    // Legal in C# 10 but would be illegal with ref fields
    int local = 42;
    return CreateSpan(ref local);

    // Legal in C# 10 but would be illegal with ref fields
    Span<int> span = stackalloc int[42];
    return CreateSpan(ref span[0]);
}

The impact of this compatibility break is expected to be very small. The impacted API shape made little sense in the absence of ref fields hence it is unlikely customers created many of these. Experiments running tools to spot this API shape over existing repositories back up that assertion. The only repository with any significant counts of this shape is dotnet/runtime and that is because that repo can create ref fields via the ByReference<T> intrinsic type.

Even so the design must account for such APIs existing because it expresses a valid pattern, just not a common one. Hence the design must give developers the tools to restore the existing lifetime rules when upgrading to C# 10. Specifically it must provide mechanisms that allow developers to annotate ref parameters as unable to escape by ref or ref field. That allows customers to define APIs in C# 11 that have the same C# 10 callsite rules.

Reference Assemblies

A reference assembly for a compilation using features described in this proposal must maintain the elements that convey span safety information. That means all lifetime annotation attributes must be preserved in their original position. Any attempt to replace or omit them can lead to invalid reference assemblies.

Representing ref fields is more nuanced. Ideally a ref field would appear in a reference assembly as would any other field. However a ref field represents a change to the metadata format and that can cause issues with tool chains that are not updated to understand this metadata change. A concrete example is C++/CLI which will likely error if it consumes a ref field. Hence it's advantageous if ref fields can be omitted from reference assemblies in our core libraries.

A ref field by itself has no impact on span safety rules. As a concrete example consider that flipping the existing Span<T> definition to use a ref field has no impact on consumption. Hence the ref itself can be omitted safely. However a ref field does have other impacts to consumption that must be preserved:

  • A ref struct which has a ref field is never considered unmanaged
  • The type of the ref field impacts infinite generic expansion rules. Hence if the type of a ref field contains a type parameter that must be preserved

Given those rules here is a valid reference assembly transformation for a ref struct:

// Impl assembly 
ref struct S<T>
{
    ref T _field;
}

// Ref assembly 
ref struct S<T>
{
    object _o; // force managed 
    T _f; // maintain generic expansion protections
}

Open Issues

Change the design to avoid compat breaks

This design proposes several compatibility breaks with our existing span safety rules. Even though the breaks are believed to be minimally impactful significant consideration was given to a design which had no breaking changes.

The compat preserving design though was significantly more complex than this one. In order to preserve compat ref fields need distinct lifetimes for the ability to return by ref and return by ref field. Essentially it requires us to provide ref-field-safe-to-escape tracking for all parameters to a method. This needs to be calculated for all expressions and tracked in all values virtually everywhere that ref-safe-to-escape is tracked today.

Further this value has relationships with ref-safe-to-escape. For example it's non-sensical to have a value can be returned as a ref field but not directly as ref. That is because ref fields can be trivially returned by ref already (ref state in a ref struct can be returned by ref even when the containing value cannot). Hence the rules further need constant adjustment to ensure these values are sensible with respect to each other.

Also it means the language needs syntax to represent ref parameters that can be returned in three different ways: by ref field, by ref and by value. The default being returnable by ref. Going forward though the more natural return, particularly when ref struct are involved is expected to be by ref field or ref. That means new APIs require an extra syntax annotation to be correct by default. This is undesirable.

These compat changes though will impact methods that have the following properties:

  • Have a Span<T> or ref struct
    • Where the ref struct is a return type, ref or out parameter
    • Has an additional in or ref parameter (excluding the receiver)

To understand the impact it's helpful to break APIs into categories:

  1. Want consumers to account for ref being captured as a ref field. Prime example is the Span(ref T value) constructors
  2. Do not want consumers to account for ref being captured as a ref field. These though break into two categories
    1. Unsafe APIs. These are APIS inside the Unsafe and MemoryMarshal types, of which MemoryMarshal.CreateSpan is the most prominent. These APIs do capture the ref unsafely but they are also known to be unsafe APIs.
    2. Safe APIs. These are APIs which take ref parameters for efficiency but it is not actually captured anywhere. Examples are small but one is AsnDecoder.ReadEnumeratedBytes

This change primarily benefits (1) above. These are expected to make up the majority of APIs that take a ref and return a ref struct going forward. The changes negatively impact (2.1) and (2.2) as it breaks the existing calling semantics because the lifetime rules change.

The APIs in category (2.1) though are largely authored by Microsoft or by developers who stand the most to benefit from ref fields (the Tanner's of the world). It is reasonable to assume this class of developers would be amenable to a compatibility tax on upgrade to C# 11 in the form of a few annotations to retain the existing semantics if ref fields were provided in return.

The APIs in category (2.2) are the biggest issue. It is unknown how many such APIs exist and it's unclear if these would be more / less frequent in 3rd party code. The expectation is there is a very small number of them, particularly if we take the compat break on out. Searches so far have revealed a very small number of these existing in public surface area. This is a hard pattern to search for though as it requires semantic analysis. Before taking this change a tool based approach would be needed to verify the assumptions around this impacting a small number of known cases.

For both cases in category (2) though the fix is straight forward. The ref parameters that do not want to be considered capturable must add scoped to the ref. In (2.1) this will likely also force the developer to use Unsafe or MemoryMarshal but that is expected for unsafe style APIs.

Ideally the language could reduce the impact of silent breaking changes by issuing a warning when an API silently falls into the troublesome behavior. That would be a method that both takes a ref, returns ref struct but does not actually capture the ref in the ref struct. The compiler could issue a diagnostic in that case informing developers such ref should be annotated as scoped ref instead.

Decision This design can be achieved but the resulting feature is more difficult to use to the point the decision was made to take the compat break.

Decision The compiler will provide a warning when a method meets the criteria but does not capture the ref parameter as a ref field. This should suitably warn customers on upgrade about the potential issues they are creating

Keywords vs. attributes

This design calls for using attributes to annotate the new lifetime rules. This also could've been done just as easily with contextual keywords. For instance [DoesNotEscape] could map to scoped. However keywords, even the contextual ones, generally must meet a very high bar for inclusion. They take up valuable language real estate and are more prominent parts of the language. This feature, while valuable, is going to serve a minority of C# developers.

On the surface that would seem to favor not using keywords but there are two important points to consider:

  1. The annotations will effect program semantics. Having attributes impact program semantics is a line C# is reluctant to cross and it's unclear if this is the feature that should justify the language taking that step.
  2. The developers most likely to use this feature intersect strongly with the set of developers that use function pointers. That feature, while also used by a minority of developers, did warrant a new syntax and that decision is still seen as sound.

Taken together this means syntax should be considered.

A rough sketch of the syntax would be:

  • [RefDoesNotEscape] maps to scoped ref
  • [DoesNotEscape] maps to scoped
  • [RefDoesEscape] maps to unscoped

Decision Use syntax for scoped and scoped ref; use attribute for unscoped.

Allow fixed buffer locals

This design allows for safe fixed buffers that can support any type. One possible extension here is allowing such fixed buffers to be declared as local variables. This would allow a number of existing stackalloc operations to be replaced with a fixed buffer. It would also expand the set of scenarios we could have stack style allocations as stackalloc is limited to unmanaged element types while fixed buffers are not.

class FixedBufferLocals
{
    void Example()
    {
        Span<int> span = stackalloc int[42];
        int buffer[42];
    }
}

This holds together but does require us to extend the syntax for locals a bit. Unclear if this is or isn't worth the extra complexity. Possible we could decide no for now and bring back later if sufficient need is demonstrated.

Example of where this would be beneficial: https://github.com/dotnet/runtime/pull/34149

Decision hold off on this for now

To use modreqs or not

A decision needs to be made if methods marked with new lifetime attributes should or should not translate to modreq in emit. There would be effectively a 1:1 mapping between annotations and modreq if this approach was taken.

The rationale for adding a modreq is the attributes change the semantics of span safety. Only languages which understand these semantics should be calling the methods in question. Further when applied to OHI scenarios, the lifetimes become a contract that all derived methods must implement. Having the annotations exist without modreq can lead to situations where virtual method chains with conflicting lifetime annotations are loaded (can happen if only one part of virtual chain is compiled and other is not).

The initial span safety work did not use modreq but instead relied on languages and the framework to understand. At the same time though all of the elements that contribute to the span safety rules are a strong part of the method signature: ref, in, ref struct, etc ... Hence any change to the existing rules of a method already results in a binary change to the signature. To give the new lifetime annotations the same impact they will need modreq enforcement.

The concern is whether or not this is overkill. It does have the negative impact that making signatures more flexible, by say adding [DoesNotEscape] to a parameter, will result in a binary compat change. That trade off means that over time frameworks like BCL likely won't be able to relax such signatures. It could be mitigated to a degree by taking some approach the language does with in parameters and only apply modreq in virtual positions.

Decision Do not use modreq in metadata. The difference between out and ref is not modreq but they now have different span safety lifetimes. There is no real benefit to only half enforcing the rules with modreq here.

Allow multi-dimensional fixed buffers

Should the design for fixed buffers be extended to include multi-dimensional style arrays? Essentially allowing for declarations like the following:

struct Dimensions
{
    int array[42, 13];
}

Decision Do not allow for now

Violating scoped

The runtime repository has several non-public APIs that capture ref parameters as ref fields. These are unsafe because the lifetime of the resulting value is not tracked. For example the Span<T>(ref T value, int length) constructor.

The majority of these APIs will likely choose to have proper lifetime tracking on the return which will be achieved simply by updating to C# 11. A few though will want to keep their current semantics of not tracking the return value because their entire intent is to be unsafe. The most notable examples are MemoryMarshal.CreateSpan and MemoryMarshal.CreateReadOnlySpan. This will be achieved by marking the parameters as scoped.

That means the runtime needs an established pattern for unsafely removing scoped from a parameter:

  1. Unsafe.AsRef<T>(in T value) could expand its existing purpose by changing to scoped in T value. This would allow it to both remove in and scoped from parameters. It then becomes the universal "remove ref safety" method
  2. Introduce a new method whose entire purpose is to remove scoped: ref T Unsafe.AsUnscoped<T>(scoped in T value). This removes in as well because if it did not then callers still need a combination of method calls to "remove ref safety" at which point the existing solution is likely sufficient.

Unscoped this by default?

The design only has two locations which are scoped by default:

  • this is scoped ref
  • out is scoped ref

The decision on out is to significantly reduce the compat burden of ref fields and at the same time is a more natural default. It lets developers actually think of out as data flowing outward only where as if it's ref then the rules must consider data flowing in both directions. This leads to significant developer confusion.

The decision on this is undesirable because it means a struct cannot return a field by ref. This is an important scenario to high perf developers and the [UnscopedRef] attribute was added essentially for this one scenario.

Keywords have a high bar and adding it for a single scenario is suspect. As such thought was given to whether we could avoid this keyword at all by making this simply ref by default and not scoped ref. All members that need this to be scoped ref could do so by marking the method scoped (as a method can be marked readonly to create a readonly ref today).

On a normal struct this is mostly a positive change as it only introduces compat issues when a member has a ref return. There are very few of these methods and a tool could spot these and convert them to scoped members quickly.

On a ref struct this change introduces significantly bigger compat issues. Consider the following:

ref struct Sneaky
{
    int Field;
    ref int RefField;

    public void SelfAssign()
    {
        // This pattern of ref reassign to fields on this inside instance methods is now
        // completely legal.
        RefField = ref Field;
    }

    static Sneaky UseExample()
    {
        Sneaky local = default;

        // Error: this is illegal, and must be illegal, by our existing rules as the 
        // ref-safe-to-escape of local is now an input into method arguments must match. 
        local.SelfAssign();

        // This would be dangerous as local now has a dangerous `ref` but the above 
        // prevents us from getting here.
        return local;
    }
}

Essentially it would mean all instance method invocations on mutable ref struct locals would be illegal unless the local was further marked as scoped. The rules have to consider the case where fields were ref reassigned to other fields in this. A readonly ref struct doesn't have this problem because the readonly nature prevents ref reassignment. Still this would be a significant back compat breaking change as it would impact virtually every existing mutable ref struct.

A readonly ref struct though is still problematic once we expand to having ref fields to ref struct. It allows for the same basic problem by just moving the capture into the value of the ref field:

readonly ref struct ReadOnlySneaky
{
    readonly int Field;
    readonly ref ReadOnlySpan<int> Span;

    public void SelfAssign()
    {
        // Instance method captures a ref to itself
        Span = new ReadOnlySpan<int>(ref Field, 1);
    }
}

Some thought was given to the idea of having this have different defaults based on the type of struct or member. For example:

  • this as ref: struct, readonly ref struct or readonly member
  • this as scoped ref: ref struct or readonly ref struct with ref field to ref struct

This minimizes compat breaks and maximizes flexibility but at the cost of complicating the story for customers. It also doesn't fully solve the problem because future features, like safe fixed buffers, require that a mutable ref struct have ref returns for fields which don't work by this design alone as it would fall into the scoped ref category.

Decision Keep this as scoped ref

ref fields to ref struct

This feature opens up a new set of ref safety rules because it allows for a ref field to refer to a ref struct. This generic nature of ByReference<T> meant that up until now the runtime could not have such a construct. As a result all of our rules are written under the assumption this is not possible. The ref field feature is largely not about making new rules but codifying the existing rules in our system. Allowing ref fields to ref struct requires us to codify new rules because there are several new scenarios to consider.

The first is that a readonly ref is now capable of storing ref state. For example:

readonly ref struct Container
{
    readonly ref Span<int> Span;

    void Store(Span<int> span)
    {
        Span = span;
    }
}

This means when thinking about method arguments must match rules we must consider readonly ref T is potential method output when T potentially has a ref field to a ref struct.

The second issue is language must consider a new type of escape scope: ref-field-safe-to-escape. All ref struct which transitively contain a ref field have another escape scope representing the value(s) in the ref field(s). In the case of multiple ref fields they can be collectively tracked as a single value. The default value for this for parameters is calling method.

ref struct Nested
{
    ref Span<int> Span;
}

Span<int> M(ref Nested nested) => nested.Span;

This value is not related to the escape scope of the container; that is as the container scope gets smaller it has no impact on the ref-field-safe-to-escape of the ref field values. Further the ref-field-safe-to-escape can never be smaller than the safe-to-escape of the container.

ref struct Nested
{
    ref Span<int> Span;
}

void M(ref Nested nested)
{
    scoped ref Nested refLocal = ref nested;

    // the ref-field-safe-to-escape of local is still *calling method* which means the following
    // is illegal
    refLocal.Span = stackalloc int[42];

    scoped Nested valLocal = nested;

    // the ref-field-safe-to-escape of local is still *calling method* which means the following
    // is still illegal
    valLocal.Span = stackalloc int[42];
}

This ref-field-safe-to-escape-scope has essentially always existed. Up until now ref fields could only point to normal struct hence it was trivially collapsed to calling method. To support ref fields to ref struct our existing rules need to be updated to take into account this new escape scope.

Third the rules for ref reassignment need to be updated to ensure that we don't violate ref-field-safe-to-escape for the values. Essentially for x.e1 = ref e2 where the type of e1 is a ref struct the ref-field-safe-to-escape must be equal.

These problems are very solvable. The compiler team has sketched out a few versions of these rules and they largely fall out from our existing analysis. The problem is there is no consuming code for such rules that helps prove out there correctness and usability. This makes us very hesitant to add support because of the fear we'll pick wrong defaults and back the runtime into usability corner when it does take advantage of this. This concern is particularly strong because .NET 8 likely pushes us in this direction with allow T: ref struct and Span<Span<T>>. The rules would be better written if it's done in conjunction with consumption code.

Decision Delay allowing ref field to ref struct until .NET 8 where we have scenarios that will help drive the rules around these scenarios.

What will make C# 11.0?

The features outlined in this document don't need to be implemented in a single pass. Instead they can be implemented in phases across several language releases in the following buckets:

  1. ref fields and scoped
  2. [UnscopedRef]
  3. ref fields to ref struct
  4. Sunset restricted types
  5. fixed sized buffers

What gets implemented in which release is merely a scoping exercise.

Decision Only (1) and (2) made C# 11.0. The rest will be considered in future versions of C#.

Future Considerations

Advanced lifetime annotations

The lifetime annotations in this proposal are limited in that they allow developers to change the default escape / don't escape behavior of values. This does add powerful flexibility to our model but it does not radically change the set of relationships that can be expressed. At the core the C# model is still effectively binary: can a value be returned or not?

That allows limited lifetime relationships to be understood. For example a value that can't be returned from a method has a smaller lifetime than one that can be returned from a method. There is no way to describe the lifetime relationship between values that can be returned from a method though. Specifically there is no way to say that one value has a larger lifetime than the other once it's established both can be returned from a method. The next step in our lifetime evolution would be allowing such relationships to be described.

Other methods such as Rust allow this type of relationship to be expressed and hence can implement handle more complex scoped style operations. Our language could similarly benefit if such a feature were included. At the moment there is no motivating pressure to do this but if there is in the future our scoped model could be expanded to included it in a fairly straight forward fashion.

Every scoped could be assigned a named lifetime by adding a generic style argument to the syntax. For example scoped<'a> is a value that has lifetime 'a. Constraints like where could then be used to describe the relationships between these lifetimes.

void M(scoped<'a> ref MyStruct s, scoped<'b> Span<int> span)
  where 'b >= 'a
{
    s.Span = span;
}

This method defines two lifetimes 'a and 'b and there relationship, specifically that 'b is greater than 'a. This allows for the callsite to have more granular rules for how values can be safely passed into methods vs. the more coarse grained rules present today.

Issues

The following issues are all related to this proposal:

Proposals

The following proposals are related to this proposal:

Existing samples

Utf8JsonReader

This particular snippet requires unsafe because it runs into issues with passing around a Span<T> which can be stack allocated to an instance method on a ref struct. Even though this parameter is not captured the language must assume it is and hence needlessly causes friction here.

Utf8JsonWriter

This snippet wants to mutate a parameter by escaping elements of the data. The escaped data can be stack allocated for efficiency. Even though the parameter is not escaped the compiler assigns it a safe-to-escape scope of outside the enclosing method because it is a parameter. This means in order to use stack allocation the implementation must use unsafe in order to assign back to the parameter after escaping the data.

Fun Samples

ReadOnlySpan<T>

public readonly ref struct ReadOnlySpan<T>
{
    readonly ref readonly T _value;
    readonly int _length;

    public ReadOnlySpan(in T value)
    {
        _value = ref value;
        _length = 1;
    }
}

Frugal list

struct FrugalList<T>
{
    private T _item0;
    private T _item1;
    private T _item2;

    public int Count = 3;

    public ref T this[int index]
    {
        [UnscopedRef] get
        {
            switch (index)
            {
                case 0: return ref _item1;
                case 1: return ref _item2;
                case 2: return ref _item3;
                default: throw null;
            }
        }
    }
}

Stack based linked list

ref struct StackLinkedListNode<T>
{
    T _value;
    ref StackLinkedListNode<T> _next;

    public T Value => _value;

    public bool HasNext => !Unsafe.IsNullRef(ref _next);

    public ref StackLinkedListNode<T> Next 
    {
        get
        {
            if (!HasNext)
            {
                throw new InvalidOperationException("No next node");
            }

            return ref _next;
        }
    }

    public StackLinkedListNode(T value)
    {
        this = default;
        _value = value;
    }

    public StackLinkedListNode(T value, ref StackLinkedListNode<T> next)
    {
        _value = value;
        _next = ref next;
    }
}

Examples and Notes

Below are a set of examples demonstrating how and why the rules work the way they do. Included are several examples showing dangerous behaviors and how the rules prevent them from happening. It's important to keep these in mind when making adjustments to the proposal.

Ref reassignment and call sites

Demonstrating how ref reassignment and method invocation work together.

ref struct RS
{
    ref int _refField;

    public ref int Prop => ref _refField;

    public RS(int[] array)
    {
        _refField = ref array[0];
    }

    public RS(ref int i)
    {
        _refField = ref i;
    }

    public RS CreateRS() => ...;

    public ref int M1(RS rs)
    {
        // The call site arguments for Prop contribute here:
        //   - `rs` contributes no ref-safe-to-escape as the corresponding parameter, 
        //      which is `this`, is `scoped ref`
        //   - `rs` contribute safe-to-escape of *calling method*
        // 
        // This is an lvalue invocation and the arguments contribute only safe-to-escape 
        // values of *calling method*. That means `local1` is ref-safe-to-escape to 
        // *calling method*
        ref int local1 = ref rs.Prop;

        // Okay: this is legal because `local` has ref-safe-to-escape of *calling method*
        return ref local1;

        // The arguments contribute here:
        //   - `this` contributes no ref-safe-to-escape as the corresponding parameter
        //     is `scoped ref`
        //   - `this` contributes safe-to-escape of *calling method*
        //
        // This is an rvalue invocation and following those rules the safe-to-escape of 
        // `local2` will be *calling method*
        RS local2 = CreateRS();

        // Okay: this follows the same analysis as `ref rs.Prop` above
        return ref local2.Prop;

        // The arguments contribute here:
        //   - `local3` contributes ref-safe-to-escape of *current method*
        //   - `local3` contributes safe-to-escape of *calling method*
        // 
        // This is an rvalue invocation which returns a `ref struct` and following those 
        // rules the safe-to-escape of `local4` will be *current method*
        int local3 = 42;
        var local4 = new RS(ref local3);

        // Error: 
        // The arguments contribute here:
        //   - `local4` contributes no ref-safe-to-escape as the corresponding parameter
        //     is `scoped ref`
        //   - `local4` contributes safe-to-escape of *current method*
        // 
        // This is an lvalue invocation and following those rules the ref-safe-to-escape 
        // of the return is *current method*
        return ref local4.Prop1;
    }
}

Ref reassignment and unsafe escapes

The reason for the following line in the ref reassignment rules may not be obvious at first glance:

e1 must have the same safe-to-escape as e2

This is because the lifetime of the values pointed to by ref locations are invariant. The indirection prevents us from allowing any kind of variance here, even to narrower lifetimes. If narrowing is allowed then it opens up the following unsafe code:

ref struct RS { }
void Example(ref Span<int> p)
{
    Span<int> local = stackalloc int[42];
    ref Span<int> refLocal = ref local;

    // The safe-to-escape of refLocal is narrower than p. For a non-ref reassignment 
    // this would be allowed as its safe to assign wider lifetimes to narrower ones.
    // In the case of ref reassignment though this rule prevents it as the 
    // safe-to-escape values are different.
    refLocal = ref p;

    // If it were allowed this would be legal as the safe-to-escape of refLocal
    // is *containing method* and that is satisfied by stackalloc. At the same time
    // it would be assigning through p and escaping the stackalloc to the calling
    // method
    // 
    // This is equivalent of saying p = stackalloc int[13]!!! 
    refLocal = stackalloc int[13];
}

For a ref to non ref struct this rule is trivially satisfied as the values all have the same safe-to-escape scope. This rule really only comes into play when the value is a ref struct.

This behavior of ref will also be important in a future where we allow ref fields to ref struct.

scoped locals

The use of scoped on locals will be particularly helpful to code patterns which conditionally assign values with different safe-to-escape scope to locals. It means code no longer needs to rely on initialization tricks like = stackalloc byte[0] to define a local safe-to-escape but now can simply use scoped.

// Old way 
// Span<byte> span = stackalloc byte[0];
// New way 
scoped Span<byte> span;
int len = ...;
if (len < MaxStackLen)
{
    span = stackalloc byte[len];
}
else
{
    span = new byte[len];
}

This pattern comes up frequently in low level code. When the ref struct involved is Span<T> the above trick can be used. It is not applicable to other ref struct types though and can result in low level code needing to resort to unsafe to work around the inability to properly specify the lifetime.

scoped parameter values

One source of repeated friction in low level code is the default escape for parameters is permissive. They are safe-to-escape to the calling method. This is a sensible default because it lines up with the coding patterns of .NET as a whole. In low level code though there is a larger use of ref struct and this default can cause friction with other parts of the span safety rules.

The main friction point occurs because of the method arguments must match rule. This rule most commonly comes into play with instance methods on ref struct where at least one parameter is also a ref struct. This is a common pattern in low level code where ref struct types commonly leverage Span<T> parameters in their methods. For example it will occur on any writer style ref struct that uses Span<T> to pass around buffers.

This rule exists to prevent scenarios like the following:

ref struct RS
{
    Span<int> _field;
    void Set(Span<int> p)
    {
        _field = p;
    }

    static void DangerousCode(ref RS p)
    {
        Span<int> span = stackalloc int[] { 42 };

        // Error: if allowed this would let the method return a reference to 
        // the stack
        p.Set(span);
    }
}

Essentially this rule exists because the language must assume that all inputs to a method escape to their maximum allowed scope. When there are ref or out parameters, including the receivers, it's possible for the inputs to escape as fields of those ref values (as happens in RS.Set above).

In practice though there are many such methods which pass ref struct as parameters that never intend to capture them in output. It is just a value that is used within the current method. For example:

ref struct JsonReader
{
    Span<char> _buffer;
    int _position;

    internal bool TextEquals(ReadOnySpan<char> text)
    {
        var current = _buffer.Slice(_position, text.Length);
        return current == text;
    }
}

class C
{
    static void M(ref JsonReader reader)
    {
        Span<char> span = stackalloc char[4];
        span[0] = 'd';
        span[1] = 'o';
        span[2] = 'g';

        // Error: The safe-to-escape of `span` is the current method scope 
        // while `reader` is outside the current method scope hence this fails
        // by the above rule.
        if (reader.TextEquals(span))
        {
            ...
        }
    }
}

In order to work around this low level code will resort to unsafe tricks to lie to the compiler about the lifetime of their ref struct. This significantly reduces the value proposition of ref struct as they are meant to be a means to avoid unsafe while continuing to write high performance code.

This is where scoped is an effective tool on ref struct parameters because it removes them from consideration as being returned from the method according to the updated method arguments must match rule. A ref struct parameter which is consumed, but never returned, can be labeled as scoped to make call sites more flexible.

ref struct JsonReader
{
    Span<char> _buffer;
    int _position;

    internal bool TextEquals(scoped ReadOnySpan<char> text)
    {
        var current = _buffer.Slice(_position, text.Length);
        return current == text;
    }
}

class C
{
    static void M(ref JsonReader reader)
    {
        Span<char> span = stackalloc char[4];
        span[0] = 'd';
        span[1] = 'o';
        span[2] = 'g';

        // Okay: the compiler never considers `span` as capturable here hence it doesn't
        // contribute to the method arguments must match rule
        if (reader.TextEquals(span))
        {
            ...
        }
    }
}

Preventing tricky ref assignment from readonly mutation

When a ref is taken to a readonly field in a constructor or init member the type is ref not ref readonly. This is a long standing behavior that allows for code like the following:

struct S
{
    readonly int i; 

    public S(string s)
    {
        M(ref i);
    }

    static void M(ref int i) { }
}

That does pose a potential problem though if such a ref were able to be stored into a ref field on the same type. It would allow for direct mutation of a readonly struct from an instance member:

readonly ref struct S
{ 
    readonly int i; 
    readonly ref int r; 
    public S()
    {
        i = 0;
        r = ref i;
    }

    public void Oops()
    {
        r++;
    }

The proposal prevents this though because it violates the span safety rules. Consider the following:

  • The ref-safe-to-escape of this is current method and safe-to-escape is calling method. These are both standard for this in a struct member.
  • The ref-safe-to-escape of i is current method. This falls out from the field lifetimes rules. Specifically rule 4.

At that point the line r = ref i is illegal by ref reassignment rules.

These rules were not intended to prevent this behavior but do so as a side effect. It's important to keep this in mind for any future rule update to evaluate the impact to scenarios like this.

Silly cyclic assignment

One aspect this design struggled with is how freely a ref can be returned from a method. Allowing all ref to be returned as freely as normal values is likely what most developers intuitively expect. However it allows for pathological scenarios that the compiler must consider when calculating ref safety. Consider the following:

ref struct S
{
    int field;
    ref int refField;

    static void SelfAssign(ref S s)
    {
        s.refField = ref s.field;
    }
}

This is not a code pattern that we expect any developers to use. Yet when a ref can be returned with the same lifetime as a value it is legal under the rules. The compiler must consider all legal cases when evaluating a method call and this leads to such APIs being effectively unusable.

void M(ref S s)
{
    ...
}

S Usage()
{
    // safe-to-escape to calling method
    S local = default; 

    // Error: compiler is forced to assume the worst and concludes a self assignment
    // is possible here and must issue an error.
    M(ref local);
}

To make these APIs usable the compiler ensures that the ref lifetime for a ref parameter is smaller than lifetime of any references in the associated parameter value. This is the rationale for having ref-safe-to-escape for ref to ref struct be return only and out be containing method. That prevents cyclic assignment because of the difference in lifetimes.

It is also why [UnscopedRef] only promotes the ref-safe-to-escape of any ref to ref struct values to return only and not calling method. Consider that using calling method allows for cyclic assignment and would force a viral use of [UnscopedRef] for a ref struct:

ref struct S
{
    byte Field;

    [UnscopedRef]
    public Span<byte> Data => new Span<byte>(ref Field, 1);
}

void M(ref S s)
{
    // Error: passing a scoped ref to [UnscopedRef] ref 
    Span<byte> span = s.Data;
}

This is correctly illegal in that case because the compiler has to consider the pathological case that S.Data could cyclic assign via this. That forces methods all methods that call S.Data to further mark their ref parameters as [UnscopedRef]. This is viral until the method which creates the value as a local. This is why return only exists as an escape scope. It does complicate the spec / implementation but it serves to make the feature significantly more usable.

Note: this cyclic assignment problem does continue to exist for [UnscopedRef] out to ref struct because that causes the safe-to-escape and ref-safe-to-escape to be equivalent.

ref struct RS
{
    int field;
    ref int refField;
}

void M1(out RS p)
{
    // Error: from method arguments must match:
    // Step 1 would calculate the narrowest escape as *containing method*
    // Step 2 would fail the assignment check because p safe-to-escape is *return only*
    M2(out p);
}

void M2([UnscopedRef] out RS p)
{
    // The lifetimes of LHS and RHS are equivalent here and hence this is legal
    p.refField = ref p.Field;
}

In terms of advanced annotations the [UnscopedRef] design creates the following:

ref struct S { }

// C# code
S Create1(ref S p)
S Create2([UnscopedRef] ref S p)

// Annotation equivalent
scoped<'b> S Create1(scoped<'a> ref scoped<'b> S)
scoped<'a> S Create2(scoped<'a> ref scoped<'b> S)
  where 'b >= 'a

readonly cannot be deep through ref fields

Consider the below code sample:

ref struct S
{
    ref int Field;

    readonly void Method()
    {
        // Legal or illegal?
        Field = 42;
    }
}

When designing the rules for ref fields on readonly instances in a vacuum the rules can be validly designed such that the above is legal or illegal. Essentially readonly can validly be deep through a ref field or it can apply only to the ref. Applying only to the ref prevents ref reassignment but allows normal assignment which changes the referred to value.

This design does not exist in a vacuum though, it is designing rules for types that already effectively have ref fields. The most prominent of which, Span<T>, already has a strong dependency on readonly not being deep here. Its primary scenario is the ability to assign to the ref field through a readonly instance.

readonly ref struct SpanOfOne
{
    readonly ref int Field;

    public ref int this[int index]
    {
        get
        {
            if (index != 1)
                throw new Exception();
            return ref Field;
        }
    }
}

This means we must choose the shallow interpretation of readonly.

Modeling constructors

One subtle design question is: How are constructors bodies modeled for ref safety? Essentially how is the following constructor analyzed?

ref struct S
{
    int field;

    public S(ref int f)
    {
        field = ref f;
    }
}

There are roughly two approaches:

  1. Model as a static method where this is a local where its safe-to-escape is calling method
  2. Model as a static method where this is an out parameter.

Further a constructor must meet the following invariants:

  1. Ensure that ref parameters can be captured as ref fields.
  2. Ensure that ref to fields of this cannot be escaped through ref parameters. That would violate tricky ref assignment.

The intent is to pick the form that satisfies our invariants without introduction of any special rules for constructors. Given that the best model for constructors is viewing this as an out parameter. The return only nature of the out allows us to satisfy all the invariants above without any special casing:

public static void ctor(out S @this, ref int f)
{
    // The ref-safe-to-escape of `ref f` is *return only* which is also the 
    // safe-to-escape of `this.field` hence this assignment is allowed
    @this.field = ref f;
}

Method arguments must match

The method arguments must match rule is a common source of confusion for developers. It's a rule which has a number of special cases that are hard to understand unless you are familiar with the reasoning behind the rule. For the sake of better understanding the reasons for the rule we will simplify ref-safe-to-escape* and safe-to-escape to simply escape-scope.

Methods can pretty liberally return state passed to them as parameters. Essentially any reachable state which is unscoped can be returned (including returning by ref). This can be returned directly through a return statement or indirectly by assigning into a ref value.

Direct returns don't pose much problems for ref safety. The compiler simply needs to look at all the returnable inputs to a method and then it effectively restricts the return value to be the minimum escape-scope of the input. That return value then goes through normal processing.

Indirect returns pose a significant problem because all ref are both an input and output to the method. These outputs already have a known escape-scope. The compiler can't infer new ones, it has to consider them at their current level. That means the compiler has to look at every single ref which is assignable in the called method, evaluate it's escape-scope, and then verify no returnable input to the method has a smaller escape-scope than that ref. If any such case exists then the method call must be illegal because it could violate ref safety.

Method arguments must match is the process by which the compiler asserts this safety check.

A different way to evaluate this which is often easier for developers to consider is to do the following exercise:

  1. Look at the method definition identify all places where state can be indirectly returned: a. Mutable ref parameters pointing to ref struct b. Mutable ref parameters with ref assignable ref fields c. Assignable ref params or ref fields pointing to ref struct (consider recursively)
  2. Look at the call site a. Identify the escape scopes that line up with the locations identified above b. Identify the escape scopes of all inputs to the method that are returnable (don't line up with scoped parameters)

If any value in 2.b is smaller than 2.a then the method call must be illegal. Let's look at a few examples to illustrate the rules:

ref struct R { }

class Program
{
    static void F0(ref R a, scoped ref R b) => throw null;

    static void F1(ref R x, scoped R y)
    {
        F0(ref x, ref y);
    }
}

Looking at the call to F0 lets go through (1) and (2). The parameters with potential for indirect return are a and b as both can be directly assigned. The arguments which line up to those parameters are:

  • a which maps to x that has escape-scope of calling method
  • b which maps to y that has with escape-scope of current method

The set of returnable input to the method are

  • x with escape-scope of calling method
  • ref x with escape-scope of calling method
  • y with escape-scope of current method

The value ref y is not returnable since it maps to a scoped ref hence it is not considered an input. But given that there is at least one input with a smaller escape scope (y argument) than one of the outputs (x argument) the method call is illegal.

A different variation is the following:

ref struct R { }

class Program
{
    static void F0(ref R a, ref int b) => throw null;

    static void F1(ref R x)
    {
        int y = 42;
        F0(ref x, ref y);
    }
}

Again the parameters with potential for indirect return are a and b as both can be directly assigned. But b can be excluded because it does not point to a ref struct hence cannot be used to store ref state. Thus we have:

  • a which maps to x that has escape-scope of calling method

The set of returnable input to the method are:

  • x with escape-scope of calling method
  • ref x with escape-scope of calling method
  • ref y with escape-scope of current method

Given that there is at least one input with a smaller escape scope (ref y argument) than one of the outputs (x argument) the method call is illegal.

This is the logic that the method arguments must match rule is trying to encompass. It goes further as it considers both scoped as a way to remove inputs from consideration and readonly as a way to remove ref as an output (can't assign into a readonly ref so it can't be a source of output). These special cases do add complexity to the rules but it's done so for the benefit of the developer. The compiler seeks to remove all inputs and outputs it knows can't contribute to the result to give developers maximum flexibility when calling a member. Much like overload resolution it's worth the effort to make our rules more complex when it creates more flexibility for consumers.

Examples of inferred safe-to-escape of declaration expressions

Related to Infer safe-to-escape of declaration expressions.

ref struct RS
{
    public RS(ref int x) { } // assumed to be able to capture 'x'

    static void M0(RS input, out RS output) => output = input;

    static void M1()
    {
        var i = 0;
        var rs1 = new RS(ref i); // safe-to-escape of 'rs1' is CurrentMethod
        M0(rs1, out var rs2); // safe-to-escape of 'rs2' is CurrentMethod
    }

    static void M2(RS rs1)
    {
        M0(rs1, out var rs2); // safe-to-escape of 'rs2' is CallingMethod
    }

    static void M3(RS rs1)
    {
        M0(rs1, out scoped var rs2); // 'scoped' modifier forces safe-to-escape of 'rs2' to the current local scope (CurrentMethod or narrower).
    }
}

Note that the local scope which results from the scoped modifier is the narrowest which could possibly be used for the variable--to be any narrower would mean the expression refers to variables which are only declared in a narrower scope than the expression.