Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Type extracting patterns #3050

Closed
IS4Code opened this issue Dec 23, 2019 · 28 comments
Closed

Proposal: Type extracting patterns #3050

IS4Code opened this issue Dec 23, 2019 · 28 comments

Comments

@IS4Code
Copy link

IS4Code commented Dec 23, 2019

In short

object obj1 = ...;
if(obj1 is List<new T> list) // or "is List<_> list" if the type argument is not important
{
    // 'list' here is a variable of type List<T> where T is extracted in some way from obj1.GetType()
    // this section of code will be separated to another method, parametrized with T, dispatched at runtime
    var first = list[0];
    list.Add(first);
    list.Add(default(T)); // T is usable here
}
var x = default(T); // error - T is no longer "assigned" here (but could exist as a valid symbol similarly to variables)

Motivation

Imagine having your favourite sorting algorithm:

public static void LightSpeedSort<T>(IList<T> list) where T : IComparable<T>

This is the sort of method you would see in a general-purpose library of algorithms, but it can only be used when you know the type of T beforehand and you can constraint it to be IComparable<T>. What if you happened to be in a situation where sorting is optional and could improve performance of other algorithms, but can be omitted when T cannot be compared with itself?

static bool TrySort<T>(IList<T> list)
{
    Type t = typeof(T);
    if(typeof(IComparable<T>).IsAssignableFrom(t)) // check that T implements IComparable<T>
    {
        typeof(Algorithms).
        GetMethod(nameof(Algorithms.LightSpeedSort), BindingFlags.Public | BindingFlags.Static) // hope there aren't overloads
        MakeGenericMethod(t).Invoke(null, new object[]{list});
        return true;
    }
    return false;
}

This code wouldn't be very useful if its purpose was to improve performance, as reflection and dynamic invocation is not really the fastest. It can be improved by caching the result of GetMethod and perhaps keeping a collection of delegates for every encountered type, but it becomes more and more complex when you try to make it more performant.

(I am aware that Comparer<T>.Default can be used in a case like this, but there are other examples where such a class isn't available.)

Idea

Rather than a simple "type-is" operator (something like if(T : IComparable<T>)), I propose a new addition to the current pattern matching system that would enable obtaining new types from a pattern:

static bool TrySort<T>(IList<T> list)
{
    if(list is IList<new T2> sortableList where T2 : IComparable<T2>)
    {
        Algorithms.LightSpeedSort<T2>(sortableList);
        return true;
    }
}

In the inner scope, T2 can be used as a type and sortableList is guaranteed to be IList<T2> if the check succeeds. Notice that here T2 doesn't even need to be identical to T; in fact, the function need not be generic at all and list can be object, which is useful when it comes from an external source (deserialization) or dynamic context.

Such a code would be, in essence, transformed to something like this:

static bool TrySort<T>(IList<T> list)
{
    if(/* dynamically call Inner, providing list */ is bool result)
    {
        return result;
    }
    return false;
}

static bool Inner<T2>(IList<T2> sortableList) where T2 : IComparable<T2>
{
    Algorithms.TrySort(sortableList);
    return true;
}

Possible implementations of the dynamic call are shown later, but they are not the core point of this proposal.

Syntax

The syntax is an enhancement to the current type pattern:

<expr> is <type> <name> where <constraints>

The type would additionally allow new <tname> anywhere in the type to denote a new type extracted from the pattern. <constraints> are standard generic constraints restricting the newly introduced types. Like in the normal type pattern, <name> can be omitted.

The type referred to with new <tname> is usable from the moment it is encountered, and can be used in the rest of the pattern, without the new (is (new T, T) checks for (object, object), (string, string) etc.).

In addition to new T, _ could also be used as a sort of a wildcard without any constraints, i.e. as in obj is List<_> l. This is treated in the same way as new T, but the type cannot be referred to in any way after that.

new T doesn't have to be used only as a generic argument; it can be the whole type, as in obj is new T val where T : struct, IComparable<T>.

The provided constraints should be compatible with any generic type definition that is used with an extracted type. For example is List<Nullable<new T>> is not valid as T on Nullable<T> is constrained to struct. is List<Nullable<new T>> where T : struct is valid.

Semantics

If the match succeeds, the extracted types are selected in any way that satisfies the pattern. This means that if the type can be matched in two or more ways, the compiler or the runtime is allowed to obtain the types in the most efficient way, e.g. ((IEnumerable<int>)obj) is IEnumerable<new T> simply makes T an alias of int as that is guaranteed to be always possible.

Implementation

I do not propose a concrete method of handling this feature, but here are examples of what it could be transformed to:

DLR-based

Without generating any complex code and transforming compile-time code to runtime checks, the DLR can be used to merge the checking of type and invoking the method into a single call. Describing this in terms of dynamic operations also makes it easier to define:

Using the DLR, the code above would simply be analogous to this:

static bool TrySort<T>(IList<T> list)
{
    try{
        return Inner((dynamic)list);
    }catch(RuntimeBinderException) // the call could be configured in a way not to generate the exception at all
    {
        
    }
    return false;
}

static bool Inner<T2>(IList<T2> arg) where T2 : IComparable<T2>
{
    Algorithms.TrySort(arg);
    return true;
}

When used in a switch statement, multiple Inner methods could be produced, performing overload resolution at runtime as well (but only when the order of case statements follows the order of specialization of the methods).

This relies on the capabilities (and limitations) of the DLR, but while it may use the optimizations already in place for dynamic, it has the downside that the current syntax doesn't give you any hint that the DLR is use (therefore Microsoft.CSharp is needed).

A different syntax could be imagined, one that reflects this fact better, such as obj1 dynamic is List<new T> list (or possibly obj1 is dynamic List<new T> list, but that sounds more like shapes/duck-typing to me).

Reflection-based

The code above can be roughly translated to this:

static bool TrySort<T>(IList<T> list)
{
    if(
        list.GetType().GetInterfaces()
        .Where(inf => typeof(IList<>).Equals(inf.GetGenericTypeDefinition())
        .Select(inf => inf.GetGenericArguments()[0])
        .Where(arg => typeof(IComparable<>).MakeGenericType(arg).IsAssignableFrom(arg))
        .FirstOrDefault() is Type t
    ){
        return (bool)methodof(Inner<>).MakeGenericMethod(t).Invoke(null, new object[]{list});
        // methodof emitted as ldtoken
    }
    return false;
}

static bool Inner<T2>(IList<T2> arg) where T2 : IComparable<T2>
{
    Algorithms.TrySort(arg);
    return true;
}

The actual transformed code would be of course more complex, employing caching to reduce unnecessary calls as much as possible. With some help from the runtime, it can even be completely removed if T already matches the pattern in one instance of the method for a specific generic argument.

Precise algorithm to use during transformation 1) Move the scope where the new types are usable into another method with those types as generic parameters (with the same constraints). The first parameter of the method is the original value, the rest are references to the variables used within the scope. If the original method has generic parameters, the new method will be located in a new generic type with the same parameters. The first statement of the method casts the original value to the matched type (if the cast fails for some strange reason, the original check will fail then as well).
If there are `return`, `break`, `continue`, or `goto` statements, the method can return a signalling `int` which is used to determine what to do after the inner method returns. `await` and `yield` might also be implementable, assuming the inner method would return `Task` or `IEnumerable`. Using `__arglist` inside the scope has to be hoisted before the inner method is called, and passed via `RuntimeArgumentHandle`.
  1. Generate a new delegate type to be used for the specific instantiations of the new method, without the new generic parameters. Create a map from System.Type to the delegate type, used for every encountered type of the original value when necessary.

  2. Determine if there are shortcuts to satisfy the pattern without relying on the runtime type of the object. For example, ((IList<T>)obj) is IList<new T2> where T2 : struct can be satisfied by checking if T is a non-nullable value type (in a specific instantiation of the method) and the result can be simply cached, if the check succeeds (if not, it goes to the full check). Similarly, if the pattern can be statically proven or disproven that it can be matched (for example if the original type is a sealed type not deriving from MarshalByRefObject), the compiler can simply emit a constant true or false and select the proper types without any runtime overhead. If the original value is null, the pattern is never matched.

  3. If the match can be statically determined, the result is used. If the match depends on generic parameters of the method or its containing type, a bool value is stored that is initialized once for every combination of generic arguments. If the result of the match isn't determined this way (or the generic check fails), a full runtime check occurs:

    • The type of the value is obtained. If the object implements ICustomTypeProvider, its result can be used first, or another interface can be invented if better dynamic support is needed.
    • The delegate map is searched first to see if the result has already been determined. If the type isn't stored in the map, the check proceeds, otherwise the delegate is called (or the match fails if it is null).
    • Reflection is performed to match and extract the types. If the pattern is a sealed type, the types are directly compared. If the pattern is a class, the base types are traversed to find it. If the pattern is an interface, the interfaces the type implements are iterated. If the pattern is an SZ or MD array (is new T[], is new T[,] etc.), the corresponding properties are checked.
    • If the type is an array, the matching proceeds to its element type. If the type is a generic type, its arguments are checked (following the specific covariance and contravariance rules). For all the newly encountered element types or type arguments, the check proceeds further inside the pattern, until the extracted type (new T) is found as one of the concrete types.
    • This process produces a sequence of System.Type tuples for all the new extracted types. The first tuple that matches all the contraints is used, or the match fails if there is no such tuple.
  4. If the types can be statically found, and the runtime cached check succeeds, the inner method is instantiated at compile-time and called normally. Otherwise, if the full runtime check succeeded, the inner method is taken and instantiated via MakeGenericMethod, then bound to the delegate, stored in the map, and invoked via the delegate.

Why a language feature?

The same question can be asked for the dynamic keyword and many other features. The whole transformation can be performed manually, but a simple implementation will be very inefficient, while as it becomes more complex, it also becomes more error-prone and could suffer from oversights, like not being thread-safe etc. Incorporating it into the language allows representing a complicated system in a simple and intuitive way, and it is more flexible when you make changes to the code.

The compiler can also emit one thing that cannot be expressed manually: ldtoken for methods, meaning the inner method can be always found without worrying about its name. Other things that the runtime might eventually support could be used to further improve the performance.

Lastly, the compiler is in the position to select the optimal strategy when analyzing the type, deducing as much information at compile-time as possible. In the first case with the IComparable<T> check, the compiler would be able to unify T with T2 and "add" IComparable<T> to it, so the inner scope would be able to call LightSpeedSort. The resulting code would be unverifiable (because the method itself doesn't specify the constraint), but valid nevertheless. This cannot be normally done, as the compiler doesn't allow ignoring generic constraints in any way.

Scoping and unbound variables

In the case of if(x is T y), y is in scope in the inner statement or block (where it is assigned), but it is also in scope in the code that follows if (where it is unassigned). This is different to if(x is IList<new T> y), as the type of y is not defined after the if and so the variable is not defined either. If consistency is desired, it can be implemented in any of these ways:

  • If the check failed and condition is false, y remains unassigned. While y itself exists as a variable, its type is not fully defined and can never be assigned again.

  • If the check failed and condition is false, y remains unassigned. The type of y shall be considered unbound until it is assigned a value that can be statically matched with the type pattern of y.

  • The code that follows and uses y becomes part of the inner method, which is instantiated (at compile-time) with the types inferred from the assigned value. For example:

    static int Method(object obj)
    {
        if(!(obj is List<new T> l)
        {
            l = new List<float>();
        }
        return l.Count;
    }
    
    // This is transformed to:
    
    delegate int InnerDelegate(object obj);
    
    static void Method(object obj)
    {
        if(/* checks that obj matches the pattern and produces delegate "del" from an instantiation of Inner */)
        {
            return del.Invoke(obj);
        }else{
            return Inner<float>(obj);
        }
    }
    
    static int Inner<T>(object obj)
    {
        if(!(obj is List<T> l))
        {
            l = new List<T>();
        }
        return l.Count;
    }

Note that if l is unassigned, its type is (for the moment) effectively an unbound List<T>, and the first assignment to the variable binds it to List<float>. Before any assignment, T is also considered to be "unassigned". This feature can be optionally extended to any variable definition:

List<new T> list where T : struct;
if(type == typeof(int))
{
    list = new List<int>();
}else{
    list = new List<long>();
}

In each of the branches, T is concretely determined, but after this code, list is assigned but T is only usable based on its constraints (value type). Again, this would be transformed to an inner generic method, instantiated at compile-time with either int or long.

default and as

This mechanism can be extended to as as well. In case the pattern is a nullable type (reference or Nullable<T>) and the check fails, there is a need to assign a default value and a default type like in this case:

if(!(obj is IList<new T> l))
{
    l = null; // or l = default;
}

// This is analogous to:
var l = obj as IList<new T>;

The branching can be implemented as outlined in the previous section, but T still has to be provided with a "default" value.

In this case, the compiler could generate a "dummy" unique type that satisfies all its constraints. Unless new() or struct is used, the type would be abstract, so no potential interfaces have to be implemented. Otherwise, the necessary methods would be implemented as throwing NotImplementedException or another exception.

For example, var l = obj as List<new T> where T : struct, IEquatable<T> is transformed as follows:

if(/* extract del of Inner like before */)
{
    del.Invoke(obj);
}else{
    Inner<Dummy1>(obj);
}

void Inner<T>(object obj) where T : struct, IEquatable<T>
{
    var l = obj as List<T>;
}

struct Dummy1 : IEquatable<Dummy1>
{
    public bool Equals(Dummy1 arg) => throw new NotImplementedException();
}

l would be only good for performing a null check, but at least that common pattern can be used. The ?? operator might be used to provide a better default value (like with if(!(obj is…): (obj as List<new T>) ?? new List<object>();, again separating the rest of the code into a generic method like before with List<float>.

Examples

public static int Count(object obj)
{
    switch(obj)
    {
        case ICollection col1: return col1.Count;
        case ICollection<_> col2: return col2.Count;
    }
    throw new ArgumentException("Argument is not a collection.", nameof(obj));
}

// roughly transformed to:

delegate bool InnerMethodDelegate(object obj, out int result);
static readonly Type InnerMethodDelegateType = typeof(InnerMethodDelegate);
static readonly Dictionary<Type, InnerMethodDelegate> typeCache = new Dictionary<Type, InnerMethodDelegate>();
static readonly Type ICollectionType = typeof(ICollection<>);

class HiddenClass
{
    public static bool InnerMethod<T>(object obj, out int result)
    {
        if(obj is ICollection<T> col2)
        {
            result = col2.Count;
            return true;
        }
        result = default;
        return false;
    }
}

static readonly MethodInfo InnerMethodInfo = typeof(HiddenClass).GetMethod(nameof(HiddenClass.InnerMethod));
// = methodof(HiddenClass.InnerMethod<>)

public static int Count(object obj)
{
    if(obj is ICollection col1)
    {
        return col1.Count;
    }
    if(obj != null)
    {
        Type objType = obj.GetType();
        if(!typeCache.TryGetValue(objType, out var del))
        {
            foreach(var interf in objType.GetInterfaces())
            {
                if(interf.IsGenericType && ICollectionType.Equals(interf.GetGenericTypeDefinition()))
                {
                    Type argType = interf.GetGenericArguments()[0];
                    try
                    {
                        var methodInst = InnerMethodInfo.MakeGenericMethod(argType);
                        del = (InnerMethodDelegate)Delegate.CreateDelegate(InnerMethodDelegateType, methodInst);
                        break;
                    }catch(ArgumentException)
                    {

                    }
                }
            }
            typeCache.Add(objType, del);
        }
        if(del != null && del.Invoke(obj, out int result))
        {
            return result;
        }
    }
    throw new ArgumentException("Argument is not a collection.", nameof(obj));
}
@HaloFour
Copy link
Contributor

This seems overly complex, relies on a lot of reflection and is really bizarre in that it introduces generic type parameters in the middle of a method. IMO the effort isn't worth the reward. I would need to see more and common use cases.

@bigredd0087
Copy link

Your example can already be implemented currently without any changes to the language by adding a constraint to TrySort:

static bool TrySort<T>(IList<T> list) where T : IComparable<T>
{
        Algorithms.LightSpeedSort<T>(list);
        return true;
}

@IS4Code
Copy link
Author

IS4Code commented Dec 23, 2019

@HaloFour Perhaps yes, but there are also other features in the language that use reflection. dynamic, as I mentioned, expression trees, and the new() generic parameter constraint uses Activator.CreateInstance under the hood.

@bigredd0087 The purpose of my example was to conditionally sort a list with TrySort only if that operation is available. Checking whether a list is restricted to only contain objects that are comparable is something that can only be done with reflection.

@bigredd0087
Copy link

The purpose of my example was to conditionally sort a list with TrySort only if that operation is available. Checking whether a list is restricted to only contain objects that are comparable is something that can only be done with reflection.

Okay, I did not understand that part. Can you expand on your example? I'm having a hard time seeing what kind of code would call TrySort without knowing at compile-time whether or not T implements IComparable.

@IS4Code
Copy link
Author

IS4Code commented Dec 23, 2019

@bigredd0087 As I said, there could be an operation that works well if the list is sorted, but might not require it. However, I might not be good at examples and there could be better cases where this would be an advantage.

Generally, if you have an object and you want to treat it as a specific type, you use polymorphism and is or as. Before generics, in C# 1, you would simply check is IList and you are done. But now a list is IList<T> and the closest non-generic interface to that is IEnumerable.

This means you can have a list without having any way of knowing that, unless you use reflection. You could go full dynamic on it, but then you lose any potentially useful compile-time checking, because you know you are just calling methods on IList<>.

So let's say you wanted to add a count function to some dynamically typed embedded scripting language where you can also store handles to .NET objects. You could check if the argument is System.Array and then get its length, but you cannot do the same for any collection, because there is no interface common to all collections (ICollection<T> doesn't implement ICollection) aside from IEnumerable. You would want to do something like this:

if(obj is ICollection<> col)
{
    return col.Count;
}

But that is not possible, you cannot obtain the property in this way. In Java, you could do that (obj instanceof List<?>) as generics are a compile-time feature, but here ICollection<string> and ICollection<int> have nothing in common.

I have realized the DLR can be used to perform the type matching:

Inner((dynamic)obj);

static int Inner<T>(ICollection<T> col)
{
    return col.Count;
}

This infers the type arguments from the value if possible, but there are still some caveats. RuntimeBinderException is thrown, so you have to catch it but you also have to be sure that it was coming from the call itself and not from inside of the method. Also there are some limitations related to the dynamic dispatch, like decreased performance and the fact you cannot use some types as parameters, like Span<T>.

@bigredd0087
Copy link

@IllidanS4 It is still not clear to me why generic constraints are not sufficient to achieve what you are asking for. For instance if I call TrySort the compiler knows whether or not T implements IComparable. In fact TrySort isn't even needed. I can just call LightSpeedSort directly and the compiler will tell me at compile-time whether or not I can use that function with the type I am feeding it.

So let's say you wanted to add a count function to some dynamically typed embedded scripting language where you can also store handles to .NET objects. You could check if the argument is System.Array and then get its length, but you cannot do the same for any collection, because there is no interface common to all collections (ICollection doesn't implement ICollection) aside from IEnumerable.

This is why Enumerable provides the Count extension method. And your proposal does not solve this problem. The example you gave would only work for types that implement ICollection, which as you say, not all collections implement.

@IS4Code
Copy link
Author

IS4Code commented Dec 23, 2019

@bigredd0087 You would have to add where T : IComparable<T> to every method that would happen to call TrySort even if you need it in only one specific place. The point was to optionally sort a list if you determine at runtime that the list is actually sortable. Propagating where T : IComparable<T> to every type and method that calls LightSpeedSort, which is what you suggest, will make the code accept only collections that are always sortable, but you wouldn't be able to consume the rest.

My example would work, because I said that not all collections implement ICollection. The point is that the rest implements ICollection<T> for some T, so if you could find the T at runtime, Count becomes accessible at compile-time. Also the method is Enumerable.Count<T>, so you cannot count a collection whose element type you don't know at compile time. The only way would be iterating the collection as a sequence and counting it manually, but you risk hanging your program on that if you encounter a never-ending sequence.

By the way, do the "angle brackets" correctly render in your browser, or is that an issue of GitHub's quotation? In your quote of my message, you copied "ICollection doesn't implement ICollection", but the text was "ICollection[T] doesn't implement ICollection" if I replace the parentheses).

@HaloFour Also regarding the uncertainty about introducing a new type in the middle of a method, I understand that it might be confusing, but it essentially makes a section of the code generic and infers its generic arguments from a runtime value, which is cool in my opinion. There was also a time when defining a variable in the middle of a function (or an expression) was weird, and the new keyword here at least makes the intention clear.

At least this is less confusing than changing the characteristics of a type based on scope, like what happens in the alternate solution to the issue:

if(T : IComparable<T>)
{
    // here T implements IComparable<T>
}
// here it does not

@bigredd0087
Copy link

You would have to add where T : IComparable to every method that would happen to call TrySort even if you need it in only one specific place. The point was to optionally sort a list if you determine at runtime that the list is actually sortable. Propagating where T : IComparable to every type and method that calls LightSpeedSort, which is what you suggest, will make the code accept only collections that are always sortable, but you wouldn't be able to consume the rest.

To me this being a problem is suggestive that the code is suffering from a larger issue of not enough separation of concerns. But I digress. :)

This code will accomplish what you want and should be quite performant:

static class Extensions
{
    public static bool TrySort<T>(this IList<T> list) => new Sorter<T>().TrySort(list); 
}

struct Sorter<T>
{
    static readonly bool IsComparable = typeof(IComparable<T>).IsAssignableFrom(typeof(T));

    public bool TrySort(IList<T> list)
    {
        if(IsComparable)
        {
            LightSpeedSort(list);
            return true;
        }
        else
            return false;
    }
}

//usage
var res = new int[] { 3, 2, 1 }.TrySort();

@bigredd0087
Copy link

My bad. The code that I put up doesn't work. I left out the constraint on LightSpeedSort. You could use that code for caching the reflection calls though.

@canton7
Copy link

canton7 commented Dec 24, 2019

I'm personally against language features which require reflection under the hood. They hide the cost of the operation from the user, and result in unexpected TargetInvocationExceptions if something goes wrong.

For example, in:

if (list is IList<new T2> sortableList where T2 : IComparable<T2>)
{
    Algorithms.LightSpeedSort<T2>(sortableList);
}

If Algorithms.LightSpeedSort throws, calling code will get a TargetInvocationException instead the underlying exception, which is unexpected.

I'm also not sure that a compiler feature which would produce unverifiable code will fly...


If something like memberof (#712) were implemented (with the necessary support for generic methods), I think your problem could be solved using a helper method, rather than requiring a new language feature. Something like:

if (InvokeIfConstraintsMet(methodof(Algorithms.LightSpeedSort), list))
{
    ....
}

@IS4Code
Copy link
Author

IS4Code commented Dec 24, 2019

@canton7

If Algorithms.LightSpeedSort throws, calling code will get a TargetInvocationException instead the underlying exception, which is unexpected.

Not at all. With the transformation I described, the actual method will be called via a delegate, which doesn't do anything with the exceptions that are thrown inside. The debugger will show the correct line; only the stack trace would reveal the hidden inner method, but that also already happens for await and yield which separate different parts of the code.

Your InvokeIfConstraintsMet would be inefficient without any caching, and it would also require a delegate type in order to be comparable in efficiency with is.

I'm also not sure that a compiler feature which would produce unverifiable code will fly...

Well, there are many features that do this. Even without unsafe, your average modern code is probably unverifiable. ref returns are unverifiable. Passing readonly field as in parameter is unverifiable. [FieldOffset] is unverifiable if fields overlap. Even the proposal to allow "noinit" local variables makes the code unverifiable.

Like in all these cases, it would be the compiler's responsibility to ensure that the resulting code, even though unverifiable, is actually valid. And even in this case (in the unification of T and T2), this step can be skipped and the full runtime check could be performed, resulting in a 100% verifiable code (as the constraint checking will be done at runtime with MakeGenericMethod).

@xZise
Copy link

xZise commented Jan 6, 2020

This seems to be an interesting proposal for an issue I've had where I needed to compare two objects, but I couldn't cast the object to IComparable<T>, because I don't know T beforehand.

At the moment I've been using dynamic, but maybe it would be possible with type safety:

private int Compare(object nodeValue1, object nodeValue2)
{
    // Here there should be no case where both are null, so value1 will never be null.
    dynamic value1 = nodeValue1 ?? nodeValue2;
    dynamic value2 = nodeValue1 == null ? null : nodeValue2;
    int result = value1.CompareTo(value2);
    if (nodeValue1 == null)
    {
        result = -result;
    }
    return result;
}

@IS4Code
Copy link
Author

IS4Code commented Jan 6, 2020

@xZise That is a good use-case, and I think the syntax could be:

if(value2 != null)
{
    if(value1 is IComparable<new T> obj when value2 is T arg)
    {
        result = obj.CompareTo(arg);
    }
}else{
    if(value1 is IComparable<new T> obj where T : class)
    {
        result = obj.CompareTo(null);
    }
}

There is a slight modification to the original approach as when has to be accounted for, and so is IComparable<new T> must extract all types that satisfy the pattern and not just the first one, as the condition has to be checked for all of them.

@canton7
Copy link

canton7 commented Jan 6, 2020

Why when not &&?

@IS4Code
Copy link
Author

IS4Code commented Jan 6, 2020

@canton7 Because the pattern would be solely value1 is IComparable<new T> and thus satisfied by any type T. If the type implements IComparable<string> and IComparable<int>, it could select string even if value2 is int with &&.

@canton7
Copy link

canton7 commented Jan 6, 2020

value1 is IComparable<new T> obj && value1 is T && value2 is T?

I'm not a fan of this syntax, but there's no point in being needlessly complex!

@IS4Code
Copy link
Author

IS4Code commented Jan 6, 2020

@canton7 It might be complex but not really needlessly. Imagine this situation:

object a = 1;
bool match = a is int x && Predicate(x);

You'd expect Predicate to be called at most once, and indeed that's what happens. Now let's take this:

object a = 1;
bool match = a is new T x && Predicate<T>(x);

is new T can be satisfied with many types: object, ValueType, int, IComparable, IComparable<int>, IConvertible, IEquatable<int>, IFormattable. The point is that the implementation would be free to choose any of these (and would probably go with object as that's the easiest to try), and once it chooses T, it becomes "fixed" (for the concrete type of a). When Predicate<T>(x) is executed, T should already be determined.

If what you suggest was possible, Predicate could be called many types before finding the correct type, which is counterintuitive and (in my opinion) more complex than extra when (which is already allowed in switch), as it would make is new T depend on the whole expression where it is used.

I am not sure if ambiguous patterns are possible at the moment (i.e. patterns that can be satisfied in more than one way), but there is another analogue to this in the language:

try{...}
catch(Exception e) when(Predicate(e))
{

}

versus

try{...}
catch(Exception e)
{
    if(Predicate(e))
    {
        ...
    }
}

Your suggestion would be correct if we assumed that at most one instantiation of IComparable<T> is ever implemented by a type, but even if this might be safe to assume in the case of this interface, it might not hold in general.

@rhaokiel
Copy link

rhaokiel commented Jan 16, 2020

This proposal fills this specific niche that I run into quite a bit:

I prefer to make very generic strongly-typed controls and interfaces as possible. The less repeated code for unique instances the better. The issue comes in when implementing the fine details for specific cases. Generics currently offer no way of splitting flow based on whether the underlying type supports a given detail or not. The only way to split the flow is through a costly Reflection call, or by creating significant amounts of duplicate code, just so a specific instance can have a unique fine detail.

Generic constraints on methods don't fill this niche, because the constraint must be propagated all the way up the stack until there is no more generic. But at compile-time, a specific instance/call will have the type information and so should be able to fill in whether a specific fine detail should be added or not.

My latest example, consider IComparer, IComparer<T>, and Comparison<T>. I'd like to be able to leave the door open for any of these to be used for sorting a column in a ListView. The ListView columns would be able to contain any item that a developer wants to put in and is therefore catch-all type object. And I could put the burden of conforming to a specific method back on the using developer. But would they realize some of the implications of conforming to the interface IComparer so that I can give back whatever object and get a comparison value? One such implication is that if you use Comparer<T>.Create(Comparison<T>) and I use IComparer.Compare, null values will not make it back to the delegate, because they will be pre-handled by Comparer<T>. My work around:

public static IComparer Create<iT>(IComparer<iT> cmp) => Create<iT>(cmp.Compare);
public static IComparer Create<iT>(Comparison<iT> cmp)
{
    if (typeof(iT).IsClass)
        return (IComparer)Activator.CreateInstance(typeof(ComparisonAdapter<>).MakeGenericType(typeof(iT)), cmp);
    if (typeof(iT).IsAssignableToGeneric(typeof(Nullable<>), out Type[] el))
        return (IComparer)Activator.CreateInstance(typeof(NullableComparisonAdapter<>).MakeGenericType(el), cmp);

    // fallback, if struct is not nullable then let the default Comparer handle null values the way it wants
    return Comparer<iT>.Create(cmp);
}

That could easily be handled with generic constraints as so... except:

public static IComparer Create<iT>(IComparer<iT> cmp) where iT : class => new ComparisonAdapter<iT>(cmp.Compare);
public static IComparer Create<iT>(Comparison<iT> cmp) where iT : class => new ComparisonAdapter<iT>(cmp);
public static IComparer Create<iT>(IComparer<iT> cmp) where iT : struct => Comparer<iT>.Create(cmp.Compare);
public static IComparer Create<iT>(Comparison<iT> cmp) where iT : struct => Comparer<iT>.Create(cmp);
public static IComparer Create<iT>(IComparer<iT?> cmp) where iT : struct => new NullableComparisonAdapter<iT>(cmp.Compare);
public static IComparer Create<iT>(Comparison<iT?> cmp) where iT : struct => new NullableComparisonAdapter<iT>(cmp);

... except that calling from a non-constrained generic method, the compiler would complain that it doesn't know which function to call. And the compiler would also complain that some of those method definitions are duplicates of one another. Not to mention all the duplicate code that I had to copy-paste.

I am very much in favor of this proposal.

@macsux
Copy link

macsux commented Feb 6, 2021

Seems like this is an ask to add equivalent of Java wildcard generics which has a lot of usefulness. Covariance /contravariance on interfaces is a lot more restrictive and does not address all usecases, requiring falling back to full reflection. https://docs.oracle.com/javase/tutorial/extra/generics/wildcards.html

@IS4Code
Copy link
Author

IS4Code commented Feb 6, 2021

@macsux In a sense, but it's more powerful. Imagine being able not only to cast something to List<*> (List<_> in the proposal), but also to get the actual thing the wildcard stands for and bring that into scope, if the pattern succeeded.

@macsux
Copy link

macsux commented Feb 6, 2021 via email

@HaloFour
Copy link
Contributor

HaloFour commented Feb 6, 2021

There are some really neat things you can do with generic wildcards in Java and I think it would be cool to see that in C#, but that would certainly involve some serious runtime changes to how generics work. In Java those generic type arguments don't exist at runtime so the flexibility comes from the compiler trusting that what you're doing is safe by verifying that you're staying within the generic bounds (generic capture compiler error messages are the most fun to troubleshoot!). But to the runtime that List<String> is just a List and those methods all accept/return Object. In .NET List<string> and List<object> are two distinct and incompatible types entirely, and while "open" generic types can be referenced by reflection you can't do much of anything with them. So the runtime would need a way to describe a variable/member of type List<?> or List<T : U> and then enforce the appropriate variance down to the member level. IMO that would be a pretty huge undertaking and probably not considered worth it given that there are other approaches to solving these problems without having to get clever with generics.

I'm really curious if/how Java will handle generic wildcards when Valhalla ships which is supposed to bring value types and generic specialization to the JVM. I'd have to imagine that, like in .NET/C#, variance will never be legal with value types.

@macsux
Copy link

macsux commented Feb 6, 2021

"In .NET List and List are two distinct and incompatible types entirely"

There are ways to bypass runtime check on this. You can already assign List to IEnumerable, but not IList. The reason is list exposes partial API surface that would make certain operations not safe, but some are perfectly valid. Here's a very interesting demonstration that proves what you say above can't be done, actually is today:

    var a = new List<string>() { "foo", "bar" };
    List<object> b = Unsafe.As<List<object>>(a);
    string i = (string)b[0]; // works
    b[0] = "hello"; // works
    b.Add(1); // runtime error, but can condition can be detected via compiler (or analyzer)

The issue is there are a lot of ways to make this crash your app right now because you've essentially bypassed all type checks by casting one memory space to another (no different than using pointers). These checks can be moved into the compiler for cases where runtime doesn't allow for them.

@HaloFour
Copy link
Contributor

HaloFour commented Feb 6, 2021

@macsux

There are ways to bypass runtime check on this.

There's a lot of very unsafe things you can do but definitely shouldn't do. This is one of them. You're exploiting the fact that the runtime happens to generate a single implementation for reference types as an optimization, but this is an implementation detail and there's no guarantee that this is the case, certainly not one that the compiler can enforce. Try that with struct generic type arguments and I'm sure that things go off the rails very quickly.

@IS4Code
Copy link
Author

IS4Code commented Feb 6, 2021

@HaloFour Maybe for the beginning, the runtime could simply provide a fast way to call a generic method when one of the type arguments is provided at runtime, like this (pseudocode):

bool Method<T>(object a)
{
    return a is List<T>;
}

void Test()
{
    object list = new List<string>();
    object str = "x";
    Method<str.GetType() as class>(list);
}

Of course this can be done with MakeGenericMethod, but I presume an intrinsic could speed things up significantly.

Of course the call can fail if the type violates generic constraints, so some way to signal that is also needed. For the remaining type matching algorithms, they can be put in some package, since it is just a matter for getting types from a type at runtime, and using them when calling a method.

The failure of the generic method call could then become a part of the result of the match:

if(list is new TList where TList : IList<TElem>)
{

//translated to:

InnerMethod<list.GetType() as class, /* infer */>(list);
void InnerMethod<TList, TElem>(TList list) where TList : IList<TElem>
{

Here the runtime is also required to infer TElem, but that can be resolved in C# as well (in which case the constraint will only be used for verification that the extraction was correct).

@declard
Copy link

declard commented Jun 7, 2022

This is somewhat similar to pattern matching on existentially qualified data in Haskell which brings a type into scope, but there it works without type casting and the types are erased in runtime: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/existential_quantification.html

In C# it may be problematic since is construction brings variable into the whole scope after it, since that's already the used approach, for the sake of consistency let it be this way. What if another construction is used, for example:

List<T> LightSpeedSort<T>(List<T> list) ...
...
if (try LightSpeedSort(list) as object result) ...
void LightSpeedSort<T>(List<T> list) ...
...
if (try LightSpeedSort(list)) ...
IList LightSpeedSort<T>(List<T> list) ...
...
if (try LightSpeedSort(list) as var result) ...

try would try to find a cast from whatever static type list has to whatever the argument type of LightSpeedSort is, while bringing the type into scope of the function, checking if the constraints are satisfied, prohibiting the result type from running out by requiring var not used if the result type of the func depends on the inferred type, and checking a cast exists from the result type to the target (as) type.

This way we minimize scope of changes by introducing a single yet complex operator.

The downside is that a separate function is required.

@orthoxerox
Copy link

orthoxerox commented Aug 10, 2022

I've just run into a situation where something like that would've been useful. I basically had a hierarchy of abstract types like Foo, Foo<T> : Foo and Foo<T, U> : Foo, from which I derived a bunch of concrete types. I also had to create a separate class that processed a list of Foos slightly differently based on their most derived abstract type, and I couldn't stick this logic into virtual methods on the types themselves, because separation of concerns.

So, without thinking, I wrote foo switch { Foo<T, U> foo2 => and realized I couldn't do that. Something like a type extracting pattern would've definitely helped me there.

@IS4Code
Copy link
Author

IS4Code commented Aug 11, 2022

@orthoxerox

I've just run into a situation where something like that would've been useful. I basically had a hierarchy of abstract types like Foo, Foo<T> : Foo and Foo<T, U> : Foo, from which I derived a bunch of concrete types. I also had to create a separate class that processed a list of Foos slightly differently based on their most derived abstract type, and I couldn't stick this logic into virtual methods on the types themselves, because separation of concerns.

So, without thinking, I wrote foo switch { Foo<T, U> foo2 => and realized I couldn't do that. Something like a type extracting pattern would've definitely helped me there.

These days I use the dynamic overload resolution to achieve basically the same thing. You could do this:

void SwitchFoo(Foo foo)
{
    CaseFoo((dynamic)foo);
}

void CaseFoo(Foo foo)
{
    //...
}

void CaseFoo1<T>(Foo<T> foo1)
{
    //...
}

void CaseFoo2<T, U>(Foo<T, U> foo2)
{
    //...
}

I can imagine this being a way my proposed syntax could be implemented as well, with minimal required new code generation. Just note that "new" generic constraints (i.e. other than class and struct) are not recognized by the DLR at the moment (tracked here), and of course this has other limitations of dynamic invocations.

@dotnet dotnet locked and limited conversation to collaborators Dec 6, 2024
@333fred 333fred converted this issue into discussion #8808 Dec 6, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants