-
Notifications
You must be signed in to change notification settings - Fork 1k
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Type extracting patterns #3050
Comments
This seems overly complex, relies on a lot of reflection and is really bizarre in that it introduces generic type parameters in the middle of a method. IMO the effort isn't worth the reward. I would need to see more and common use cases. |
Your example can already be implemented currently without any changes to the language by adding a constraint to TrySort: static bool TrySort<T>(IList<T> list) where T : IComparable<T>
{
Algorithms.LightSpeedSort<T>(list);
return true;
} |
@HaloFour Perhaps yes, but there are also other features in the language that use reflection. @bigredd0087 The purpose of my example was to conditionally sort a list with |
Okay, I did not understand that part. Can you expand on your example? I'm having a hard time seeing what kind of code would call TrySort without knowing at compile-time whether or not T implements IComparable. |
@bigredd0087 As I said, there could be an operation that works well if the list is sorted, but might not require it. However, I might not be good at examples and there could be better cases where this would be an advantage. Generally, if you have an object and you want to treat it as a specific type, you use polymorphism and This means you can have a list without having any way of knowing that, unless you use reflection. You could go full So let's say you wanted to add a if(obj is ICollection<> col)
{
return col.Count;
} But that is not possible, you cannot obtain the property in this way. In Java, you could do that ( I have realized the DLR can be used to perform the type matching: Inner((dynamic)obj);
static int Inner<T>(ICollection<T> col)
{
return col.Count;
} This infers the type arguments from the value if possible, but there are still some caveats. |
@IllidanS4 It is still not clear to me why generic constraints are not sufficient to achieve what you are asking for. For instance if I call TrySort the compiler knows whether or not T implements IComparable. In fact TrySort isn't even needed. I can just call LightSpeedSort directly and the compiler will tell me at compile-time whether or not I can use that function with the type I am feeding it.
This is why Enumerable provides the Count extension method. And your proposal does not solve this problem. The example you gave would only work for types that implement ICollection, which as you say, not all collections implement. |
@bigredd0087 You would have to add My example would work, because I said that not all collections implement By the way, do the "angle brackets" correctly render in your browser, or is that an issue of GitHub's quotation? In your quote of my message, you copied "ICollection doesn't implement ICollection", but the text was "ICollection[T] doesn't implement ICollection" if I replace the parentheses). @HaloFour Also regarding the uncertainty about introducing a new type in the middle of a method, I understand that it might be confusing, but it essentially makes a section of the code generic and infers its generic arguments from a runtime value, which is cool in my opinion. There was also a time when defining a variable in the middle of a function (or an expression) was weird, and the At least this is less confusing than changing the characteristics of a type based on scope, like what happens in the alternate solution to the issue: if(T : IComparable<T>)
{
// here T implements IComparable<T>
}
// here it does not |
To me this being a problem is suggestive that the code is suffering from a larger issue of not enough separation of concerns. But I digress. :) This code will accomplish what you want and should be quite performant: static class Extensions
{
public static bool TrySort<T>(this IList<T> list) => new Sorter<T>().TrySort(list);
}
struct Sorter<T>
{
static readonly bool IsComparable = typeof(IComparable<T>).IsAssignableFrom(typeof(T));
public bool TrySort(IList<T> list)
{
if(IsComparable)
{
LightSpeedSort(list);
return true;
}
else
return false;
}
}
//usage
var res = new int[] { 3, 2, 1 }.TrySort(); |
My bad. The code that I put up doesn't work. I left out the constraint on LightSpeedSort. You could use that code for caching the reflection calls though. |
I'm personally against language features which require reflection under the hood. They hide the cost of the operation from the user, and result in unexpected For example, in: if (list is IList<new T2> sortableList where T2 : IComparable<T2>)
{
Algorithms.LightSpeedSort<T2>(sortableList);
} If I'm also not sure that a compiler feature which would produce unverifiable code will fly... If something like if (InvokeIfConstraintsMet(methodof(Algorithms.LightSpeedSort), list))
{
....
} |
Not at all. With the transformation I described, the actual method will be called via a delegate, which doesn't do anything with the exceptions that are thrown inside. The debugger will show the correct line; only the stack trace would reveal the hidden inner method, but that also already happens for Your
Well, there are many features that do this. Even without Like in all these cases, it would be the compiler's responsibility to ensure that the resulting code, even though unverifiable, is actually valid. And even in this case (in the unification of |
This seems to be an interesting proposal for an issue I've had where I needed to compare two At the moment I've been using private int Compare(object nodeValue1, object nodeValue2)
{
// Here there should be no case where both are null, so value1 will never be null.
dynamic value1 = nodeValue1 ?? nodeValue2;
dynamic value2 = nodeValue1 == null ? null : nodeValue2;
int result = value1.CompareTo(value2);
if (nodeValue1 == null)
{
result = -result;
}
return result;
} |
@xZise That is a good use-case, and I think the syntax could be: if(value2 != null)
{
if(value1 is IComparable<new T> obj when value2 is T arg)
{
result = obj.CompareTo(arg);
}
}else{
if(value1 is IComparable<new T> obj where T : class)
{
result = obj.CompareTo(null);
}
} There is a slight modification to the original approach as |
Why |
@canton7 Because the pattern would be solely |
I'm not a fan of this syntax, but there's no point in being needlessly complex! |
@canton7 It might be complex but not really needlessly. Imagine this situation: object a = 1;
bool match = a is int x && Predicate(x); You'd expect object a = 1;
bool match = a is new T x && Predicate<T>(x);
If what you suggest was possible, I am not sure if ambiguous patterns are possible at the moment (i.e. patterns that can be satisfied in more than one way), but there is another analogue to this in the language: try{...}
catch(Exception e) when(Predicate(e))
{
} versus try{...}
catch(Exception e)
{
if(Predicate(e))
{
...
}
} Your suggestion would be correct if we assumed that at most one instantiation of |
This proposal fills this specific niche that I run into quite a bit: I prefer to make very generic strongly-typed controls and interfaces as possible. The less repeated code for unique instances the better. The issue comes in when implementing the fine details for specific cases. Generics currently offer no way of splitting flow based on whether the underlying type supports a given detail or not. The only way to split the flow is through a costly Reflection call, or by creating significant amounts of duplicate code, just so a specific instance can have a unique fine detail. Generic constraints on methods don't fill this niche, because the constraint must be propagated all the way up the stack until there is no more generic. But at compile-time, a specific instance/call will have the type information and so should be able to fill in whether a specific fine detail should be added or not. My latest example, consider
That could easily be handled with generic constraints as so... except:
... except that calling from a non-constrained generic method, the compiler would complain that it doesn't know which function to call. And the compiler would also complain that some of those method definitions are duplicates of one another. Not to mention all the duplicate code that I had to copy-paste. I am very much in favor of this proposal. |
Seems like this is an ask to add equivalent of Java wildcard generics which has a lot of usefulness. Covariance /contravariance on interfaces is a lot more restrictive and does not address all usecases, requiring falling back to full reflection. https://docs.oracle.com/javase/tutorial/extra/generics/wildcards.html |
@macsux In a sense, but it's more powerful. Imagine being able not only to cast something to |
The downside is that constraint cannot be made part of public contact.
Current issue with generics, is if I have nested generic types like a
Russian doll, constraint in each level has to be declared as a type
variable on the method signature. The caller is now responsible for
providing those args explicitly, even though the reciever declares that he
doesn't actually care what the specific type is, only that it confirms to
certain constraints. Here is an example of what I mean in Java.
https://github.com/AxonFramework/AxonFramework/blob/master/messaging/src/main/java/org/axonframework/deadline/AbstractDeadlineManager.java
…On Sat., Feb. 6, 2021, 5:36 a.m. IS4, ***@***.***> wrote:
@macsux <https://github.com/macsux> In a sense, but it's more powerful.
Imagine being able not only to cast something to List<*> (List<_> in the
proposal), but also to get the actual thing the wildcard stands for and
bring that into scope, *if* the pattern succeeded.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3050 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAINFWF4L5ZZETKGCFBZUTLS5ULSXANCNFSM4J6UWLJA>
.
|
There are some really neat things you can do with generic wildcards in Java and I think it would be cool to see that in C#, but that would certainly involve some serious runtime changes to how generics work. In Java those generic type arguments don't exist at runtime so the flexibility comes from the compiler trusting that what you're doing is safe by verifying that you're staying within the generic bounds (generic capture compiler error messages are the most fun to troubleshoot!). But to the runtime that I'm really curious if/how Java will handle generic wildcards when Valhalla ships which is supposed to bring value types and generic specialization to the JVM. I'd have to imagine that, like in .NET/C#, variance will never be legal with value types. |
"In .NET List and List are two distinct and incompatible types entirely" There are ways to bypass runtime check on this. You can already assign List to IEnumerable, but not IList. The reason is list exposes partial API surface that would make certain operations not safe, but some are perfectly valid. Here's a very interesting demonstration that proves what you say above can't be done, actually is today: var a = new List<string>() { "foo", "bar" };
List<object> b = Unsafe.As<List<object>>(a);
string i = (string)b[0]; // works
b[0] = "hello"; // works
b.Add(1); // runtime error, but can condition can be detected via compiler (or analyzer) The issue is there are a lot of ways to make this crash your app right now because you've essentially bypassed all type checks by casting one memory space to another (no different than using pointers). These checks can be moved into the compiler for cases where runtime doesn't allow for them. |
There's a lot of very unsafe things you can do but definitely shouldn't do. This is one of them. You're exploiting the fact that the runtime happens to generate a single implementation for reference types as an optimization, but this is an implementation detail and there's no guarantee that this is the case, certainly not one that the compiler can enforce. Try that with struct generic type arguments and I'm sure that things go off the rails very quickly. |
@HaloFour Maybe for the beginning, the runtime could simply provide a fast way to call a generic method when one of the type arguments is provided at runtime, like this (pseudocode): bool Method<T>(object a)
{
return a is List<T>;
}
void Test()
{
object list = new List<string>();
object str = "x";
Method<str.GetType() as class>(list);
} Of course this can be done with Of course the call can fail if the type violates generic constraints, so some way to signal that is also needed. For the remaining type matching algorithms, they can be put in some package, since it is just a matter for getting types from a type at runtime, and using them when calling a method. The failure of the generic method call could then become a part of the result of the match: if(list is new TList where TList : IList<TElem>)
{
//translated to:
InnerMethod<list.GetType() as class, /* infer */>(list);
void InnerMethod<TList, TElem>(TList list) where TList : IList<TElem>
{ Here the runtime is also required to infer |
This is somewhat similar to pattern matching on existentially qualified data in Haskell which brings a type into scope, but there it works without type casting and the types are erased in runtime: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/existential_quantification.html In C# it may be problematic since List<T> LightSpeedSort<T>(List<T> list) ...
...
if (try LightSpeedSort(list) as object result) ... void LightSpeedSort<T>(List<T> list) ...
...
if (try LightSpeedSort(list)) ... IList LightSpeedSort<T>(List<T> list) ...
...
if (try LightSpeedSort(list) as var result) ...
This way we minimize scope of changes by introducing a single yet complex operator. The downside is that a separate function is required. |
I've just run into a situation where something like that would've been useful. I basically had a hierarchy of abstract types like So, without thinking, I wrote |
These days I use the dynamic overload resolution to achieve basically the same thing. You could do this: void SwitchFoo(Foo foo)
{
CaseFoo((dynamic)foo);
}
void CaseFoo(Foo foo)
{
//...
}
void CaseFoo1<T>(Foo<T> foo1)
{
//...
}
void CaseFoo2<T, U>(Foo<T, U> foo2)
{
//...
} I can imagine this being a way my proposed syntax could be implemented as well, with minimal required new code generation. Just note that "new" generic constraints (i.e. other than |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
In short
Motivation
Imagine having your favourite sorting algorithm:
This is the sort of method you would see in a general-purpose library of algorithms, but it can only be used when you know the type of
T
beforehand and you can constraint it to beIComparable<T>
. What if you happened to be in a situation where sorting is optional and could improve performance of other algorithms, but can be omitted whenT
cannot be compared with itself?This code wouldn't be very useful if its purpose was to improve performance, as reflection and dynamic invocation is not really the fastest. It can be improved by caching the result of
GetMethod
and perhaps keeping a collection of delegates for every encountered type, but it becomes more and more complex when you try to make it more performant.(I am aware that
Comparer<T>.Default
can be used in a case like this, but there are other examples where such a class isn't available.)Idea
Rather than a simple "type-is" operator (something like
if(T : IComparable<T>)
), I propose a new addition to the current pattern matching system that would enable obtaining new types from a pattern:In the inner scope,
T2
can be used as a type andsortableList
is guaranteed to beIList<T2>
if the check succeeds. Notice that hereT2
doesn't even need to be identical toT
; in fact, the function need not be generic at all andlist
can beobject
, which is useful when it comes from an external source (deserialization) or dynamic context.Such a code would be, in essence, transformed to something like this:
Possible implementations of the dynamic call are shown later, but they are not the core point of this proposal.
Syntax
The syntax is an enhancement to the current type pattern:
The type would additionally allow
new <tname>
anywhere in the type to denote a new type extracted from the pattern.<constraints>
are standard generic constraints restricting the newly introduced types. Like in the normal type pattern,<name>
can be omitted.The type referred to with
new <tname>
is usable from the moment it is encountered, and can be used in the rest of the pattern, without thenew
(is (new T, T)
checks for(object, object)
,(string, string)
etc.).In addition to
new T
,_
could also be used as a sort of a wildcard without any constraints, i.e. as inobj is List<_> l
. This is treated in the same way asnew T
, but the type cannot be referred to in any way after that.new T
doesn't have to be used only as a generic argument; it can be the whole type, as inobj is new T val where T : struct, IComparable<T>
.The provided constraints should be compatible with any generic type definition that is used with an extracted type. For example
is List<Nullable<new T>>
is not valid asT
onNullable<T>
is constrained tostruct
.is List<Nullable<new T>> where T : struct
is valid.Semantics
If the match succeeds, the extracted types are selected in any way that satisfies the pattern. This means that if the type can be matched in two or more ways, the compiler or the runtime is allowed to obtain the types in the most efficient way, e.g.
((IEnumerable<int>)obj) is IEnumerable<new T>
simply makesT
an alias ofint
as that is guaranteed to be always possible.Implementation
I do not propose a concrete method of handling this feature, but here are examples of what it could be transformed to:
DLR-based
Without generating any complex code and transforming compile-time code to runtime checks, the DLR can be used to merge the checking of type and invoking the method into a single call. Describing this in terms of
dynamic
operations also makes it easier to define:Using the DLR, the code above would simply be analogous to this:
When used in a
switch
statement, multipleInner
methods could be produced, performing overload resolution at runtime as well (but only when the order ofcase
statements follows the order of specialization of the methods).This relies on the capabilities (and limitations) of the DLR, but while it may use the optimizations already in place for
dynamic
, it has the downside that the current syntax doesn't give you any hint that the DLR is use (therefore Microsoft.CSharp is needed).A different syntax could be imagined, one that reflects this fact better, such as
obj1 dynamic is List<new T> list
(or possiblyobj1 is dynamic List<new T> list
, but that sounds more like shapes/duck-typing to me).Reflection-based
The code above can be roughly translated to this:
The actual transformed code would be of course more complex, employing caching to reduce unnecessary calls as much as possible. With some help from the runtime, it can even be completely removed if
T
already matches the pattern in one instance of the method for a specific generic argument.Precise algorithm to use during transformation
1) Move the scope where the new types are usable into another method with those types as generic parameters (with the same constraints). The first parameter of the method is the original value, the rest are references to the variables used within the scope. If the original method has generic parameters, the new method will be located in a new generic type with the same parameters. The first statement of the method casts the original value to the matched type (if the cast fails for some strange reason, the original check will fail then as well).Generate a new delegate type to be used for the specific instantiations of the new method, without the new generic parameters. Create a map from
System.Type
to the delegate type, used for every encountered type of the original value when necessary.Determine if there are shortcuts to satisfy the pattern without relying on the runtime type of the object. For example,
((IList<T>)obj) is IList<new T2> where T2 : struct
can be satisfied by checking ifT
is a non-nullable value type (in a specific instantiation of the method) and the result can be simply cached, if the check succeeds (if not, it goes to the full check). Similarly, if the pattern can be statically proven or disproven that it can be matched (for example if the original type is a sealed type not deriving fromMarshalByRefObject
), the compiler can simply emit a constant true or false and select the proper types without any runtime overhead. If the original value isnull
, the pattern is never matched.If the match can be statically determined, the result is used. If the match depends on generic parameters of the method or its containing type, a
bool
value is stored that is initialized once for every combination of generic arguments. If the result of the match isn't determined this way (or the generic check fails), a full runtime check occurs:ICustomTypeProvider
, its result can be used first, or another interface can be invented if better dynamic support is needed.null
).is new T[]
,is new T[,]
etc.), the corresponding properties are checked.new T
) is found as one of the concrete types.System.Type
tuples for all the new extracted types. The first tuple that matches all the contraints is used, or the match fails if there is no such tuple.If the types can be statically found, and the runtime cached check succeeds, the inner method is instantiated at compile-time and called normally. Otherwise, if the full runtime check succeeded, the inner method is taken and instantiated via
MakeGenericMethod
, then bound to the delegate, stored in the map, and invoked via the delegate.Why a language feature?
The same question can be asked for the
dynamic
keyword and many other features. The whole transformation can be performed manually, but a simple implementation will be very inefficient, while as it becomes more complex, it also becomes more error-prone and could suffer from oversights, like not being thread-safe etc. Incorporating it into the language allows representing a complicated system in a simple and intuitive way, and it is more flexible when you make changes to the code.The compiler can also emit one thing that cannot be expressed manually:
ldtoken
for methods, meaning the inner method can be always found without worrying about its name. Other things that the runtime might eventually support could be used to further improve the performance.Lastly, the compiler is in the position to select the optimal strategy when analyzing the type, deducing as much information at compile-time as possible. In the first case with the
IComparable<T>
check, the compiler would be able to unifyT
withT2
and "add"IComparable<T>
to it, so the inner scope would be able to callLightSpeedSort
. The resulting code would be unverifiable (because the method itself doesn't specify the constraint), but valid nevertheless. This cannot be normally done, as the compiler doesn't allow ignoring generic constraints in any way.Scoping and unbound variables
In the case of
if(x is T y)
,y
is in scope in the inner statement or block (where it is assigned), but it is also in scope in the code that followsif
(where it is unassigned). This is different toif(x is IList<new T> y)
, as the type ofy
is not defined after theif
and so the variable is not defined either. If consistency is desired, it can be implemented in any of these ways:If the check failed and condition is false,
y
remains unassigned. Whiley
itself exists as a variable, its type is not fully defined and can never be assigned again.If the check failed and condition is false,
y
remains unassigned. The type ofy
shall be considered unbound until it is assigned a value that can be statically matched with the type pattern ofy
.The code that follows and uses
y
becomes part of the inner method, which is instantiated (at compile-time) with the types inferred from the assigned value. For example:Note that if
l
is unassigned, its type is (for the moment) effectively an unboundList<T>
, and the first assignment to the variable binds it toList<float>
. Before any assignment,T
is also considered to be "unassigned". This feature can be optionally extended to any variable definition:In each of the branches,
T
is concretely determined, but after this code,list
is assigned butT
is only usable based on its constraints (value type). Again, this would be transformed to an inner generic method, instantiated at compile-time with eitherint
orlong
.default
andas
This mechanism can be extended to
as
as well. In case the pattern is a nullable type (reference orNullable<T>
) and the check fails, there is a need to assign a default value and a default type like in this case:The branching can be implemented as outlined in the previous section, but
T
still has to be provided with a "default" value.In this case, the compiler could generate a "dummy" unique type that satisfies all its constraints. Unless
new()
orstruct
is used, the type would be abstract, so no potential interfaces have to be implemented. Otherwise, the necessary methods would be implemented as throwingNotImplementedException
or another exception.For example,
var l = obj as List<new T> where T : struct, IEquatable<T>
is transformed as follows:l
would be only good for performing a null check, but at least that common pattern can be used. The??
operator might be used to provide a better default value (like withif(!(obj is…
):(obj as List<new T>) ?? new List<object>();
, again separating the rest of the code into a generic method like before withList<float>
.Examples
The text was updated successfully, but these errors were encountered: