Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design Notes on Kotlin Value Classes #237

Open
elizarov opened this issue Feb 1, 2021 · 82 comments
Open

Design Notes on Kotlin Value Classes #237

elizarov opened this issue Feb 1, 2021 · 82 comments

Comments

@elizarov
Copy link
Contributor

elizarov commented Feb 1, 2021

This issue is for discussing and gathering feedback for Design Notes on Kotlin Value Classes: https://github.com/Kotlin/KEEP/blob/master/notes/value-classes.md

@altavir
Copy link

altavir commented Feb 2, 2021

Yesterday we had a long discussion in kotlin telegram chat about that. There was a nice proposal there that should be at least discussed. The idea is to be able to isolate all value class mutations into a single mutating lambda function. It could be done in two ways: either allow mutations only in a mutating lambda/function with an appropriate receiver or allow those lambdas alongside simple mutations (one can also allow only lambdas at first and relax the rule later).

There are two practical benefits of such lambdas:

  1. If we are doing several mutations, we need to either create several copies of a value object with subsequently changing fields or we need a scope for atomic changes. Subsequent changes could be in theory detected and glued together, but it would require a lot of work from the compiler and in the end, won't be guaranteed to work properly in all cases. The lambda on the other hand could guarantee atomicity of changes.
  2. Introducing a lambda provides a small barrier in the usage of such operations. One must remember that copying value-classes is still an expensive operation and one should not use it mindlessly. The performance is maybe not a primary point but it seems a bit unexpected when two different assignment operations give such dramatic performance difference.

Also, I have an additional educational point in favor of isolation - it is a mental model. The mental model explained in the design notes is understandable from the point of view of an experienced programmer, but there are a lot of concepts there. All those mutating modifieds, var declarations, etc add a lot of things to understand and remember. Whereas mutating via lambda keeps things simple - it is just in-place mutation that does not affect previous usages of the variable.

At the end of the design notes document, there is a short discussion about non-constructors for data classes (the idea is really nice, but I think it should be discussed more). Those non-constructors are in fact follow the same idea of mutating lambda, so having them both would probably also simplify the mental model.

@elizarov
Copy link
Contributor Author

elizarov commented Feb 8, 2021

Update on the var keyword reuse

See section on Mutable properties of immutable types

This forum post quotes an important use-case https://discuss.kotlinlang.org/t/inline-classes-var-to-be-deprecated/20762 where inline value class is used as a "handle” to some mutable data storage. It nicely shows how a regular var can be used on immutable value classes for a good reason.

We’d love to rescue and keep this use-case. We don’t have an alternative plan that will make it possible yet, but we are looking for it. There are two potential approaches.

Note: both approaches also open a road to implement regular (non-value) interface with regular var properties by immutable value classes, which is not directly possible if var is hardcoded to have a different meaning in value and non-value contexts.

A new keyword to denote properties with mutating setter (wither)

In addition to two kinds of properties that we already have in Kotlin (val for a read-only property with a getter and var for a writable property with a getter and a setter) we can add a separate keyword for the 3rd kind of a property – a mutating property with a getter and a mutating setter (wither).

How to name it? That is the question!

Introducing the 3rd type of property will make the resulting scheme harder to learn and might require the new hard keyword in the language. However, we can alleviate the former concern by nudging the learners by IDE inspections. For example, value class Foo(var prop: Type) could have an automated suggestion to change var with a backing field (that will not be allowed on a value class) to this new property type.

A new modifier to denote a regularly mutatable property on an immutable value

Alternatively, we can add some new modifier that, being added before a var property on a value class (or interface) turns into a regular mutable property with a setter that is supposed to mutate some mutable data structure and does not affect the value it is being called on.

Basically, this will be a modifier that is opposite to mutating (in the same sense that open is opposite to final). So, that, by default all var properties on a value class/interface will be considered to implicitly have a mutating modifier, while all properties on a regular non-value class/interface will be considered to implicitly have this new "opposite to mutating" modifier.

How to name it? That is the question!

@fvasco
Copy link

fvasco commented Feb 8, 2021

Please sorry for my trivial question, but I missed something important.

If a value class is an immutable data type, why is it important to mark a property like a var or a val?

All other properties are really similar to regular properties without backing field.

If so, the Grid example can be written as:

value class Grid(ptr: Long) {
    var width: Int
        get() = GridNative.getWidth(ptr)
        set(v) { GridNative.setWidth(ptr, v) }
}

or

value class Grid(ptr: Long)

var Grid.width: Int
        get() = GridNative.getWidth(ptr)
        set(v) { GridNative.setWidth(ptr, v) }

It looks really similar to a regular class

class Grid(val ptr: Long)

var Grid.width: Int
        get() = GridNative.getWidth(ptr)
        set(v) { GridNative.setWidth(ptr, v) }

We really need a new modifier?

@elizarov
Copy link
Contributor Author

elizarov commented Feb 9, 2021

@fvasco

If a value class is an immutable data type, why is it important to mark a property like a var or a val?

Consider, for example, an immutable type for complex numbers with properties re: Double (for its real part), im: Double (for its immaginary part), absoluteValue: Double, etc. For our discussion, it is completely unimportant how these properties are actually stored, whether they have backing fields or not. All of them can be stored in backing fields, or just a few can be stored. Typically re and im will be stored, while others will be computed, but that is not important in larger business data structures, as the "storage approach" for some big enterprise entities might even evolve and change over time.

Regardless of "backed by a field or not", from all these properties, re and im properties are special. You can always reconstruct your complex number with different values of re and im. They are "mutatable properties of an immutable class". They are the ones we'll want to mark as var or in some other way (see the comment above). But, for example, an abosoluteValue property is truly read-only (val property). You can only query it, but you cannot, in general, update an abosluteValue on an arbitrary complex number and expect a meaningful result in all the cases.

For this specific example of a complex number you could define mutation for abosoluteValue if you want, but you'll have to figure out what to do with zero. For larger business entities, there could be lots of readonly (val) properties that cannot be meaningfully mutated at all.

Now, having read this, you can say that Kotlin already has a feature to distinguish these two kinds of properties on immutable classes. In Kotlin, we can define an immutable Complex class like this:

data class Complex(val re: Double, val im: Double) { // constructor properties
    val absoluteValue: Double // other properties
}

That is, we can distinguish different kinds of properties by whether they are constructor properties or body properties. Constructor val properties are actually mutable, since they can be mutated using a copy() function to get a new value. But this constructor/non-constructor is a wrong dichotomy:

  • Constructor properties must be backed by a field. While this is a good default, there is no intrinsic relation between mutation ability and being in a stored field (e.g. multiple mutable properties can be packed into a single backing field). We could lift this restriction on the constructor properties (with some effort) but see the other point.
  • Constructor properties are a class-only concept. What if you need to define an interface for you class, how do you distinguish those two kinds of properties in an interface? What if you need to define an extension property? How do you distinguish between a truly read-only property and a property that has a mutating setter ("wither") (an example is given in this section of the KEEP).

@Animeshz
Copy link

Animeshz commented Feb 9, 2021

Constructor properties are a class-only concept. What if you need to define an interface for you class, how do you distinguish those two kinds of properties in an interface? What if you need to define an extension property? How do you distinguish between a truly read-only property and a property that has a mutating setter ("wither") (an example is given in this section of the KEEP).

I guess the value classes should just be sort of literals (because they really are not allocating memory on the heap, like another wrapper classes) and hence fully immutable. (Correct me if I'm wrong)

Also written in KEEP:

Values classes are immutable classes that disavow the concept of identity for their instances.

operator fun Complex.add(other: Complex) should return another complex instead of mutating it. Just like operations on primitive doesn't change themselves but returns a new value.

complex.re += 5 doesn't really make much sense but rather complex += Complex(5, 0) do more.

And to extend/shrink modulus of a complex number there can be another function to take the components, form a new complex and return that.

Edit: Also wanna add, that there could be a annotation which can generate a compile-time copy() method (Similar to the kotlinx.serialization) in case there's a need to change only a few fields 👀.

@fvasco
Copy link

fvasco commented Feb 9, 2021

Hi, @elizarov,
I noticed that I already answer with a really similar example here.

My question is: how many modifier are allowed for re and im properties?

Function's parameters are val by default, and there isn't any other possible modifier.

Sincerely, I didn't see the point to add some kind of wither modifier.

I made a trivial example, just to understand better the issue.

value class IntPointer(val+wither pointer: Long) {

  var value : Int get() ... set() ...
}

How this declaration differs from:

value class IntPointer(val pointer: Long) { ... }
value class IntPointer(pointer: Long) { ... }

If any value class is identified by its state, I don't need to update an unmodifiable type, I have to create a new value with arguments, and it is always possible to create a new value with different values (please correct me if I wrong).

I really see a value class like a primitive type.

var real = 5.0
var imaginary = 2.0 + 3.i

How differs these two variable?
I don't considering primitive/object question (there is no primitive type concept in Kotlin), nor stack/heap allocation (Kotlin does not guarantee allocation type for both type).

@elizarov
Copy link
Contributor Author

elizarov commented Feb 9, 2021

@fvasco The difference is syntactic. The goal is to enable working with immutable classes just like with mutable types. Indeed, every time you "modify" an immutable class you create a new value. We want the syntax for this operation to be as concise as for mutable classes so that people who choose to model their domain with immutable classes don't suffer from the added boilerplate. Now, assume I want to modify a Complex number (that is, to create a new complex value) by adding 5 to its real part (taking example from @Animeshz). I don't want to write this, as this is verbose and does not actually express what I wanted to do:

complex += Complex(5, 0)

I want to write this, just like I do it for mutable classes:

complex.re += 5

The example is not that great for complex numbers, though. The actual design notes have a more striking example where the syntactic difference is quite apparent in the section on deep mutations.

@fvasco
Copy link

fvasco commented Feb 9, 2021

I agree with you, @elizarov.
But I don't understand how your examples requires a wither.
Maybe we have different view on the same problem and I not exposed my view very well.

I view a strict analogy from base classes (Int, Char, ...) and value classes: both miss of identity and both are immutable.

Considering your example above, in my view it is possible to update a variable if it is declared as a var, regardless it is a base or a value class.

So:

value class Complex(re: Double, im: Double)

var mutableComplex = 1.0 + 2.0.i
mutableComplex += 5.0 // works

val constantComplex = 1.0 + 2.0.i
constantComplex += 5.0 // fail

Regarding the deep mutation, assuming that all types are value classes, the statement order.delivery.status.message = updatedMessage works if order is declared as var order.
Instead it is not possible to update val order (like val c: Char).

@edrd-f
Copy link

edrd-f commented Feb 9, 2021

Some general feedback: value classes for the purpose of optimizations look great. Often it's necessary to model domain objects like Email(value: String) and Username(value: String), and having to box/unbox them all the time is unnecessarily expensive.

On the other hand, the whole "immutable way of working with mutability" (i.e. creating new instances when assigning new values for var/wither properties of value classes) is very implicit. When I see point.x += 10 it's not possible to know if this is an assignment happening to a regular class or a value class without inspecting point. What if the code base is heavily multi-threaded and Point starts as a value class and is later changed to a regular class? With the current design, it seems this would compile with no errors or warnings, however all sorts of bugs would appear.

Using copy makes it explicit, so I know what's going on without having to look at the class definition. Sure, the IDE could give some hints, but we also spend a lot of time reviewing code on platforms such as GitHub, where there's no intelligent code navigation for Kotlin.

I understand copy is much more verbose whenever it's necessary to update deeply nested values, but I'd like to see some real use cases for this. The example given on the Deep mutations section looks ad-hoc and I think it's important to also have real use-cases to see if we're not trying to solve locally what should be solved through better code design.

Another point is how compiler plugins could be used to solve some of the mentioned issues. For example, the following:

order = order.copy(
    delivery = order.delivery.copy(
        status = order.delivery.status.copy(
            message = updatedMessage
        )
    )
)

could be

val updatedOrder = order.deepCopy {
  delivery.status.message.mutate(newValue = updatedMessage) 
}

where deepCopy and mutate could be generated for every value class. This would be more explicit and wouldn't (probably) require language changes.

@philipguin
Copy link

philipguin commented Feb 9, 2021

@fvasco Continuing from the other thread:

On potentially making val/var superfluous and removing them altogether, there would be a few things to consider:

  1. What if, conceptually, we’d like to restrict which combination of things can belong to a value class? In that case, val/var would have some utility, where a library could emit “instances” and the user could “change” only a subset of fields. This means the library author can better protect against misuse.
  2. Removing the keyword altogether may introduce parse issues, depending on how it’s implemented. I don’t think there’s a precedent for the syntax anywhere else in Kotlin.
  3. What if, on Kotlin/Native for instance, they do want to eventually allow mutating fields directly? In that case, they would definitely want to keep var/val in use.

I’m sure there’s a lot more, but it is still interesting to consider.

More generally, I notice the comparison of value types to primitives occurring frequently, but I have to wonder how useful that analogy really is. In a theoretical sense it may be nice, but the “law of leaky abstractions” probably applies here as well. We are adding a lot of functionality to what was previously just the set of primitives after all, and the list of supported features can only grow.

I also have to wonder if Valhalla won’t be petitioned to add normal field setters at some point in the future - assuming it’s even possible. The desire to focus on immutability rather than just emulate C# structs always struck me as odd, though I’m sure they have good reason for it. Carving out space for this in Kotlin might not be the worst idea, especially for Kotlin/Native.

Finally, on the case of non-mutating custom setters: I’m not sure what the “mutating” keyword would be, but my vote is for option 2 suggested by @elizarov, where the keyword is only needed on the non-mutating case. However, what if we moved the keyword to the setter itself?:

var x: Float
    get() = ...
    const set(v) { ... } // or whichever keyword

By doing so, const is available as it cannot clash with the existing const val. Not sure if there's another consideration.

@fvasco
Copy link

fvasco commented Feb 10, 2021

@philipguin

What if, conceptually, we’d like to restrict which combination of things can belong to a value class?

This issue should be addressed in the init block.
How some kind of var can forbid something like `Complex(NAN, INFINITY)?

Removing the keyword altogether may introduce parse issues. I don’t think there’s a precedent for the syntax anywhere else in Kotlin.

Nothing new.

class Sum(a: Int, b: Int) {
  val n = a + b
}

Yes, updating the parser is an issue, but we are considering changing the language.

What if, on Kotlin/Native for instance, they do want to eventually allow mutating fields directly?

I will vote no for this proposal.
There are enough mutating types in Kotlin, I think we can define an immutable one, at least.

@fvasco
Copy link

fvasco commented Feb 10, 2021

@edrd-f

What if the code base is heavily multi-threaded and Point starts as a value class and is later changed to a regular class?

With my proposal, switching from

value class Point(x: Int, y: Int)

to

data class Point(val x: Int, val y: Int)

results in a compilation error.

Instead, switching from immutable to mutable leads to an errors' party.

@elizarov
Copy link
Contributor Author

@edrd-f

Another point is how compiler plugins could be used to solve some of the mentioned issues.

Everything can be solved by a plugin. The whole idea of having better support for immutability is to have this plugin built-in into the language itself.

For example, the following /skipped/ could be

val updatedOrder = order.deepCopy {
  delivery.status.message.mutate(newValue = updatedMessage) 
}

This is similar to the concerns that @altavir shared above, so let me answer both here. The argument is that these "implicit copies" are spooky, hard to see without IDE, etc, so let's have some more verbose, more explicit, scoped syntax for them. We could invent lots of different approaches to this "more explicit syntax" (the quote above shows just one). These concerns are valid, yet, regardless of how this "more explicit syntax" will look like, they gloss over the key fact:

Mutations (creating new copies) of immutable types are safe (they cannot have side-effects on unrelated parts of the system), while mutations of mutable types are dangerous (they can accidentally affect unrelated code that kept a reference to the same instance).

It is wrong to design a language in such a way that performing a safe and mostly harmless operation (like creating a new immutable value) requires more boilerplate than performing a more dangerous operation (like updating a field in a mutable instance) that requires a lot of attention and forethought from a programmer.

Indeed, if those two kinds of mutations look syntactically the same, then changing a (safe) immutable value class to a (dangerous) mutable class will cause the old code to still compile, yet it will start producing weird bugs. Let's see how this problem can be solved. Both @altavir and @edrd-f propose to distinguish (syntactically or contextually) these two kinds of mutations. However, taking into account the key fact, it means that we have to make (dangerous) mutations of mutable classes more explicit, not vice versa. The safer operation should not be more verbose than a more dangerous one.

There is a simpler solution to this problem. First of all, note that the problem is not novel. It is happening right now all the time. People make mistakes of using mutable classes where they should be using immutable ones. As you write, it compiles "with no errors or warnings, however, all sorts of bugs would appear."

However, as we improve support for immutable classes in the language (make it less burdensome to use immutable values) we can start adding mechanisms to require immutability in various contexts (like asynchronous data pipelines) or at least warn on attempts to use mutable data there. Now if you accidentally use a mutable class where you should have been using an immutable one then you'll get, at least, warned.

@elizarov
Copy link
Contributor Author

elizarov commented Feb 10, 2021

@fvasco But I don't understand how your examples require a wither.

Wither is not needed for mutableComplex += 5.0 (it works in Kotlin now!), wither is needed for order.delivery.status.message = updatedMessage:

  • It needs to create a new copy of status with an updated message field.
  • Then it needs to create a new copy of delivery with an updated status field.
  • Then it needs to create a new copy of order with an updated delivery field.
  • Only then it assigns this new copy of order to the order property/local (which must be var or it will not compile).

@philipguin
Copy link

philipguin commented Feb 10, 2021

Would custom setters be called for each nested “wither” in the example? That would explain the motivation for disallowing them. However, the case I’m interested in is the one lacking a backing field, which I suppose wouldn’t be touched being just functions, essentially.

@fvasco
Copy link

fvasco commented Feb 10, 2021

Yes @elizarov, I agree.

But I suppose to argue about new modifiers (#237 (comment)), not about implementation details.

@kyay10
Copy link

kyay10 commented Feb 10, 2021

I think it's fair to mention here the partially related issue of KT-44530. The quick TL;DR is that currently there's a missing optimisation whenever you return a lambda or store it in a local variable that only gets passed to inline functions because the compiler doesn't realise that the lambda doesn't have to be boxed. If that optimisation is implemented, one could use a @JvmInline value class with a backing lambda with a few tricks to simulate a multi-field value class that doesn't have to be boxed if it doesn't need to. Or even better yet, the kotlin language can include some sort of @AutoInline or @LambdaBacked or an annotation along those lines that automatically writes all that boilerplate for you, or it can even use the same underlying mechanism as the lambda optimisation (which should be not incredibly hard to implement) to avoid boxing a multi-field value class when it doesn't need to box it. My point here is that even if the Kotlin team isn't interested in the idea of @AutoInline value classes right now or just doesn't have the time to implement them, they can still be implemented manually by the users if that lambda optimisation is implemented. Again that optimisation is possibly trivial and, from a certain prespective, it is kind of something that a beginner who just learned about inilne functions probably would expect from from the compiler to do automatically; check the linked issue for more detailed info and examples as to what this brings onto the table in terms of possibilities. Hopefully I'm not going super off-topic because I do think that this is probably relevant to this discussion as a future possibility at least. Feel free, however, to delete this comment if you think that it's harmful to this current discussion.

@rnett
Copy link

rnett commented Feb 10, 2021

One question/clarification about value interfaces: how do value interfaces and non-value interfaces interact inheritance-wise? It doesn't make sense for a non-value interface to extend a value interface, but the opposite (i.e. ImmutableList extending List) does. I would think value interfaces would be allowed to extend non-value interfaces, but I didn't see it specified anywhere.

Two questions about the scope functions section:

If I understand your mutating inline fun <T, R> T.run(block: mutating T.() -> R): R example right, it wouldn't be callable on non-var value objects?

If I understand the apply example right, there is a difference between:

var state: State = ...
state.tags += "tag"
state.updateNow()

and

var state: State = ...
state.apply{
    tags += "tag"
    updateNow()
}

where the proper replacement of the first snippet is

var state: State = ...
state = state.apply{
    tags += "tag"
    updateNow()
}

This is different than how apply is used currently (the docs say "The return value is the object itself", and while the identity is irrelevant, this wouldn't do it with respect to equality either) and I expect it would trip people up. Especially in cases where you are using apply on deep mutable or immutable var variables, i.e. call.symbol.owner.apply{ } differs in whether it updates owner depending on if owner is mutable or an immutable var value.

Given that we essentially want the lambda to match the call site's mutability, one easy way to handle this would be allowing overloads on mutating. Something type system based seems more ideal though (like the tagged projections that were mentioned).

Another thing that was hinted at a bit in the "Read-only collections vs immutable collections" section: MutableList will have essentially the same methods as a var ImmutableList. It would be nice if those could be combined, like if there was a way for a value class/interface to implement interfaces only when mutable (using it's mutating methods). It would prevent having to create duplicates of any mutating functions for MutableList and mutating ImmutableList, etc. This probably falls under the tagged projection proposal as well though.

@edrd-f
Copy link

edrd-f commented Feb 11, 2021

@elizarov, first of all, thanks for taking the time to reply to our feedback.

About these statements:

It is wrong to design a language in such a way that performing a safe and mostly harmless operation (like creating a new immutable value) requires more boilerplate than performing a more dangerous operation (like updating a field in a mutable instance) that requires a lot of attention and forethought from a programmer.

Both @altavir and @edrd-f propose to distinguish (syntactically or contextually) these two kinds of mutations. However, taking into account the key fact, it means that we have to make (dangerous) mutations of mutable classes more explicit, not vice versa. The safer operation should not be more verbose than a more dangerous one.

I agree when looking through an idealistic perspective, however, from a realistic perspective, I don't think it will be practical to have more warnings/boilerplate for mutability while switching the default to immutabilty considering Kotlin has to interoperate with Java and JavaScript, and these use mutability heavily. It's the same story as flexible types: ideally, everything would be nullable when dealing with platform types, however it would lead to an insane amount of null-handling code and noisy type declarations, so there's a pragmatic compromise of safety for interop conciseness. So this:

[...] as we improve support for immutable classes in the language (make it less burdensome to use immutable values) we can start adding mechanisms to require immutability in various contexts (like asynchronous data pipelines) or at least warn on attempts to use mutable data there. Now if you accidentally use a mutable class where you should have been using an immutable one then you'll get, at least, warned.

Has two potential issues: worse interoperability and more IDE dependency to know what's going on (which is bad for code reviews). The alternative is to drop the warnings and keep the idea of same syntax for mutable and immutable assignments, however we're then forced to look at the class declarations to know what's safe to mutate and what's not.

The alternative @altavir and I proposed solves the ambiguity problem and I'm sure there are ways of designing something pretty concise so that the additional syntax shouldn't be an issue.

Lastly, I'd like to reinforce what @altavir said about an easier mental model. Kotlin already has properties instead of getters/setters, which some complain "hide behaviors". It has delegated properties. It has dispatch and extension receivers. It's now going to have multiple receivers...
These are some examples of features that have implicit behaviors and require some experience to know exactly what's going on, but the steeper learning curve is justified by how useful they are in almost every domain. Now, given the issues I described, I'm not sure the benefits of the proposed immutability facilities are so widely applicable to justify the extra implicity they create, and that's the reason why I suggested they're provided as a compiler plugin.

@altavir
Copy link

altavir commented Feb 11, 2021

@elizarov @edrd-f Let me reiterate since I've skipped some details that were present in the chat discussion.
The initial proposal by Roman it to do the following:

value class B(var c: Int, var d: Int)
value class A(var b: B)

var a = A(..)

a.b.c += 1 //returns Unit, but the state of a is changed
a.b.d -= 1 //second rewrite

My thought (not originally mine, I am translating the result of the discussion) is to do

a.mutate{ //or mutate(a){ which is probably even better
  b.c += 1
  b.d -= 1
} //both changes are done atomically here and the value

As you see the syntax is exactly the same but is allowed only in a specific colored (see @ilmirus PR) scope. This syntax could actually co-exist with the first one to allow atomic changes and could be used for a restrict-first-relax-later introduction (only scoped changes at first, but also non-scoped later).

The scoped change also has the benefit of always explicitly knowing, which variable is actually a root of a change. In a simple lens assignment, you can't know if the variable is the one that is being changed or an intermediate step.

@quickstep24
Copy link

Did you mean to write value class B(var c: Int, var d: Int)?

@altavir
Copy link

altavir commented Feb 11, 2021

Yes. It won't compile otherwise

@fvasco
Copy link

fvasco commented Feb 12, 2021

In the section Abstracting mutation into functions

A type has to be mentioned twice: as a receiver type and as a return type.

It is not clear what is the return type of the copy method., why state.copy(...) should not return a State?

The intent of writing a function that returns an updated receiver is not immediately clear

Yes, I agree.
But few lines above

Sometimes a convention of naming such functions as withXxx may be used, but we don’t follow it here.

Why need to cover with a language feature a deliberate, wrong programmer's choice?

Finally it isn't not specified in that section how should work the code:

val state = State(...)
state.updateNow()

I don't find enough motivations for a new feature, moreover we should inspect Java interoperability better.

final var state = State(...)
state.updateNow() // or StateKt.updateNow(state)

This is a valid and misleading Java code.

@fvasco
Copy link

fvasco commented Feb 17, 2021

To help the developer to use immutable objects (see also my post above), we can introduce a new operator to "invoke and assign", really similar to "plus and assign" (with similar pro and cons).
I propose you an example, I call it .=, for example.

Here we play with a not mutable list.

var list = listOf(1, 2, 3)
list += 4
list .= filter { it % 2 == 0 }

The example in the previous post became:

var state = State(...)
state .= updateNow()

This operator is more explicit because it has been read on the caller site, there is no magic under the hood, and it is more flexible because it is possible to use with already existent types.

var t = n.seconds
t .= absoluteValue
t /= 42

or

myDataClass .= copy( value = "abc" )

@fvasco
Copy link

fvasco commented Feb 17, 2021

Regarding the section Var this as a ref (inout) parameter in disguise, I try here to enhance delegation to achieve the same goal.

Sincerely, I am not really a fan of this proposal, I don't perceive this feature as undoubtedly useful.

My first idea is to use & to access the delegated instance, i.e.:

val value by lazy { ... }
....
if( &value.isInitialized() ) { ... }

My second idea is to allow delegation in function's arguments, i.e.:

fun <T> f( value by Lazy<T>) {
if ( ... ) println(value)
}

Using these building blocks we can write code as:

// use a `Closeable` and close if it has been initialized
fun <T : Closeable, R> Lazy<T>.useLazy(block: (Lazy<T>) -> R): R =
    try {
        block(this)
    } finally {
        if (isInitialized()) value.close()
    }


    val largeBuffer by lazy { allocateBuffer(10_000) }
    &largeBuffer.useLazy { buf ->
        if (debug) dump(buf)
    }

What does it has to do with inout?

My first idea is to use & to access the delegated instance, i.e.:

val value by lazy { ... }
....
if( &value.isInitialized() ) { ... }

My second idea is to allow delegation in function's arguments, i.e.:

fun <T> f( value by Lazy<T>) {
if ( ... ) println(value)
}

Using these building blocks we can write code as:

// use a `Closeable` and close if it has been initialized
fun <T : Closeable, R> Lazy<T>.useLazy(block: (Lazy<T>) -> R): R =
    try {
        block(this)
    } finally {
        if (isInitialized()) value.close()
    }


    val largeBuffer by lazy { allocateBuffer(10_000) }
    &largeBuffer.useLazy { buf ->
        if (debug) dump(buf)
    }

What does it have to do with inout?
Now it is possible to provide ref variable as a library.

interface Ref<T> { var obj: T }
operator fun <T> Ref<T>.getValue(thisRef: Any?, property: KProperty<*>) = this.obj
operator fun <T> Ref<T>.setValue(thisRef: Any?, property: KProperty<*>, value: T) { obj = value }

fun <T> ref(value: T): Ref<T> = object : Ref<T> { override var obj: T = value }

fun main() {
    var a by ref(1)
    var b by ref(2)
    swap(&a, &b)
    println("$a, $b") // 2, 1
}

fun <T> swap(a by Ref<T>, b by Ref<T>) {
    a = b.also { b = a }
}

This implementation is a little more verbose (explicit) than the native one.
Finally, this implementation is more flexible, developers can implement Ref using a @Volatile var or a StateFlow.

@elizarov
Copy link
Contributor Author

@fvasco To help the developer to use immutable objects (see also my post above), we can introduce a new operator to "invoke and assign", really similar to "plus and assign" (with similar pro and cons).

This would be better discussed under https://youtrack.jetbrains.com/issue/KT-44585 (it is listed there as one of the possible syntactic options).

@andersio
Copy link

andersio commented Mar 9, 2021

Speaking of my app & library development experiences with Swift, the ambiguity as discussed is perhaps an unfortunate projection of the underlying implementation details of this language feature. Maybe it is caused by the close analogy to data class and its copy().

In my opinion, forcing mutations in a block scope indeed makes the intent to mutate stand out. But it is never going to tackle the "ambiguity" at its root, because the "ambiguity" is started before any mutation can happen — value classes effectively play the rules of value semantics.

Conceptually, var obj2 = obj1 is where the ambiguity semantically begins. If we use the analogy of a multiverse, every assignment of a value class instance to a new variable signals the need to branch off a new alternative universe to the original instance, and the spin-offs must be independent from the original timeline from that point of branching off.

If we explain it in terms of value semantics, every rvalue is copy-assigned to the lvalue, notwithstanding the compiler potentially optimizing it away.

This is why I find it slightly unfortunate that the document does not build on concepts like reference semantics v.s. value semantics, specifically for explaining changes in the language spec for the language users. Instead, it seems to be more geared towards implementors, with a wealth of details on implementation approaches.

More specifically, while the document seems to indicate the initial implementation uses a copy-on-write approach, IMO it can still work well with value semantics + copy-assign being the conceptual model for value classes in the language. As far as I can understand, the procedural outcome should be the same — copy-on-write can be treated as an optimization to copy-assign, provided that there is no intervening optimisation.

(like rewriting copy-updates into in-place mutations)

I do understand that there might be a desire to set it apart from C/C++/Swift structs, but adopting these established and distinctive concepts as a tool to better contextualize the new feature need not imply optimization constraints.

For example, the document argues that:

A number of languages, like C++, C#, Rust, Go, Swift, and others have a concept of a struct.
[...]
The conceptual model behind value classes is different. A developer’s intent of writing a value class is to explicitly disavow its identity and to declare, upfront, its immutability. This declaration enables compiler optimization such as passing those values directly, but it does not guarantee them.

When we look at Swift, its language specification indeed does establish that:

  1. value type that is passed-by-copy and copy-on-assignment;
  2. inout references have copy-in, copy-out semantic; and
  3. It also has ABI stability with strong guarantees around flattened memory layout.

While all these might seem to be agreeing with the presented counter argument, the Swift compiler does defy it — many compiler optimizations deviating from these semantics have been implemented and shipped anyways, as long as they did not change the procedural outcome or break the invariants. A couple of examples:

  1. automatically passing immutable small values to function calls in registers;
  2. automatically passing immutable large values to function calls on stack by reference; and
  3. inout reference being a pointer to the backing field memory, as long as the compiler knowns it is safe to go direct (e.g. no getter/setter vtable dispatch needed).

So... hmm, what I want to get to is, that this ambiguity feels more a teaching/explaining issue, rather than a trap warranting an in-language solution. I do think bringing in ref vs value semantics — as the new basis to explain normal classes vs value classes — could be helpful in this regard, rather than the current somewhat “a special kind of class” narrative.

After all, AFAIK, no existing declarations except inline classes will be subsumed/auto-converted to value classes. Using a new type of declaration implies the need to learn new semantics. Sounds pretty fair to me.

@elizarov
Copy link
Contributor Author

elizarov commented Apr 9, 2021

UPDATE: Taking into account use-cases where a value class is used as an immutable handle into a mutable data structure and the need to gradually teach developers new features where properties of immutable classes are somehow updated, we've decided to give an explicit name to the corresponding concept -- a "copyable property", and introduce a corresponding copy modifier both for "copyable properties" (copy var) and for copying functions (copy fun).

The text was updated to consistently use the corresponding terminology and the copy modifier. Motivation for the change was added to the text, too.

The restriction on the use of var properties for value classes is removed.

See #247

@zhelenskiy
Copy link
Contributor

zhelenskiy commented Apr 11, 2021

@elizarov

UPDATE: Taking into account use-cases where a value class is used as an immutable handle into a mutable data structure and the need to gradually teach developers new features where properties of immutable classes are somehow updated, we've decided to give an explicit name to the corresponding concept -- a "copyable property", and introduce a corresponding copy modifier both for "copyable properties" (copy var) and for copying functions (copy fun).

The text was updated to consistently use the corresponding terminology and the copy modifier. Motivation for the change was added to the text, too.

The restriction on the use of var properties for value classes is removed.

See #247

According to the last commit date, there was nothing changed near the date of your message
image

@elizarov
Copy link
Contributor Author

@zhelenskiy

See #247

According to the last commit date, there was nothing changed near the date of your message
image

It is in a separate PR #247, to be merged after review.

@DWVoid
Copy link

DWVoid commented May 6, 2021

I would not be able to offer insights of how advanced features are designed, but I can provide a very basic yet quite desperate use-case that we are currently facing

Background

I am in the processes of writing some game server in my free time. In this server, we need to very frequently perform a wide variety of calculations on three dimensional vectors, ranging from thousands to around a million computations per second.

Current Implementation Method

Naturally, we would want to pack it in a immutable class like this (computation functions omitted)

data class Int3(val x: Int, val y: Int, val z: Int)

But this gave me a very large performance compromise. JVMs are not able do proper escape analysis on this and very simple computations could result in expensive object allocations. So to make this perform better, I have to do the following:

class MutableInt3(var x: Int, var y: Int, var z: Int)

In this class, most computations do self mutations. In intensive computations, this gave me a 5x speed-up over the last version, but the performance is not ideal as this is still has an additional indirect access and it is now very error-prone. However, tens of thousands of such objects still fly around functions every game tick, so I have to further introduce a third way which is to manually writing each field wherever performance is needed, in the way of

var x: Int = init 
var y: Int = init
var z: Int = init

This usually gave me a further 10x speed up in intensive computation regions. Note that val is not used here as it would otherwise be extremely cumbersome. Also, now I have to manually inline the computation functions to avoid object allocations for tuple returns, which made the code significantly bloated and caused problems on readability and maintainability as now there are a couple dozen of implementation of the same code dotted around the code base.

Conclusion

I desperately desire a simple syntax that does compile-time field-inlining for data-class-like structures. Even just simply spreading the fields into the parent scope or function signature whenever possible is better than the current situation of having absolutely nothing. I believe that this could encourage people to box their code in a reusable manner without worrying about absolutely tanked performance.

Additional Notes

I would prefer that fields for such grammar should stay immutable, as from my personal experience writing C# and C++, mutable value class structures actually cause confusions when parallelizing algorithms and working with async computation, as under such circumstances it is non-trivial to be sure that it any passed-in reference of a value object is safe to be used and mutated just based on the structure signature, and even if we are sure, we still need to rely on pure faith that no other programmers will break the current thread safety model.

@joffrey-bion
Copy link

joffrey-bion commented May 6, 2021

I think however that this behavior is too surprising: var c: Int // this is the same as 'copy var'

@CLOVIS-AI I agree that it would be less surprising to plainly forbid var without copy in value classes' constructors.
I don't have a strong opinion about dropping the var, though.

This KEEP considers copy fun also, so should we propose a similar operator for functions?
myWallet.balance.draw(42.eur) // what syntax for this line?

@fvasco Actually that's a good point, I'm not sure how we could do that in a nice way.
Anyway, I'm starting to reconsider my point here, based on what @rnett wrote.

Also, consider what would happen if a.b was a normal var but with a type that is a value class. You would have to use :=, but you would be doing standard mutation on a with all the issues that entails.

@rnett This is indeed a good point, I guess mixing regular vars with copy vars will be confusing with either syntax.

@pdvrieze
Copy link

pdvrieze commented Jun 8, 2021

Overall I find the ideas introduced in the keep quite interesting and worthwhile (especially also the builder constructors). On "withers" (awful name - hard to teach unless it involves soul sand and skeleton heads), I think they are useful in working with immutable types (whether value or not).

Using copy makes it explicit, so I know what's going on without having to look at the class definition. Sure, the IDE could give some hints, but we also spend a lot of time reviewing code on platforms such as GitHub, where there's no intelligent code navigation for Kotlin.

In line with the comment above I think it is important to somehow define a syntax for semantically doing symbol reuse (if there is a reference floating about it now no longer refers to that object). While copy-on-write semantics are beneficial in cases, they can still be confusing and the code should allow consuming code to be understood without knowing that the types involved are immutable/use withers. The usage of := as copying assignment seems to make sense (since it is already an assignment operator in other languages), although I'm not sure what to do with members, perhaps a different member access operator would be better: a.=b.=c="newValue".

A copy function (explicit writing) gets all kinds of issues with nesting and verbosity, I think an operator would be better.

I would expect that in case of value classes, the compiler would optimize as appropriate (if it can elide the wrapper it should also be able to elide copies when setting member properties).

@CLOVIS-AI
Copy link

It is very interesting to compare the behavior of already-existing "value classes" (eg. Int) and the proposal.

We are used to the fact that operators behave very differently on Int:

var a = 5

// This replaces 'a' by another value
// 'a' must be var
a++

compared to normal classes:

val b = Foo()

// This calls 'inc()' on 'b'
// 'b' does not need to be var
b++

Another difference is how it behaves with other variables:

var a1 = 5
var a2 = a1
a1++ // a2 IS NOT modified

var b1 = Foo()
var b2 = b1
b1++ // b2 IS modified, because of references

In my experience, that difference is well-understood by everyone, and although it is a bit strange (unlike Java, it's not immediately clear what is a reference or isn't, especially for beginners), that difference is pretty much standard in all modern programming languages.

String already behaves like this. To my knowledge, it is the only type other than java primitives that requires being a var to use plusAssign. Because the IDE has pretty good error messages, I don't think it is an issue. Extending that behavior to value classes is probably not a problem for readability, as long as they are used exactly for what they're good for.

I'm currently working on a project with Kotlin React, where deeply immutable data structures are common (almost all the UI state), because React requires that the top-level reference be edited to detect a state modification. This leads to code looking like this:

props.update(someData.copy(field = someData.field.copy(…)))

Replacing these data classes by value classes could bring something like:

props.update(someData.field = …)

I believe that, even if it uses the same syntax as mutable modification, and other variables/references are not updated automatically, this is much more readable and can be learned really easily.

@CLOVIS-AI
Copy link

CLOVIS-AI commented Aug 13, 2021

After re-reading the proposal, I am in favor of cov instead of copy var. Maintaining the aesthetic of the three-letters keyword is important enough in my mind to offset the "not perfect" name. Also, it doesn't look like a modifier (public var is still a var, it doesn't do any major semantic change to how it behaves, copy var is a completely different thing).

Using cov, we maintain the mental model that "the last keyword on the declaration line is the one that explains what is really is":
public open val a = 5
When reading the code, in most cases it is possible to ignore all keywords but the last one, and I believe most of us do this unconsciously.

Although cov is not a perfect name, I think it is still the best option. It is also fairly easy to learn: "it stands for copy var, because you use it like var but it behaves like copy".

However, using cov removes the symmetry of copy var & copy fun.

If using the same keyword for both is important, I think this is fine;

value class Foo() {
  cov a = 5
  cov fun foo() {
    a = 6
  }
}

Here, cov would be read as "this is a cov".

From my experience working with ADTs as sealed class + data class, the concept of value interface / value abstract class / value sealed class / value sealed interface is very important. It is too common to write code like:

sealed class R {
  val a: Int
}

data class A(override val a: Int) : R()
data class B(override val a: Int, …) : R()
data class C(pverride val a: Int, …) : R()

// This is necessary to be able to work with sealed hierarchies of immutable state, currently
fun R.copyA(newA: Int) = when (this) {
  this is A -> copy(a = newA)
  this is B -> copy(a = newA)
  this is C -> copy(a = newA)
}

// Usage (example using Compose):
@Composable
fun Component(foo: R, update: (R) -> Unit) {
  Text(foo.a)
  Button("Update", onClick = { update(foo.copyA(10)) })
}

This is necessary, because the copy method is compiler-generated, and there is no way to move it higher in the hierarchy.

Instead, something like this could be used:

sealed value class R {
  cov a: Int  // or 'copy var'
}

value class A(override cov a: Int) : R()
value class B(override cov a: Int, …) : R()
value class C(override cov a: Int, …) : R()

// No need for any boilerplate copy function on the sealed class

// Usage
@Composable
fun Component(foo: R, update: (R) -> Unit) {
  Text(foo.a)
  Button("Update", onClick = { update(foo.a = 10) })
}

Indeed, because value classes don't require all their constructor parameters to be attributes (unlike data class), this can be shortened to:

sealed value class R {
  cov a: Int
}

value class A() : R()
value class B(…) : R()
value class C(…) : R()

@joffrey-bion
Copy link

joffrey-bion commented Aug 13, 2021

@CLOVIS-AI While I agree we should consider Int when thinking about this, your examples don't seem to be valid here.

'b' does not need to be var

Yes it does. Same as for the Int variable. You get the compile error Val cannot be reassigned if you declare val b = Foo(), and try to do b++.

Your argument does hold for String and plusAssign, though.

b2 IS modified, because of references

b1 is actually re-assigned with the return value of inc(), and b2's reference doesn't change. The instance b2 points to might be mutated if inc() is implemented incorrectly, that's right, but it's independent of this operator's mechanism. Any other method call could have the same effect.

See this playground example of your code.

@CLOVIS-AI
Copy link

@joffrey-bion Ah, sorry. It seems I confused ++ (which assigns the return value, and doesn't mutate the object) with += (which mutates the object, without using the return value).

But then I guess it's even more of a proof that this behavior (assign a copy, without mutating the original object, therefore without impacting other references) is not stranger to Kotlin, since that's how ++ behaves in all cases. If I've been able to read code without understanding the difference, I would not be surprised that reading code with value classes & deep copy wouldn't be an issue.

@joffrey-bion
Copy link

@CLOVIS-AI I don't believe ++ supports the point you are making here. To me the "reassigning" behaviour is part of the definition of ++, which is perfectly OK because it's consistent. No matter the type of the variable, it always requires a var and will reassign that var.

What does support your point is the inconsistency in the behaviour of += (primitives and strings VS other objects). And I do find it strange when I think about it, but I have to admit I never had second thoughts so far when I used it on strings. That being said, I almost never use +=, and particularly not on strings.

This inconsistent behaviour is why I was initially suggesting using a distinct operator like :=, for which the definition would be that some variables along the a.b.c path are reassigned, and it would be consistent.
My concern with the proposed syntax is that the = operator would behave differently depending on the type, e.g. a.b = x either changes a or doesn't depending on a's type.
But as pointed out by others, a different operator also has its flaws: no equivalent for copy fun calls, and unclear behaviour for mixed chains of mutable and immutable objects (e.g. a.b.c.d = x where some of these fields are mutable and others are copy vars). So now I'm not convinced either way 😅

@mertceyhan
Copy link

mertceyhan commented Nov 1, 2022

Hey! I was trying to access a value class from Java but the code doesn't compile. Is this an interoperability issue or am I missing something? Kotlin version is 1.7.20

Screenshot 2022-11-01 at 20 29 51

@janvladimirmostert
Copy link

janvladimirmostert commented Nov 1, 2022 via email

@benibela
Copy link

benibela commented Mar 4, 2023

I was trying to use a value class as field, but it does not work:


@JvmInline
value class ValueClass(val i: Int)

class Test {
   @JvmField var v: ValueClass = ValueClass(123)
}

I think it should create the class with an Int-field.

I my project I access the classes with JNI, and it is easier to work with fields than methods in JNI

@elizarov
Copy link
Contributor Author

elizarov commented Mar 6, 2023

benibela I was trying to use a value class as field, but it does not work.

KT-57130 Inline classes: JvmField support

@arkivanov
Copy link

I've got some questions about copying value classes, in comparison to data classes. I would be happy if someone could answer them and/or take into account.

1. Reducer-style functions

There is a use case for copying data classes in a reducer-like style:

fun reduce(state: State, msg: Msg): State

Or in some cases I've seen the following pattern (something related to TEA architecture):

fun reduce(state: State, msg: Msg): Pair<State, Cmd>

The proposed solution with value classes is to use functions like copy fun State.update(). While this works in general, I find that the copying reducer style has one important feature. It allows two ways of changing the state, either copy the current one with some properties changed, or create a brand new instance of state with some properties initialized to non-default values. For example:

fun reduce(state: State, msg: Msg): State =
    when (msg) {
        is Msg.Foo -> state.copy(a = 1, b = 2)
        is Msg.Bar -> State(c = 3) // Reset the state with some properties changed
    }

For me it it would be pretty important to have this supported with copyable value classes.

2. Amount of allocations

I couldn't find this in the design notes, so asking here.

When we copy a data class, there is only one allocation produced.

var state = State()
state = state.copy(a = 1, b = 2, c = 3)

With value classes this is going to look like this:

state.a = 1
state.b = 2
state.c = 3

How many allocations will the second variant produce? This may be especially important when deep-copy the state.

3. State consistency

When we copy and re-assign a data class, the variable either points to the old object or a new one. The actual data is always consistent. Moreover, when we use fun reduce(State, Msg): State style, the function is free to copy the state as many times as it needs to, but the returned state is always consistent, and the state variable never points to something inconsistent.

If the state needs to be shared between threads, it can be marked as volatile or placed in AtomicReference. Every thread will see only consistent states. If the state can only be updated on the main thread, but multiple threads can read it, I can write the following.

class Store {
    @Volatile
    var state = State()
        private set

    @MainThread
    fun updateState() {
        state = state.copy(a = 1, b = 2, c = 3)
    }
}

Or if the state can be updated concurrently, I can write the following:

class Store {
    private val _state = AtomicReference(State())
    val state: State get() = _state

    fun updateState() {
        _state.update { it.copy(a = 1, b = 2, c = 3) }
    }
}
  1. Will the copy fun variant have this feature as well? How it would look like?
  2. When exposing a value class via a reactive stream (Observable/Flow), will the consumer only receive the final and consistent state? How it would look like?
4. Generic types

Currently, one may define a generic Store for managing screen states (aka MVI/MVU), that accepts a reducer function for changing the state.

class Store<State, Msg>(
    private val initialState: State,
    private val reducer: (State, Msg) -> State,
) {
    // ...
}

The Store is agnostic to the actual state type, it can be a data class, a normal class, enum, etc.

It looks like that with the introduction of copyable value classes, the developer will have to choose: either use val reducer: (State, Msg) -> State (which is inconvenient for value classes), or use something like val reducer: copy State.(Msg) -> Unit (which only supports value classes).

  1. Will copy fun approach support generic types and function types for classes?
  2. How one can make it totally agnostic to the actual type (so it works with data classes, normal classes, enums and value classes), yet not having boilerplate with value classes (temporary variables)?

@elizarov
Copy link
Contributor Author

elizarov commented Jul 12, 2023

arkivanov 1. Reducer-style functions

Reducer-style function can be easily rewritten with the help of copying scope functions. The one you'll need here is copy-retrofitted version of apply, so this function:

fun reduce(state: State, msg: Msg): State =
    when (msg) {
        is Msg.Foo -> state.copy(a = 1, b = 2)
        is Msg.Bar -> State(c = 3) // Reset the state with some properties changed
    }

becomes:

fun reduce(state: State, msg: Msg): State = state.apply { 
    when (msg) {
        is Msg.Foo -> { a = 1; b = 2 }
        is Msg.Bar -> { c = 3 }
    }
}

However, as you'll use copy function throughout your frameworks, the whole approach to defining reducer-style functions will change. You'll declare them as copy functions on the state instead of using apply to convert from the reducer-style world into the copy-style world:

copy fun State.reduce(msg: Msg) = 
    when (msg) {
        is Msg.Foo -> { a = 1; b = 2 }
        is Msg.Bar -> { c = 3 }
    }

Note, that this copy fun State.reduce(msg: Msg) will actually have the same JVM ABI as the original fun reduce(state: State, msg: Msg): State, so it is a binary compatible change for a JVM library.

  1. Amount of allocations

The number of allocations will depend on the backend implementation details. WIth Valhalla-capable JVM or on a Naitive backend, for example, updates to value classes will not allocate at all. But even on pre-Valhalla JVM it will be possible in many cases to optimize state.a = 1; state.b = 2; state.c = 3 back into a single state = state.copy(a = 1, b = 2, c = 3) call with a single allocation during compilation in many real-world scenarios.

  1. State consistency

You can create a copy-friendly version of AtomicReference.update extension function combining apply and the original update (so let's call it applyUpdate) and use it like this:

class Store {
    private val _state = AtomicReference(State())
    val state: State get() = _state

    fun updateState() {
        _state.applyUpdate { a = 1; b = 2; c = 3 }
    }
}
  1. Generic types

It looks like that with the introduction of copyable value classes, the developer will have to choose: either use val reducer: (State, Msg) -> State (which is inconvenient for value classes), or use something like val reducer: copy State.(Msg) -> Unit (which only supports value classes).

Yes. You'll have to choose between the two signatures for your reducer. But they have the same actual ABI and it will be easy to convert between them using scope functions. As I've noted above, I expect that the second version will become more popular over time as value classes and copy functions become wide-spread.

@Peanuuutz
Copy link

Peanuuutz commented Sep 22, 2023

I'm a little concerned about current proposal on name-based construction, because it definitely can go too wild and messy when a class is mixed with properties in primary constructor, uninitialized properties in the body, calculated properties initialized in the init block. At the end of the day we can't quite figure out which parameters are required. Just take a look at this (pretty organized) example:

class Game {
    val id: Int
    val name: String
    var version: String
    var vendor: Vendor
    // ... more uninitialized properties ...

    var repr: String
    // ... more calculated properties ...
    
    init {
        repr = /* ... */
    }
    
    class Vendor(
        val id: Int,
        val name: String
        // ... more (maybe even regular) parameters ...
    ) {
        val repr: String
        // ... more calculated properties ...

        init {
            repr = /* ... */
        }
    }
}

val game = Game {
    // What do you expect?
}

It seems too implicit and casual. I do think it's important to limit this feature for just data classes and value classes, because they have a stable source of where the core information are gathered, which is, the primary constructor.

However, this restriction could seem tedious too. First, there are data-like classes which require regular parameters in the primary constructor. Second, it's quite common to have more stuff than the default behavior, like to add parameter checks and transformations.

By the way, it seems that a builder object is required with current proposal. For normal classes, this is fine. What about value classes, a more lightweight class model? Is it appropriate to have an object anyway? Probably not. We need a more performant means to achieve this, at least for value classes, especially when it is totally doable to flatten multi-field value classes into local variables or function parameters. Please, do not count on JIT all the time.

All in all, we need to design this feature with three principles in mind:

  1. Stable source of information.
  2. Extensibility.
  3. Potential performance gain.

I'd like to share a design for this entire system.

  1. Builders are classes that have builder modifier.

    Take Vendor for example. The corresponding builder should look like this:

    class Vendor(/* ... */) {
        builder class Builder {
            var id: Int
            var name: String
            // ... more uninitialized properties copied from the primary constructor ...
        }
    }

    Builders can also be value based:

    class Vendor(/* ... */) {
        builder value class Builder {
            copy var id: Int
            copy var name: String
            // ...
        }
    }
  2. Builders are orphan types, which means, no superclass (rather than Any) nor subclass, but interfaces are allowed.

    Class hierarchy can violate the first principle. Value class are orphan types on their own too.

  3. Builders can only have primary constructor.

    This will avoid different constructors assigning different default values, and this violates the first principle.

  4. The primary constructor of a builder cannot have properties.

    Otherwise it returns to the original issue.

  5. Builders can have uninitialized properties.

    This is intended, and the reason why not just pluging in this feature into regular classes is shown at the beginning.

  6. Builders cannot have lateinit properties.

    Use uninitialized property instead.

  7. Builders can have public/internal/private properties.

  8. All the properties that have lower visibility than the builder type must have initial values or custom setters.

    Uninitialized properties are supposed to be assigned in other places, so they must have higher or equal visibility; for lower ones, they can't be accessed from outside, so they must be initialized.

  9. Builders can have one or more init blocks.

    This one I can't decide. It may violate the first principle as you can assign any property and make it optional instead of required at a glance.

  10. Builders can have public/internal/private functions.

  11. If a property or function, whether member or extension, directly or indirectly accesses any uninitialized property, it must be inline.

    The reason is given soon (in ABI part). For short, it's a requirement for proper inference that the builder is definitely initialized. This could be easily repealed when contracts on function/lambda execution become more stable.

  12. Builder lambdas are lambdas that have builder modifier.

    typealias BuilderLambda = builder Builder.() -> Unit
  13. Builder lambdas must have a builder as receiver. Context receivers and lambda parameters can be any.

    The main goal of builder lambdas is to provide builder DSL, so the receiver must be a builder.

  14. Uninitialized properties are mapped to required fields in the builder lambdas; properties with default values and higher or equal visibility than the builder type are mapped to optional fields.

  15. Builder lambdas treat builders as uninitialized, so the content of a builder lambda must assign all the required fields, regardless of whether the receiver is actually initialized. Normal lambdas treat builders as initialized.

    This will differ the initialization and normal usage of builders.

  16. Builders must be initialized, either from direct assignments or passed as the receiver of a builder lambda, before stored in properties, passed into non-inline functions or lambdas as parameters, captured in noinline or crossinline lambdas, returned from functions or lambdas. Builders are safe to store in local variables.

    This will make sure builders are properly initialized before normal usage.

  17. Builders can be declared as companion when in final classes and value classes.

    class Vendor(/* ... */) {
        companion builder class
    }
  18. The default name for companion builders is "Builder". To change the name, simply append it.

    class Vendor(/* ... */) {
        companion builder class Factory
    }
  19. Companion builders automatically copy all the parameters from the primary constructor.

    Since secondary constructors need to delegate to the primary constructor, and is very limited due to this (this() must happen first), it's not goal for companion builders to support secondary constructors.

  20. Companion builders can be written without class.

    class Vendor(/* ... */) {
        companion builder
    }
    
    @JvmInline
    value class Value(/* ... */) {
        companion builder
    }

    When written with shorthand, companion builders become value based automatically if the class is also value based.

    This is compatible with custom name too.

    class Vendor(/* ... */) {
        companion builder Factory
    }
  21. If not exist, companion builders automatically generate corresponding build function which simply calls the primary constructor of the class with all the parameters.

    class Vendor(/* ... */) {
        companion builder {
            /*
            fun build(): Vendor {
                return Vendor(
                    // ...
                )
            }
             */
        }
    }

    This will make sure that even if the primary constructor is private, the builder still works.

  22. If not exist, companion builders automatically generate corresponding (inline) factory functions in the same file, on the same level, of the class.

    class Vendor(/* ... */) {
        companion builder
    }
    
    /*
    inline fun Vendor(block: builder Vendor.Builder.() -> Unit): Vendor {
        return Vendor.Builder().apply(block).build()
    }
     */

    For nested classes, the factory function will be generated as static function in the outer class.

    class Game(/* ... */) {
        class Vendor(/* ... */) {
            companion builder
        }
        
        /*
        static {
            inline fun Vendor(block: builder Vendor.Builder.() -> Unit): Vendor {
                return Vendor.Builder().apply(block).build()
            }
        }
         */
    }
  23. If not exist, data classes and value classes will generate companion builder and copy function with this system. Old generated copy function will be deprecated and eventually removed.

    data class Vendor(/* ... */) {
        /*
        builder class Copy(vendor: Vendor) {
            // ... properties assigned with default values from vendor ...
            
            fun build(): Vendor {
                // ...
            }
        }
        
        inline fun copy(block: builder Copy.(old: Vendor) -> Unit): Vendor {
            val old = this
            return Copy(old).apply { block(old) }.build()
        }
        
        @Deprecated
        fun copy(
            // ...
        ): Vendor {
            // ...
        }
        
        companion builder
         */
    }
  24. Generated copy builder becomes value based if companion builder is value based.

    data class Vendor(/* ... */) {
        /*
        builder value class Copy
         */
        
        
        companion builder value class // Forced to be value based
    }

    If user wants their builders to be lightweight, then it should be true for copy builders.

@Davio
Copy link

Davio commented Apr 5, 2024

Would it be a good idea to make the value class itself implement Comparable if the wrapped value implements it?

E.g. this code currently does not work:

@JvmInline
value class Wrapper(val value: Int)

mutableListOf(Wrapper(2), Wrapper(1)).sort()

The following code works as a workaround:

@JvmInline
value class Wrapper(val value: Int) : Comparable<Wrapper> {
    override fun compareTo(other: Wrapper): Int = value.compareTo(other.value)
}

mutableListOf(Wrapper(2), Wrapper(1)).sort()

Many developers assume that the value class automatically adopts such traits of the wrapped value and basically delegates implementations to the wrapped value.

But Comparable<T> is a strange one since the T here should be of its own type.

@mgroth0
Copy link

mgroth0 commented Apr 5, 2024

@Davio what if I wanted my value class to sort in a different way than the inner values? Example:

@JvmInlint
value class Version(val version: String): Comparable<Version> {
    override fun compareTo(other: Version): Int = // compare based on semantic versioning
}

Unsure with your proposal if I would feel safe making value classes. I might always be worrying if I rememebered to write my custom compareTo or if I am using the default.

@Davio
Copy link

Davio commented Apr 5, 2024

@mgroth0 you can already override the default working of toString for example, so maybe you should also be able to override the default working of Comparable

@CLOVIS-AI
Copy link

Would it be a good idea to make the value class itself implement Comparable if the wrapped value implements it?

I disagree. I create value classes when I want to create new values with new behavior. If I wanted to keep the same behavior, I would use a typealias.

@Davio
Copy link

Davio commented Apr 5, 2024

@CLOVIS-AI you could also create value classes because you want to enforce some requirements, such as a value class which wraps an Int that must be in some range, or a value class which wraps a String that must have a certain format, etc.

You can't do this with a typealias, so there are use cases for value classes that closely want to simulate their wrapped values, but offer something extra.

@CLOVIS-AI
Copy link

"a value class which wraps a String that must have a certain format"

I agree that this is a valid use-case, for example, to represent semver tags, date representations, vehicle license plates, etc. However, in all these situations, comparing by textual representation is meaningless. In my opinion, there shouldn't be a default behavior if it's likely to be incorrect.

However, I do agree that the syntax for interface implementation is too verbose in this situation, especially when interface delegation is available to other classes. I created KT-67167 to track this.

@mgroth0
Copy link

mgroth0 commented Apr 5, 2024

you can already override the default working of toString for example

Good point, but this is exactly why I avoid toString wherever possible. I consider any method such as this with a global default inherently less safe, so that's why I don't really want compareTo to also be in this category

@fvasco
Copy link

fvasco commented Apr 6, 2024

@Davio, why the Comparable interface only?
What is the rationale so a developer should expect that this specific interface should be implicitly implemented?
What if value class will contain multiple fields?
Moreover, this change is not backward compatible.

Many developers assume that the value class automatically adopts such traits of the wrapped value and basically delegates implementations to the wrapped value.

I consider this a wrong assumption.
I use value class to define a new type with a new behaviour.
I.e.: Passwords must not be Comparable, how I should write it?

@Davio
Copy link

Davio commented Apr 8, 2024

@fvasco Not the Comparable interface only, but that is one that I had a use case for.
A value class cannot contain multiple fields, in that case you would use a regular (data) class.

I'm not sure whether the change would not be backwards compatible. Value classes that already explicitly implement Comparable would keep that implementation and the ones that don't would implement it implicitly via delegation to the wrapped value.

But I guess it all depends on what your view is of value types:

  • If you want the value type to behave as much as possible as the type it wraps, implementing its interfaces by delegating to it makes sense
  • If you want the value types to behave as little as possible as the type it wraps (because your point of creating the value type is to override this kind of default behavior), then it makes sense to force everything to be explicit

@zhelenskiy
Copy link
Contributor

@Davio

A value class cannot contain multiple fields

now

I'm not sure whether the change would not be backwards compatible.

It wouldn't be. If you mix old libraries with new ones, you will fail.

If you want the value type to behave as much as possible as the type it wraps, implementing its interfaces by delegating to it makes sense

then you probably need typealias.


Nevertheless, I agree that we should provide some [explicit] way to provide Comparable implementations.

@fvasco
Copy link

fvasco commented Apr 9, 2024

Hi @Davio,
you missed some my questions. You look to me too committed on your use case.

Not the Comparable interface only, but that is one that I had a use case for.

So, in your example, Wrapper should implement any Int's interface, so Wrapper should implement Comparable<Int>, not Comparable<Wrapper>.

If I understand your idea well, you want promote my Password class:

value class Password(private val secret: String) {

  override fun toString() = "****"
}

to

value class Password(private val secret: String): Serializable, Comparable<Password>, CharSequence {

  override fun toString() = "****"

  // all other auto generated methods
}

implicitly and without any notice.
However for my use case, Password must not be Comparable, must not be Serializable and it isn't a CharSequence.
Moreover, toString() implementation of CharSequence is wrong.

This proposal isn't back compatible and introduces bugs on my code.

For these considerations, I think that this proposal does not comply to the Principle of least astonishment and the Minus 100 points rule.

I agree with @zhelenskiy, maybe you are looking to typealias, or you should consider a kapt pluging to auto implement interfaces.

@Davio
Copy link

Davio commented Apr 9, 2024

@fvasco I guess that you can find use cases for both arguments, those were it would be a good fit and those were it would not be.

So the question is: would you expect (or want) a value class to have all the characteristics of the type it wraps (and I concede that Comparable is a weird one because the T should always be the type of the class itself, so Comparable<Int> should become Comparable<Wrapper> or would you expect a value class to only be a way to flatten the wrapped type in the heap without any functionality?

I'm not completely convinced either way, I think there are arguments for both, but whatever is decided is fine. We can implement my use case perfectly by writing explicit implementations.

Typealiases still have the issue that they don't allow any extra functionality such as constructor validation.

But writing a compiler plugin could be a nice middle ground I guess.

@fvasco
Copy link

fvasco commented Apr 9, 2024

So the question is: would you expect (or want) a value class to have all the characteristics of the type it wraps (...) or would you expect a value class to only be a way to flatten the wrapped type in the heap without any functionality?

@Davio, I cannot reply to your question becouse a "Values classes are immutable classes that disavow the concept of identity for their instances".
A value class is a class like others, it does not wraps another class, not it is limited to one field.
Neither a value class is a flatten type of another.

From the referenced KEEP

@JvmInline
value class Complex(val re: Double, val im: Double)

Is Complex wrap Doubles?
Is Complex flat Doubles?
May Complex implement Comparable<Complex> in a trivial way?
Complex is a brand new type, it is composed by two Doubles, it does not wrap or flat.

Same considerations are valid for a single-field value class:

@JvmInline
value class Optional<T : Any>(private val value: T?) {

    fun isEmpty() = value == null

    fun get() = checkNotNull(value)
}

Optional is a type, just like List.

I sincerely suggest you to re-read the referenced KEEP deeply.
This is my opinion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests