CIP-0085? | Sums-of-products in Plutus Core #455

michaelpj · 2023-01-30T12:40:16Z

This CIP proposes a substantial extension to Plutus Core to support sums-of-products, with the aim of realising some substantial performance improvements.

(rendered proposal in branch)

rphair

I'm approving this mainly (and immediately) because any disputes or changes to this would be well above my level. The explanations are good enough that even at a dumb level I feel like I can see why this is important & worth the "costs" that are admitted.

The Path to Active suggests this is fait accompli, and although that in itself shouldn't be grounds for approving the CIP, the proposal establishes that the team is going ahead with this after having considered & tested enough alternatives & confirming that it's the best way forward.

Since it mentions interfacing with the Ledger I also want to ask @JaredCorduan if he & any others in that area could question & comment on the Ledger aspect of the proposal. 🤓

CIP-????/README.md

michaelpj · 2023-01-30T14:20:35Z

The Path to Active suggests this is fait accompli, and although that in itself shouldn't be grounds for approving the CIP, the proposal establishes that the team is going ahead with this after having considered & tested enough alternatives & confirming that it's the best way forward.

That's not quite what I meant to convey! I wouldn't say it's a fait accompli - we have a prototype, but we had to have a prototype in order to know whether it was a good idea or not! We have yet to turn it into a production implementation, and it may be that we'll get input from other people that makes us reconsider things.

This is of course complicated in that it's a CIP by the currently listed reviewer of the Plutus CIPs i.e. me 😅 So yeah, it's the case that I currently think we should accept it.

Also, I guess I could have written the initial version of this CIP when we first had the idea, and then iterated from there. That would have been more open (there were at least two private design docs that preceded this), but I didn't want to do a proper proposal until I had a high-quality argument, which required a lot of performance work in the background (e.g. I actually spent a long time on multi-lambdas before eventually discarding them).

JaredCorduan · 2023-01-30T15:10:00Z

Since it mentions interfacing with the Ledger I also want to ask @JaredCorduan if he & any others in that area could question & comment on that aspect of the proposal.

Interesting details will probably emerge when we go to actually implement this, but as of right now I think that I understand the implications to ledger and I think that it looks like a great idea.

ch1bo

Very interesting proposal, some things are a bit over my level of understanding as well (operational semantics). So my first question is only on how other usages of Data will be affected?

CIP-????/README.md

rvcas · 2023-01-30T15:47:58Z

Yes please

CIP-????/README.md

effectfully · 2023-01-30T15:30:36Z

CIP-????/README.md

+
+### Discussion
+
+#### Why is case not strict?


Will this allow us to make a faster ifThenElse than the builtins-based one?

I wouldn't be surprised if the Scott-encoded boolean case expression was already faster than the builtin version. As far as I remember, we mostly have the builtin one because we need an eliminator for builtin booleans, not because it's necessarily fast!

I wouldn't be surprised if the Scott-encoded boolean case expression was already faster than the builtin version.

Sure, but that one has to go through the delay/force hoops, while case-expressions give us enough leeway to bypass that.

As far as I remember, we mostly have the builtin one because we need an eliminator for builtin booleans, not because it's necessarily fast!

Right... So do we normally use Scott-encoded booleans and only resort to built-in ones when dealing with builtins, which we then/immediately convert to Scott-encoded ones? Can't remember. If that's the case, then we should benefit a little from not Scott-encoding booleans (which I assume we'll get to automatically just by giving up on Scott-encoding in favor of native data types?) and not introducing any ifThenElse, but instead inlining the case at each call site, so that we can remove delay/force noise.

Which makes me wonder if there's a lot more pattern matching that we'd better inline during compilation instead of abstracting it out into a matcher.

Or am I saying nonsense?

Sure, but that one has to go through the delay/force hoops, while case-expressions give us enough leeway to bypass that.

Annoyingly, I think we'll still have to put that in. The problem is that the moment you abstract over the case branches (which we do in the compiler), you get strictness again 🤷 We can probably optimize a lot of it away, though.

Right... So do we normally use Scott-encoded booleans and only resort to built-in ones when dealing with builtins, which we then/immediately convert to Scott-encoded ones?

Yes.

which I assume we'll get to automatically just by giving up on Scott-encoding in favor of native data types?

Yes.

Which makes me wonder if there's a lot more pattern matching that we'd better inline during compilation instead of abstracting it out into a matcher.

We do inline it, but we can't do it until UPLC for annoying type abstraction reasons. We should write some test cases to check that things get as optimized as we like. I have some, but possibly not for this exact case.

As far as I remember, we mostly have the builtin one because we need an eliminator for builtin booleans, not because it's necessarily fast!

It seems like a lost opportunity to not be able to get rid of Builtin types which are sum of products such as booleans. Would it be possible to have a notion of predefined datatypes (such as the sop of booleans), and allow builtin functions to return them?

For the current builtins it seems it would be enough to return a Plutus Core term. Even a Plutus Core value seems enough. All the current (and future) builtins that deconstruct a value could be done with a case. The costing of polymorphic functions should stay the same, as long as we require certain reasonable sanity on future built-ins. But yeah, it is a big change...

current builtins

Uh huh, and how about future builtins? Will we allow builtins to case on SOPs or not? We'll have to decide when someone asks. If we don't, we're inconsistent. Making these kind of incrementally-sensible-but-globally-incoherent decisions is how we ended up with the current builtins :D

Will we allow builtins to case on SOPs or not?

If we want to keep current builtins working, then we'd need casing on a SOP in IfThenElse (even though technically IfThenElse would be redundant as a builtin).

So currently we convert constants of built-in types to their respective Plutus data types via dedicated converter functions generated by the compiler. I think what Mauro proposes amounts to converting arguments of some predefined types into constants within builtins. Imagine a builtin taking a Bool as an example, you'd have to to convert constr 0 to False (or True?) somehow. It would be natural to require such a converter alongside the definition of the predefined data type, but then we're at square one -- what's the point of pushing more context into builtins if you end up doing all the same things.

And the denotation of a builtin does take a Bool or a () or whatever, it doesn't take some arbitrary constr 0 thing. Imagine a cryptographic function doing something different depending on a Bool argument that it has, we will have to convert SoPs to the actual Haskell values.

Bool is probably not too complicated though. Something like [a] is going to be so much more complex.

So I'm not sure I understand how we'd go about getting rid of data types easily expressible in Plutus and actually do less without forbidding a predefined type to serve as the type of an argument of a builtin.

I don't see how this follows. Built-in booleans aren't being removed, no?

I don't see how this follows. Built-in booleans aren't being removed, no?

They are not. And they'll have to stay in order for us to be able to validate old-style scripts regardless. But maybe with SoP we can generate more efficient code by using native Plutus booleans instead of the built-in ones -- that's what we're discussing here.

CIP-????/README.md

effectfully · 2023-01-30T16:25:32Z

CIP-????/README.md

+
+#### Why is case not strict?
+
+This makes it different to all other Plutus Core terms, but we believe it is the expected behaviour: _no_ language has strict case destructuring.


I feel like this opens up an interesting potential meta-issue. Currently we don't support basic short-circuiting boolean operations like && and ||. If somebody wanted to get those, they'd have to use a separate toolchain for generating UPLC code and that toolchain would need to elaborate an expression like x || y to something like

force (ifThenElse x (delay True) (delay y))

which is pretty complex and is awkward to integrate into the surface language in a seamless way, particularly if you're aiming for the general case. And isn't zero-cost!

However elaborating x || y to

case x of True -> True False -> y

is fast and much more convenient and begs for being included in a DSL or even a standalone language with macros or two-stage evaluation or whatever. It might become hard for us to compete with that!

Realistically though, if it becomes a problem we too can special-case && and || and it's less likely someone will need lazy semantics for other stuff. Although in the blockchain context where every bit matters it's generally a good idea to choose tools giving you more control over performance and we've seen how far people are willing to go to improve upon costs.

However elaborating x || y to is fast and much more convenient and begs for being included in a DSL or even a standalone language with macros or two-stage evaluation or whatever.

Note that even today elaborating it to the case expression will get evaluated lazily, just with more overhead. The problem is that our function calls are strict. Indeed, a macro system or something would fix that (and indeed you can easily do that today with some TH if you actually care ). I'm unsure how big a deal it is...

CIP-????/README.md

effectfully

Giving an approval just for this to become mergable, but I'd really like to see how the proposed changes affect evaluation times of validation scripts (as opposed to the nofib benchmarks).

UPD: ah, I don't have write access, so my approval doesn't count.

Quantumplation · 2023-01-30T16:52:23Z

If it would be helpful, SundaeSwap is happy to add you to our Contracts repos (again?) if you want some more real world use cases to test on.

michaelpj · 2023-01-30T17:25:12Z

I'd really like to see how the proposed changes affect evaluation times of validation scripts

Right yes, I was going to do that and forgot!

CIP-????/README.md

ghost · 2023-01-30T17:44:56Z

I would ♡ to see how Hydra contracts benchmark is affected by this proposal: https://hydra.family/head-protocol/benchmarks/transaction-cost
/cc @ch1bo

I am afraid this might be complicated without the ledger interface though 🤔

CIP-????/README.md

KtorZ · 2023-02-01T09:23:58Z

I like the idea; feels like "what Data should have been to be begin with". However, I have few concerns:

The proposal only talks about adding new term constructors but say nothing about removing stuff and uncluttering the core language. I find the current Plutus Core language not so easy to work with and quite confusing for builders that seek on building tools. While the introduction of new terms is great, it also adds more rooms for errors. Now we have 3 ways to represent an integer (as a raw UPLC primitive, as a wrapped data I, or as a wrapped constr). If Data was initially introduced as a work-around in the early days of Plutus - as suggested in the proposal:

The solution that we chose was to pick a single generic structured data type which would be used for data interchange between the ledger and scripts. This is the 'Data' type.

Then the introduction of constr and case make Data (and all its associated builtins, ChooseData, IData, ConstrData, unConstrData etc...) redundant. Since this would already require a new version of Plutus, what about also "cleaning up" the language on the same occasion?
The proposal suggests that this change would be adopted by the ledger as well. Which raises a second concern regarding interoperability. At the moment, the on-chain format used between the Plutus VM and ledger is unspecified. One has to reverse-engineer Haskell code and rewind type-class instances to figure out how is the ledger constructing a script context. Some things even simply rely on conventions from Haskell's library (like the ordering of keys in a map) which I find (a) not practical to work with and (b) potentially harmful. So, if anything is done in the ledger to adopt the sums-of-products, may I strongly suggest to document this interface in a language-agnostic way (i.e. not Haskell). This is fundamentally a data-serialization format. (Note that this is part of what we're trying to achieve with CIP-0057). Right now it only works because the ledger and plutus happen to both be written in Haskell and rely on the same underlying libraries. I argue that there should be a clearer separation between this two components.

michaelpj · 2023-02-01T10:35:50Z

@KtorZ

The proposal only talks about adding new term constructors but say nothing about removing stuff and uncluttering the core language.

Then the introduction of constr and case make Data (and all its associated builtins, ChooseData, IData, ConstrData, unConstrData etc...) redundant.

Sadly I don't think this enables us to get rid of anything. In particular, we can't get rid of Data because we still want it for datums and redeemers. See my comment above, I'll expand on this in the text.

I think Data is the only thing that this proposal even comes close to replacing.

Now we have 3 ways to represent an integer (as a raw UPLC primitive, as a wrapped data I, or as a wrapped constr).

To be honest, I'm surprised that those things are confusing. They are very different things: one is Plutus Core builtin integers; one is a constructor of a specific structured data type; and one is a constructor for Plutus Core datatypes. It doesn't seem more confusing than the difference between, say, the Haskell Integer and the TInteger constructor of the cborg Term type (using that example since Data was originally designed as a quasi-subset of CBOR).

The proposal suggests that this change would be adopted by the ledger as well. Which raises a second concern regarding interoperability. At the moment, the on-chain format used between the Plutus VM and ledger is unspecified.

I think the thing that's problematically unspecified is the translation of the script context into its representation for the script. And yes, I agree that's very problematic and has been a hole for a while. Part of the reason I haven't written the new version down here is that I don't know how the old version works :(

I will however say that I don't think that this CIP makes the situation any worse. So while I'd like to clear that up as part of the ledger specification effort, I would be sad if we blocked this CIP on doing that.

Right now it only works because the ledger and plutus happen to both be written in Haskell and rely on the same underlying libraries.

This doesn't seem true to me? Can you expand on your argument here? I agree it's problematic that the translation is under-specified, but I think that mostly is a problem for someone trying to produce an alternative "ledger driver" for Plutus scripts, rather than anything else?

KtorZ · 2023-02-01T16:43:32Z

@michaelpj To be honest, I'm surprised that those things are confusing. They are very different things: one is Plutus Core builtin integers; one is a constructor of a specific structured data type; and one is a constructor for Plutus Core datatypes. It doesn't seem more confusing than the difference between, say, the Haskell Integer and the TInteger constructor of the cborg Term type (using that example since Data was originally designed as a quasi-subset of CBOR).

It is confusing because they do live at the same level on the same execution engine. Which means that, in practice, a script can actually accept any of the UPLC term as datum or redeemer. One doesn't have to use Data for that, and it just happens that the Haskell codebase only generates values (through ToData) that wraps terms as UPLC's Data. It's easy to violate the abstraction outside of the Haskell realm -- purposely or not.

@michaelpj I will however say that I don't think that this CIP makes the situation any worse. So while I'd like to clear that up as part of the ledger specification effort, I would be sad if we blocked this CIP on doing that.

It doesn't indeed. But that's something I'd love to see happening. Otherwise, it doesn't arguably make the situation much better outside of the Haskell sphere. I wish Plutus and the ledger were written in two different languages, so that we could better enforce (and document) interfaces at the components' boundaries.

@michaelpj This doesn't seem true to me? Can you expand on your argument here? I agree it's problematic that the translation is under-specified.

That's what I meant by "it only works because". "it" referring to: being able to not specify the interface between plutus and the ledger. If the components were not both written in Haskell and were not both relying on the same libraries, how to represent a ScriptContext on-chain would have required an ambiguous specification.

@michaelpj but I think that mostly is a problem for someone trying to produce an alternative "ledger driver" for Plutus scripts, rather than anything else?

Or, alternative compilers targeting UPLC that want to interface with the existing ledger. The ledger will provide script context to a script in one given unspecified format. So if you're producing some code that requires this script context, it becomes tricky. Again, I am simply raising my concerns here. It would be a miss opportunity to introduce this change, entirely rework how the script context is produced by the ledger, and still not document it.

michaelpj · 2023-02-02T13:08:55Z

I wish Plutus and the ledger were written in two different languages, so that we could better enforce (and document) interfaces at the components' boundaries.

If the components were not both written in Haskell and were not both relying on the same libraries, how to represent a ScriptContext on-chain would have required an ambiguous specification.

I don't think we disagree on the facts, but I think I disagree with your implication here. Even if the components had been written in different languages, that would not necessarily have meant we had to specify it better. We could equally well have ended up with implementation-defined behaviour. The thing that makes the difference is whether or not you write down the spec, not the languages IMO.

Or, alternative compilers targeting UPLC that want to interface with the existing ledger. The ledger will provide script context to a script in one given unspecified format. So if you're producing some code that requires this script context, it becomes tricky.

I'm not sure it's the compiler's problem so much as a problem for the user-space libraries that support working with the script context data, but I totally agree that it's problematic.

I've expanded on the acceptance criteria to explicitly call out specification of the script context translation :)

CIP-????/README.md

mjaskelioff · 2023-02-02T19:07:28Z

CIP-????/README.md

+
+### Discussion
+
+#### Why is case not strict?


As far as I remember, we mostly have the builtin one because we need an eliminator for builtin booleans, not because it's necessarily fast!

It seems like a lost opportunity to not be able to get rid of Builtin types which are sum of products such as booleans. Would it be possible to have a notion of predefined datatypes (such as the sop of booleans), and allow builtin functions to return them?

rphair · 2023-04-07T18:13:47Z

don't worry @michele-nuzzi - CIP PR's can stay at Last Check for a while... until we're sure both authors & community are happy as it stands and/or that it represents a usable alternative that can be aimed for in acceptance & implementation.

nielstron · 2023-04-07T21:59:12Z

Thanks for tagging me @michele-nuzzi! I have also read through the proposal before and had similar thoughts but didn't get around to write them down.

Another thing that bugs me is that the current PlutusData implementation yields a very simple way to compare data for equivalence (equalsData) and makes analysis of structured data possible (I.e. case distinction for bytes, int, constr). This is currently not possible on the UPLC level (which in itself is slightly confusing) and would not be solved by this CIP as I see it?

michaelpj · 2023-04-11T15:42:02Z

Status update

This CIP really has two parts. The first part, adding SOPs to PLC, is I think uncontroversial. It allows significantly better code generation for those that want it and has basically no downsides.

The second part is about using SOPs for the script context. This has unfortunately turned out to be more controversial. I had expected that it would be an improvement or at least not worse for everyone. However, there are a few real cases where it makes things worse.

The most serious to my eye is the loss of the way to rapidly compare bits of the script context for equality and hash them. This was something of a "coincidental" feature, but the speed advantages are significant. Annoyingly, there is a very stark tradeoff, since for other functions that work with structured data, using SOPs is much faster.

The ideal solution would be to provide a way to do these operations with SOPs. However, I haven't been able to find a way to do that. We can't even write a homogeneous equality function in untyped PLC because we (deliberately) don't have a way to do "type case" in UPLC. Perhaps a builtin could do it, but it would be wildly different from any builtin we have so far, and very difficult to cost.

So we are on the horns of a dilemma: we can either forgo fast equality and hashing to get a significant speedup across the board; or forgo that speedup to keep our special cases.

So where to go from here? My inclination is to split the CIP into two: SOPs, and script-context-as-SOP. Most of the technical work is in implementing SOPs, and I have been busy getting that in shape for the next HF. I think we can just accept the SOPs CIP.

The script-context-as-SOP CIP is harder. At least it will need benchmarking of some realistic examples using the builtins that would be lost, e.g. the Hydra validator (other examples welcome!). I don't know what to do about a proposal that has significant costs as well as significant benefits. Probably we should not accept it until we have a way to avoid the costs, which requires some more ideas.

I'd like some guidance from the editors (cc @rphair @KtorZ ) on what course of action they recommend.

Comment responses

I would like to precise that only PlutusTX scripts keep decoding entire script contexts;

in my opinion, this is a non-issue since plutarch came out and any other language (plu-ts,Aiken,OpShin,Helios etx.) does not decode the script context other than the fields needed.

Plutus Tx scripts don't need to do it either, they can also just take a Data argument. This is a bit of a myth.

I think I need to rephrase the argument a bit, though: decoding the script context is not just a weird Plutus Tx quirk. Using a representation other than Data is much faster for operations in general (see benchmarks in this CIP, for example). Going through the builtin system multiple times for every pattern match is terrible. So regardless of how you produce your PLC script, you're going to have a faster script if you can process the script context as something other than Data... the problem is just getting it there.

unlifting is not more complicated than reading a json

I think this is a misunderstanding. Unlifting is not about converting Data to anything, it's about interpreting general PLC terms. I'll try and clarify the text to explain the problem more clearly.

I dislike the idea of supporting multiple ways to represent structured data.

They are very different. One of them is native to the language, one of them is a builtin type with a whole suite of separate functions. They're not really very easy to confuse (in my biased opinion). To use your own analogy, you don't find JSON and Javascript classes confusing, even though they're both ways of representing structured data.

Another thing that bugs me is that the current PlutusData implementation yields a very simple way to compare data for equivalence (equalsData) and makes analysis of structured data possible (I.e. case distinction for bytes, int, constr). This is currently not possible on the UPLC level (which in itself is slightly confusing)

(covered the equalsData stuff above, I think)

The ability to distinguish different terms in that way actually adds significant power to the language (it gives us a kind of "type case"), and makes the metatheory more complicated. So its omission is deliberate.

michaelpj · 2023-04-11T16:18:47Z

To see how type-case can be problematic, suppose we had the following function:

isIntegerConstant :: forall a . a -> Bool

Now I can write this:

evilIdentity :: forall a . a -> a
evilIdentity x = if isIntegerConstant x then traceError "haha, no integers!" else x

which is pretty surprising and breaks parametricity.

L-as · 2023-04-11T16:52:21Z

Possible "fix":

Remove ScriptContext entirely. I have never been satisfied with the current integration between the scripts and the ledger.
Scripts have too much power, the ability to inspect anything is not good. This is also what causes the double satisfaction attack.
Taking this into account, if scripts instead returned a list of constraints, we wouldn't have to worry about which format ScriptContext is in at all. The constraints would of course use SoPs, as there isn't any need to check equalities of constraints.

MicroProofs · 2023-04-11T18:46:49Z

Could I get a link to the branch that the code is on? Curious on vm changes so I can add it to Aiken.

Edit: nvm I think it's mpj/products-non-binding.

I see how the vm handles the new terms now.

rphair · 2023-04-11T19:15:55Z

I'd like some guidance from the editors (cc @rphair @KtorZ ) on what course of action they recommend.

@michaelpj I think you make a good argument for splitting the CIP so we can accept the less disputable one straightaway. The technical considerations are beyond me so I'm recommending this on the general principle that it would give the community a solid point of reference while continuing to work on & debate the uncertain part.

It also sounds like forking the proposal would yield one CIP satisfying @KtorZ in both developer & editor hats as per #455 (comment). 🤠 In the meantime taking this out of Last Check at least until such a split is fully considered.

michaelpj · 2023-04-17T10:43:00Z

@rphair should I make two new PRs for that? Or I could reuse this PR for one of them. Either way I'll link to this as a place where discussions have happened.

rphair · 2023-04-17T15:53:26Z

thanks for asking me @michaelpj although @KtorZ @Ryun1 @SebastienGllmt might also indicate their preference.

Since this PR is where the discussion started, I think it's logical to use ("reuse") it as the basis for the uncontested portion, and create one new PR for the delta. Then this "uncontested" portion could be the baseline for the 2nd PR which would then be an update to a CIP-0085 merged in the near future.

michaelpj · 2023-04-20T11:58:47Z

Then this "uncontested" portion could be the baseline for the 2nd PR which would then be an update to a CIP-0085 merged in the near future.

I think I'll just make it a completely separate CIP. It is logically separate after all.

michaelpj · 2023-04-20T12:16:54Z

Pushed to just be SOPs, I think this can go back into final check, it was almost entirely deleting content.

I realised I had the folder name wrong, sorry, so the comments have got a bit detached now, please squash merge this when you do!

CIP for the script context stuff coming soon.

michaelpj · 2023-04-20T14:31:35Z

Here it is: #507

michele-nuzzi · 2023-04-24T15:04:05Z

Since we are splitting the CIP so that CIP 85 only implies the introduction of SOPs in UPLC and the case construct I would still propose an alternative solution.

As stated before I'm against the introduction of the possibility of representing structured data in multiple ways as it is much more error-prone than how working with raw data will ever be.

An alternative solution, if we really want this new SOP representation then the Data representation should not be present at all.

I do like the idea of an ad-hoc construct for matching SOPs but I don't think is enough to justify the introduction of SOPs.

An alternative solution, (assuming it is really needed, even if I'm sure every language provides support for a similar control flow) would be the introduction of a built-in function to select a given continuation based on an index.

Here I call it selector for lack of creativity.

it would take data as the first input and a list of continuations as the second (a list of functions, or lists of delayed terms should be integrated)

note I'm using "list" here but the builtin could also be used with an indefinite number of applications just like the case construct described in the proposal would work

It would work by using the constructor index as the index of the list of continuations to which apply the list of data used as fields.

Even if messy I would still prefer it to have both SOPs and Data representations.

KtorZ · 2023-05-02T14:46:13Z

@michele-nuzzi
As stated before I'm against the introduction of the possibility of representing structured data in multiple ways as it is much more error-prone than how working with raw data will ever be.

An alternative solution, if we really want this new SOP representation then the Data representation should not be present at all.

I do share Michele's opinion here as also stated previously. I see no reason for Data to still exists once we have term-level constructors for SOP? So far, the only argument I've really seen from @michaelpj for the use of Data regards the interfacing between on-chain and the outside world. Although, @michaelpj, I don't buy the argument that using a different representation on-chain than the one "in the transaction" makes it harder for script authors; It makes it (possibly) harder for tool / compiler writers only.

I don't know (m)any people who write UPLC directly; pretty much everyone use a higher-level abstraction framework that generates UPLC code for them. Might it be PlutusTx or Aiken (or [choose-your-weapon]). I think that's a fair thing to ask compiler / framework developers to take care of. Given how the ecosystem has already developed around the use of Data in the transaction building process, it's also a good idea to keep the Data representation at this level; since any modification impacts literally everyone there. Yet, removing Data from the Plutus VM entirely and having various compiler implementors be aware of that Data -> SOP transformation sounds totally reasonable. I think we can even all fit in a room if needs be 😊

Having said that, I am more than happy with the CIP from an editor standpoint and I'll give it a (well-deserved!) checkmark ✔️. I'd like also now to deflect any further discussions regarding Data vs SOP to #507 as these goes together. We can't "remove Data" while it's still used to encode the script context.

rphair · 2023-05-02T15:26:00Z

@KtorZ sounds good... I'd been planning to elbow a "last check" on this PR into the agenda with #507 but it's good to see you've put it on Last Check officially in the agenda.

michaelpj · 2023-05-02T18:17:34Z

An alternative solution, (assuming it is really needed, even if I'm sure every language provides support for a similar control flow) would be the introduction of a built-in function to select a given continuation based on an index.

This exists (well, not the continuation version since builtins can't call functions) in the form of chooseData. You can implement a version with continuations on top of this easily, there's one in the plutus-tx library. But it's slow. One of the primary reasons that we're doing this is performance - I feel like this is the thing that you are missing in this discussion, @michele-nuzzi .

I do share Michele's opinion here as also stated previously. I see no reason for Data to still exists once we have term-level constructors for SOP?

There are a few reasons:

It requires CIP-0097? | Representing the Script Context as a SOP term #507, which has IMO blocking problems because it removes some fast operations that people rely on.
You lose the ability to "type-case", which you can do on Data.
The argument about it matching more closely with the external world.

That is to say: I don't think we would introduce it now if we didn't have it already, but it's hard to remove when people rely on it. Maybe we can still do it, but we'll have to at least get #507 over the line first.

michele-nuzzi · 2023-05-02T18:46:29Z

This exists (well, not the continuation version since builtins can't call functions) in the form of chooseData. You can implement a version with continuations on top of this easily, there's one in the plutus-tx library. But it's slow. One of the primary reasons that we're doing this is performance - I feel like this is the thing that you are missing in this discussion, @michele-nuzzi .

I'm confident that none of the "alternative languages" is relying on chooseData exactly because of its performance. I can talk for plu-ts and I know for sure this isn't the case as shown before in this discussion pointing to the plu-ts implementation.

The implementation assumes the passed data is Constr and automatically destructures it; checks the index and passes the list of data to the continuation.

The proposed implementation is "unsafe" from the typing perspective but that shouldn't bother us otherwise what's the point of a typed IR form of UPLC in the standard compilation process?

I would be surprised if any of the other implementations would use something too different than the implementation based on continuations.

One of the primary reasons that we're doing this is performance

I'm all in for performances but not if it means introducing the possibility of unsafe contracts.

In my opinion, some more data from contracts that are the results of other compilation processes would help the decision of introducing SOPs or not.

In the event the additional data shows an actual improvement in performance, I'd love to support this proposal under the condition that it doesn't introduce additional ambiguity (aka. only SOPs, removing Data).

MicroProofs · 2023-05-02T19:10:30Z

I like SOP for its fast destructuring, but the problem is the blocker on missing out equals data and serialise_data. As a different compromise how about a new builtin that is similar to case. Say case_constr.

Takes in a constant of type data that is a constr and any term. Then it applies all field arguments from constr plus the constr index to that term. Then that single term can handle constr index and whatever fields it has applies.

Now I know this step is computationally more expensive. But since we have a lot of builtins that operate on data, I wonder if operating on data in a more efficient way might be the solution. Even converting script context to SOP does not change that we still need better ways to operate on data because of datum, redeemer, and script context still has data fields in datums, inputs, outputs, and redeemers fields.

*on phone so I'll clean up this comment in a bit.

michele-nuzzi · 2023-05-02T19:23:42Z

Takes in a constant of type data that is a constr and any term. Then it applies all field arguments from constr plus the constr index to that term. Then that single term can handle constr index and whatever fields it has applies.

If I understand correctly something like case_constr would be something like

(lam someConstr
  (lam someContinuation
    [
      (lam unConstrPair
        [
          [
            someContinuation
            [(force (force (builtin fstPair))) unConstrPair]
          ]
          [(force (force (builtin sndPair))) unConstrPair]
        ]
      )
      [(builtin unConstrData) someConstr]
    ]
  )
)

MicroProofs · 2023-05-02T19:42:36Z

Takes in a constant of type data that is a constr and any term. Then it applies all field arguments from constr plus the constr index to that term. Then that single term can handle constr index and whatever fields it has applies.

If I understand correctly something like case_constr would be something like
(lam someConstr
  (lam someContinuation
    [
      (lam unConstrPair
        [
          [
            someContinuation
            [(force (force (builtin fstPair))) unConstrPair]
          ]
          [(force (force (builtin sndPair))) unConstrPair]
        ]
      )
      [(builtin unConstrData) someConstr]
    ]
  )
)

Not quite way less code. I'll post the example in the new CIP PR since this was merged

Edit: Posted in #507

* Sums-of-products * Address some comments * Address more comments * More on benchmarking * More acceptance criteria * Address more comments * More work * More * Update metadata * More benchmarks * More comments * Rename folder * Rework to just be SOPs

Sums-of-products

d2de491

rphair approved these changes Jan 30, 2023

View reviewed changes

CIP-????/README.md Outdated Show resolved Hide resolved

rphair reviewed Jan 30, 2023

View reviewed changes

CIP-????/README.md Outdated Show resolved Hide resolved

CIP-????/README.md Outdated Show resolved Hide resolved

rphair changed the title ~~Sums-of-products in Plutus Core~~ CIP-???? | Sums-of-products in Plutus Core Jan 30, 2023

ch1bo reviewed Jan 30, 2023

View reviewed changes

CIP-????/README.md Outdated Show resolved Hide resolved

effectfully reviewed Jan 30, 2023

View reviewed changes

effectfully approved these changes Jan 30, 2023

View reviewed changes

Address some comments

e1ec86b

ghost reviewed Jan 30, 2023

View reviewed changes

CIP-????/README.md Outdated Show resolved Hide resolved

CIP-????/README.md Outdated Show resolved Hide resolved

ch1bo mentioned this pull request Jan 30, 2023

Snapshot number and utxo hash need not be in the redeemer on close/contest cardano-scaling/hydra#677

Merged

2 tasks

Quantumplation reviewed Jan 31, 2023

View reviewed changes

CIP-????/README.md Outdated Show resolved Hide resolved

effectfully reviewed Jan 31, 2023

View reviewed changes

CIP-????/README.md Outdated Show resolved Hide resolved

CIP-????/README.md Outdated Show resolved Hide resolved

CIP-????/README.md Outdated Show resolved Hide resolved

michaelpj force-pushed the mpj/sums-of-products branch from 3cb2cab to 9ecd61d Compare February 1, 2023 15:56

Address more comments

d744361

michaelpj force-pushed the mpj/sums-of-products branch from 9ecd61d to d744361 Compare February 1, 2023 15:57

michaelpj added 2 commits February 2, 2023 12:54

More on benchmarking

75b37aa

More acceptance criteria

63f12df

mjaskelioff reviewed Feb 2, 2023

View reviewed changes

michaelpj added 2 commits February 3, 2023 11:22

Address more comments

0d458e5

More work

d331050

rphair removed the State: Last Check Review favourable with disputes resolved; staged for merging. label Apr 11, 2023

Rename folder

3960dcb

Rework to just be SOPs

192e149

rphair added the State: Last Check Review favourable with disputes resolved; staged for merging. label Apr 20, 2023

michaelpj mentioned this pull request Apr 20, 2023

CIP-0097? | Representing the Script Context as a SOP term #507

Closed

KtorZ approved these changes May 2, 2023

View reviewed changes

rphair merged commit e635f7d into cardano-foundation:master May 2, 2023

rphair removed the State: Last Check Review favourable with disputes resolved; staged for merging. label May 2, 2023


		#### Why is case not strict?

		This makes it different to all other Plutus Core terms, but we believe it is the expected behaviour: _no_ language has strict case destructuring.

CIP-0085? | Sums-of-products in Plutus Core #455

CIP-0085? | Sums-of-products in Plutus Core #455

Conversation

michaelpj commented Jan 30, 2023 • edited by rphair Loading

rphair left a comment • edited Loading

Choose a reason for hiding this comment

michaelpj commented Jan 30, 2023

JaredCorduan commented Jan 30, 2023 • edited by rphair Loading

ch1bo left a comment

Choose a reason for hiding this comment

rvcas commented Jan 30, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

effectfully Feb 15, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

effectfully left a comment • edited Loading

Choose a reason for hiding this comment

Quantumplation commented Jan 30, 2023

michaelpj commented Jan 30, 2023

ghost commented Jan 30, 2023 • edited by ghost Loading

KtorZ commented Feb 1, 2023

michaelpj commented Feb 1, 2023

KtorZ commented Feb 1, 2023 • edited Loading

michaelpj commented Feb 2, 2023

Choose a reason for hiding this comment

rphair commented Apr 7, 2023 • edited Loading

nielstron commented Apr 7, 2023

michaelpj commented Apr 11, 2023

Status update

Comment responses

michaelpj commented Apr 11, 2023

L-as commented Apr 11, 2023

MicroProofs commented Apr 11, 2023 • edited Loading

rphair commented Apr 11, 2023

michaelpj commented Apr 17, 2023

rphair commented Apr 17, 2023 • edited Loading

michaelpj commented Apr 20, 2023

michaelpj commented Apr 20, 2023

michaelpj commented Apr 20, 2023

michele-nuzzi commented Apr 24, 2023

KtorZ commented May 2, 2023 • edited Loading

rphair commented May 2, 2023

michaelpj commented May 2, 2023

michele-nuzzi commented May 2, 2023

MicroProofs commented May 2, 2023 • edited Loading

michele-nuzzi commented May 2, 2023

MicroProofs commented May 2, 2023 • edited Loading

michaelpj commented Jan 30, 2023 •

edited by rphair

Loading

rphair left a comment •

edited

Loading

JaredCorduan commented Jan 30, 2023 •

edited by rphair

Loading

effectfully Feb 15, 2023 •

edited

Loading

effectfully left a comment •

edited

Loading

ghost commented Jan 30, 2023 •

edited by ghost

Loading

KtorZ commented Feb 1, 2023 •

edited

Loading

rphair commented Apr 7, 2023 •

edited

Loading

MicroProofs commented Apr 11, 2023 •

edited

Loading

rphair commented Apr 17, 2023 •

edited

Loading

KtorZ commented May 2, 2023 •

edited

Loading

MicroProofs commented May 2, 2023 •

edited

Loading

MicroProofs commented May 2, 2023 •

edited

Loading