Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully dependent pi types. #1933

Closed
wants to merge 1 commit into from
Closed

Conversation

ticki
Copy link
Contributor

@ticki ticki commented Feb 26, 2017

This is one of 3 RFCs. The collection of the pi type RFCs is tracked by issue #1930.

Rendered.

Depends on #1931 and #1932.

@ticki ticki mentioned this pull request Feb 26, 2017
@Diggsey
Copy link
Contributor

Diggsey commented Feb 26, 2017

The "rendered" link points to the previous RFC

@ticki
Copy link
Contributor Author

ticki commented Feb 26, 2017

Fixed.

@Ixrec
Copy link
Contributor

Ixrec commented Feb 26, 2017

As currently written, this RFC doesn't make any real sense to me, possibly because I've never used a language that had "fully dependent types", but it feels like half of the RFC is just missing. The motivation section in particular needs to be more than one sentence; it at least needs a few code examples of what would be possible under this RFC that wouldn't be possible under the other two.

What does "fully dependent type" mean? My current understanding is that array types like [T; N] are dependent types, the two previous RFCs expand this to allow "user-defined dependent types", and this third RFC does...something else. Certain parts of this RFC strongly imply it's talking about types dependent on runtime values, and other parts strongly imply that would be a future extension, so I have no idea what it's actually proposing, much less whether I want it.

Or should we just not worry about this until the first two RFCs are closer to consensus/acceptance?

@comex
Copy link

comex commented Feb 26, 2017

I agree with @lxrec's concern about vagueness, but I'd like to say that if we can somehow get invariant checking to feel like it "just works", maybe something like LiquidHaskell (which I haven't used), it could be a big boost to Rust's safety-plus-performance approach.

For example, as you probably know, a semi-frequent proposal has been a way to safely omit bounds checks by forcing the compiler to prove that they're superseded by previous checks - in other words, an explicit, mandatory version of what's today the job of the LLVM optimizer. That specific use case can sort of be implemented today with lifetime hackery, a special "verified index" structure, etc., but it's very far from just working. Better if we could just say:

impl<T> [T] {
  // const effectively means pure
  const fn len(&self) -> usize { .. }
  fn verified_get(&self, idx: usize} -> &T with idx < self.len() { .. }
}

fn test(list: &mut [i32], idx: usize) {
  // check once upfront
  assert!(idx < list.len());
  // deep in some inner loop that mutates list…
      let val = *list.verified_get(idx);
}

Notably, this really calls for a SMT solver to handle more complex manipulation of array indices, as you'd see in most real programs. If we're going this far with dependent types I think it would be inadvisable to shy away from using one.

@cramertj
Copy link
Member

cramertj commented Feb 26, 2017

This RFC proposes its own method of invariant construction based on explicit loops and conditionals. As stated in the alternatives, it might also be worth considering an MIR-based approach. There's some prior art on this in @Ericson2314's "A Stateful MIR for Rust". Would all such systems necessitate an SMT solver?

# Prerequisite material
The `where` RFC (kept track of
[here](https://github.com/rust-lang/rfcs/issues/1930)) introduces a set of
axioms, defining a simple, constructive logic.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mean the with RFC it should be #1932.

@withoutboats withoutboats added the T-lang Relevant to the language team, which will review and decide on the RFC. label Feb 26, 2017
@eternaleye
Copy link

eternaleye commented Feb 27, 2017

@Ixrec: In essence, #1931 and #1932 do not introduce "dependent types" as they're understood in the literature - they introduce what may be called "const-dependent types".

Full dependent types, then, remove the need for the type parameters to be const, and permits the dependent parameters of types to depend on runtime behavior - including, but not limited to, branching, user input, etc. In particular, one might be able to do something like this:

impl<T> [T] {
    fn split_at(&self, len: {l: usize | l <= self.length()}) -> (&[T; len]; &[T; self.length() - len])
}

(I've strawmanned the dependent type with the syntax Coq uses.)

In Coq, this could not be called without first proving that len < self.length() holds (though the Program tactic can automate simple cases of this), but if it is called, the the return type holds.

Similar signatures could apply to Chunks and Windows - returning (impl Iterator<Item=&[T; n]>, &[T]), or even (impl Iterator<Item=&[T; n]>, &[T; self.length() % n])

As a corrolary of len being a runtime parameter, this cannot induce monomorphization - which may place limitations on exactly what return types are permitted. In practice, this will probably only be possible for functions that

  1. Have signatures entirely lacking in runtime-parameterized types (might guard or replace panics)
    • fn split_at(&self, len: {l: usize | l < self.length()}) -> (&[T]; &[T])
  2. Have signatures where all runtime-parameterized types are behind references (richer APIs)
    • fn split_at(&self, len: {l: usize | l <= self.length()}) -> (&[T; len]; &[T; self.length() - len])
  3. Have signatures where all runtime-parameterized types are behind arbitrary type-erasing containers, like Box or Arc (of use for verified implementations of parsers that need parse forests such as GLR, GLL, or RCG; likely other use cases)

in ascending order of implementation difficulty.

@AndyShiue
Copy link

What is the use case of "real" dependent types? Can I provide the type arguments at runtime? I guess no because that would also change the size. Therefore, sigma types still doesn't work, right?

@Ericson2314
Copy link
Contributor

@cramertj To be honest, I am a bit confused by this RFC. My stateful does take a hoare-logic-like approach [Fun fact, it does so quite coincidentally!]. However, the types it ascribes locations per node in the CFG, aren't dependent in any way. Certainly, it's possible to get that formalization of the CFG, and through in some dependent types as it does.

I'm also confused why these RFCs mention SMT solving (I asked this on the original one in the original thread). SMT solving and advanced type theories, while they can be used together (and it's great!), come from somewhat different academic traditions and thus historically aren't. It may seem trite to bring up academic traditions in a purely technical discussion, but my point as whereas these RFCs imply it's easy to accidentally end up needing SMT solving, I'd argue the opposite is true---blindly mimicking prior art is unlikely to make one end up needing SMT by mistake.

@Ericson2314
Copy link
Contributor

@AndyShiue well, if we want to generalize trait objects, we will want real existential types. Also a lot of stuff with e.g. multi-dimensional arrays ought not to be accomplished with monomorphization.

@DemiMarie
Copy link

I am hoping to be able to prove arbitrarily complex properties of my programs, up to and including "This program is correct, with no bugs, according to a certain specification". I would imagine that this would be a huge plus in making Rust more attractive in the safety-critical embedded world: this would allow one to prove that a program has no bugs at all.

@mark-i-m
Copy link
Member

mark-i-m commented Mar 2, 2017

this would allow one to prove that a program has no bugs at all.

Well, technically, you would also need to prove the compiler correct for that...

But I agree! That would be fantastic, and not just for embedded. I think most systems-level code could benefit heavily from that. The key is that it has to be done in a way that is ergonomic for the programmer too.

@aturon aturon self-assigned this Mar 9, 2017
@aturon
Copy link
Member

aturon commented Mar 10, 2017

Thanks @ticki.

The lang team discussed this RFC a bit during the triage meeting today, and consensus continues to be as laid out in the comment on the previous RFC, namely that this extension is getting quite a bit far ahead of things, and we should focus for now solely on a minimal const generics proposal, and the implementation foundation to support it.

While it's certainly fine to discuss the topic, we'd prefer discussion to happen on https://internals.rust-lang.org/ for the time being, in the interest of keeping the RFC PR list pruned to proposals under consideration in the relatively near term. This is a topic worth revisiting after the core has stabilized.

@rfcbot fcp postpone

@rfcbot
Copy link
Collaborator

rfcbot commented Mar 10, 2017

Team member @aturon has proposed to postpone this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@ticki
Copy link
Contributor Author

ticki commented Mar 12, 2017

I'm ok with postponing it. It's a huge change and even small improvements won't change that. Rust is in a state and time where there are other things to focus on at this moment, even though it would be a nice feature.

@aturon
Copy link
Member

aturon commented Apr 17, 2017

Thanks @ticki. I'm going to go ahead and close. We can potentially revisit this topic after the core const generics feature has stabilized and we have experience with it.

@aturon aturon closed this Apr 17, 2017
@LPTK
Copy link

LPTK commented Jun 5, 2017

Just adding to the discussion about SMT solvers. It looks like the author is missing the point about their usefulness, when writing:

It turns out reasoning about invariants is not as hard as expected. Hoare logic allows for this.
One need not SMT-solvers for such a feature. In fact, one can reason from the rules, we have already provided.

For example, say I implement a function that requires its argument x to be strictly positive x > 0 and I want to pass it to a function that requires it to be positive x >= 0. This is obviously safe, but the proposal doesn't say anything about how that safety would be resolved (which ideally should be: automatically, by the compiler).

You can start implementing ad-hoc rules to manipulate knowledge about integers and such, but at this point you are just re-implementing the features of an SMT. Plus, there are tons of such rules you can implement and different heuristics to solve them efficiently, so you probably don't want to go and do that yourself when good tools already exist to do it for you.

The other approach is to treat integers as inductively defined data types and allow explicit reasoning on their structure, like in languages such as Idris based on the Curry-Howard correspondence. However, that typically requires more work from programmers to make basic things work, and with less automation. So it's not completely clear whether such approaches would scale well for every-day Rust users.

@le-jzr
Copy link

le-jzr commented Jun 5, 2017

The problem with just using an SMT solver is that any useful integer analysis is at best NP-complete, and we can't have static analysis that can run for years if you make a complicated program. If validity of a program depends on the analysis, you can only reliably solve some cases, and the algorithm used becomes a de facto part of the language definition.

@LPTK
Copy link

LPTK commented Jun 5, 2017

@le-jzr there are some well-defined subsets supported by SMTs that are decidable, so if SMT solving is made part of the type checker, it should obviously restrict itself to those subsets (so the algorithm is not a part of the language spec). For example, this includes integer arithmetics without variable-variable multiplication/division, which is already a pretty useful subset.

Performance is a valid concern, but what are the alternatives here? I don't see requiring Rust users to write formal proofs as a realistic solution, and even in such explicit approaches performance problems are frequently encountered. If we want some level of automation, SMTs are probably the safest bet as they have been developed with performance as one of their most important goals.

On the other hand, most non-trivial type checkers out there are easily NP-hard. Hindley-Milner is DEXPTIME-complete. Java, Rust and Scala's type systems are even Turing-complete. What matters is to see whether this impacts real-life programs or whether that corresponds mostly to rare pathological cases.

@le-jzr
Copy link

le-jzr commented Jun 5, 2017

I'd say it mostly depends on how exactly it's integrated. My past research suggests that linear code is fairly trivial, but it's loops that are difficult to deal with, as well as most important to get right. You can get around loops by explicit annotations for invariants, but that's a significant concession to ergonomy.

@LPTK
Copy link

LPTK commented Jun 5, 2017

Absolutely, loops (and recursion in general) are where automation becomes very hard (and is, in fact, necessarily incomplete), but that's going to be the case whatever technique you're using. It is still perfectly possible to combine some level of automatic inductive reasoning with the power of SMT solvers. They are not incompatible.

Again, I'd like to see some alternative proposals that look better than that, for Rust's use case. I think even if loops are not handled at all in the first iterations of the system, it's still going to be very useful to be able to express invariants on runtime values and have some level of automation to manipulate them. At least it makes users pause and write assertions or loop invariants in the critical parts where the system cannot prove correctness. Or even, say, enclose the code in an unsafe block.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.