Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fate of the Rand trait #83

Closed
dhardy opened this issue Jan 12, 2018 · 21 comments
Closed

Fate of the Rand trait #83

dhardy opened this issue Jan 12, 2018 · 21 comments
Labels

Comments

@dhardy
Copy link
Owner

dhardy commented Jan 12, 2018

For a while now, I've been planning to replace Rand with some distribution (Default in this branch) for most purposes. The two are roughly equivalent; e.g. the one-tuple (T,) implementation boils down to:

// old Rand
impl<T: Rand> Rand for (T,) {
    #[inline]
    fn rand<R: Rng>(rng: &mut R) -> (T,) {
        (rng.gen(),)
    }
}

// new Disrtibution (`Default` name may change)
impl<T> Distribution<(T,)> for Default where Default: Distribution<T> {
    #[inline]
    fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> (T,) {
        (rng.gen(),)
    }
}

Why, you ask?

  1. Type-specific value generation now all comes under the Distribution trait, giving more uniform usage (rng.sample(Default) or rng.sample(Range::new(2, 11))) and making generic code easier
  2. Rand::rand does not support dynamic dispatch due to implicit Sized constraint; fixing this would break all Rand implementations, so we have no reason not to introduce a replacement

But, then, what about people using code like let v: (u32, u64) = Rand::rand(rng)?
This is what this issue is about; we could:

  1. Remove Rand completely, forcing users to update (not that hard in many cases; rng.gen() will still work)
  2. Introduce a wrapper trait; this eliminates breakage for users (and in some cases even implementations of Rand), but leaves duplicate functionality
  3. Introduce the above wrapper, with deprecation warnings, and remove before 1.0

Is the last option the best?

There is also rand_derive; we can probably introduce an alternative based on Distribution, but I don't know if we should? Apparently rand_derive has no dependent crates.

@burdges
Copy link

burdges commented Jan 12, 2018

I'd call it Uniform or similar, not Default.

@pitdicker
Copy link

pitdicker commented Jan 13, 2018

If we remove Rand, we are also killing rand_derive because it implements that trait?

I was wondering about rand_derive, and if there was some kind of reason it wouldn't show up with reverse dependencies on crates.io even while it was used. But maybe it doesn't work well?

rand can already take care of the basic types like integers, floats, char and bool. And also tuples, arrays with less then 32 elements (hacky i.m.o.) and Option. The places where you would use rand_derive are newtypes, struct's and maybe unions (I don't believe it works on larger arrays).

  • Do raw random numbers often make sense? Do none of the values in a struct have some kind of sensible range? Or dependency on another? Is it really true that every combination of every value is valid? Are there really no values that may never occur?
  • Does deriving work? Or does the struct contain some type from the standard library, like PhantomData, Result, Cell, Wrapping? Or even some smart pointers?
  • If the above points are true, and you have one or a couple of structs and newtypes for which deriving Rand makes sense, is that reason enough to include an extra dependency, or will you just write the couple of necessary lines yourself?

@pitdicker
Copy link

Thinking about trait names, Rand makes sense to me for all kinds of types, and the rand method is contains uses the most valuable method name in this library.

Distribution mostly makes sense for floats, and with a stretch for all primitive types. But the name does not feel to me like it fits all sorts of structs.

@pitdicker
Copy link

Can we say Rand, Rng, SeedableRng, StdRng and tread_rng are the big things users of rand depend on?

Personally I like option 3 best. With the exception of naming. But I am just thinking out loud...

If we end up removing Rand, and if we could pick a new name instead of Distribution, Rand would be a really nice new name. I think it even makes sense to rename it that way. When rand_core and rand_rngs are split off something like half the code left in rand will be implementing Distribution, enough to make sense for it to use 'our most valuable name' 😄. But then option 3 is unavailable.

Is a deprecation period something we really need? It is not like crates are forced to update. Or are types and traits from rand part of the public API of crates? Hmm, with Rand that may be the case.

A quick look on github: https://github.com/search?p=1&q=%22impl+Rand+for%22&type=Code&utf8=%E2%9C%93. Many, many are just old copies of librand. Starting with the results from page 7 we get into trouble...

@dhardy
Copy link
Owner Author

dhardy commented Jan 13, 2018

Personally I like the name Distribution; it describes the purpose exactly. Rand is a bit unclear just from the name.

I'd call it Uniform or similar, not Default.

Finally, a suggestion I like!

@burdges
Copy link

burdges commented Jan 13, 2018

Of course Uniform is a distribution name, not a sample space name. Range is a sample space, so RangeFull might be the corresponding name there, except it lack a type.

In general, there is an issue with not all distributions making sense for all data types, which messes up some deriving I fear. I think Rand already encountered this.

In this vein, we cannot turn a RangeArgument<T> into a sample space directly because RangeFull lacks a type, so ..::<u64> make no sense, etc., so this fails:

trait SampleSpace {  type Domain;  }
impl<T,S> SampleSpace for S where S: RangeArgument<T> {  type Domain = T;  }  // Fails
trait Distribution<S: SampleSpace> { .. }

Instead, we need an obnoxious PhantomData in Uniform like

trait Distribution<T> {
    /// Samples are independent assuming independent `Rng` samples
    const INDEPENDENT: bool = true;
    fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> T;
}

struct Uniform<T, S>(PhantomData<T>,S);

/// Attempt to infer the type of a `RangeArgument` from context.
fn uniform<T,S: RangeArgument<T>>(S) -> Uniform<T,S> { Uniform<T, S>(PhantomData<T>, S) }

impl<T,S> Distribution<T> for Uniform<T,S> where S: RangeArgument<T> {
    default fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> T { .. }
}

@dhardy
Copy link
Owner Author

dhardy commented Jan 13, 2018

All this to allow the following?

let x = rng.sample(Uniform<u32, RangeFull>);

This should extend to something like rng.sample(Uniform<u32, _>::new(1..7)) (specifying type and sample space), but I'm not quite sure whether we can do that (perhaps, using a trick like the new Range code to store parameters).

I'm not convinced it's worth it though, because

  1. Aside from restricted range with uniform sampling over ordered values, I think most distributions will use the full or an implied range (sample space)
  2. For many distributions there is an implied sampling range anyway (e.g. exponential distribution over all positive numbers), so the two concepts are not really independent

(Of course one can have things like log-normal with restricted range, but that is effectively two different distributions, with a choice of clamping values down to the restricted range or re-sampling.)

If the distribution (e.g. Uniform) doesn't have a sample-space parameter, then it doesn't need a T parameter either. I've seen that done, but overall prefer the code we have in my rand branch.

@burdges
Copy link

burdges commented Jan 13, 2018

I see. If rng.sample(..) is going to work, then uniform should be the default and the return type must be inferred from context, so this maybe:

pub trait Distribution<T> {
    /// Samples are independent assuming independent `Rng` samples
    const INDEPENDENT: bool = true;
    fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> T;
}

impl<T,S> Distribution<T> for S where S: RangeArgument<T> {
    default fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> T {
        .. maybe destructure and use self.start() and self.end() ..
    }
}

I take it you'll have a symmetric method for Rng too?

trait Rng .. {
    ...
    fn sample<T, D: Distribution<T>>(&mut self, dist: &D) -> T { dist.sample(rng) }
}

@dhardy
Copy link
Owner Author

dhardy commented Jan 13, 2018

Rust doesn't support default parameters so we have rng.gen() for the default (existing method) and rng.sample(distr) for the configurable version.

Yes, it requires a symmetric sample method, but in an extension trait to allow rand-core to be separate (so trait Sample: Rng).

@dhardy
Copy link
Owner Author

dhardy commented Jan 13, 2018

Oh, why do you keep adding Distribution::INDEPENDENT? Just as documentation? Since sample takes &self not &mut self it's hard to support samples which are not independent.

Yes, this is why rand 0.3 has both IndependentSample and Sample, but I don't think it's worth having both, personally (as @GrahamDennis noted, one is a random distribution and the other is a random process).

@burdges
Copy link

burdges commented Jan 13, 2018

Yes, Distribution::INDEPENDENT is dumb. I doubt anyone would use interior mutability in a Distribution anyways.

I strongly dislike this push for senseless crate separation: An extension trait is a clear harm. Separate docs are a clear harm. What value does a separate crate bring?

@dhardy
Copy link
Owner Author

dhardy commented Jan 13, 2018

@burdges I'm not really convinced myself, though it's come up several times, e.g. here.

@dhardy
Copy link
Owner Author

dhardy commented Feb 7, 2018

I see rand_derive now has one reverse dependency: glitter (glit crate).

I'm not sure, but we could go with option 2 and keep Rand around forever in some form; I think that would be enough for rand_derive.

@dhardy
Copy link
Owner Author

dhardy commented Feb 10, 2018

After having had a look at quickcheck's Arbitrary trait, which is a parallel of Rand but with an added shrink method, I do not think rand_derive is useful for this feature. Adding deriving for Arbitrary (including a shrink implementation) might make sense, but that I'll leave to @BurntSushi.

The example mentioned above derive's Rand for some types then uses that to implement Arbitrary, but without implementing shrink (e.g. Expression impl). @BurntSushi do you agree that it would make more sense for this example to implement Arbitrary directly, and perhaps make use of a "derive Arbitrary" macro instead (e.g. for Name)?

I ask because I'm very tempted to mark rand_derive and Rand obsolete (replacing the latter with a Uniform distribution, but not replacing the former), and this use for quickcheck appears to be the main use-case for both.

I guess rust-random#148 is related.

@vks
Copy link

vks commented Feb 16, 2018

I ask because I'm very tempted to mark rand_derive and Rand obsolete (replacing the latter with a Uniform distribution, but not replacing the former), and this use for quickcheck appears to be the main use-case for both.

Sounds good to me! I think rand_derive is only useful for types exclusively composed of integers.

dhardy added a commit that referenced this issue Feb 17, 2018
See #83
Required for following changes affecting Rand
@pitdicker
Copy link

Also 👍 from me. I think we can see quickcheck's s Arbitrary trait as similar and better for the purpose of testing.

@BurntSushi
Copy link

@dhardy Apologies, I'm only just now seeing your ping. I'm afraid I don't really have enough context to understand the question you're asking me. Can you briefly summarize what input you need from me?

@pitdicker
Copy link

I see rand_derive now has one reverse dependency: glitter (glit crate).

I'm not sure, but we could go with option 2 and keep Rand around forever in some form; I think that would be enough for rand_derive.

Nobody is forced to update, and even when glitter wants to update it probably does not need a big change? It only uses #[derive(Rand)] for an enum. It can just generate a number in in the range 0..17 and map that to the enum values.

@dhardy
Copy link
Owner Author

dhardy commented Feb 17, 2018

@BurntSushi Arbitrary is a bit like Rand: it provides a way to create values of type self, given a helper type (Gen or Rng). So it is possible to use Rand and #[derive(Rand)] to (partially) automatically implement the arbitrary generation for structs (etc.) composed of sub-types which already support Rand. Have you considered adding #[derive(Arbitrary)] support along the same lines?

That's partly besides the point; I just discovered that glitter uses rand_derive to implement Arbitrary — but as @pitdicker says it's not a big deal if we drop support for rand_derive.

@BurntSushi
Copy link

@dhardy I don't think I've ever used derive(Rand) personally, and while derive(Arbitrary) is something I've heard people say they want, I've never given much thought to it myself and have no plans to work on it. The hardest bit about it is coming up with the shrinker aspect of Arbitrary. It's not clear there is even a way to automatically derive that.

@dhardy
Copy link
Owner Author

dhardy commented Mar 11, 2018

Rand has been deprecated and Uniform implemented: rust-random#256
Also, rand-derive has been deprecated (even though GH says not merged): rust-random#263

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants