-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
behavioral consistency of calculate() #356
Conversation
moves one-off helpers into the file they're used in
this should be useful in determining whether a supplied statistic is appropriate given variable types.
also fixes a bug in copying attrs to calculate results: generate -> generated closes #355
adjust line break, use appropriate formatting for t statistic / null mu
some for new functionality, some making up for coverage that we let slip in recent PRs
one was indistinguishable, the other had to do with copying attributes.
Turns out previous behavior wasn't to error when calculating a _distribution_ of statistics without hypothesizing. This will allow calculate to be internally consistent with its handling of null hypotheses.
I kinda do like Thanks much for all the other work here! It's amazing all the little pieces that needed to be put back together and finished up that have somehow trickled their way into the package. I'm glad we have some master tidiers 😀 |
What I'd really like is tying the infer verbs back to Allen Downey's There Is Only One Test diagram (http://allendowney.blogspot.com/2016/06/there-is-still-only-one-test.html?m=1). Adding |
Much appreciated!😄 If we decide that these changes are a go, I'll open up a separate PR/issue with some thoughts on |
Wow, that's a lot of good work! Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for great work!
After reviewing it does look like splitting this PR into at least two parts (utility functions tidying and calculate()
update) would lead to a better review.
dispatch to helpers within calculate avoid eval(parse( in favor of switch some style improvements
@echasnovski Lots of helpful changes coming from your first round of review! Much appreciated. :-) I hear you on keeping future PRs smaller--will do. Probably should have mentioned that switching out that one I’m not sure if you mean splitting up the PR itself or its review. Totally on board for the latter—feel free to take your time reviewing. We probably could/should nudge the merging of these changes until after the release of 0.5.4 (addressing the vdiffr issues) in the next few days. This PR will be pretty related to coming |
That was me thinking out loud, sorry. I meant that some changes (utilities wrapper or rename) need relatively small amount of targeted conversation but lead to cluttering the whole PR. When looking at every change made in PR I had to determine if it is an "attention needed" change or a simple move/rename. If it was a separate PR, it will need only skipping through changes after agreeing in pre-conversation. Never fully internalized this until actually making review of a rather big PR :) Yes, making CRAN release soon is a priority here. These changes are better to wait until 0.6.0. |
All for it, thanks for clarifying + dealing with the clutter in this PR. I'll holler here and merge upstream once 0.5.4 is on CRAN! Starting that release process tonight. |
Finally got an opportunity to write code. Made some updates myself, hope you don't mind. |
Thanks for catching that negation for |
notably, fixes merge conflicts in NEWS Merge branch 'master' into obs-stat # Conflicts: # NEWS.md
This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue. |
This PR makes some changes to
calculate()
aiming to error more informatively and behave more consistently. Briefly, fromNEWS
:stat
argument isn't well-defined for the variables specifiedError informatively with inappropriate
stat
The most confusing errors I see while teaching with
calculate()
seem to predominantly arise from within the dispatchedcalc_impl
methods. As a result, the errors (or incorrect output) appear differently for the same type of mistake: supplying astat
that doesn't make sense given the variable typesspecify()
ed. For instance, with the currentdevelop
:or
or
We caught a few of the more cryptic errors with some conditional logic for special cases (the
check_for_*_stat
functions), though a good bit still slip through. We ought to make sure thatcalculate()
errors come through before dispatching tocalc_impl()
.The new generalized error takes the form:
or
I've tossed around a few different iterations of phrasings for this error, trying to get at "it doesn't make sense" with sensitive but eliciting language. Very much willing to discuss rewordings here. Specifically, are "dichotomous" and "multinomial" the most common phrases here?
This implementation eliminates the need to run ad-hoc checks/transformations on the variable types/existence within the
calc_impl
methods that have caused issues in the past (see my most two most recent PRs). Those now take place before dispatching. Further, errors will now be consistent for what is essentially the same mistake—specifying astat
that doesn't make sense given the variable(s) specified.Its logic relies on a
stat_types
tibble inutils.R
—would appreciate a close look at what it considers a valid test statistic given variable types.Calculating observed statistics
The steps to calculating observed statistics seem to vary unnecessarily across test statistics. Some test statistics couldn't be calculated without
generate()
ing first, test statistics differ in their response to unneeded hypothesis information, test statistics with nontrivial null hypotheses differed in the strictness with which they required a null, and we tended often not to warn in pipelines that were similar to those needed to successfully calculate a test statistic. My most recent two PRs for reference here, as well, in addition to the following examples.Calculate will now supply a message when the user supplies "too much" information to calculate the given ("untheorized") observed statistic:
In previous infer versions, depending on the statistic, we used to error out or return the original
x
argument here. Some of my more recent PRs adjust some of the latter behavior to return the statistic without messaging forz
statistics.On the other hand, not supplying enough information (for "theorized" statistics) results in a warning:
or
In the above case, depending on the statistic, we used to error (Chi-Square GOF), message (one-sample t), or silently report the statistic/error uninformatively (other theorized statistics). I could see the argument for erroring here, though this would be a breaking change for some statistics and wrappers.
calculate()
will behave similarly aftergenerate()
ing.Some other changes:
"two.sided"
as an alias for"two-sided"
in*_p_value()
(Allow direction = "two.sided" in get_p_value() #355)I don't believe this PR breaks any code that used to work... all previous unit tests pass except for rewordings and transitions between messages/warnings/errors. See 508eefa for needed adjustments.
I don't introduce it in this PR pending more discussion, but I think it's worth mentioning that this allows for a single function to calculate observed statistics (
observe()
?) a la the current*_stat()
functions. Not sure if this goes against the grain of infer's pedagogical approach.