Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Null models fail with invalid data #177

Closed
jarioksa opened this issue May 23, 2016 · 2 comments
Closed

Null models fail with invalid data #177

jarioksa opened this issue May 23, 2016 · 2 comments
Milestone

Comments

@jarioksa
Copy link
Contributor

jarioksa commented May 23, 2016

PR #175 allows building stratified null models. Either species or sites are split to classes, null model is generated separately for each class, and then these simulations are combined again with smbind. With careless invocation, users can generate classes which have only one element, and then several sampling methods fails, sometimes fatally. Here is an overview of models that fail with only one observation:

  • swap, tswap and curveball go to infinite loop that cannot be interrupted. Same infinite loop problem concerns functions that use these swapping functions: swap_count, abuswap_r, abuswap_c. The only way to get out is to kill R and lose a day's work. Commit 6cc3fd0 will fix this and give a user-error instead of un-interruptable infinite loop.
  • r2dtable fails with a cryptic error message Error in r2dtable(fill, rf, cf) : invalid argument 'r'. Functions that depend on r2dtable fails similarly: quasiswap_count, swsh_samp, swsh_both, swsh_samp_r, swsh_samp_c, swsh_both_r, swsh_both_c.
  • c0_samp and c0_both with one site and r0_samp and c0_both with one species fail with error message on number of items to replace is not a multiple of replacement length. Probably this happens because of dropping dimensions.
  • backtrack fails with Error in sample.int(length(x), size, replace, prob) : too few positive probabilities.

Many of these are user errors, but models r00, r0, c0and quantitative r00_, r0_, c0_ models based on these can produce different one-item matrices (but I haven't checked if these are valid: they may not be). We do not expect to see this usage at large, but with stratified models they just may appear. The question is should we handle these more gracefully than with these error messages. Moreover, we probably should verify that the quantitative models are valid ones.

One thing that must be fixed are the swap models (swap, tswap, curveball): it is a user error to ask swapping of one row or one column, but un-interruptable infinite loop is unacceptable. I'll change it to a normal error.

@psolymos
Copy link
Contributor

psolymos commented May 24, 2016

I added error messages to the r2dtable based nullmodels in make.commsim 96fbde0.

For c0_samp, and r0_samp the problem was the behaviour of sample(x) when length(x) == 1 in which case sample(1:x) will not match the array dimensions (number of items to replace is not a multiple of replacement length).

For c0_both, and r0_both the problem was similar, but another error related to rmultinom (no positive probabilities). This happened when row sum/col was 0, thus probability was defined as a length(0) numeric. This could potentially happen when running stratified null models, and it shouldn't produce such error.

A note in the help page about these restrictions is also added, see branch nullmod-1rowcol 4617d96.

psolymos added a commit that referenced this issue May 24, 2016
This commit addresses #177
psolymos added a commit that referenced this issue May 24, 2016
@jarioksa jarioksa mentioned this issue May 24, 2016
6 tasks
@psolymos
Copy link
Contributor

Commit 6cc3fd0 and PR #180 fixed and closed this issue.

jarioksa pushed a commit that referenced this issue Sep 15, 2016
millions of random numbers may be generated with one matrix (e.g.,
in quasiswap) with same dimensions, and it is wasteful to repeat
check for dimensions every time when the dimensions do not change.
This should be checked before entering the loop or in R before
calling C (like it is done now in make.commsim after 96fbde0 and
github issue #177).
jarioksa pushed a commit that referenced this issue Oct 10, 2016
millions of random numbers may be generated with one matrix (e.g.,
in quasiswap) with same dimensions, and it is wasteful to repeat
check for dimensions every time when the dimensions do not change.
This should be checked before entering the loop or in R before
calling C (like it is done now in make.commsim after 96fbde0 and
github issue #177).

(cherry picked from commit f567fa5)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants