Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some hypothesis test functions #315

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

mdahlin
Copy link

@mdahlin mdahlin commented Jan 2, 2025

Adds functions for

Functions are generally based on the scipy version. I tried to align with existing formatting/setup from the fisher test, but wanted to make sure I was on the right track (in terms of structure, level of documentation, desire for this capability, etc.) before thinking about adding more tests.

@YeungOnion
Copy link
Contributor

Sorry that I've taken a bit to reply here.

So far, these are great. We have mentioned the idea of a nan policy in regards to analytical functions (as opposed to empirical functions) and I think following the scipy approach is good because of developer expectations.

We don't really have enough tests to have a sense of uniform API for tests and having the policy as an argument is useful for establishing that.

I think the direction you're going in is good and I would approve this once I look into why the nightly-dependent workflow in the CI won't compile. I'm open to you continuing on this PR or opening a dependent PR.

However, regarding license, to what degree would you say you referred to the scipy source? I don't wish to complicate the license we distribute with, nor do I want to use a license that's not typical for crates on crates.io

@mdahlin
Copy link
Author

mdahlin commented Jan 12, 2025

Hey thanks for the response and feedback.

In terms of the nightly piece. I found the same error locally. A day or so later I updated nightly and everything worked just fine, so it seems like it was an issue specific to nightly.


In terms of how much I "referred to the scipy source", it's been a pretty loose reference for the most part but I'll provide some relevant links if you want to form your own opinion.

one-way ANOVA F-test

My conclusion: commonality with scipy is mainly just the function signature as I leveraged a statsdirect page for logic

One Sample t-test

My conclusion: again mainly just the function signature as I used the logic from this jpm page

Mann Whitney U

Here we'd probably want to look at the two main pieces of logic, being the different methods for calculating the test's statistic, separately

Exact

My conclusion: These are very different from each other. The scipy version is doing a lot of 2-d array stuff and matrix operations that I didn't get into in my implementation.

Asymptotic

My conclusion: This is probably the only case worth your review/thoughts. The scipy version is ~10 LOC and the version in this PR is basically a 1:1 copy of those lines. There isn't too much room for alternative implementation here, but happy to re-write it to avoid any potential issues. Also for reference, the scipy function references this section in the Mann Whitney U wiki article.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants