-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dataframe_regression fixture #35
Conversation
Thanks @648trindade! Can you please also update the CHANGELOG? Something like: 2.1.0 (UNRELEASED)
------------------
* `#35 <https://github.com/ESSS/pytest-regressions/pull/35>`__: New ``dataframe_regression`` fixture to check pandas DataFrames directly. |
tests/test_dataframe_regression.py
Outdated
dataframe_regression.check(pd.DataFrame.from_dict({"data1": data2})) | ||
|
||
|
||
def test_n_dimensions(dataframe_regression, no_regen): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I understand this is a consistent message, I find it a bit misleading, because in practice, we want to tell that n-dimensional arrays are not supported. I ask because on num_regression
, the message is Only 1D arrays are supported on num_data_regression fixture
.
Perhaps, at least, tell the user what he did wrong. Something like this:
Only numeric data is supported on dataframe_regression fixture: Column "C" has unsupported data type "Object".
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up mixing things up in this test, sorry. Taking a second look at this I came to the fact that pandas do not accepts numpy n-dimensional arrays within dataframe columns.
So the assertion for 1D arrays inside DataFrameRegressionFixture
will never fail. I think I can remove that assertion and rename this test as @tadeu suggested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in overall, I've just commented some minor details, thanks for the awesome contribution :)
@@ -232,18 +72,6 @@ def check( | |||
except ModuleNotFoundError: | |||
raise ModuleNotFoundError(import_error_message("Pandas")) | |||
|
|||
import functools | |||
|
|||
__tracebackhide__ = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that __tracebackhide__ = True
should be kept
src/pytest_regressions/plugin.py
Outdated
'P': Pa_to_bar(P), | ||
} | ||
), | ||
data_index=positions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation seems wrong here, there's no data_index
in dataframe_regression.check
|
||
def test_usage_workflow(testdir, monkeypatch): | ||
""" | ||
:type testdir: _pytest.pytester.TmpTestdir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just an idea for an issue later, we could use proper type annotations and add mypy
support in precommit/CI
|
||
import sys | ||
|
||
monkeypatch.setattr( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that you just made this test based on the other tests, but I'm curious about why there's this injection of test stuff into sys
(seems weird), perhaps @tarcisiofischer or @nicoddemus know about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea
tests/test_dataframe_regression.py
Outdated
"500 1.20000000000000018 1.10000000000000009 0.10000000000000009", | ||
] | ||
) | ||
# prints used to debug #3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are they still necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing it
tests/test_dataframe_regression.py
Outdated
dataframe_regression.check(pd.DataFrame.from_dict({"data1": data2})) | ||
|
||
|
||
def test_n_dimensions(dataframe_regression, no_regen): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_n_dimensions
→ test_non_numeric_data
?
tests/test_num_regression.py
Outdated
@@ -160,7 +160,7 @@ def test_different_data_types(num_regression, no_regen): | |||
# Smoke test: Should not raise any exception | |||
num_regression.check({"data1": data1}) | |||
|
|||
data2 = np.array(["a"] * 10) | |||
data2 = np.array([True] * 10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this change? Perhaps instead of changing, it could be added as a parameter via parametrize
(test both "a"
and True
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
python strings have 'object' dtype inside numpy/pandas. Objects are no longer accepted since we can not test it numerically (or can we? Maybe I'm missing something).
But boolean type is numerical-like, then the dataframe is not rejected before the actual check, and it returns the expected error message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay, but perhaps we could extend it to support simple "value-type" objects such as strings, for an improved usage and for making user lives easier? They would not compare numerically, just for exact matches in these cases. (Could be another PR/issue though)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did it! Rolled back the num_regression
test that I changed to use string arrays again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
tests/test_dataframe_regression.py
Outdated
def test_different_data_types(dataframe_regression, no_regen): | ||
data1 = np.ones(10) | ||
# Smoke test: Should not raise any exception | ||
dataframe_regression.check(pd.DataFrame.from_dict({"data1": data1})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this line be a separate test (if it doesn't already exist?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It refers to the same CSV file than the below check
checks, right? Looks like a kind of 'sanity check'.
But as common cases are well treated by the test_common_cases
test function, I will remove it.
@648trindade, this looks good to be merged. Do you have plans to add more things to it? |
@tadeu Nope, I think it is enough. |
* Reflect changes from ESSS/pytest-regressions#35 * Run stubtest as part of mypy testenv. * Updated config files. * Linting. * Linting * Updated files with 'repo_helper'. (#2) Co-authored-by: repo-helper[bot] <74742576+repo-helper[bot]@users.noreply.github.com> * [pre-commit.ci] pre-commit autoupdate (#1) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Reflect changes from ESSS/pytest-regressions#35 * Linting. * Don't upload coverage. * Updated config files. * Linting. * Updated config files. Co-authored-by: repo-helper[bot] <74742576+repo-helper[bot]@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Fix #21
Hi guys!
Most of the changes here start from the fact that numerical regression was actually using pandas as backend.
Main changes:
DataFrameRegressionFixture
fordataframe_regression
NumericRegressionFixture
now inherits most of its behavior fromDataFrameRegressionFixture
:check
method converts python dicts to pandas DataFrames before callingDataFrameRegressionFixture.check
num_regression
tests and made sure that apd.DataFrame
was being created from the input dictdataframe_regression
there are no tests for filling asymmetric arrays with values, since pandas natively do not allow themnum_regression
test since it was checking types with a string array, and string arrays are no longer allowed (makes sense IMHO)