Add dataframe_regression fixture #35

648trindade · 2020-10-12T23:27:17Z

Fix #21

Hi guys!

Most of the changes here start from the fact that numerical regression was actually using pandas as backend.

Main changes:

Created DataFrameRegressionFixture for dataframe_regression
NumericRegressionFixture now inherits most of its behavior from DataFrameRegressionFixture: check method converts python dicts to pandas DataFrames before calling DataFrameRegressionFixture.check
I've added some type checking to the inner data types, the following numpy data types are not allowed:
- timedelta (m)
- datetime (M)
- objects (O)
- zero-terminated bytes (S, a)
- unicode strings (U)
- raw data (V)
For most tests, I just copied num_regression tests and made sure that a pd.DataFrame was being created from the input dict
For dataframe_regression there are no tests for filling asymmetric arrays with values, since pandas natively do not allow them
I had to change 1 num_regression test since it was checking types with a string array, and string arrays are no longer allowed (makes sense IMHO)

nicoddemus · 2020-10-13T16:45:07Z

Thanks @648trindade!

Can you please also update the CHANGELOG? Something like:

2.1.0 (UNRELEASED)
------------------

* `#35 <https://github.com/ESSS/pytest-regressions/pull/35>`__: New ``dataframe_regression`` fixture to check pandas DataFrames directly.

tarcisiofischer · 2020-10-13T17:42:23Z

tests/test_dataframe_regression.py

+        dataframe_regression.check(pd.DataFrame.from_dict({"data1": data2}))
+
+
+def test_n_dimensions(dataframe_regression, no_regen):


Although I understand this is a consistent message, I find it a bit misleading, because in practice, we want to tell that n-dimensional arrays are not supported. I ask because on num_regression, the message is Only 1D arrays are supported on num_data_regression fixture.

Perhaps, at least, tell the user what he did wrong. Something like this:
Only numeric data is supported on dataframe_regression fixture: Column "C" has unsupported data type "Object".

What do you think?

I ended up mixing things up in this test, sorry. Taking a second look at this I came to the fact that pandas do not accepts numpy n-dimensional arrays within dataframe columns.
So the assertion for 1D arrays inside DataFrameRegressionFixture will never fail. I think I can remove that assertion and rename this test as @tadeu suggested

tadeu

Looks good in overall, I've just commented some minor details, thanks for the awesome contribution :)

tadeu · 2020-10-13T19:36:01Z

src/pytest_regressions/num_regression.py

@@ -232,18 +72,6 @@ def check(
        except ModuleNotFoundError:
            raise ModuleNotFoundError(import_error_message("Pandas"))

-        import functools
-
-        __tracebackhide__ = True


I think that __tracebackhide__ = True should be kept

tadeu · 2020-10-13T19:45:33Z

src/pytest_regressions/plugin.py

+                    'P': Pa_to_bar(P),
+                }
+            ),
+            data_index=positions,


Documentation seems wrong here, there's no data_index in dataframe_regression.check

tadeu · 2020-10-13T19:47:57Z

tests/test_dataframe_regression.py

+
+def test_usage_workflow(testdir, monkeypatch):
+    """
+    :type testdir: _pytest.pytester.TmpTestdir


Just an idea for an issue later, we could use proper type annotations and add mypy support in precommit/CI

tadeu · 2020-10-13T19:52:06Z

tests/test_dataframe_regression.py

+
+    import sys
+
+    monkeypatch.setattr(


I know that you just made this test based on the other tests, but I'm curious about why there's this injection of test stuff into sys (seems weird), perhaps @tarcisiofischer or @nicoddemus know about this.

tadeu · 2020-10-13T19:54:45Z

tests/test_dataframe_regression.py

+            "500  1.20000000000000018  1.10000000000000009  0.10000000000000009",
+        ]
+    )
+    # prints used to debug #3


Are they still necessary?

Removing it

tadeu · 2020-10-13T19:55:50Z

tests/test_dataframe_regression.py

+        dataframe_regression.check(pd.DataFrame.from_dict({"data1": data2}))
+
+
+def test_n_dimensions(dataframe_regression, no_regen):


test_n_dimensions → test_non_numeric_data?

tadeu · 2020-10-13T19:59:00Z

tests/test_num_regression.py

@@ -160,7 +160,7 @@ def test_different_data_types(num_regression, no_regen):
    # Smoke test: Should not raise any exception
    num_regression.check({"data1": data1})

-    data2 = np.array(["a"] * 10)
+    data2 = np.array([True] * 10)


Why did this change? Perhaps instead of changing, it could be added as a parameter via parametrize (test both "a" and True).

python strings have 'object' dtype inside numpy/pandas. Objects are no longer accepted since we can not test it numerically (or can we? Maybe I'm missing something).
But boolean type is numerical-like, then the dataframe is not rejected before the actual check, and it returns the expected error message

Ah, okay, but perhaps we could extend it to support simple "value-type" objects such as strings, for an improved usage and for making user lives easier? They would not compare numerically, just for exact matches in these cases. (Could be another PR/issue though)

Did it! Rolled back the num_regression test that I changed to use string arrays again

tadeu · 2020-10-13T20:00:16Z

tests/test_dataframe_regression.py

+def test_different_data_types(dataframe_regression, no_regen):
+    data1 = np.ones(10)
+    # Smoke test: Should not raise any exception
+    dataframe_regression.check(pd.DataFrame.from_dict({"data1": data1}))


Could this line be a separate test (if it doesn't already exist?)

It refers to the same CSV file than the below check checks, right? Looks like a kind of 'sanity check'.
But as common cases are well treated by the test_common_cases test function, I will remove it.

tadeu · 2020-10-19T12:05:08Z

@648trindade, this looks good to be merged. Do you have plans to add more things to it?

648trindade · 2020-10-19T12:51:22Z

@648trindade, this looks good to be merged. Do you have plans to add more things to it?

@tadeu Nope, I think it is enough.

* Reflect changes from ESSS/pytest-regressions#35 * Run stubtest as part of mypy testenv. * Updated config files. * Linting. * Linting * Updated files with 'repo_helper'. (#2) Co-authored-by: repo-helper[bot] <74742576+repo-helper[bot]@users.noreply.github.com> * [pre-commit.ci] pre-commit autoupdate (#1) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Reflect changes from ESSS/pytest-regressions#35 * Linting. * Don't upload coverage. * Updated config files. * Linting. * Updated config files. Co-authored-by: repo-helper[bot] <74742576+repo-helper[bot]@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Add dataframe_regression fixture

88a014e

nicoddemus requested review from tadeu and tarcisiofischer October 13, 2020 16:43

nicoddemus approved these changes Oct 13, 2020

View reviewed changes

tarcisiofischer approved these changes Oct 13, 2020

View reviewed changes

tadeu reviewed Oct 13, 2020

View reviewed changes

updates dataframe_regression stuff and CHANGELOG

09565bd

tadeu approved these changes Oct 14, 2020

View reviewed changes

domdfcoding added a commit to domdfcoding/pytest-regressions-stubs that referenced this pull request Oct 14, 2020

Reflect changes from ESSS/pytest-regressions#35

9dfe5ab

Allow str arrays on dataframe/num regression

f30c580

nicoddemus merged commit 899c32d into ESSS:master Oct 19, 2020

domdfcoding added a commit to domdfcoding/pytest-regressions-stubs that referenced this pull request Dec 22, 2020

Reflect changes from ESSS/pytest-regressions#35

9bee617

nicoddemus mentioned this pull request Jan 7, 2021

num_regression fixture now requires pandas to be installed #44

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add dataframe_regression fixture #35

Add dataframe_regression fixture #35

648trindade commented Oct 12, 2020 •

edited

Loading

nicoddemus commented Oct 13, 2020 •

edited

Loading

tarcisiofischer Oct 13, 2020

648trindade Oct 14, 2020 •

edited

Loading

tarcisiofischer Oct 14, 2020

tadeu left a comment

tadeu Oct 13, 2020

tadeu Oct 13, 2020

tadeu Oct 13, 2020

tadeu Oct 13, 2020

tarcisiofischer Oct 14, 2020

tadeu Oct 13, 2020

648trindade Oct 14, 2020

tadeu Oct 13, 2020

tadeu Oct 13, 2020

648trindade Oct 14, 2020 •

edited

Loading

tadeu Oct 14, 2020

648trindade Oct 18, 2020

tadeu Oct 19, 2020

tadeu Oct 13, 2020

648trindade Oct 14, 2020

tadeu commented Oct 19, 2020

648trindade commented Oct 19, 2020 •

edited

Loading

		dataframe_regression.check(pd.DataFrame.from_dict({"data1": data2}))


		def test_n_dimensions(dataframe_regression, no_regen):

Add dataframe_regression fixture #35

Add dataframe_regression fixture #35

Conversation

648trindade commented Oct 12, 2020 • edited Loading

nicoddemus commented Oct 13, 2020 • edited Loading

Choose a reason for hiding this comment

648trindade Oct 14, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tadeu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

648trindade Oct 14, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tadeu commented Oct 19, 2020

648trindade commented Oct 19, 2020 • edited Loading

648trindade commented Oct 12, 2020 •

edited

Loading

nicoddemus commented Oct 13, 2020 •

edited

Loading

648trindade Oct 14, 2020 •

edited

Loading

648trindade Oct 14, 2020 •

edited

Loading

648trindade commented Oct 19, 2020 •

edited

Loading