Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse Quantity Strings #824

Merged
merged 4 commits into from
Feb 1, 2024
Merged

Conversation

Marenz
Copy link
Contributor

@Marenz Marenz commented Dec 19, 2023

  • Fix missing number in Quantity formatting for small values
  • Add function to allow parsing of Quantity strings

@Marenz Marenz requested a review from a team as a code owner December 19, 2023 17:38
@Marenz Marenz requested a review from llucax December 19, 2023 17:38
@github-actions github-actions bot added part:docs Affects the documentation part:tests Affects the unit, integration and performance (benchmarks) tests part:data-pipeline Affects the data pipeline labels Dec 19, 2023
@llucax llucax added this to the v1.0.0-rc4 milestone Dec 20, 2023
Copy link
Contributor

@llucax llucax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise.

Signed-off-by: Mathias L. Baumann <mathias.baumann@frequenz.com>
@github-actions github-actions bot added the part:tooling Affects the development tooling (CI, deployment, dependency management, etc.) label Jan 29, 2024
@Marenz
Copy link
Contributor Author

Marenz commented Jan 29, 2024

Updated and ready for review

@llucax
Copy link
Contributor

llucax commented Jan 30, 2024

CI failing with:

pylint src benchmarks docs examples noxfile.py tests
************* Module frequenz.sdk.timeseries._quantities
src/frequenz/sdk/timeseries/_quantities.py:204:4: R0912: Too many branches (16/12) (too-many-branches)
************* Module tests.timeseries.test_quantities
tests/timeseries/test_quantities.py:582:4: W0212: Access to a protected member _base_value of a client class (protected-access)

Copy link
Contributor

@llucax llucax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the second commit could be split, other than that (and the CI failure) LGTM.

@Marenz
Copy link
Contributor Author

Marenz commented Jan 30, 2024

CI failing with:..

haha, I had that fixed, I just didn't add it to the commit XD

@Marenz Marenz force-pushed the parse_quantity branch 2 times, most recently from 1bae2f7 to 1d300b4 Compare January 30, 2024 10:27
@llucax
Copy link
Contributor

llucax commented Jan 31, 2024

Please re-request a review from me when this is ready for another round.

@Marenz Marenz requested a review from llucax January 31, 2024 16:23
Copy link
Contributor

@llucax llucax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For "Fix quantity formatting bug with -0/0" it might be good to describe what the bug actually was, is not completely clear to me by just reading the diff.

About the rounding issue when formatting, really tricky, I wonder if it wouldn't be simpler to do a string roundtrip in the formatting function itself, like convert to string, convert again to float, and only then check which exponent is more appropriate, then convert again to string. It might be less efficient (maybe?) but it seems easier to implement and likely more robust.

@pytest.mark.parametrize("quantity_type", [Power, Voltage, Current, Energy, Frequency])
@pytest.mark.parametrize("exponent", [0, 3, 6, 9])
@hypothesis.settings(max_examples=1000)
@hypothesis.seed(42) # Seed that triggers a lot of problematic edge cases
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really want to leave the seed fixed? Isn't it more useful to let it change to potentially catch more random cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know for a fact that there are more cases and solving those requires an unreasonable amount of work (at this point).
@sahas-subramanian-frequenz mentioned an imperfect solution is better than no solution at all and at least here I would agree.

There will be cases were it prints slightly weird, maybe using a sub-optimal unit etc, but the printed result will never be straight out wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that sounds completely reasonable and I agree 100%. Can you add a comment summarizing this here? Because for the uninformed casual reader it will be really hard to figure all of this out, they'll probably just try to remove those two lines and be completely puzzled about why everything breaks now. I know future me will attempt that 😆

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be a pain in the ass about this, but I mean something more like "we are setting these fixed because there are still corner cases that will fail if we allow hypothesis to do random tests, this should be removed when all the corner cases are properly handled".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did see the larger docstring of the function explaining basically that in other words?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, sorry, I expected it in the comments instead of the docstring so my brain stopped parsing after I couldn't find the clarification there.

@Marenz
Copy link
Contributor Author

Marenz commented Jan 31, 2024

About the rounding issue when formatting, really tricky, I wonder if it wouldn't be simpler to do a string roundtrip in the formatting function itself, like convert to string, convert again to float, and only then check which exponent is more appropriate, then convert again to string. It might be less efficient (maybe?) but it seems easier to implement and likely more robust.

Yes, I thought about this particular solution as well. It might be less elegant, but it would be pretty reliable and covering pretty much all possibilities.
Maybe in a future PR?

@Marenz
Copy link
Contributor Author

Marenz commented Jan 31, 2024

For "Fix quantity formatting bug with -0/0" it might be good to describe what the bug actually was, is not completely clear to me by just reading the diff.

Basically it would only print the unit, no number (the strip removed the 0)

@llucax
Copy link
Contributor

llucax commented Jan 31, 2024

Maybe in a future PR?

Sure

Signed-off-by: Mathias L. Baumann <mathias.baumann@frequenz.com>
@Marenz Marenz requested a review from llucax January 31, 2024 17:16
Copy link
Contributor

@llucax llucax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last comment about the comment. Also if you can add this "It would only print the unit, no number" to the "Fix quantity formatting bug with -0/0 " commit while you are it, I think it's good info to have in the commit message.

@pytest.mark.parametrize("quantity_type", [Power, Voltage, Current, Energy, Frequency])
@pytest.mark.parametrize("exponent", [0, 3, 6, 9])
@hypothesis.settings(max_examples=1000)
@hypothesis.seed(42) # Seed that triggers a lot of problematic edge cases
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be a pain in the ass about this, but I mean something more like "we are setting these fixed because there are still corner cases that will fail if we allow hypothesis to do random tests, this should be removed when all the corner cases are properly handled".

It would only print the unit, no number.

Signed-off-by: Mathias L. Baumann <mathias.baumann@frequenz.com>
Certain numbers would be rendered wrongly due to rounding in the python formatter, e.g. Voltage.from_volts(999.9999850988388) would render "1000 V".
This results in a faulty round-trip conversion (`num -> str -> num` no longer matching).

Signed-off-by: Mathias L. Baumann <mathias.baumann@frequenz.com>
@Marenz
Copy link
Contributor Author

Marenz commented Feb 1, 2024

Extended the commit description with the "no unit" explanation

@Marenz Marenz added this pull request to the merge queue Feb 1, 2024
@llucax
Copy link
Contributor

llucax commented Feb 1, 2024

Merged via the queue into frequenz-floss:v1.x.x with commit 971d7cf Feb 1, 2024
14 checks passed
@Marenz Marenz deleted the parse_quantity branch February 1, 2024 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
part:data-pipeline Affects the data pipeline part:docs Affects the documentation part:tests Affects the unit, integration and performance (benchmarks) tests part:tooling Affects the development tooling (CI, deployment, dependency management, etc.)
Projects
Development

Successfully merging this pull request may close these issues.

2 participants