Improve `repr`s for key Parcels classes #1693

erikvansebille · 2024-09-06T11:29:22Z

So far, we haven't really worked on the output of print() on objects like Field, FieldSet etc. While there is a ParticleSet.__repr__, even that is pretty clunky and doesn't take into account notebook functionality.

It would be nice to have better print() implementation for at least the following classes

The text was updated successfully, but these errors were encountered:

rehamansoor · 2024-09-06T16:25:44Z

Hi,
I’d like to contribute to this issue. Could you please specify how you would like the print() implementation to be improved for the Field, FieldSet, and ParticleSet classes?

erikvansebille · 2024-09-09T09:32:33Z

Thanks for offering to contribute to this Issue, @rehamansoor! I don't have a very clear idea what to print in mind yet, but was inspired by the html-version __repr__ in xarray (see e.g. pydata/xarray#1627). So a table-like/collapsable view that shows the attributes of these objects that are relevant to users

Since a FieldSet is a collection of Fields, I suggest we start with a Field.__repr__? This should include information on the name, grid, dimension sizes, interpolation_method etc.
@VeckoTheGecko, since you are working on redefining the API, perhaps you also have ideas what attributes to show in a notebook?

VeckoTheGecko · 2024-09-10T16:13:01Z

Hi @rehamansoor, great to hear that you’re interested in this issue.

Before the wall of text that is about to follow, please let me know if you have any questions about anything below, or if you'd like any clarification. I'm not sure what your Python level is so - if you feel like this is a bit much for where you're at - let me know (there are issues in the codebase that are better suited to new Python developers 😁). I imagine these won't all be fixed in one PR, or by the same person.

Useful reference

TLDR: I think we shouldn't do HTML outputs for now. Let's focus on text reprs

HTML repr outputs would be amazing in theory, but after doing some diving into the implementation in xarray (see entrypoint ds._repr_html_() for the curious) their implementation is complex requiring custom CSS (vendored in the package itself in a static folder), HTML generation (including escaping characters to produce valid HTML), and they even use custom decorators to drive the reprs.

import xarray as xr
import numpy as np
import pandas as pd

ds = xr.Dataset(
    data_vars=dict(
        temperature=(["loc", "time"], np.random.randn(2, 4)),
        precipitation=(["loc", "time"], 10 * np.random.rand(2, 4)),
    ),
    coords=dict(
        lat=("loc", [42.25, 42.21]),
        time=pd.date_range("2014-09-06", periods=4),
        reference_time=pd.Timestamp("2014-09-05"),
    ),
    attrs=dict(description="Weather related data."),
)

print(ds._repr_html_())

I think there is quite some room for improvement in our text reprs (we'll need them anyway for scripts) and down the line see if there is a need for HTML ones as well (/investigate how we can better customise iPython notebooks output). At the moment the apparent complexity/maintenance burden makes me hesitate. Note customising iPython notebook reprs is easily done with the _repr_html_() method.

Starting with the smallest objects to the largest, this is the information (self.values) that would be good to have in the repr for each:

Field
- self. information to include
  - grid
  - mesh
  - allow_time_extrapolation
  - time_periodic
  - gridindexingtype
  - to_write
- notes: pretty formatting with indentation
VectorField
- self. information to include
  - name
  - U
  - V
  - W
- notes: pretty formatting with indentation
FieldSet
- information to include
  - Fields (call .get_fields()). To be formatted as a list going down
ParticleFile
- self. information to include
  - name
  - outputdt
  - chunks
  - create_new_zarrfile
  - (intentionally left out particleset)
- notes: have a simple oneline repr (e.g., ParticleFile(name='my_name', outputdt='...)
ParticleSet
- self. information to include
  - fieldset
  - pclass
  - repeatdt
- Other information to include:
  - number of particles: do len(self)
  - particle information (only first x and last y particles):
    - Currently the __repr__ print out all particles, which isn’t really useful in the case of thousands of particles (which occurs frequently). Current implementation is return "\n".join([str(p) for p in self]) which can be adapted accordingly. Choose sensible defaults for x and y
- notes: pretty formatting with indentation
Kernel
- Hmmmm, not sure for the timebeing. No changes needed
Particle and Variable
- @erikvansebille I saw some reprs of the format f"PType<{self.name}>::{self.variables}". Does Parcels dynamically evaluate these in the C-code? Are these worth avoiding changes to?

I would recommend working on the simple reprs first. Also at this stage we are not too concerned about user customising the repr output (like the set_options() class in xarray options.py allowing to set max_width), but having a sensible output of 80 should be good. Please rely on formatting helper functions to reduce code duplication, you can put them in tools/_formatting.py.

This isn't the be-all and end-all for the reprs. We'll be reworking Parcels internals over the coming months cleaning things up. But having a starting point to go from would be super helpfull.

If there are numpy arrays in the repr, please use the context manager with np.printoptions(threshold=5, suppress=True, linewidth=120): or some other sensible defaults (see docs) so that the output is managable.

rehamansoor · 2024-09-28T02:46:51Z

Thank you for the detailed breakdown, and I apologize for the delay in replying—I got a bit busy with school. I really appreciate the opportunity to contribute. While I have experience in Python through my college courses, I think these tasks might be a bit complex for me at the moment. Could you suggest an issue that might be a better fit for my current skill level? I’m eager to help and continue improving my abilities.

Thanks again, and I look forward to contributing!

VeckoTheGecko · 2024-09-30T08:23:38Z

Hi @rehamansoor . All good! We do have a few issues that would be better suited:

In #1620 with have the subtask Remove erroneous calls to os.path in cleanup_remove_files() and clean up function. This is a purely Pythonic task and, once you understand that section of code, should be quite straightforward. We also have #1701 which is just documenting adding metadata to the particlefile output (for example an entry "contact_email": "email@example.com") . There is also #1511 which is more specific to Parcels itself

VeckoTheGecko · 2024-10-14T09:33:46Z

@erikvansebille I saw some reprs of the format f"PType<{self.name}>::{self.variables}". Does Parcels dynamically evaluate these in the C-code? Are these worth avoiding changes to?

bump

#1693

github-project-automation bot added this to Parcels development Sep 6, 2024

github-project-automation bot moved this to Backlog in Parcels development Sep 6, 2024

erikvansebille added enhancement good first issue Good for new parcels developers coding/Python labels Sep 6, 2024

VeckoTheGecko mentioned this issue Sep 10, 2024

Enable pyupgrade, add Grid repr, and other changes #1680

Merged

VeckoTheGecko changed the title ~~Improve print for key Parcels classes~~ Improve reprs for key Parcels classes Oct 24, 2024

VeckoTheGecko added a commit that referenced this issue Oct 24, 2024

Add ParticleFile and Field reprs

78726f6

#1693

VeckoTheGecko mentioned this issue Oct 25, 2024

New and updated reprs for Variable, ParticleFile, Field, VectorField, and ParticleSet #1743

Merged

VeckoTheGecko added a commit that referenced this issue Oct 25, 2024

Add ParticleFile and Field reprs

f3eda29

#1693

VeckoTheGecko mentioned this issue Nov 20, 2024

Update ParticleType repr #1766

Merged

VeckoTheGecko closed this as completed in #1766 Nov 20, 2024

github-project-automation bot moved this from Backlog to Done in Parcels development Nov 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `repr`s for key Parcels classes #1693

Improve `repr`s for key Parcels classes #1693

erikvansebille commented Sep 6, 2024 •

edited by VeckoTheGecko

Loading

rehamansoor commented Sep 6, 2024

erikvansebille commented Sep 9, 2024

VeckoTheGecko commented Sep 10, 2024 •

edited

Loading

rehamansoor commented Sep 28, 2024

VeckoTheGecko commented Sep 30, 2024 •

edited

Loading

VeckoTheGecko commented Oct 14, 2024

Improve reprs for key Parcels classes #1693

Improve reprs for key Parcels classes #1693

Comments

erikvansebille commented Sep 6, 2024 • edited by VeckoTheGecko Loading

rehamansoor commented Sep 6, 2024

erikvansebille commented Sep 9, 2024

VeckoTheGecko commented Sep 10, 2024 • edited Loading

rehamansoor commented Sep 28, 2024

VeckoTheGecko commented Sep 30, 2024 • edited Loading

VeckoTheGecko commented Oct 14, 2024

Improve `repr`s for key Parcels classes #1693

Improve `repr`s for key Parcels classes #1693

erikvansebille commented Sep 6, 2024 •

edited by VeckoTheGecko

Loading

VeckoTheGecko commented Sep 10, 2024 •

edited

Loading

VeckoTheGecko commented Sep 30, 2024 •

edited

Loading