Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance, delete or leave <multi_epidist> #202

Closed
joshwlambert opened this issue Nov 13, 2023 · 12 comments
Closed

Enhance, delete or leave <multi_epidist> #202

joshwlambert opened this issue Nov 13, 2023 · 12 comments
Assignees
Milestone

Comments

@joshwlambert
Copy link
Member

The <multi_epidist> class in {epiparameter} only has one purpose: improve printing when there are a large number of <epidist> entries returned by epidist_db(). From #197 there was some difference of opinion on the utility of such as class and whether it should be kept in the package, or whether it is useful and as it is used for printing the print method could be improved with the use of {cli}. This issue is to discuss which of these options is best for future development.

@joshwlambert joshwlambert added the dev day Issues for the {epiparameter} development day label Feb 13, 2024
@joshwlambert
Copy link
Member Author

Tagging with dev day for discussion during the {epiparameter} development day.

@joshwlambert
Copy link
Member Author

joshwlambert commented Mar 1, 2024

Moving forward with {epiparameter} development it would be good to get a final decision on this. I'm in favour of leaving <multi_epidist> as a minimal class which provides nicer printing but does not take on {cli} as a dependency.

Thoughts @Bisaloo @chartgerink @pratikunterwegs @jamesmbaazam?

@pratikunterwegs
Copy link
Collaborator

Just thinking through a use-case I'm familiar with, passing delay density functions to {cfr}: IIRC, if I wanted to estimate CFR using multiple onset-to-death distributions, say from different studies on the same pathogen, I would have to do

lapply(
 multi_epidist, function(edist) { cfr_*(data, delay_density = \(x) density(edist, x) ) }
)

Does the <multi_epidist> add anything over a list of <epidist>? Equally, does it take anything away, and how much work is it to maintain? I don't mind accidentally printing 100s of epidists in a list once in a while, so to me this is always going to look like a net cost (in maintenance time). Other than that, I don't have strong feelings about removing or retaining it.

@joshwlambert
Copy link
Member Author

Functionally, the addition of <multi_epidist> over a list of <epidist> is minimal. It is primarily for nicer printing and secondarily for some additional S3 dispatch benefits, over the user manually calling lapply() (e.g. get_citation() & is_parameterised()). In terms of maintenance I don't think it's considerable, given it is only instantiated in a single function (epidist_db()) and has simple and minimal methods that are often just looping over the list of <epidist>.

@Bisaloo
Copy link
Member

Bisaloo commented Mar 6, 2024

More than maintenance, my main issue is the increased complexity for new package users.

When approaching epiparameter, they have quite a lot to learn with the custom classes and understanding the differences between the classes is one extra hurdle. If the differences are minor, then it's probably easier to merge the classes.

@joshwlambert
Copy link
Member Author

When approaching epiparameter, they have quite a lot to learn with the custom classes and understanding the differences between the classes is one extra hurdle. If the differences are minor, then it's probably easier to merge the classes.

Perhaps I haven't explained myself clearly, or the package documentation is not clear on this. The intention for the <multi_epidist> class is for users to not be aware of it. As mentioned in a previous comment, it is there purely for printing and some addition benefits of dispatch in the small number of cases where a user will have read in multiple <epidist> objects from epidist_db(). I've purposefully left out documentating <multi_epidist> in the package documentation as I do not want to burden the user to have to learn how to use objects of this class.

If the differences are minor, then it's probably easier to merge the classes.

Which classes are you proposing to merge?

@chartgerink
Copy link
Member

Thanks @joshwlambert for tagging!

  1. Overall, I do not see an issue with retaining an additional internal class to help you achieve the prints that are informative to end-users
  2. Keeping an internal class is preferable to adding a dependency for sure, if this functionality is identified as providing a benefit to the end-user

As the package maintainer, you can decide based on the feedback provided, if you feel you have sufficient feedback.

@joshwlambert
Copy link
Member Author

Removing "devday" label as this issue is not relevant for the Epiverse-Imperial Epiparameter workshop.

@joshwlambert joshwlambert removed the dev day Issues for the {epiparameter} development day label Mar 26, 2024
@joshwlambert
Copy link
Member Author

Another possibility that just occurred to me while working with <bibentry> objects, an <epidist> class could hold multiple epidemiological parameter sets. From the documentation of ?bibentry():

The bibentry objects created by bibentry can represent an arbitrary positive number of references. One can use c() to combine bibentry objects, and hence in particular build a multiple reference object from single reference ones. Alternatively, one can use bibentry to directly create a multiple reference object by “vectorizing” the given arguments, i.e., use character vectors instead of character strings.

I'm not sure I prefer this option, but thought it would be good to put it into the discussion before making a final decision.

@Bisaloo
Copy link
Member

Bisaloo commented Apr 3, 2024

Having a vectorized class can often be a good option and it ultimately depends on whether you expect all your methods & functions acting on epidist objects to be vectorized as well.

Functions such as discretize() or get_citation() will now have to work out of the box on a list of epidists.

@joshwlambert
Copy link
Member Author

I've implemented a improvement to the print.multi_epidist() function in PR #326.

I'l summarise how the new implementation works with respect to the points raised in the comments above:

  • I will keep the <multi_epidist> class in the package as I believe it can be useful and lower the difficulty of handling <epidist> objects when loading in with epidist_db().
  • {cli} is taken on as a package dependency to enable printing symbols and URL formatting.
  • The new print method has a header, body, footer layout and takes inspiration from {tibble} (more specifically {pillar}) to show a preview of the data when the object is large and would flood the console, as well as object metadata.

I will leave PR #326 open until the end of the week so please provide feedback within this time. Once the PR is merged I will close this issue.

@joshwlambert
Copy link
Member Author

PR #326 has been merged into main.

@github-project-automation github-project-automation bot moved this from In Progress to Done in {epiparameter} v0.2.0 Jun 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done
Development

No branches or pull requests

4 participants