Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rtree.index.Index: avoid use of args/kwargs #344

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

adamjstewart
Copy link
Collaborator

I spent longer than I would have liked trying to figure out why:

index = Index(properties=Property(dimension=3, interleaved=False))

wasn't working. If we weren't using *args, **kwargs everywhere, I would have immediately figured this out. This PR starts with Index, but eventually I plan to move to other classes as well.

*args, **kwargs should be avoided for the following reasons:

  • Does not report an issue when unknown kwargs are used (my problem)
  • Does not report an issue when misspelled kwargs are used (actually discovered this in the tests)
  • Does not support type hints

Basically the only time to use *args, **kwargs is when you have one function that wraps around another function and you need to pass through all arguments.

At the moment, this change is backwards incompatible since I replace *args with individual parameters and remove support for additional Property passthroughs. I could undo this change if we want to only remove **kwargs and keep things backwards compatible.

def __init__(
self,
filename: str | bytes | None = None,
stream: Any | None = None,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't yet figured out how to type this

arrays = None
basename = None
storage = None
if args:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No more guessing games!

r1.add(555, (2, 2))
del r1
self.assertTrue(storage.hasData)

r2 = index.Index(storage, properly=settings, overwrite=False)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actual typo/bug in our tests discovered by removing **kwargs


:param interleaved: True or False, defaults to True.
This parameter determines the coordinate order for all methods that
:param arrays: A tuple of arrays for bulk addition.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't documented

@@ -240,7 +240,7 @@ def test_intersection(self) -> None:
self.assertTrue(0 in self.idx.intersection((0, 0, 60, 60)))
hits = list(self.idx.intersection((0, 0, 60, 60)))

self.assertTrue(len(hits), 10)
self.assertEqual(len(hits), 10)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also a bug

@adamjstewart
Copy link
Collaborator Author

Could use help figuring out the final remaining test failure...

@hobu
Copy link
Member

hobu commented Jan 5, 2025

Could use help figuring out the final remaining test failure...

Options ordering changed? The custom filename test failure must mean an unexpected default is getting used or something? 🤷

@adamjstewart
Copy link
Collaborator Author

That's what I thought too, but that is just a simple test for filename, the first argument. I tried explicitly adding filename= but nothing changed.

@adamjstewart
Copy link
Collaborator Author

First let's discuss whether or not this is something we want and later we can fix things.

@hobu
Copy link
Member

hobu commented Jan 6, 2025

something we want

Custom filenames? Yes, definitely.

The janky kwargs/args configuration of the options? It was the style back in the day. I'm sure it could be remade to be something much better if someone has the time to put into it. I'm certainly not adverse to that, but it probably means making deprecating and then breaking things.

@adamjstewart
Copy link
Collaborator Author

We can convert kwargs to parameters immediately without deprecation. Let me see if I can deprecate args and add parameters, it should actually be relatively easy.

@adamjstewart
Copy link
Collaborator Author

Still need to deprecate (instead of remove) some properties and fix the test.

@FreddieWitherden
Copy link
Contributor

My preference here would be to create a subclass of Index which has a nicer and cleaner interface for construction. It can then map this to the legacy API. This would allow us to retain backwards compatibility and provide a route for code to be gradually updated.

Also, Index is not a great name for a class; from rtree.index import Index also looks a bit weird as an Index could really be anything (database Index, B tree, ...).

@adamjstewart
Copy link
Collaborator Author

@FreddieWitherden that would make it impossible to deprecate the old interface, because any instantiation of the subclass would also instantiate the parent class. The reverse would actually be better, new classes that implement the logic with the old class being a wrapper around those.

However, I still think the current approach is easiest, as users will have the fewest things to change. My only remaining complaint about the current class is that it is highly dynamically typed, making type checking almost impossible. Any attempts to change that would likely result in a completely different library.

@FreddieWitherden
Copy link
Contributor

@FreddieWitherden that would make it impossible to deprecate the old interface, because any instantiation of the subclass would also instantiate the parent class. The reverse would actually be better, new classes that implement the logic with the old class being a wrapper around those.

I'm not entirely sure what you mean? The old interface absolutely can be deprecated and marked internal. Then, after a suitable period of time it can be removed (and the code underpinning the new supported interface updated accordingly). No code using the new interface would need to change (as how it is actually coded is very much an implementation detail so if it provides functionality itself or inherits it from a super class is immaterial to consumers).

However, I still think the current approach is easiest, as users will have the fewest things to change. My only remaining complaint about the current class is that it is highly dynamically typed, making type checking almost impossible. Any attempts to change that would likely result in a completely different library.

The problem with the current class is mostly around construction. The actual query interface is okay. Hence my suggestion around subclassing it and having these subclasses provide more reasonable constructors/static factory methods.

@adamjstewart
Copy link
Collaborator Author

Are you suggesting to create a subclass of Index or an entirely new class that is independent of Index? If the former, we can't raise a warning message about deprecation in Index without also raising it in the subclasses. If the latter, this would be a lot of duplicated code, but I guess it would be fine in principle.

@FreddieWitherden
Copy link
Contributor

Rename Index to _Index. Create a new class Index(_Index) and mark as deprecated. Create a new class Rtree(_Index) (and friends) with a tweaked API that we want to support long term.

@adamjstewart
Copy link
Collaborator Author

@hobu how would you like to proceed here? I think designing new subclasses is outside the scope of what I have time to work on. Could convert many of the args/kwargs to formal inputs though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants