-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Indexing Domains and DB Levels for Faster & Flexible Lookup #188
Comments
This is related to #189. It would make sense that these be completed together (as it will be essential to have a more efficient claim look-up time when we have different claim types in the chain). |
A relevant comment from the gestalt discovery MR https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/195#note_623220960. This is referring to how we are currently storing a node's entire chain of claims in the
|
After resolving some issues regarding "self-discovery" in An aspect of
Making the However, given that we're eventually going to be using the sigchain for auditing and provenance use cases, it would make sense for these efficiencies to be implemented directly into the sigchain domain as much as possible. |
There are several situations where you are using leveldb to store a stream of things key to value. While the main key is the primary way of looking up an entry, you may also want to look the entry up via other fields. This is basically DB indexing. I've gotten around this right now by creating additional sublevels, and just duplicating those keys. See things like the ACL database where I did it with However a more general solution is better idea. Something that can be used by all domains that have indexing needs. I can see that sigchain, notifications, gestalts and acl all need something like this. As discussed here: https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/209#note_668300560 with respect to notification invitation search, there are some existing libraries that we can consider and "port" over:
The basic idea is sound, however it seems to be lacking in the garbage collection department. For example: bcomnes/level-idx#19 with no answer, and looking at the source code shows no maintenance of indexes when entries are updated or entries deleted. It should be easy to reimplement their indexing libraries with proper maintenance, and combined with the transaction system coming from EFS work, then it can all be embedded into the DB class. @emmacasolin in that case, I think you should not bother with indexing just yet. It's more general problem. |
The implementation of secondary indexing at This issue can be kept as separate issue representing the integration work of secondary indexing into:
|
This will lead to #197. |
The discussion in MatrixAI/js-db#1 means that we will only have simple indexing functionality for now. More complex indexes will require a restructure of the underlying DB, to potentially just use sqlite3 to avoid rebuilding such low level structures ourselves, but we might be too deep into leveldb for now. Not sure, requires more requirements analysis. I imagine later in the future we're going to need full text indexing too to help with performance. |
Made this issue more general to the concept of indexing across PK. |
All indexing is manually done per-domain. |
Specification
The sigchain currently uses a linear search to locate a particular claim (for example, relating to a specific identity's cryptolink). This should be extended to improve this
O(n)
lookup.For example, we could incorporate a map of cryptolink ID (or the equivalent) to the sequence number in the chain of the most recent change to this cryptolink.
Additional context
See:
The text was updated successfully, but these errors were encountered: