-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make fst work without memory mapping with an arbitrary "fake array" #15
Conversation
FWIW, this diff is taking this library further away from |
@BurntSushi Yes, it does.. It would be great if the main fst crate had a way to be used with a generic From your blog article about fst:
This POC proves that it is possible to use your FST crate without memory mapping in a pretty efficient manner, though as it is now I'm sure it reduces the performance for the mem-map use case due to dynamic invocation. Would you be interested in modifying the upstream FST crate to allow it to work on non-slices? No need to actually implement a caching system, just make it possible to hook into the read requests if needed. I'm sure it can be done without any performance loss compared to the way it works now, though I don't know enough about Rust perf to say what the best approach is. |
@phiresky It does seem like a very good use case to satisfy, but I'd probably want to look at a design first before doing so. And yes, performance is definitely a concern, but so is complexity of implementation and API complexity. There are other practical problems... e.g., While I might agree that it's a good idea and am OK adding something like it, the time and energy I have available to devote towards it is pretty limited and that might be a problem for y'all. So in that sense, I understand going your own way with Anyway, this is a really cool change. It's awesome to see FST working without relying on mmap. I had thought it would be a lot harder than this. |
@BurntSushi @phiresky I am not merging this. @phiresky's original use case is to fetch the data from an http server or IPFS. We developed a sstable based dictionary that is better suited to that use case. We will opensource a cleanup version. The block may actually small fst instead of sstable but the idea is the same. We'll probably only do 2 levels. (because of the properties of the storage we are working with: high latency but decent throughput) |
A better description is here: quickwit-oss/tantivy#1067
This allows fst to work without having the whole index in memory and on systems without memory mapping supported (e.g. WASM).
To make this viable for everything else, the dynamic invocation needs to be removed again or put behind a compile time flag.