-
-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prefer using an Arc<[u8]> instead of an Arc<Vec<u8>> #97
Comments
This is a great issue, thanks. If you have other suggestions for breaking changes, please share them. Now is a good time. Unfortunately, the way I do development in FOSS isn't terribly conducive to collaboration because I have a terrible time context switching. So I end up doing things in short spurts. As for why I chose ... but looking at std, I actually don't see an easy (and safe) way to build an The extra copy does indeed seem unfortunate, particularly given that FSTs tend to be quite large. With all that said, my plan at the moment is to adopt what tantivy's fork of Does this assuage your concern? |
No worries, that was just the only reason (IIRC) I forked the library and step back because it is not pleasant to work with a fork. You said that there were no safe way to construct an Anyway, you will change the API to make it better, so don’t take my comment into account. Thank you for your work :) PS: will you fix #92 to make it safe to read an FST from any slice lifetime? |
Yes? That's also when the From impl for Vec was added. Both were added a year after I added the Arc<Vec> APIs.
Hmm. Maybe I wasn't being clear. Making Fst generic over AsRef<[u8]> will fix that, no? |
Yeah, sorry, I am not an early morning guy :) I think that the parametric type with |
This change was not nearly as bad as I thought it would be. Instead of trying to provide all the different possible ways to store some bytes, we instead make FSTs maximally flexible by accepting any type that can cheaply provide byte slice. This should resolve a number of issues with folks constructing FSTs in ways that weren't supported by the old constructors. As a bonus, we no longer need to directly depend on a specific implementation of memory maps. Conveniently, the `Mmap` type in the `memmap` crate already implements `AsRef<[u8]>`, so using memory maps is as simple as let mmap = memmap::Mmap::map(&File::open("something.fst").unwrap()); let fst = Fst::new(mmap).unwrap(); Fixes #92, Fixes #94, Fixes #97
This change was not nearly as bad as I thought it would be. Instead of trying to provide all the different possible ways to store some bytes, we instead make FSTs maximally flexible by accepting any type that can cheaply provide byte slice. This should resolve a number of issues with folks constructing FSTs in ways that weren't supported by the old constructors. As a bonus, we no longer need to directly depend on a specific implementation of memory maps. Conveniently, the `Mmap` type in the `memmap` crate already implements `AsRef<[u8]>`, so using memory maps is as simple as let mmap = memmap::Mmap::map(&File::open("something.fst").unwrap()); let fst = Fst::new(mmap).unwrap(); Fixes #92, Fixes #94, Fixes #97
This change was not nearly as bad as I thought it would be. Instead of trying to provide all the different possible ways to store some bytes, we instead make FSTs maximally flexible by accepting any type that can cheaply provide byte slice. This should resolve a number of issues with folks constructing FSTs in ways that weren't supported by the old constructors. As a bonus, we no longer need to directly depend on a specific implementation of memory maps. Conveniently, the `Mmap` type in the `memmap` crate already implements `AsRef<[u8]>`, so using memory maps is as simple as let mmap = memmap::Mmap::map(&File::open("something.fst").unwrap()); let fst = Fst::new(mmap).unwrap(); Fixes #92, Fixes #94, Fixes #97
This change was not nearly as bad as I thought it would be. Instead of trying to provide all the different possible ways to store some bytes, we instead make FSTs maximally flexible by accepting any type that can cheaply provide byte slice. This should resolve a number of issues with folks constructing FSTs in ways that weren't supported by the old constructors. As a bonus, we no longer need to directly depend on a specific implementation of memory maps. Conveniently, the `Mmap` type in the `memmap` crate already implements `AsRef<[u8]>`, so using memory maps is as simple as let mmap = memmap::Mmap::map(&File::open("something.fst").unwrap()); let fst = Fst::new(mmap).unwrap(); Fixes #92, Fixes #94, Fixes #97
This change was not nearly as bad as I thought it would be. Instead of trying to provide all the different possible ways to store some bytes, we instead make FSTs maximally flexible by accepting any type that can cheaply provide byte slice. This should resolve a number of issues with folks constructing FSTs in ways that weren't supported by the old constructors. As a bonus, we no longer need to directly depend on a specific implementation of memory maps. Conveniently, the `Mmap` type in the `memmap` crate already implements `AsRef<[u8]>`, so using memory maps is as simple as let mmap = memmap::Mmap::map(&File::open("something.fst").unwrap()); let fst = Fst::new(mmap).unwrap(); Fixes #92, Fixes #94, Fixes #97
This change was not nearly as bad as I thought it would be. Instead of trying to provide all the different possible ways to store some bytes, we instead make FSTs maximally flexible by accepting any type that can cheaply provide byte slice. This should resolve a number of issues with folks constructing FSTs in ways that weren't supported by the old constructors. As a bonus, we no longer need to directly depend on a specific implementation of memory maps. Conveniently, the `Mmap` type in the `memmap` crate already implements `AsRef<[u8]>`, so using memory maps is as simple as let mmap = memmap::Mmap::map(&File::open("something.fst").unwrap()); let fst = Fst::new(mmap).unwrap(); Fixes #92, Fixes #94, Fixes #97
This change was not nearly as bad as I thought it would be. Instead of trying to provide all the different possible ways to store some bytes, we instead make FSTs maximally flexible by accepting any type that can cheaply provide byte slice. This should resolve a number of issues with folks constructing FSTs in ways that weren't supported by the old constructors. As a bonus, we no longer need to directly depend on a specific implementation of memory maps. Conveniently, the `Mmap` type in the `memmap` crate already implements `AsRef<[u8]>`, so using memory maps is as simple as let mmap = memmap::Mmap::map(&File::open("something.fst").unwrap()); let fst = Fst::new(mmap).unwrap(); Fixes #92, Fixes #94, Fixes #97
In a previous comment it seemed you were ready to probably introduce breaking changes to the library.
I found out that an fst (and thus a set and a map) can be created from shared bytes, those are represented by an
Arc
ofVec<u8>
, I was wondering why you chose to use this type instead of anArc<[u8]>
.One of the advantage of an
Arc<[u8]>
is that it can be used every whereAsRef<[u8]>
is needed, anArc<Vec<u8>>
doesn't implement this trait.The only advantage I see in this choice is for performance reason that will be rarely triggered: constructing an
Arc<Vec<u8>>
from an existing vec is cheaper as it will not copy the entirety of the bytes,Arc<[u8]>
does.I will stop bothering you, and let you work on more interresting features now :)
I just wanted to make sure to break things when possible and clean the library a little bit more if this makes any sense, of course.
The text was updated successfully, but these errors were encountered: