Use bitarray as underlying data type #260
Closed
scott-griffiths
started this conversation in
Ideas
Replies: 2 comments
-
Now available as a beta release:
|
Beta Was this translation helpful? Give feedback.
0 replies
-
... and now released properly in version 4.1. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The biggest problem with bitstring, as I see it, is speed.
It already does most of what I want it to do, and I'm pretty happy with the API. The main reason that people find it unsuitable for their task is that it just isn't fast enough. Fundamentally this is because it has always been a pure Python module, and Python just doesn't have the bit level methods needed. Possible solutions:
Option 1. is ruled out mainly as it's too much work, and I wouldn't enjoy programming in C. I've had a few goes with Cython and it seems a good tool, but it would still be a lot of work to get a good enough speed up. That leaves the third option, which is what I'm now looking at...
Replacing internal storage with bitarray objects.
The bitarray package has existed since about the same time as bitstring, and they perform a very similar role. Their APIs are rather different and I think it's fair to say that bitstring has more features, but there's no denying that bitarray is much faster as it's coded in C. By replacing the internal representation of the bitstring with a bitarray object we can keep the API and functionality of bitstring but gain (most of) the speed of bitarray.
Hopefully this is a win-win - users of bitstring get a speed boost and bitarray implicitly gains bitstring's users!
I've been trying this out on the bitarray_test branch. It's already passing the bulk of the unit tests - some work is needed for file-based bitstrings and LSB0 bit numbering but the main stuff works.
New features:
tobitarray
method that gives either thebitarray.bitarray
orbitarray.frozenbitarray
storage back to the user.bitarray
orfrozenbitarray
either as a keyword orauto
initialised.My rough plan is that this will become version 4.1 of bitstring some time in 2023.
Possible issues
I'm interested to hear any views on this - good or bad. Feel free to try out the bitarray_test branch if you're feeling brave.
Beta Was this translation helpful? Give feedback.
All reactions