-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zlib Module #6069
zlib Module #6069
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this! I have not looked at the rest of it yet, but we want to call this zlib
, not uzlib
. If there are incompatible differences with the CPython API (that is, if it's not just a subset), then we would want to change them to match the CPython API.
So call everything zlib
instead of uzlib
. An example of this renaming without the shared-bindings/shared-module changes is re
, which is ure
in MicroPython.
Thanks for working on this @gamblor21! I tested the functionality of this successfully with this:
I'm still a bit unclear on how the wbits parameter works, but I found that removing the 10 byte gzip header from the data and then using any negative value (tested -1 to -100) allows the data in the example to decompress. My interpretation of the micropython docs is it may be meant to work with positive values with data that does include the header. I wasn't able to get any positive values to work in the script above though with or without the header bytes in the data. |
Will do. Just a FYI for you (and others) the underlying library itself is called |
I tried out this branch again after newest commits on Feather TFT ESP32-S2. Confirmed that the name is now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @gamblor21!
It looks like the uzlib
API isn't grounded in the real zlib
API. The CPython API doesn't have an IO wrapper class at all. The decompress object doesn't really have the stream API.
However, gzip
has a GzipFile class and a gzip.decompress()
function. I think that's the best approach for making this CPython compatible. Rename to gzip
and rework the API to match. https://docs.python.org/3/library/gzip.html
I'll take a look a bit more in depth at the
Just as FYI did this as a quick change to hope the extmod could work so depending on the extent of rework this may take a while as it is low on my priority list at the moment. |
My current code is access response.content which I originally did think was a stream. But the docs list bytes, so perhaps I misunderstood what it was. I've only taken a quick look, but I didn't see anything in gzip that seems to have the stream functionality. I do think being able to decompress a stream would be good functionality to have. I can see CPython not benefiting from it as much since RAM is closer to infinite it's less likely to encounter compressed data that doesn't fully fit into RAM with plenty left over. Maybe since the streaming functionality differs from the CPython API we could make a new module name for it? it also doesn't seem to exist inside of gzip that I can tell.
Or we could remove the streaming functionality entirely to match the CPython API closer? I think it seems nice to have, but maybe I overestimate it's usefulness. As far as the This code runs and has the same output in both CPython and CircuitPython (built from this branch): import zlib
data = b'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03EPAn\xc2@\x0c\xbc\xe7\x15\xd6\x9e\tJ\x02\xa2Jn\xfd\x00\x87\xaa\\ZU\xc8\r.\xac\xbaI\xc0\xeb\xd0\xa2(\x7f\xafw\x13\xca\x9e<c\xcf\xd8\xb3C\x02`Zl\xc8T`\x98\xd0\x89\xd5z\x11X\x8f\xcd\xd9\xd9\xf6\xa8\x9da\x8c\xcc\xa5\'\xbe\x05\xa8@aC\xc2\xb6\xf6J\xbcGB)\x96\nk\xb1W\xdaybo"\xfd\xb1\x98\xc7\xf17e\xf2\xbd\x93 \xc93}\xda\x98\x9c\xd5\n\x1f\xc6\xf7{\x9e\xa3\x15D/x\xb1\xc7\x93\xc0\xb6\xfb1\xb3\xdf\x81|\xcd\xf6,\xb6k\xc3\xf0\xb6o>\x89\xa1\xfb\x82>\xce\xd7=3\xb5\xe2np\xb5\xde\x8a\x06\x01t\x0e\xb4$\xbf4\xff\x9b\x0f\x187O\t\x86{\x8e)\xc4>*;\x0e\xf7\x9aU\x96\x97e6E\x1a\x939\x96\x91N\xd0\xf9\xc7\x17\t~S\xbbG\t\x8a"+\x8a4+\xd2\xbc|\xcd\x9f\xaau^\x15\xeb\xe5f\xb5y3\xc9\xf8\x07\xbb\x92\xbdow\x01\x00\x00'
data_body = data[10:]
decompressed = zlib.decompress(data_body, -15)
print(decompressed) |
Ya, totally get that. We do have a third option of picking a non-CPython name for the existing API. I'd avoid Can't the file API be used for streams? I wonder how the real requests library handles gzip. |
If a new name were selected would it be for the stream API only and I tried poking around in https://github.com/psf/requests to figure out how it's handled in there, but I'm coming up empty. I tried some I am a bit perplexed to not to find much with |
I'd be ok if they were split into two modules. |
Did some quick research. First off from what has been mentioned I can have just I also looked quickly into how CPython works and what it uses. The full zlib is almost 100K so that is a non-starter. I think if things are set up and then future work could be done finding a smaller zlib compatible library (or zlib may have some smaller options already I'm not sure). At a later date then more of the CPython functionality could be included. |
I believe DecompIO can be made into
We're not sure what wbits is used for, so I'd just drop it and make it So, I still think we want to make this |
I will take a look into the DecompIO being able to read files. Just need some time to figure out how that all works but I don't see it being too hard. I did read some information on what One thing just to keep in mind (for anyone reading this in the future) the CPython |
I did find some information about wbits (putting it here for future reference if needed):
From https://stackoverflow.com/questions/3122145/zlib-error-error-3-while-decompressing-incorrect-header-check |
Hi, I'm pretty interested in using decompression in CircuitPython. What remaining steps need to happen for this to make it into the next release? If there are things I can help with, let me know. |
Are there any specific parts of this you are looking for? Short term I want to get at least |
Hi @gamblor21 — thanks for the update! I'm implementing over-the-air software installation for a MatrixPortal M4, and my hope is to be able to unpack a downloaded zip file full of Python files. I know that zipfile support might be a long way off, but even having basic decompression (such as |
100a2bc
to
1d470c2
Compare
@gamblor21 is this self-contained as it is? Should we go ahead and merge this to get part of the functionality? If you could make an issue (or edit an existing one) with a task list (use |
Yes this should be fully self-contained. Worse case if I missed something
Will do in the next day or so. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, let's merge what we have and I'll describe it as "preliminary" and/or "experimental". Thanks!
@gamblor21 I re-read @tannewt's comments, and it sounds like he would like it all wrapped up in |
@dhalbert My understanding based on t Scott's comment from 2/3 (quoted) was that it was okay to split the It's possible that my interpretation is incorrect though. I know it was also discussed in the weeds a bit during one or two of the meetings but I'm not certain of which dates it was, and I think for at least one of them it was a week that Scott wasn't present. |
Got it, thanks, I was not following it that closely. I'll go ahead, since this is useful functionality, and |
partial implementation now matches discussion, we believe
Hi, as @FoamyGuy said we did discuss it during in the weeds a few weeks ago. I looked and |
Moving the MicroPython
extmod/uzlib
to the CircuitPython shared-bindings/module pattern.Provides zlib decompression functionality.