-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fill method to displayio.Bitmap #2756
Conversation
Looks like You could also swap the loop order and do y as the oustide loop to reduce execution of |
Are |
For Good question about |
Apologies, I didn't check second diff as I thought it was just doc changes. |
No worries. First commit was "copy, paste, does it work?". Second commit had some minor refactoring. |
Normally in Python lower-bounds are inclusive and upper-bounds are exclusive, so the slice [1:4] is the elements 1, 2, and 3. I hope that the same is chosen for x2/y2 in this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach is fine with me.
Do you want to speed it up further? You could pack the value into one uint32_t at the start and then fill the memory. That would reduce the number of memory operations.
@tannewt That's funny, I was just went for a stroll and thought of the same thing for the special but common easy case of I've not looked at exactly how the packing works, will one 32bit chunk work for all cases? For 8 palette colours represented by 3bits does it spill between two 32bit chunks, i.e. would need three different 32 bit values to be written in? |
@tannewt Sure, might as well make it as fast as possible while we're here. Guess I have the same general question as @kevinjwalters. |
bits_per_value is always a power of 2 (so 1, 2 or 4 bits). Inherent in the computation here: https://github.com/adafruit/circuitpython/blob/master/shared-bindings/displayio/Bitmap.c#L66 You may accidentally set values past the end of the row but that is fine because the stride (aka how long a row is) is rounded up to the nearest word boundary. |
Something like this? Tested this and seems to work. // build the packed word
uint32_t word = 0;
for (uint8_t i=0; i<32 / self->bits_per_value; i++) {
word |= (value & self->bitmask) << (32 - ((i+1)*self->bits_per_value));
}
// copy it in
for (uint32_t i=0; i<self->stride * self->height; i++) {
self->data[i] = word;
} EDIT - oh, and is indeed faster :) Adafruit CircuitPython 5.1.0-92-gdc7574684-dirty on 2020-04-10; Adafruit CLUE nRF52840 Express with nRF52840
>>> import fill_test
Fill 1
1.32199
Fill 2
0.0059967
>>> |
@caternuson yup! Looks right to me. |
@tannewt OK, pushed those changes. I think the last remaining question is the treatment of the upper bounds for the dirty area. This? (current code) self->dirty_area.x2 = self->width;
self->dirty_area.y2 = self->height; or this? self->dirty_area.x2 = self->width - 1;
self->dirty_area.y2 = self->height - 1; |
Looks like the original code was correct with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current version looks correct. Thanks!
This is functional at least. I basically just stole from the pixel set code and loopified it.
Test program:
Results: