Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[spi_flash] handle reentrance gracefully #4397

Merged
merged 1 commit into from
Mar 23, 2021
Merged

Conversation

tyomitch
Copy link
Collaborator

Report failure instead of deadlocking the device

@tyomitch
Copy link
Collaborator Author

Looks like an HTTPS misconfiguration is blocking the CI

Copy link
Member

@gamblor21 gamblor21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look fine to me, but did not test directly.

while (!common_hal_busio_spi_try_lock(&supervisor_flash_spi_bus)) {}
common_hal_digitalio_digitalinout_set_value(&cs_pin, false);
static bool flash_enable(void) {
if (common_hal_busio_spi_try_lock(&supervisor_flash_spi_bus)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth retrying X amount of times even? I know adafruit_bus_device had a routine to catch background tasks and keep trying for the lock, I'm not familiar enough to say for sure this needs it just a thought.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be that instead of the busy-wait in https://github.com/adafruit/Adafruit_CircuitPython_BusDevice/blob/master/adafruit_bus_device/spi_device.py#L73, it's better to throw an exception when the SPI is locked? Then the user would have control over how many times, if at all, to retry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looping over common_hal_busio_spi_try_lock() in C will not succeed if it doesn't work the first time: If foreground Python code has locked the bus, it has to be able to proceed in order to unlock the bus. I don't think we ever un/lock the bus from an interrupt context.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jepler doesn't the same apply to adafruit_bus_device? Or may run_background_tasks() lock SPI in one tick, and unlock it in another?

@tyomitch
Copy link
Collaborator Author

Ping @tannewt who added the loop in question in 9d91111

@tannewt
Copy link
Member

tannewt commented Mar 19, 2021

I added it three years ago... Have you reproduced an issue with this or just suspect a problem?

I think the assumption is that it'd be incorrect for this code to occur when the spi bus is locked. Having this be a loop allows you to detect the error with a debugger when developing.

@tyomitch
Copy link
Collaborator Author

I added it three years ago... Have you reproduced an issue with this or just suspect a problem?

My particular use case was attempting a file access in an interrupt handler, and observing that it locks up the board if the interrupt happens during screen update.

I think the assumption is that it'd be incorrect for this code to occur when the spi bus is locked. Having this be a loop allows you to detect the error with a debugger when developing.

There is nothing to prevent this code from being called when the SPI bus is locked: interrupts notwithstanding, the Python user can lock the bus via busio.SPI, then try a file operation, and he would have no indication whatsoever as to why the board locked up.

@hierophect
Copy link
Collaborator

@tannewt It seems the most likely place for this to be encountered in practice is with the Meowbit, which shares the SPI bus between the flash and the screen (I assume this is where you first encountered the problem, @tyomitch). Specifically, when a module tries to access the flash while the screen is updating, it causes an irrecoverable hang. Then you need to figure out how to manually wipe the NOR flash somehow, like introducing an intentional exception into your build. I just ran into it with the AudioPWMIO PR - it's not fun to deal with.

@hierophect
Copy link
Collaborator

If there are unwanted wider consequences for this change, and since the Meowbit is just one board (and a historically annoying one, for this same reason), this could be folded into an optional mpconfigboard macro so it doesn't apply to other boards.

tannewt
tannewt previously approved these changes Mar 22, 2021
Copy link
Member

@tannewt tannewt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Code that calls transfer should already be able to handle failures. Thanks! Looks like it just needs a final merge.

Report failure instead of deadlocking the device
Copy link
Collaborator

@hierophect hierophect left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jepler jepler merged commit f8cea3b into adafruit:main Mar 23, 2021
@jepler
Copy link
Member

jepler commented Mar 23, 2021

Thank you @tyomitch !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants