Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

atmel-samd: Fix CDC #294

Merged
merged 1 commit into from
Sep 29, 2017
Merged

atmel-samd: Fix CDC #294

merged 1 commit into from
Sep 29, 2017

Conversation

tannewt
Copy link
Member

@tannewt tannewt commented Sep 29, 2017

by making sure we define CONF_USB_COMPOSITE_CDC_ACM_EN before including ASF4 files.

CONF_USB_COMPOSITE_CDC_ACM_EN before including ASF4 files.
@tannewt tannewt added this to the 3.0 Beta milestone Sep 29, 2017
@tannewt tannewt requested a review from dhalbert September 29, 2017 19:15
@dhalbert dhalbert merged commit bac841d into adafruit:master Sep 29, 2017
jepler pushed a commit to jepler/circuitpython that referenced this pull request Nov 3, 2024
Since the very beginning, the stm32 port (first called stm, then stmhal now
stm32) has had a special keyboard interrupt feature which works by using
PendSV to break out of any running code.  This preemptive ctrl-C was added
long ago in commit 01156d5.

The stm32 port still uses that code, and current does this:

- If ctrl-C is received on UART or USB then `mp_sched_keyboard_interrupt()`
  is called (like all other ports) to set a flag for the VM to see, and
  then the VM (or any loop calling `mp_handle_pending(true)`) will
  eventually handle the `KeyboardInterrupt` exception, raising it via NLR.

- If another ctrl-C is received while the existing scheduled keyboard
  interrupt is still pending (ie the VM has not yet processed it) then a
  special hard NLR jump will activate, that preempts the calling code.
  Within the PendSV interrupt the stack is adjusted and an NLR jump is made
  to the most recent `nlr_push()` location.  This is like a normal NLR
  except it is called from an interrupt context and completely annihilates
  the code that was interrupted by the IRQ.

The reason for the preemptive interrupt was to handle ctrl-C before the VM
was able to handle it.  Eventually a mechanism (that's in use today by all
ports) was added to the VM and runtime to be able to check for pending
interrupts.  Then the stm32 port was updated to use this mechanism, with a
fallback to the old preemptive way if a second ctrl-C was received (without
the first one being processed).

This preemptive NLR jump is problematic because it can interrupt
long-running instructions (eg store multiple, usually used at the end of a
function to restore registers and return).  If such an instruction is
interrupted the CPU remembers that with some flags, and can resume the
long-running instruction when the interrupt finishes.  But the preemptive
NLR does a long jump to different code at thread level and so the
long-running interrupt is never resumed.  This leads to a CPU fault.

This fault has been previously reported in issues adafruit#3807 and adafruit#3842 (see also
issue adafruit#294).  It's now possible to easily reproduce this problem, since
commit 69c25ea.  Running the test suite
over and over again on any stm32 board will eventually crash the board (it
can happen on a PYBv1.x, but it happens more regularly on PYBD-SF2/6).

The point is, a skipped test now soft resets the board and so the board
must run `boot.py` again.  The test runner may then interrupt the execution
of `boot.py` with the double-ctrl-C that it sends (in `tools/pyboard.py`,
`enter_raw_repl()`) in order to get the board into a known good state for
the next test.  If the timing is right, this can trigger the preemptive
PendSV in an unfortunate location and hard fault the board.

The fix in this commit is to just remove the preemptive NLR jump feature.
No other port has this feature and it's not needed, ctrl-C works very well
on those ports.  Preemptive NLR jump is a very dangerous thing (eg it may
interrupt and break out of an external SPI flash operation when reading
code from a filesystem) and is obviously buggy.

With this commit, stm32 borads no longer hard fault when running the test
suite (but it does leave an issue, the tests can still interrupt `boot.py`
with a single ctrl-C; that will be fixed separately).

An alternative to this commit would be to clear the CPU state for the
long-running instruction as suggested in issue adafruit#3842.  But it's much
simpler to just remove this code, which is now unnecessary and can have
other problems as per issue adafruit#294.

Signed-off-by: Damien George <damien@micropython.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants