Resolves Timed out waiting for RSA operation #256

nathanjel · 2017-01-17T22:13:27Z

Resolves Timed out waiting for RSA operation abort while using SSL connectivity with mbedtls

…nnectivity with mbedtls

projectgus · 2017-01-17T23:36:10Z

Hi nathanjel,

Thanks for debugging this, sorry I didn't reply earlier to the issue..

I admit when I wrote the "fundamental problem with the driver" comment I was thinking that if the bignum operation didn't complete within 2 seconds, this would be a driver bug. However it sounds like here what is happening is that the bignum ISR is being starved out for more than 2 seconds (so the interrupt completed, but it hasn't had a chance to run on CPU and flag the semaphore.)

The change you sent is very illuminating. LOWMED is the default interrupt priority, if you don't provide any level flags. So it seems like setting IRAMONLY is what makes the difference here. This is the correct flag, given the handler is entirely implemented in IRAM.

However, it's still a little odd - I don't really understand why the ISR could remain unserviced for two seconds when the handler is in flash, and not when it's in IRAM, even though the priority is nominally the same.

You said your program makes intensive use of a lot of features, do any of those features involve erasing or writing to flash (or use the NVS API or anything else that might be writing to flash in the background)?

Angus

nathanjel · 2017-01-18T00:56:06Z

Hi,

Indeed, my code does use spi_flash_* functions every few minutes, to save accumulated data and eventually update configuration (it's a hi-tech dimmer, needs to keep state and record energy consumption) Actually, to make my stability tests more robust, for testing I changed the code so that it writes a flash page every two seconds. In normal usage scenarios it might be every few minutes up to every few days, depending on frequency of user interactions.

I am using my own filesystem over spi_flash_* for it (https://github.com/nathanjel/ffs) and I also had to tweak a bit the whole solution reg. task core affinity (see #258), otherwise the flash functions would abort() every few usages.

To be 100% honest - my fix suggestion was intuitive - obviously I have less exposure to ESP32 internals than the team behind the esp-idf, but i thought it was good to share what worked in my case.

Marcin

igrr · 2017-01-18T00:59:56Z

Do you happen to run any tasks at the highest priority level in your system (23)?

nathanjel · 2017-01-18T01:19:08Z

No, I do not. I use the lower priority numbers. Below is how I start all my tasks.

#define TASK_PRIO_REGULAR 	(ESP_TASK_MAIN_PRIO+1)
#define TASK_PRIO_ELEVATED 	(ESP_TASK_MAIN_PRIO+2)
#define TASK_PRIO_IO	    (ESP_TASK_MAIN_PRIO+3)

#define CORE_PIN_STATUS		tskNO_AFFINITY
#define CORE_PIN_NET		tskNO_AFFINITY
#define CORE_PIN_IO			tskNO_AFFINITY
#define CORE_PIN_IOC		tskNO_AFFINITY
#define CORE_PIN_NR			tskNO_AFFINITY
#define CORE_PIN_MOOD		tskNO_AFFINITY

/home/nathan/xnode/SNESP32/main/iohandle.c:
  483  
  484  	// we got config in, so start sending control data
  485: 	xTaskCreatePinnedToCore(PowerControlTaskRunner, "PC", OSI_STACK_SIZE_SMALL, NULL, TASK_PRIO_IO, NULL, CORE_PIN_IOC);
  486  
  487  	for (;;) {

/home/nathan/xnode/SNESP32/main/main.c:
   60      // start tasks
   61      ESP_LOGI("MAIN", "Starting tasks");
   62:     xTaskCreatePinnedToCore(StatusTask, "*Status",
   63                  OSI_STACK_SIZE_TYPICAL,
   64                  NULL, TASK_PRIO_REGULAR, NULL, CORE_PIN_STATUS);
   65      
   66:     xTaskCreatePinnedToCore(NetworkTask, "*Net",
   67                  OSI_STACK_SIZE_LARGE,
   68                  NULL, TASK_PRIO_ELEVATED, NULL, CORE_PIN_NET);
   69  
   70:     xTaskCreatePinnedToCore(IOTask, "*IO",
   71                  OSI_STACK_SIZE_TYPICAL,
   72                  NULL, TASK_PRIO_IO, NULL, CORE_PIN_IO);
   73  
   74:     xTaskCreatePinnedToCore(MoodNodeTask, "*Mood",
   75                  OSI_STACK_SIZE_TYPICAL,
   76                  NULL, TASK_PRIO_REGULAR, NULL, CORE_PIN_MOOD);

igrr · 2017-01-18T01:28:49Z

Okay, I see. Actually this issue dovetails nicely with the other one about spi_flash, and I think I see what the cause of both issues is. Will do some tests today and update you later.

nathanjel · 2017-01-18T07:58:56Z

This happens after I enable optimization.

Guru Meditation Error: Core  0 panic'ed (Interrupt wdt timeout on CPU0)
Register dump:
PC      : 0x40083650  PS      : 0x00060034  A0      : 0x80083072  A1      : 0x3ffb0b50  
A2      : 0x3ffb1d68  A3      : 0x3ffd3c90  A4      : 0x0000001d  A5      : 0x3ffb0b7c  
A6      : 0x00000001  A7      : 0x3ffd3f0c  A8      : 0xb33fffff  A9      : 0xb33f0000  
A10     : 0x00060023  A11     : 0x00060023  A12     : 0xb33fffff  A13     : 0x00000000  
A14     : 0x00060023  A15     : 0x00000000  SAR     : 0x00000018  EXCCAUSE: 0x00000005  
EXCVADDR: 0x00000000  LBEG    : 0x4000c349  LEND    : 0x4000c36b  LCOUNT  : 0x00000000  

Backtrace: 0x40083650:0x3ffb0b50 0x40083072:0x3ffb0b70 0x400816ed:0x3ffb0bb0

Entering gdb stub now.
$T06#ba

and GDB


Remote debugging using /dev/ttyUSB1
0x40083650 in rsa_complete_isr (arg=0x3ffb1d68 <uart_spinlock+8>) at /home/nathan/xnode/esp-idf/components/mbedtls/port/esp_bignum.c:71
71	        portYIELD_FROM_ISR();
(gdb) bt
#0  0x40083650 in rsa_complete_isr (arg=0x3ffb1d68 <uart_spinlock+8>) at /home/nathan/xnode/esp-idf/components/mbedtls/port/esp_bignum.c:71
#1  0x40083072 in uart_rx_intr_handler_default (param=0x3ffd3c60) at /home/nathan/xnode/esp-idf/components/driver/./uart.c:488
#2  0x400816ed in _xt_medint3 ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb)

nathanjel · 2017-01-28T21:06:18Z

The spi_flash issue solution closed this particular PR and related issues.

milamber-ls · 2018-12-03T03:34:11Z

is the pull request accepted? in dev 3.2 idf I still see
static void rsa_isr_initialise() { if (op_complete_sem == NULL) { op_complete_sem = xSemaphoreCreateBinary(); esp_intr_alloc(ETS_RSA_INTR_SOURCE, 0, rsa_complete_isr, NULL, NULL); } }

Resolves Timed out waiting for RSA operation abort while using SSL co…

3b529b9

…nnectivity with mbedtls

nathanjel mentioned this pull request Jan 17, 2017

esp_bignum aborts randomly when in task with no CPU affinity / Timed out waiting for RSA operation (op_reg 0x3ff02810 int_reg 0x1) #239

Closed

nathanjel mentioned this pull request Jan 18, 2017

[TW#12230] gettimeofday not reentrant when used with RTC registers only / "waiti" called and triggers watchdog #219

Closed

nathanjel closed this Jan 28, 2017

0xFEEDC0DE64 pushed a commit to 0xFEEDC0DE64/esp-idf that referenced this pull request May 5, 2021

Add FS, SD and SD_MMC (espressif#256)

e625b3b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolves Timed out waiting for RSA operation #256

Resolves Timed out waiting for RSA operation #256

nathanjel commented Jan 17, 2017

projectgus commented Jan 17, 2017 •

edited

Loading

nathanjel commented Jan 18, 2017 •

edited

Loading

igrr commented Jan 18, 2017

nathanjel commented Jan 18, 2017

igrr commented Jan 18, 2017

nathanjel commented Jan 18, 2017

nathanjel commented Jan 28, 2017

milamber-ls commented Dec 3, 2018

Resolves Timed out waiting for RSA operation #256

Resolves Timed out waiting for RSA operation #256

Conversation

nathanjel commented Jan 17, 2017

projectgus commented Jan 17, 2017 • edited Loading

nathanjel commented Jan 18, 2017 • edited Loading

igrr commented Jan 18, 2017

nathanjel commented Jan 18, 2017

igrr commented Jan 18, 2017

nathanjel commented Jan 18, 2017

nathanjel commented Jan 28, 2017

milamber-ls commented Dec 3, 2018

projectgus commented Jan 17, 2017 •

edited

Loading

nathanjel commented Jan 18, 2017 •

edited

Loading