Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolves Timed out waiting for RSA operation #256

Closed
wants to merge 1 commit into from
Closed

Resolves Timed out waiting for RSA operation #256

wants to merge 1 commit into from

Conversation

nathanjel
Copy link
Contributor

Resolves Timed out waiting for RSA operation abort while using SSL connectivity with mbedtls

@projectgus
Copy link
Contributor

projectgus commented Jan 17, 2017

Hi nathanjel,

Thanks for debugging this, sorry I didn't reply earlier to the issue..

I admit when I wrote the "fundamental problem with the driver" comment I was thinking that if the bignum operation didn't complete within 2 seconds, this would be a driver bug. However it sounds like here what is happening is that the bignum ISR is being starved out for more than 2 seconds (so the interrupt completed, but it hasn't had a chance to run on CPU and flag the semaphore.)

The change you sent is very illuminating. LOWMED is the default interrupt priority, if you don't provide any level flags. So it seems like setting IRAMONLY is what makes the difference here. This is the correct flag, given the handler is entirely implemented in IRAM.

However, it's still a little odd - I don't really understand why the ISR could remain unserviced for two seconds when the handler is in flash, and not when it's in IRAM, even though the priority is nominally the same.

You said your program makes intensive use of a lot of features, do any of those features involve erasing or writing to flash (or use the NVS API or anything else that might be writing to flash in the background)?

Angus

@nathanjel
Copy link
Contributor Author

nathanjel commented Jan 18, 2017

Hi,

Indeed, my code does use spi_flash_* functions every few minutes, to save accumulated data and eventually update configuration (it's a hi-tech dimmer, needs to keep state and record energy consumption) Actually, to make my stability tests more robust, for testing I changed the code so that it writes a flash page every two seconds. In normal usage scenarios it might be every few minutes up to every few days, depending on frequency of user interactions.

I am using my own filesystem over spi_flash_* for it (https://github.com/nathanjel/ffs) and I also had to tweak a bit the whole solution reg. task core affinity (see #258), otherwise the flash functions would abort() every few usages.

To be 100% honest - my fix suggestion was intuitive - obviously I have less exposure to ESP32 internals than the team behind the esp-idf, but i thought it was good to share what worked in my case.

Marcin

@igrr
Copy link
Member

igrr commented Jan 18, 2017

Do you happen to run any tasks at the highest priority level in your system (23)?

@nathanjel
Copy link
Contributor Author

No, I do not. I use the lower priority numbers. Below is how I start all my tasks.

#define TASK_PRIO_REGULAR 	(ESP_TASK_MAIN_PRIO+1)
#define TASK_PRIO_ELEVATED 	(ESP_TASK_MAIN_PRIO+2)
#define TASK_PRIO_IO	    (ESP_TASK_MAIN_PRIO+3)

#define CORE_PIN_STATUS		tskNO_AFFINITY
#define CORE_PIN_NET		tskNO_AFFINITY
#define CORE_PIN_IO			tskNO_AFFINITY
#define CORE_PIN_IOC		tskNO_AFFINITY
#define CORE_PIN_NR			tskNO_AFFINITY
#define CORE_PIN_MOOD		tskNO_AFFINITY
/home/nathan/xnode/SNESP32/main/iohandle.c:
  483  
  484  	// we got config in, so start sending control data
  485: 	xTaskCreatePinnedToCore(PowerControlTaskRunner, "PC", OSI_STACK_SIZE_SMALL, NULL, TASK_PRIO_IO, NULL, CORE_PIN_IOC);
  486  
  487  	for (;;) {

/home/nathan/xnode/SNESP32/main/main.c:
   60      // start tasks
   61      ESP_LOGI("MAIN", "Starting tasks");
   62:     xTaskCreatePinnedToCore(StatusTask, "*Status",
   63                  OSI_STACK_SIZE_TYPICAL,
   64                  NULL, TASK_PRIO_REGULAR, NULL, CORE_PIN_STATUS);
   65      
   66:     xTaskCreatePinnedToCore(NetworkTask, "*Net",
   67                  OSI_STACK_SIZE_LARGE,
   68                  NULL, TASK_PRIO_ELEVATED, NULL, CORE_PIN_NET);
   69  
   70:     xTaskCreatePinnedToCore(IOTask, "*IO",
   71                  OSI_STACK_SIZE_TYPICAL,
   72                  NULL, TASK_PRIO_IO, NULL, CORE_PIN_IO);
   73  
   74:     xTaskCreatePinnedToCore(MoodNodeTask, "*Mood",
   75                  OSI_STACK_SIZE_TYPICAL,
   76                  NULL, TASK_PRIO_REGULAR, NULL, CORE_PIN_MOOD);

@igrr
Copy link
Member

igrr commented Jan 18, 2017

Okay, I see. Actually this issue dovetails nicely with the other one about spi_flash, and I think I see what the cause of both issues is. Will do some tests today and update you later.

@nathanjel
Copy link
Contributor Author

This happens after I enable optimization.

Guru Meditation Error: Core  0 panic'ed (Interrupt wdt timeout on CPU0)
Register dump:
PC      : 0x40083650  PS      : 0x00060034  A0      : 0x80083072  A1      : 0x3ffb0b50  
A2      : 0x3ffb1d68  A3      : 0x3ffd3c90  A4      : 0x0000001d  A5      : 0x3ffb0b7c  
A6      : 0x00000001  A7      : 0x3ffd3f0c  A8      : 0xb33fffff  A9      : 0xb33f0000  
A10     : 0x00060023  A11     : 0x00060023  A12     : 0xb33fffff  A13     : 0x00000000  
A14     : 0x00060023  A15     : 0x00000000  SAR     : 0x00000018  EXCCAUSE: 0x00000005  
EXCVADDR: 0x00000000  LBEG    : 0x4000c349  LEND    : 0x4000c36b  LCOUNT  : 0x00000000  

Backtrace: 0x40083650:0x3ffb0b50 0x40083072:0x3ffb0b70 0x400816ed:0x3ffb0bb0

Entering gdb stub now.
$T06#ba

and GDB


Remote debugging using /dev/ttyUSB1
0x40083650 in rsa_complete_isr (arg=0x3ffb1d68 <uart_spinlock+8>) at /home/nathan/xnode/esp-idf/components/mbedtls/port/esp_bignum.c:71
71	        portYIELD_FROM_ISR();
(gdb) bt
#0  0x40083650 in rsa_complete_isr (arg=0x3ffb1d68 <uart_spinlock+8>) at /home/nathan/xnode/esp-idf/components/mbedtls/port/esp_bignum.c:71
#1  0x40083072 in uart_rx_intr_handler_default (param=0x3ffd3c60) at /home/nathan/xnode/esp-idf/components/driver/./uart.c:488
#2  0x400816ed in _xt_medint3 ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) 


@nathanjel
Copy link
Contributor Author

The spi_flash issue solution closed this particular PR and related issues.

@nathanjel nathanjel closed this Jan 28, 2017
@milamber-ls
Copy link

is the pull request accepted? in dev 3.2 idf I still see
static void rsa_isr_initialise() { if (op_complete_sem == NULL) { op_complete_sem = xSemaphoreCreateBinary(); esp_intr_alloc(ETS_RSA_INTR_SOURCE, 0, rsa_complete_isr, NULL, NULL); } }

0xFEEDC0DE64 pushed a commit to 0xFEEDC0DE64/esp-idf that referenced this pull request May 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants