Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when accessing flash while using RMT and -Os compiler flag (IDFGH-7974) #9487

Closed
kriegste opened this issue Aug 2, 2022 · 11 comments
Closed
Assignees
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally

Comments

@kriegste
Copy link

kriegste commented Aug 2, 2022

  • Module or chip used: ESP32 rev0 on TTGO
  • IDF version: 4.4.1
  • Build System: Standard VSCode installation (Windows) as per the tutorial
  • Power Supply: USB

Problem Description

When compiled with -Os, using RMT and accessing the flash at the same time results in a crash (Cache disabled but cached memory region accessed).

The problem is in
https://github.com/espressif/esp-idf/blob/release/v4.4/components/hal/esp32/include/hal/rmt_ll.h
The functions there are not really "inline" when compiled with -Os. Thus they end up in flash unreachable during flash operations.

Solution: Use __attribute__((always_inline)) (I think you defined it FORCE_INLINE_ATTR).

I've seen in uart_ll.h the low level functions are inlined properly. But e.g. in i2c_ll.h they are not either! I haven't tried anything with I2C or looked into all the other ll files yet, but maybe other peripherals suffer from a similar bug. Whoever fixes rmt_ll.h should check the other components, too.

Code to reproduce this issue

#include <string.h>
#include "driver/rmt.h"
#include "nvs_flash.h"

void app_main(void) {
	rmt_config_t tx = {
		.channel = RMT_CHANNEL_0,
		.gpio_num = 20,
		.mem_block_num = 1,
		.clk_div = 40,
		.tx_config.idle_output_en = true,
		.rmt_mode = RMT_MODE_TX,
	};
	rmt_config(&tx);
	rmt_set_source_clk(tx.channel, RMT_BASECLK_APB);
	rmt_driver_install(tx.channel, 0, ESP_INTR_FLAG_IRAM);

	rmt_item32_t items[512] = {};
	for (int i = 0; i < 512; i++) {
		items[i].duration0 = 1;
		items[i].level0 = 0;
		items[i].duration1 = 1;
		items[i].level1 = 1;
	}
	rmt_write_items(RMT_CHANNEL_0, items, 512, false);

	while (true) {
		uint32_t crashme;
		esp_flash_read(NULL, (void *)&crashme, 0, sizeof(crashme));
	}
}

Debug Logs

Guru Meditation Error: Core  0 panic'ed (Cache disabled but cached memory region accessed).

Core  0 register dump:
PC      : 0x400e41b0  PS      : 0x00060034  A0      : 0x40082578  A1      : 0x3ffb0b60
0x400e41b0: rmt_ll_get_tx_err_interrupt_status at C:/Users/***/esp/esp-idf/components/hal/esp32/include/hal/rmt_ll.h:307

0x40082578: _xt_lowint1 at C:/Users/***/esp/esp-idf/components/freertos/port/xtensa/xtensa_vectors.S:1111

A2      : 0x3ffb1298  A3      : 0x00000000  A4      : 0x00000080  A5      : 0x00000001
A6      : 0x00000001  A7      : 0x00000020  A8      : 0x80082d2a  A9      : 0x00000020
A10     : 0x3ff56000  A11     : 0x3ffb55f0  A12     : 0xffffffff  A13     : 0x00000000
A14     : 0x3ffb12a4  A15     : 0x00000000  SAR     : 0x00000008  EXCCAUSE: 0x00000007
EXCVADDR: 0x00000000  LBEG    : 0x40082079  LEND    : 0x40082081  LCOUNT  : 0x00000027
0x40082079: esp_timer_impl_get_counter_reg at C:/Users/***/esp/esp-idf/components/esp_timer/src/esp_timer_impl_lac.c:118 (discriminator 2)

0x40082081: esp_timer_impl_get_counter_reg at C:/Users/***/esp/esp-idf/components/esp_timer/src/esp_timer_impl_lac.c:128 (discriminator 1)



Backtrace:0x400e41ad:0x3ffb0b600x40082575:0x3ffb0ba0 0x4008a059:0x3ffb4eb0 0x400840ca:0x3ffb4f30 0x400d4245:0x3ffb4f70 0x400e4874:0x3ffb57c0
0x400e41ad: esp_efuse_utility_get_number_of_items at C:/Users/***/esp/esp-idf/components/efuse/src/esp_efuse_utility.c:241

0x40082575: _xt_lowint1 at C:/Users/***/esp/esp-idf/components/freertos/port/xtensa/xtensa_vectors.S:1111

0x4008a059: spi_flash_chip_generic_detect_size at C:/Users/***/esp/esp-idf/components/spi_flash/spi_flash_chip_generic.c:119

0x400840ca: esp_flash_read at C:/Users/***/esp/esp-idf/components/spi_flash/esp_flash_api.c:832

0x400d4245: app_main at C:\Users\***\esp\rmt_inline_crash\build/../main/main.c:29 (discriminator 1)

0x400e4874: main_task at C:/Users/***/esp/esp-idf/components/freertos/port/port_common.c:129
@espressif-bot espressif-bot added the Status: Opened Issue is new label Aug 2, 2022
@github-actions github-actions bot changed the title Crash when accessing flash while using RMT and -Os compiler flag Crash when accessing flash while using RMT and -Os compiler flag (IDFGH-7974) Aug 2, 2022
@espressif-bot espressif-bot added Status: Selected for Development Issue is selected for development Status: In Progress Work is in progress and removed Status: Opened Issue is new Status: Selected for Development Issue is selected for development labels Aug 3, 2022
@suda-morris
Copy link
Collaborator

Thanks for reporting, we can reproduce the issue with your code. Will fix it ASAP.
BTW, calling RMT functions while the flash cache is disabled is not guaranteed to be safe, as these public APIs are not placed in IRAM.

@kriegste
Copy link
Author

kriegste commented Feb 4, 2023

How certain are you that the issue is really fixed?

In IDF 4.4.4 I still get a

Guru Meditation Error: Core 0 panic'ed (Cache disabled but cached memory region accessed).

0x40163211: rmt_ll_enable_tx_err_interrupt at esp-idf/components/hal/esp32/include/hal/rmt_ll.h:252
0x400825f1: shared_intr_isr at esp-idf/components/esp_hw_support/intr_alloc.c:407
0x40082ff5: _xt_lowint1 at esp-idf/components/freertos/port/xtensa/xtensa_vectors.S:1114
0x4000c050: ?? ??:0
0x40008544: ?? ??:0
0x40086d35: delay_us at esp-idf/components/spi_flash/spi_flash_os_func_app.c:148
0x4009b932: spi_flash_chip_generic_wait_idle at esp-idf/components/spi_flash/spi_flash_chip_generic.c:455
0x4009b841: spi_flash_chip_generic_page_program at esp-idf/components/spi_flash/spi_flash_chip_generic.c:253
0x4009b34e: spi_flash_chip_generic_write at esp-idf/components/spi_flash/spi_flash_chip_generic.c:278
0x40086c40: esp_flash_write at esp-idf/components/spi_flash/esp_flash_api.c:907
0x400f41d1: esp_partition_write at esp-idf/components/spi_flash/partition.c:466
0x400f306d: esp_ota_write at esp-idf/components/app_update/esp_ota_ops.c:240

@suda-morris
Copy link
Collaborator

rmt_ll_enable_tx_err_interrupt is called rmt_driver_install which is not expected to run in ISR. Another place can be rmt_set_err_intr_en Maybe you're calling rmt_set_err_intr_en in the ISR callback?

@kriegste
Copy link
Author

kriegste commented Feb 6, 2023

That is strange. I do not use rmt_set_err_intr_en at all. rmt_driver_install is only called at start and rmt_driver_uninstall is not used at all.

I suspect that -Os makes the dump unreliable and it's in fact another function like rmt_ll_get_tx_end_interrupt_status, rmt_ll_clear_tx_end_interrupt, etc. which are called from the isr and not properly inlined.

@kriegste
Copy link
Author

kriegste commented Feb 6, 2023

rmt_ll_enable_tx_err_interrupt and
rmt_ll_enable_rx_err_interrupt
are basically the same function. Could it be that the compiler combines them (uses only one of them)? -Os does crazy things. But also rmt_ll_enable_rx_err_interrupt is used only in rmt_set_err_intr_en, rmt_driver_uninstall and rmt_driver_install (via rmt_hal_rx_channel_reset)...

Anyway, if I inline rmt_ll_enable_tx_err_interrupt using __attribute__((always_inline)) the crash seems to go away...

@kriegste
Copy link
Author

kriegste commented Feb 6, 2023

...but a new crash appears:

0x401631e9: rmt_ll_enable_rx_end_interrupt at esp-idf/components/hal/esp32/include/hal/rmt_ll.h:246
0x40082ff5: _xt_lowint1 at esp-idf/components/freertos/port/xtensa/xtensa_vectors.S:1114
0x40008544: ?? ??:0
0x40086d35: delay_us at esp-idf/components/spi_flash/spi_flash_os_func_app.c:148
0x4009b932: spi_flash_chip_generic_wait_idle at esp-idf/components/spi_flash/spi_flash_chip_generic.c:455
0x4009b841: spi_flash_chip_generic_page_program at esp-idf/components/spi_flash/spi_flash_chip_generic.c:253
0x4009b34e: spi_flash_chip_generic_write at esp-idf/components/spi_flash/spi_flash_chip_generic.c:278
0x40086c40: esp_flash_write at esp-idf/components/spi_flash/esp_flash_api.c:907
0x400f41e1: esp_partition_write at esp-idf/components/spi_flash/partition.c:466
0x400f307d: esp_ota_write at esp-idf/components/app_update/esp_ota_ops.c:240

@kriegste
Copy link
Author

kriegste commented Feb 6, 2023

I did what I did back in August with version 4.4.1: I properly inlined all functions in
https://github.com/espressif/esp-idf/blob/release/v4.4/components/hal/esp32/include/hal/rmt_ll.h
using __attribute__((always_inline)).

Now I'm monitoring if the problem is gone again.

@kriegste
Copy link
Author

kriegste commented Feb 8, 2023

No problems so far after inlining.

What were the changes you made to the 4.4 branch?
Looking at the history of these files makes me think the fix might have somehow gotten lost.
https://github.com/espressif/esp-idf/commits/release/v4.4/components/hal/esp32/include/hal/rmt_ll.h
https://github.com/espressif/esp-idf/commits/release/v4.4/components/driver/rmt.c

@emmpe
Copy link

emmpe commented Feb 10, 2023

I'm encountering a similar problem (slightly different trace) in version 4.4.2 and 4.4.4. I'm calling rmt_driver_install with the ESP_INTR_FLAG_IRAM flag.
Using attribute((always_inline)) on rmt_ll_get_tx_err_interrupt_status seems to help me.
But when looking into the driver I noticed that it looks like the tx_buf is allocated in the wrong way in rmt_translator_init().

if (p_rmt_obj[channel]->intr_alloc_flags & ESP_INTR_FLAG_IRAM) {
p_rmt_obj[channel]->tx_buf = (rmt_item32_t *)malloc(block_size);
} else {
p_rmt_obj[channel]->tx_buf = (rmt_item32_t *)heap_caps_calloc(1, block_size, MALLOC_CAP_INTERNAL | MALLOC_CAP_8BIT);
}

should it not be the other way around?

if (!(p_rmt_obj[channel]->intr_alloc_flags & ESP_INTR_FLAG_IRAM)) {
p_rmt_obj[channel]->tx_buf = (rmt_item32_t *)malloc(block_size);
} else {
p_rmt_obj[channel]->tx_buf = (rmt_item32_t *)heap_caps_calloc(1, block_size, MALLOC_CAP_INTERNAL | MALLOC_CAP_8BIT);
}

I don't know if this solves any issues yet, but it does not look correct to me.

@kriegste
Copy link
Author

Please reopen.

@suda-morris suda-morris reopened this Feb 14, 2023
@espressif-bot espressif-bot added Status: In Progress Work is in progress and removed Resolution: Done Issue is done internally Status: Done Issue is done internally labels Feb 14, 2023
@suda-morris
Copy link
Collaborator

@emmpe Thanks for reporting, yeah, it's an issue, we will fix it ASAP and backport to release/4.4

@kriegste You're correct, the fix is not backported to 4.4 release yet (it should exist since 5.0). We will backport this fix to release/4.4 soon.

@espressif-bot espressif-bot added Status: Reviewing Issue is being reviewed Resolution: Done Issue is done internally Status: Done Issue is done internally and removed Status: In Progress Work is in progress Status: Reviewing Issue is being reviewed labels Feb 14, 2023
espressif-bot pushed a commit that referenced this issue Feb 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally
Projects
None yet
Development

No branches or pull requests

4 participants