Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAN bus goes to Bus Off after ECU startup. (IDFGH-13672) #14548

Closed
HDLA-BG opened this issue Sep 11, 2024 · 16 comments
Closed

CAN bus goes to Bus Off after ECU startup. (IDFGH-13672) #14548

HDLA-BG opened this issue Sep 11, 2024 · 16 comments
Labels
Resolution: NA Issue resolution is unavailable Status: Done Issue is done internally

Comments

@HDLA-BG
Copy link

HDLA-BG commented Sep 11, 2024

IDF version.
esp-idf-v5.0.6

Espressif SoC revision.
ESP32S3

Operating System used.
Windows

How did you build your project?
Command line with idf.py

If you are using Windows, please specify command line type.
PowerShell

Development Kit.
Custom Board

Power Supply used.
External power supply

What is the expected behavior?
After ECU reset CAN BusOFF shall not be triggered.

What is the actual behavior?
After ECU reset CAN goes in BusOFF state. Which causes all other devices to try to recover.
Also it stops running simulations.

Steps to reproduce.

  1. Run CAN simulation which sends a lot of CAN frames (~40% bus load).
  2. Restart the device
  3. Simulations goes to BusOFF state.

CAN configuration in the project:

static const twai_timing_config_t t_config = TWAI_TIMING_CONFIG_250KBITS();
static twai_filter_config_t f_config = TWAI_FILTER_CONFIG_ACCEPT_ALL();
static const twai_general_config_t g_config =  { .mode = TWAI_MODE_NORMAL, 
                                                .tx_io = 37, 
                                                .rx_io = 38,
                                                .clkout_io = TWAI_IO_UNUSED, 
                                                .bus_off_io = TWAI_IO_UNUSED,
                                                .tx_queue_len = 40, 
                                                .rx_queue_len = 40,
                                                .alerts_enabled = TWAI_ALERT_NONE,  
                                                .clkout_divider = 0,
                                                .intr_flags = ESP_INTR_FLAG_LEVEL2};

We found that in IDF v5.0.4 the issue doesn't exist and first occurs on v5.0.5
Also identified what causes the problem
in twai_configure_gpio()

    //Set TX pin
    gpio_conf.mode = GPIO_MODE_OUTPUT | (io_loop_back ? GPIO_MODE_INPUT : 0);
    gpio_conf.pin_bit_mask = 1ULL << tx;
    gpio_config(&gpio_conf);
    esp_rom_gpio_connect_out_signal(tx, twai_controller_periph_signals.controllers[controller_id].tx_sig, false, false);

gpio_conf.mode = GPIO_MODE_OUTPUT | (io_loop_back ? GPIO_MODE_INPUT : 0);

if I change like this

    //Set TX pin
    gpio_conf.mode = GPIO_MODE_DISABLE | (io_loop_back ? GPIO_MODE_INPUT : 0);
    gpio_conf.pin_bit_mask = 1ULL << tx;
    gpio_config(&gpio_conf);
    esp_rom_gpio_connect_out_signal(tx, twai_controller_periph_signals.controllers[controller_id].tx_sig, false, false);

if gpio_conf.mode is set to GPIO_MODE_DISABLE, the issue disappears, we still see some error frames due to not acknowledged frames, but the CAN doesn't go to BusOFF.

I don't understand why this change fixes the issue.
Can you give me a hint what could causes it and maybe a workboard, since this code exist even on the master branch.

@espressif-bot espressif-bot added the Status: Opened Issue is new label Sep 11, 2024
@github-actions github-actions bot changed the title CAN bus goes to Bus Off after ECU startup. CAN bus goes to Bus Off after ECU startup. (IDFGH-13672) Sep 11, 2024
@diplfranzhoepfinger
Copy link
Contributor

@HDLA-BG
Copy link
Author

HDLA-BG commented Sep 11, 2024

Yes we have BusOff recovery.
But the problem is that CAN goes in BusOFF and CAN simulation in CANoe is stopping and also because of our device
the other devices are doing BusOFF recovery.

@diplfranzhoepfinger
Copy link
Contributor

2. Restart the device

how do you restart ?

by a RESET, or by a Software-Triggered Restart ?

like

    printf("Restarting now.\n");
    fflush(stdout);
    esp_restart();

@diplfranzhoepfinger
Copy link
Contributor

or do you switch of and on power ?
so the Transceiver also gets powerless ?

@HDLA-BG
Copy link
Author

HDLA-BG commented Sep 11, 2024

No I reset the device with using reset button on esp-prog.
Also the issue doesn't exist on 5.0.4 :(

@diplfranzhoepfinger
Copy link
Contributor

No I reset the device with using reset button on esp-prog.

Also the issue doesn't exist on 5.0.4 :(
very important Information, so this tends to be really a SW issue.

@HDLA-BG
Copy link
Author

HDLA-BG commented Sep 11, 2024

I just tried with the twai_network_master example.
And the result is the same:
image

With the change in twai.c the CAN doesn't go to BusOFF

@wanckl
Copy link
Collaborator

wanckl commented Sep 13, 2024

@HDLA-BG Can you more clear the issue

  • who using esp32s3?
  • what is ECU and what is simulator? is they using esp32s3?
  • BUS don't goto bus off, devices on the bus can goto bus off

So you mean esp32s3 restart let other device bus off? or other devices restart let esp32s3 goto bus off?

If first one, I know the reason, If 2nd one, please check on other devices

@HDLA-BG
Copy link
Author

HDLA-BG commented Sep 13, 2024

Hello @wanckl

Thank you for your help

  • Our device is using esp32s3, the device is connect to a CAN network where multiple other devices other devices are connected(they don't use esp32)
  • We are most often testing the our software with PC simulations like CanEasy or Vector CANoe
  • Sorry if I didn't explained it clear enough, when esp32s3 starts(due to intended on unintended reset) other devices in the network go to busOFF, they recover from it almost immediately. But CAN simulations normally don't have automatic BusOFF recovery enabled, so they are going to BUS Off state, and to continue testing the simulation has to be restarted.
    Btw. this is how we found the issue.

Unfortunately some devices have BusOFF DTC(Data trouble code), so when our device is connected to the network, this might lead to DTCs stored on other devices.

Schematics here maybe it can help:
image

Please let me know if you need any additional information.

@diplfranzhoepfinger
Copy link
Contributor

@HDLA-BG Can you more clear the issue

  • who using esp32s3?

our ECU

  • what is ECU and what is simulator? is they using esp32s3?

ECU use esp32s3, Simulator use Vector, other ECU no matter what, can use ESP32, can use other CPU

  • BUS don't goto bus off, devices on the bus can goto bus off

YES !

So you mean esp32s3 restart let other device bus off?

true !

or other devices restart let esp32s3 goto bus off?

no
we have busoff recovery, so that would not be an issue at first place.

If first one, I know the reason,

good. please tell, or tell where to send beer, to tell.

If 2nd one, please check on other devices

@diplfranzhoepfinger
Copy link
Contributor

Please let me know if you need any additional information.

yes, all explained good now.
i think you can understand.

@diplfranzhoepfinger
Copy link
Contributor

@Dazza0 maybe for you

@Dazza0
Copy link
Contributor

Dazza0 commented Sep 14, 2024

@HDLA-BG @diplfranzhoepfinger

I think what's likely happening here is that when the ESP32-S3 resets, the TX pin is driven LOW momentarily. This results in the bus being drive to the DOMINANT state, which is causing error states in the other devices on the bus. An easy way to verify this is just to use a logic analyzer or oscilloscope and observe the TX pin of the ESP32-S3 when a reset occurs.

@wanckl Please verify this is the case and run a git bisect

@wanckl
Copy link
Collaborator

wanckl commented Sep 18, 2024

@HDLA-BG @diplfranzhoepfinger

Yes @Dazza0 's description is the reason, I just discover this recently

will fix asap

@HDLA-BG
Copy link
Author

HDLA-BG commented Sep 18, 2024

@wanckl that is great.
Thanks for your help.

I am wondering, how are you going to fix it?

@wanckl
Copy link
Collaborator

wanckl commented Sep 30, 2024

@HDLA-BG
Hi, almost your change GPIO_MODE_DISABLE is right, you can have a try replace the func in twai.c with follow:

#include "esp_private/gpio.h"
#include "soc/io_mux_reg.h"
static void twai_configure_gpio(int controller_id, gpio_num_t tx, gpio_num_t rx, gpio_num_t clkout, gpio_num_t bus_status)
{
    // assert the GPIO number is not a negative number (shift operation on a negative number is undefined)
    assert(tx >= 0 && rx >= 0);

    //Set RX pin
    gpio_func_sel(rx, PIN_FUNC_GPIO);
    gpio_set_pull_mode(rx, GPIO_FLOATING);
    gpio_input_enable(rx);
    esp_rom_gpio_connect_in_signal(rx, twai_controller_periph_signals.controllers[controller_id].rx_sig, false);

    //Set TX pin
    gpio_func_sel(tx, PIN_FUNC_GPIO);
    gpio_set_pull_mode(tx, GPIO_FLOATING);
    esp_rom_gpio_connect_out_signal(tx, twai_controller_periph_signals.controllers[controller_id].tx_sig, false, false);

    //Configure output clock pin (Optional)
    if (clkout >= 0 && clkout < GPIO_NUM_MAX) {
        gpio_set_pull_mode(clkout, GPIO_FLOATING);
        gpio_func_sel(clkout, PIN_FUNC_GPIO);
        esp_rom_gpio_connect_out_signal(clkout, twai_controller_periph_signals.controllers[controller_id].clk_out_sig, false, false);
    }

    //Configure bus status pin (Optional)
    if (bus_status >= 0 && bus_status < GPIO_NUM_MAX) {
        gpio_set_pull_mode(bus_status, GPIO_FLOATING);
        gpio_func_sel(bus_status, PIN_FUNC_GPIO);
        esp_rom_gpio_connect_out_signal(bus_status, twai_controller_periph_signals.controllers[controller_id].bus_off_sig, false, false);
    }
}

@espressif-bot espressif-bot added Status: In Progress Work is in progress and removed Status: Opened Issue is new labels Oct 9, 2024
@espressif-bot espressif-bot added Status: Reviewing Issue is being reviewed Status: Done Issue is done internally Resolution: NA Issue resolution is unavailable and removed Status: In Progress Work is in progress Status: Reviewing Issue is being reviewed labels Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: NA Issue resolution is unavailable Status: Done Issue is done internally
Projects
None yet
Development

No branches or pull requests

5 participants