Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flashing yellow and red pixels after some period of working hours #519

Open
feitzi opened this issue Mar 26, 2024 · 30 comments
Open

Flashing yellow and red pixels after some period of working hours #519

feitzi opened this issue Mar 26, 2024 · 30 comments
Labels
bug Something isn't working

Comments

@feitzi
Copy link

feitzi commented Mar 26, 2024

Bug report

Describe the bug

My Ulanzi with Awtrix works fine for a few hours with HA over Mqtt. After 3-6 hours the wlan disconnects and on the left side two pixels (one yellow and one red) starts flashing.
With a restart (just possible via hardware buttons) everything works fine for the next few hours.

Wireless strength is really good and the Awtrix has a fixed IP address on my DHCP Server.

Additional information

  • Devices involved:
    • Model: Ulanzi Awtrix Smart Pixel Clock 2882 (TC001)
    • awtrix3 version: 0.96]

To Reproduce

Running the Awtrix for some hours.

Screenshots

20240326_195652.jpg

@feitzi feitzi added the bug Something isn't working label Mar 26, 2024
@feitzi feitzi changed the title [BUG] Flashing yellow and red pixels after some period of working hours Mar 26, 2024
@haterakathegrinch1
Copy link

Can confirm this. Thought it was a hardware issue, but as my error is exactly the same pixel and the same behaviour it seems more like a software issue. Sadly in terms of bugtracking the issue is non reproducable atm.

@Blueforcer
Copy link
Owner

that's not a software issue.
Red pixel means no Wi-Fi connection, Yellow pixel means no MQTT connection.

@feitzi
Copy link
Author

feitzi commented Apr 1, 2024

If it's not a software problem, what else could it be? Everything works perfectly after a restart. And mqtt via home assistant works seamlessly with other devices. The same applies to wifi.
Another point I've noticed in the last few days: when the error occurs, sometimes the screen seems to freeze for a few minutes.

For me it seems to be a software error. I will reflash Mx Awtrix.
Is there any way to retrieve logs and publish them here?

@Blueforcer
Copy link
Owner

Blueforcer commented Apr 1, 2024

MQTT tries to reconnect wich isn't asynchronous and awtrix freezes during that time.

You can set debug_mode in dev.json and listen with serial terminal.

If you think that's a software bug, than please give me instructions on how to reproduce this error. You must have configured something special, otherwise there would be more such reports with over 5000 users.

Another possibility would of course be that the ESP is damaged.

@luebbe
Copy link
Contributor

luebbe commented Apr 3, 2024

I have seen the yellow pixel on Awtrix too on a few rare occasions.
Is it possible that your router/repeater sometimes switches the wifi channel? I have the impression that some ESP devices have problems reconnecting in that case, depending on the firmware (not Awtrix AFAICT). At least some of my ESP8266 behave this way and I have implemented a watchdog to reboot them if it takes too long. But I'm using an async wifi library (asyncmqtt in older projects, espmqtt in newer projects)

@Blueforcer
Copy link
Owner

There are also problems with ESP32 together with Unifi routers.

@vitaha83
Copy link

vitaha83 commented Apr 3, 2024

I join this problem!!!

@vitaha83
Copy link

vitaha83 commented Apr 3, 2024

How do I get back to the awtrix3 firmware version 0.95? Perhaps there will be no problems with this firmware!

@Blueforcer
Copy link
Owner

How do I get back to the awtrix3 firmware version 0.95? Perhaps there will be no problems with this firmware!

https://github.com/Blueforcer/awtrix3/releases

but there was no change to wifi in 0.96

@Ysbrand
Copy link

Ysbrand commented Apr 5, 2024

Can I join? I see the same issue, after a few hours, sometimes a day I loose the connection to Awtrix, This issue exists already for longer time and is, in my opinion, not related to the most recent Awtrix FW versions.

I have Omda AP's but I'm using Unify switches and routers (not sure if it is related but I see a mention of ESP32 issues with Unifi.

@vitaha83
Copy link

vitaha83 commented Apr 5, 2024

Perhaps the reason for the hang-up is in RAM? breaks are a loss of connection and reboot!
Please watch your "Free ram" parameter in Home Assistant
2024-04-05_23-09-42

@feitzi
Copy link
Author

feitzi commented Apr 7, 2024

I also monitored the ram usage and I confirm that it seems to be a memory leak problem.
Here is the free ram consumption:
Screenshot_20240407_134746_Home Assistant.jpg

At 13:44 the yellow led starts to flash.
It's definitely a issue with the software and not with the hardware.

@vitaha83
Copy link

vitaha83 commented Apr 7, 2024

The dependence of the operation of RAM and the operating time of the clock.
If the value of "Free ram" is less than 40,000 B, the clock stops working!
2024-04-07_18-34-57

@luebbe
Copy link
Contributor

luebbe commented Apr 7, 2024

Interesting find. I'll take a look at uptime vs ram usage as well.

@Blueforcer
Copy link
Owner

Make sure to not trigger HA discovery entities very often. The used library has a memory leak. It's better to work with raw Mqtt or http API commands.

@Ysbrand
Copy link

Ysbrand commented Apr 8, 2024

Hi,

I'm not doing any automatic discoveries in HA, everything is manual (MQTT wise). Is there any chance that we can reboot the AWTRIX3 automatically when running out of resources (and obviously a mechanism that prevents more than 1 automatic reboot every .. hours)?

@luebbe
Copy link
Contributor

luebbe commented Apr 8, 2024

I'm running Awtrix 0.96 since a few weeks and haven't had any problems so far.

I checked my HA logs for the Awtrix free ram and apart from one peak, where it goes down to 60K about a week ago, it is consistently between 120-130K free ram.

Are you running custom applications? Maybe turn them off for a day and check if the problem goes away?
Other things that come to mind:

  • Gif animations?
  • Sounds?

@vitaha83
Copy link

vitaha83 commented Apr 9, 2024

Yes, after disabling automation in the Home Assistant, the RAM in the watch stopped being consumed, respectively, and the watch will work for a long time without restarts ...
But it's not interesting!
I bought the watch specifically for AWTRIX 3 firmware, to work with the Home Assistant!
Will the library be improved with the operation of RAM in the firmware?
photo_2024-04-09_12-56-17

@luebbe
Copy link
Contributor

luebbe commented Apr 9, 2024

This is the free ram of my Awtrix 0.96 over the course of a week:

grafik

Apart from the few occasional peaks, it's pretty stable between 120K and 130K free ram.

I have three automations running on HA that continuously push data to Awtrix. (Outside temp, Solar power and Octoprint status).
They all have Text, maybe a progress bar and static icons, no gifs.

How many automations are you running in HA? If you enable them one after the other and check if there is one specific automation that causes the memory leak, we could try to investigate.

Otherwise your question:

Will the library be improved with the operation of RAM in the firmware?

is just straining the capabilities of our crystal ball... ;-)

@Blueforcer
Copy link
Owner

I don't think there will be any update from the Creator of the lib. Last update was 2 years ago.
The goal ist to completely remove the HA discovery shit and make a awtrix integration in HA, (HACS). But for that I doesn't have enough skills

@vitaha83
Copy link

vitaha83 commented Apr 11, 2024

Hello, friends!
I managed to achieve stable operation of the operating system in our watch by editing automation in the home assistant:
2024-04-11_18-05-40

  1. I have added parameters for all automation:
data:
  qos: 0
  retain: false
  1. To work with indicators, I changed the automation:
    2024-04-11_18-06-04

it was:

- service: light.turn_on
  entity_id: light.awtrix_..._indicator_1

changed:

- service: mqtt.publish
  data:
     qos: 0
     retain: false
     topic: awtrix_.../indicator1
     payload: >-
       {"color":[255,0,0]}

it was:

- service: light.turn_off
  entity_id: light.awtrix_..._indicator_1

changed:

- service: mqtt.publish
  data:
     qos: 0
     retain: false
     topic: awtrix_.../indicator1
     payload: >-
       {"color":[0,0,0]}

Thanks Lübbe Onken! ;)

@luebbe
Copy link
Contributor

luebbe commented Apr 12, 2024

@vitaha83 that is a very interesting solution for me. I wonder, how the change to qos:0, retain:false can have such a big impact on the awtrix since awtrix is just a consumer of the message.

This sounds a bit like a problem in the mqtt library used by awtrix.
qos 0 means "fire and forget" (from the view of home assistant). qos 1 means that the subscriber tries to confirm that it has received the message to the sender.
Is it possible that awtrix builds up a stack of puback messages that it never gets rid of, when qos > 0?

I assume that retain: false/true doesn't affect the memory consumption on awtrix.

@Ysbrand
Copy link

Ysbrand commented Apr 15, 2024

Changing the automations solved my issue as well.
Apparently the service light.turn_off and turn_on are doing things in a different way while calling the mqtt.publish service is handled better by the Awtrix logic.

@luebbe
Copy link
Contributor

luebbe commented Apr 17, 2024

I don't think there will be any update from the Creator of the lib. Last update was 2 years ago. The goal ist to completely remove the HA discovery shit and make a awtrix integration in HA, (HACS). But for that I doesn't have enough skills

I don't know how to implement a HA integration for Awtrix, but I took a look at the HA autodiscovery library that you are using, found it too big and clumsy and rolled my own. If you want, I can take a look at the HA autodiscovery.
Maybe we should also take a look at the way mqtt is handled in Awtrix, since this thread indicates that memory is leaking.

@Blueforcer
Copy link
Owner

Mqtt itself doesn't have a memory leak, only when you access the entities from HA discovery. that's already tested.

As I see the library builds the MQTT Payload string at runtime, maybe there is no release of the data.

We have some experienced users in discord. I'm pretty sure together we can build an integration wich just uses MQTT or even better http. This would have the advantage that we can free Awtrix from some code and we have more RAM free.

@DrRSatzteil
Copy link

I do experience similar problems though I only get the blinking yellow indicator (lost mqtt connection I guess). I can still see the device from my router but I cannot ping the device anymore.

I think that it did not start right after the upgrade to 0.96 but after I started using the lifetime property of apps. Could this be related? Might just be coincidence of course... I'm not using Home Assistant and no HA Autodiscovery.

@michapixel
Copy link

i think it's a power-related problem, when you power the esp & the leds from the onboard usb. i also experienced strange hangs, leds (the red and yellow ones) and extensively low WIFI signals reconnect errors etc.pp.

but since i redesigned ma case anyway i added an extra usb-plug, like shown here:
hardware_basis

https://pixelit-project.github.io/hardware.html#wiring-guide

And boom: WiFi good, no system hangs, no quirky leds etc.
But rebooting always reboots into the "system-menu" :) but i can live with that for now.

@Blueforcer
Copy link
Owner

But rebooting always reboots into the "system-menu" :) but i can live with that for now.

Then you have connected your middle button wrong. Needs to be active low.

@michapixel
Copy link

how would i do that?

@heimchemiker
Copy link

I had the same problem. I had one custom app that had a "lifetime" of one hour (3600s). I have removed this Lifetime and switched to deleting the app through my homeautomation after the hour and so far this seems to have solved the problem. (The "lifetime" didn't work anyways in my case, see Issue 335)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

9 participants