-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zh:ember:uart:ash: Received ERROR from adapter, with code=ERROR_EXCEEDED_MAXIMUM_ACK_TIMEOUT_COUNT #23761
Comments
It has happened today twice :c I see some of
|
You are getting different errors here, so, at first glance, it would point to the adapter being unstable. Did you look into removing possible sources of interference for the USB? (I see you are on a Pi, so, powered USB hub, extension cable, etc...) Note: September 1st release should also help on some edge-cases interference issues (2.4GHz ones). |
Thank you very much @Nerivec! I will make sure that power supply isn't the problem - I need to find somewhere a 5V 3A power supply, currently I'm using strong phone charger. The dongle is connected directly to the RPi. Also I'll be waiting for the September 1st release to check it will help somehow :D Thank you so much for your reply @Nerivec. PS PPS |
Interference can result is various errors that don't actually have any meaning because they are randomly triggered. Once this is improved/fixed, then we can see what errors remain, if any. The Pi is known for causing USB troubles. Make sure to use a USB extension cord to connect the adapter to it, so you can place it farther away from the Pi. A powered USB hub (a USB hub with it's own power supply, so it doesn't draw on the Pi) also fixed several problems for other users in the past. As for your router, if you haven't already, make sure your 2.4GHz WiFi and your ZigBee use very different channels. Usually, channel 20 or 25 for ZigBee is your best bet. You can also check channels usage around you with ember-zli.
|
Now I'm using the v1.40.0-1 of zigbee2mqtt and the problem has happened again... :/. In a period of 2 hours (before the fatal error occurred) I got five times And these were the last errors:
Before I try to test the USB cable extension / powered USB hub, perhaps I should reflash my adapter with firmware with a different baud rate? Currently I'm using the 230400 version, should I use the 115200 version? What do you think @Nerivec? The rest of the log ("duplicates" removed)
|
I don't think the baudrate would create this particular issue, but 115200 is definitely more tested than any other. |
@Nerivec could you please tell me what these errors mean? Maybe I can find out how to fix this too, haha
After these errors the zigbee2mqtt stopped working. Now I am using a better power supply with a better cable (but I still don't have the you know the strong 5V 3A real one - but it worked so long with the old type :c really strange), but the baud is still The rest of the log
PS |
/** The direction flag in the frame control field was incorrect. */
ERROR_WRONG_DIRECTION = 0x32,
/**
* The truncated flag in the frame control field was set, indicating there was not enough memory available to
* complete the response or that the response would have exceeded the maximum EZSP frame length.
*/
ERROR_TRUNCATED = 0x33,
/**
* The overflow flag in the frame control field was set, indicating one or more callbacks occurred since the previous
* response and there was not enough memory available to report them to the Host.
*/
ERROR_OVERFLOW = 0x34, From https://github.com/Koenkk/zigbee-herdsman/blob/master/src/adapter/ember/enums.ts#L755 But as mentioned before, with the varying errors, it looks like interference, which may result in errors that would not necessarily mean anything (except something interfered 😛).
Try to temporarily remove that device from the network if you can, see if it works better after that. That device creates some strange behaviors for sure. |
So after a week of testing I have unplugged the ZY-M100-S_2 and everything seems to work pretty good! I still have the problem that I have to disconnect and reconnect the adapter after rebooting the Raspberry Pi1, but I was able to switch to the previously used power supply two days ago and everything still works fine! In the next two weeks I plan to temporarily run my HA with Z2M on an old laptop (with a better processor than RPI, and with 4GB of RAM instead of the 1GB it currently has). And then I will try again to use the occupancy sensor. Footnotes
|
Quick note, not sure what step you are on, but make sure to update to 7.4.4 too, it seems to have fixed several problems for a few users. |
I was getting |
I get the following error everytime I restart or start z2m. zh:ember:uart:ash: Received ERROR from adapter while connecting, with code=ERROR_EXCEEDED_MAXIMUM_ACK_TIMEOUT_COUNT. If I rebbot the coordinators before starting the z2m instance no error. Running latest firmware, latest z2m on 3 separate instances. there are 0 interference issues all 3 installs are SLZB06M. |
Stumbling upon the same issue all the time and troubleshooting. When this error happens the only way to put network back online is to restart Home Assistant, restarting Z2M just gives me same error instantly.
Fortunately I have two locations with completely same zigbee configuration and after recent changes on one of them the error happens once every 1-2 days. And the other location is stable for 4 days already. So maybe some externat interferences are in play. Will report about new developments in my networks. EDIT: ok one mistake about sameness of the 2 locations setup. In the "stable" location I had SLZB06M coordinator firmware from .dev branch and in the unstable it was v2.3.6. Upgraded to latest v2.6.8.dev16 and will see. |
For me, this was solved by changing my WiFi channel. I was having lots of ERROR_EXCEEDED_MAXIMUM_ACK_TIMEOUT_COUNT errors, which were causing Zigbee2MQTT to restart. I could reproduce it almost always by recreating the map. I found i had the Ember coordinator on Zigbee channel 25 and my 2.4GHz WiFi on Channel 11. These channels overlap according to the diagram above. Moving my WiFi to channel 4 mostly solved the problem. Significantly less frequent errors and restarts. However, what I'm unclear on, @Nerivec , is why a timeout should require Zigbee2MQTT to restart? There are people on HA's forum having problems too: https://community.home-assistant.io/t/error-messages-trying-to-ota-update-devices-using-z2m-on-a-ha-yellow/788913/2 |
Whenever an ERROR frame is received from the adapter, it means the adapter has entered a "failed state", after which, the only recourse is to restart the stack. In a lot of cases, it's only about keeping proper synchronization between the adapter and the application. As for the error itself, most of the reports here look like interference indeed (you may not realize how bad it can get in some situations, and not necessarily from your environment, but neighbors too; the SLZB06M also seems particularly susceptible), but the adapter should handle this better nonetheless. I suspect a firmware issue when the radio comes under stress, but we'll have to wait for silabs on that one... |
@Nerivec so restart of just Z2M zhould be enough to bring network back online? Why this issue is so annoying for me is that Home Assistant restarts Z2M plugin, but it just keeps crashing in a loop with same error, what exhausts watchdog restart counter and the network fails permanently. Manual restarts of Z2M by me, after few hours when I realized the network is down, still leads to the error. The only way is to restart Home Assistant as a whole. This is the strange part for me. |
@szwacz And that happens with SLZB06M (TCP-based)? If it were with a USB device, I'd say something goes wrong with the USB on HA, but I assume the TCP adapter is not connected directly to the HA machine? |
@Nerivec you are correct, I use it via ethernet cable. |
It is strange that it requires a full HA restart to fix it though. Since you upgraded your network, I assume you have the old adapter laying around somewhere? Can you use it with Ember ZLI to scan for networks + channels, to see what the environment is like around you for reference (preferably closer to the new adapter)? https://github.com/Nerivec/ember-zli/wiki/Stack#scan-network Can you also take a look at the CPU usage on your machine around the time these errors happen? |
@ortofan what core firmware version do you have on the SLZB06M? Versions around 2.5.6 are known to have a bug that slows down the ESP side. That is reflected in the Zigbee side with timeouts. |
v2.6.8.dev16 and Zigbee: FWIW I have 5-8 NOUS A1Z Smart plugs on all my networks (I run 3 z2m instances all with the same coordinator and firmware). This is an invaluble plug for me becouse it has very unique feature. Over voltage protection. I frequntly have over 260v AC Mains on the phase of the house where large number of solar installations are present in my hood. INSANE I know but the power company wont fix it for a while. Till then NOUS and a few UPS saves me important gear as it turns off instantly when over 258v. |
@Nerivec Thank you for that explanation. Indeed I have NOUS A1Z in my network. So this issue should be separate of the one here. Thankfully failure of the network just happenedn, so I had the chance to observe it better. The hunch about CPU usage is intresting one, because I use Raspberry PI 3B+. First attaching logs of when the fail happened. I've discovered it after half an hour, and clicked just "start" in the Z2M addon options again. This is memory and cpu usage recorded during events from the log file above. After restarting whole HA everything went back to normal. Indeed maybe I should try upgrading my raspberry.... |
This happens to me as well. Once or twice a day my Zigbee network crashes. As far as I can see it often when it's under "high" load (As in a lot of devices are asked to do things at the same time). I can also reproduce the issue by trying to push an update to one of my Hue lamps. It will fail because of the crash. Adapter: Sonoff ZBDongle-E Logs
|
@Nerivec I also did channel scan as you asked (man this is nice tool! :) )
My network is on channel 26 (moved there few weeks ago out of desperation). Didn't detect different zigbee networks than mine. I also ordered more powerful raspberry and will try run HA with more RAM. |
Im getting this error daily now, switched from a slae.sh stick to the 06M, using channel 25, my own wifi networks are on 1 and 6. Some Neighbors are on 11 but it is not particularly crowded here and these houses are well isolated. Unfortunately im at a loss on what i can do to improve this, running the latest firmware on the 06M as well. |
Ok, I can report major improvements. Upgraded from RaspberryPI 3B+ 1GB RAM, to RaspberryPI 4B 4GB RAM. On old setup (pi3): New setup (pi4): This is night and day. I don't know if processing power or RAM was the issue, but since I had two identical installations and after upgrade both improved this way I can pretty safely say that if you use Home Assistant with Zigbe2MQTT aim at Pi4 with at least 2 gigs of RAM. The |
@szwacz this is very good to know. I have the same RPi Model 3B+ 1GB hardware and ERROR_EXCEEDED_MAXIMUM_ACK_TIMEOUT_COUNT errors every couple of minutes, leading to Zigbee2MQTT restarts. It's also running openHAB 4.2, which has become unresponsive for sometimes a minute at a time since installing Z2M and connecting a handful of very talkative routers. Like you, I figured I needed to upgrade. I considered an RPi 4B or 5B, but for just over half the money I've bought an off-lease HP EliteDesk 800 G3 Mini with 16GB RAM and 256 NVMe. Based on your investigation, I think my hunch that this will at least improve the timeouts might pay off! |
With this in mind i increased the resources available to my z2m VM, it sits in a low resource alpine container. i increased it to 2 cores and 1GB ram, however it still crashed about a day later just like before. what i’ve done now is increased the ACK timeout delay to 2000, disabled OTA for some devices (these were a bit spammy) and made some changes in the tcp keep alive options in my vm. |
Just a sidenote, but from HA specs sheet, they seem to recommended Pi 4 or above. That would be for home assistant alone, you add add-ons and integrations, every time you increase the requirements a little (not so little for anything AI related). Z2M alone should use around 100MB of RAM (likely ~70 in most cases), and a small amount of CPU. It's not quite clear yet what could be essentially resulting in serial interference when the resources become low, but we've had enough reports to confirm it appears to be (HA). Some work is being done to reduce the requirements of Z2M a bit further, but it is already pretty low, so in most cases, the other "resource takers" should be looked at more closely (or the hardware specs indeed). @Kuchiru can you expand on the exact changes you applied to ACK & TCP? (If you have some before/after logs, that would be great too!) Also, are you running the original core firmware on the 06m, or a esphome firmware? |
Hi @Nerivec, Im running 2.6.8 dev21 for both logs with the latest dev firmware available for zigbee, 202411xx i believe. Changes in sysctl for tcp keepalive:
I've attached the logs as well, for the before log i had debug enabled in the hopes of catching something extra, unfortunately i did not. |
That setting does not exist (i.e. this does nothing). There is a rather large spike in "stale neighbor" around midnight in your after log (https://nerivec.github.io/z2m-ember-helper/). |
Good to know, I’ll remove it!
I've looked through the log but did not find any mention of NEIGHBOR_STALE, nothing is set specifically to change after midnight though so im not sure what is causing this. did you see any specific devices that were stale? |
Yes. I red those guidelines, but I had Pi3 already laying around so just gave it a try. The thing is that everything was working fine. Except once in a while cascading crash due to error discussed in this thread happened. Since all troubleshooting materials about zigbee always focus on wifi or other signal interference and firmware updates this is what I was troubleshooting first few weeks. Nothing was telling me that performance bottleneck might be the culprit. |
@Kuchiru |
Yeah it bothers me too, housely activity obviously comes to a halt around midnight but all routers remain powered so im not sure whats happening here. Z2M did crash every morning shortly after we wake up, however, since the TCP changes and disabling OTA requests for some devices it has not crashed and has remained up for about 3 days now. I've had chatgpt analyze my logs to find any offenders that are spamming the network and reconfigured some reporting as well for good measure. edit: z2m started crashing again, i've created a home assistant automation to restart my z2m instance in proxmox when it disconnects, this works around the issue but is sub-optimal to say the least. |
What happened?
My ZigBee network just stopped working. And I got this error message:
zh:ember:ezsp: Found no buffer in queue but ASH layer sent signal that one was available.
EDIT:
It seems that this is the main error message
zh:ember:uart:ash: Received ERROR from adapter, with code=ERROR_EXCEEDED_MAXIMUM_ACK_TIMEOUT_COUNT.
It is also worth mentioning, that when I'm restarting the Home Assistant, then somehow Z2M after reboot causes HA to reboot again, but this time Z2M is not running. I had to switch
autostart
of Z2M to false, because in other case I will end in reboot loop.To fix this all I must do is:
Unplug adapter
>
reboot Home Assistant>
Start Zigbee2MQTT (it will fail, ok, there is no adapter plugged in)>
Start Zigbee2MQTT again>
wait some time (2 to 4 s)>
plug in adapter.From now on everything is working perfectly!
What did you expect to happen?
I expected that everything should work smoothly as always 😁
How to reproduce it (minimal and precise)
I don't know how to reproduce it. It has happend second time.
Zigbee2MQTT version
1.39.1
Adapter firmware version
7.4.3
Adapter
Sonoff ZBDongle-E with ember
Setup
HA on a Raspberry Pi 3 (
rpi3-64
)Debug log
Expand to see
The text was updated successfully, but these errors were encountered: