Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sudden battery drain #59

Open
airmaxx opened this issue Nov 8, 2024 · 24 comments
Open

Sudden battery drain #59

airmaxx opened this issue Nov 8, 2024 · 24 comments

Comments

@airmaxx
Copy link

airmaxx commented Nov 8, 2024

I'm experiencing a sudden battery drain, after 1-2 weeks of running just fine. Any idea what could be the cause, or how I can debug it?

Screenshot_20241108-121454

@lanrat
Copy link
Owner

lanrat commented Nov 8, 2024

What does your battery voltage graph look like?

The battery % is calculated based on the voltage, and every battery will be a little different. I calibrated this for the batteries I had for testing, but yours may have a different charge/discharge curve.

@airmaxx
Copy link
Author

airmaxx commented Nov 18, 2024

Here's the voltage graph for the same time period.

Bildschirmfoto 2024-11-18 um 15 50 14

@elratt0r
Copy link
Contributor

elratt0r commented Nov 18, 2024

Could you also show the boots for the same timerange?

Assumptions:
a) the esp is booting more often than the assumed sleep interval
b) the battery you're using has a very "non-linear" discharge curve

About the battery: I am using a LiPo, 3.7V, 1200mAh (from Eckstein, ZB07002) and that gives the following discharge curve. You see a steep(er) drop at around 3.45V, that's more or less the cut-off voltage for my battery.

homeplate_bat_30days
homeplate_bat_detail

@airmaxx
Copy link
Author

airmaxx commented Nov 18, 2024

How can I see the boots? Are they published via MQTT to Home Assistant?

My battery is specified as Rated voltage: 3.7V, capacity: 3000mAh, 11.1Wh, it shouldn't start discharging so fast at around 4.0V.

I'll keep an eye on it, it's currently approaching 4V again (after about 10 days of use).

@elratt0r
Copy link
Contributor

Okay, with that specs it's really odd that it empties that rapid.

For comparsion: Mine usually lasts ~1 month, with sleep cycle set to 20 minutes and a weekday speed up between 6 and 9 o'clock to 5 min intervals.

Yes, it's published via MQTT under homeassistant/sensor/<hostname>/boot/state -> {"boot": <value>}

You should also be able to see it in HA:
homeplate_ha_diag

@klopyrev
Copy link

I want to report that I'm seeing the exact same issue. It runs fine for a while. At some point, several days or weeks later, the battery suddenly drains.

I have an identical Inkplate 10 in another room running an older version of the code, and I see no such problem.

image

@airmaxx
Copy link
Author

airmaxx commented Nov 20, 2024

Interesting that it happens on different levels. At the moment I'm waiting to see the next event. Did you check the temperature at the moment of the battery discharge? It was higher at the time of the battery discharge.
Bildschirmfoto 2024-11-20 um 08 21 27

@elratt0r
Copy link
Contributor

elratt0r commented Nov 20, 2024

@klopyrev Could you provide the commit hash the old version is running on?

@airmaxx Good catch! My assumption would be, that the ESP32 is not going to sleep (anymore)

Is any of you using the feature to modify the sleep times via MQTT?
(homeplate/homeplate/activity/run -> {"action": "hass", "refresh": "300"})

@klopyrev
Copy link

@elratt0r I checked the version date printed in the main.cpp file, and I think both my devices have been running the new version (I forgot I reflashed the older device). So, good chance it's not new code.

I just reflashed both devices again to make sure. I'll monitor to see what happens.

@hillbillypolenta
Copy link

hillbillypolenta commented Nov 20, 2024

Also experiencing this. Looking at a few cycles wild guess there may be an issue with the voltage calibration as a pretty typical LiPo discharge curve can be observed over the span of (in my use case) 1 month before it drops off a cliff at a reported 3.8-3.9 volts or so. After recovering it reported 3.2-ish volts.

Unfortunately I was not logging boot reason but I have turned all of those on now and will see if there's anything to report there (with reference to the configurable sleep duration feature which I am using)

Screenshot 2024-11-21 005137

Using the 3000mAh battery included with the Inkplate 10

@lanrat
Copy link
Owner

lanrat commented Nov 20, 2024

I know this may be difficult, but getting the device's serial logs while it tries to go into sleep would help. However this would need to be done while not providing any power over USB in order to not affect the battery readings. This could be done by using a USB cable that has the VCC line cut, or an external UART connected directly to the UART pins on the ESP32.

@klopyrev
Copy link

This is from using an external UART adapter connected to the ESP32 TX / RX pins.

I can leave this setup connected to my PC until the issue reoccurs.

13:19:33.884 > ets Jul 29 2019 12:21:46
13:19:33.885 >
13:19:33.885 > rst:0x1 (POWERON_RESET),boot:0x12 (SPI_FAST_FLASH_BOOT)
13:19:33.885 > configsip: 0, SPIWP:0xee
13:19:33.885 > clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
13:19:33.885 > mode:DIO, clock div:2
13:19:33.885 > load:0x3fff0030,len:1184
13:19:33.885 > load:0x40078000,len:13192
13:19:33.885 > load:0x40080400,len:3028
13:19:33.885 > entry 0x400805e4
13:19:34.889 >
13:19:34.889 >
13:19:34.889 > [SETUP] starting, version(Nov 20 2024, 08:47:13) boot: 0
13:19:34.931 > ESP Chip information:
13:19:34.931 > This is esp32 chip with 2 CPU core(s), WiFi/BT/BLE, silicon revision 3, 4MB external flash
13:19:34.931 > Minimum free heap size: 4415055 bytes
13:19:34.931 > Heap summary for capabilities 0x00000C06:
13:19:34.931 > Totals:
13:19:34.931 > free 0 allocated 0 min_free 0 largest_free_block 0
13:19:34.931 > [ACTIVITY] startActivity(1) put into queue
13:19:35.042 > [STATUS] Boot 1 normal
13:19:37.163 > [SETUP] Battery: 100% (4.24v)
13:19:37.741 > [SETUP] starting button task
13:19:37.741 > [INPUT] Input monitoring not supported by this platform
13:19:37.741 > [SETUP] starting time task
13:19:37.741 > [TIME] Internal Clock and RTC differ by -1732126773 seconds. local(3) RTC(1732126776)
13:19:37.741 > [TIME] local time (3) 19:00:03 Wednesday, December 31 1969
13:19:37.741 > [TIME] ERROR: RTC time is too far in past. RTC likely has wrong value!
13:19:37.741 > [SETUP] starting WiFi task
13:19:37.741 > [WIFI] Connecting...
13:19:37.741 > [SETUP] starting OTA task
13:19:37.741 > [SETUP] starting MQTT task
13:19:37.741 > [MQTT] starting MQTT
13:19:37.741 > [SETUP] starting sleep task
13:19:37.741 > [SETUP] starting activities task
13:19:37.741 > [ACTIVITY] starting activity: 1
13:19:37.741 > [SLEEP] delaying sleep for 15 seconds
13:19:37.852 > [WIFI] Connected to AP successfully!
13:19:37.853 > [WIFI] IP address: 192.168.4.38
13:19:37.868 > [STATUS] WiFi connected
13:19:37.932 > [WIFI] Connected: 192.168.4.38
13:19:37.964 > [TIME] Syncing RTC to NTP
13:19:37.993 > [MQTT] Connecting to MQTT...
13:19:37.993 > [OTA] OTA ready
13:19:37.993 > [SLEEP] delaying sleep for 20 seconds
13:19:37.993 > [STATUS] Downloading image...
13:19:38.025 > [MQTT] Connected to MQTT.
13:19:38.025 > [MQTT] Session present: 0
13:19:38.025 > [MQTT] Subscribing at QoS 2, packetId: 1
13:19:38.025 > [MQTT] Sending MQTT Config
13:19:38.025 > [MQTT] Subscribe acknowledged: packetId: 1 qos: 2
13:19:38.072 > [MQTT] Connected
13:19:39.931 > [IMAGE] Downloading image: http://192.168.4.3:5000/2/
13:19:39.931 > [ 5506][E][WiFiClient.cpp:313] setSocketOption(): fail on -1, errno: 9, "Bad file number"
13:19:39.931 > [TIME] Internal clock was adjusted by 1732126776 seconds
13:19:39.931 > [TIME] Internal RTC was adjusted by 1 seconds
13:19:39.931 > [TIME] NTP UNIX time Epoch(1732126779)
13:19:39.931 > [TIME] Timezone offset: (-18000) -5 hours
13:19:39.931 > [TIME] synced local UNIX time Epoch(1732126779) 13:19:39 Wednesday, November 20 2024
13:19:40.265 > [IMAGE] Download done
13:19:40.282 > [STATUS] Rendering image...
13:19:45.257 > [IMAGE] Image render ready
13:19:45.258 > [IMAGE] displaying....
13:19:47.283 > [IMAGE] displaying done.
13:19:47.740 > [MQTT] sending status update
13:19:47.741 > [MQTT] Sending MQTT State: [homeassistant/sensor/homeplate2/boot/state] {"boot":1,"activity_count":1,"boot_reason":"normal","sleep_duration":1200}
13:19:47.932 > [MQTT] Sending MQTT State: [homeassistant/sensor/homeplate2/wifi_signal/state] {"signal":-45}
13:19:47.948 > [MQTT] Sending MQTT State: [homeassistant/sensor/homeplate2/temperature/state] {"temperature":24}
13:19:47.981 > [MQTT] Sending MQTT State: [homeassistant/sensor/homeplate2/battery/state] {"voltage":4.2,"battery":100}
13:19:58.752 > [ACTIVITY] startActivity(0) put into queue
13:19:58.752 > [ACTIVITY] starting activity: 0
13:19:58.752 > [SLEEP] prepping for sleep
13:19:58.752 > [MQTT] Stopping and disconnecting...
13:19:58.752 > [MQTT] Disconnected from MQTT.
13:19:58.752 > [WIFI] Stopping and disconnecting...
13:19:58.752 > [WIFI] Disconnected from WiFi access point
13:19:58.752 > [WIFI] WiFi lost connection. Reason: 8
13:19:58.895 > [SLEEP] entering sleep for 1200 seconds (20 min)

@lanrat
Copy link
Owner

lanrat commented Nov 21, 2024

@klopyrev Thanks for the logs.

It looks like its reporting the battery is at 100% when you took these. is that is correct?

Additionally, do you know if it stayed in sleep mode for the 20 minutes? or did it wake early for anything?

@klopyrev
Copy link

Yeah, the device was fully charged at the beginning.

Here are the logs from a few hours: https://pastebin.com/fADMkRPj

@lanrat
Copy link
Owner

lanrat commented Nov 22, 2024

I don't see anything out or the ordinary in the logs. And your reported battery drain looks normal too.

@airmaxx
Copy link
Author

airmaxx commented Nov 27, 2024

It happend again. This time i logged "boot count", yet it does not show anything. Sleep duration is set in config.cpp - I do not use MQTT for it.

Bildschirmfoto 2024-11-27 um 09 02 47 Bildschirmfoto 2024-11-27 um 09 02 32 Bildschirmfoto 2024-11-27 um 09 02 07 Bildschirmfoto 2024-11-27 um 09 04 07

@lanrat
Copy link
Owner

lanrat commented Nov 27, 2024

@airmaxx , it may be unrelated to the battery drain issue, but it looks like your homeplate crashed towards the end of the boot count graph.

I'm thinking on how to go about debugging this.

@klopyrev
Copy link

I'm still running with my inkplate connected to my PC with the external UART adapter. I'll keep it running until the issue happens. It make take some time. 2 of the times the drain happened it was after about a week. 1 of the times it was after about a month.

@airmaxx
Copy link
Author

airmaxx commented Nov 27, 2024

it looks like your homeplate crashed towards the end of the

I am not 100% sure about this - I pressed the button to (re?)start Homeplate at that time because I thought it was already crashed - but it seems to have continued to run while losing a lot of battery. I have now restarted it with only 35% battery to see if the behaviour is related to battery level/voltage.

@lanrat
Copy link
Owner

lanrat commented Nov 27, 2024

From a fresh boot, the boot count counter starts at 0, and is incremented every time the homepalte wakes from sleep. It only resets to 0 when doing a fresh boot (not waking from sleep).

@elratt0r
Copy link
Contributor

elratt0r commented Nov 27, 2024

Bildschirmfoto 2024-11-27 um 09 02 47

Could you provide your sanitized config.h and also config.cpp?

Bildschirmfoto 2024-11-27 um 09 02 32

More confusing I find, that the sudden battery drain does not seem to correlate with the changed sleep interval,
neither with a raise in boot cycles.

@klopyrev
Copy link

My Inkplate which I had connected to the PC for the last several weeks finally reached the point where the battery suddenly drained. Unfortunately, the only thing that was saved in the logs was a very long list of null symbols, more than 20000 of them. I'm going to try again, but this time save everything into a file rather than using the history of the console.

@klopyrev
Copy link

klopyrev commented Jan 6, 2025

I got the logs captured this time: https://public.klopyrev.net/minicom.cap

@elratt0r
Copy link
Contributor

elratt0r commented Jan 6, 2025

I got the logs captured this time: https://public.klopyrev.net/minicom.cap

27939:[2025-01-04 15:34:20] CORRUPT HEAP: Bad head at 0x3ffdf970. Expected 0xabba1234 got 0x3ffdab48
* First crash, but recovered
31908:[2025-01-05 11:10:29] CORRUPT HEAP: Bad head at 0x3ffdf970. Expected 0xabba1234 got 0x3ffdab48
* Second crash, but recovered
35118:[2025-01-06 03:03:07] [  2779][E][WiFiClient.cpp:313] setSocketOption(): fail on -1, errno: 9, "Bad file number"
* Next crash, probably while downloading the png
35243:[2025-01-06 03:23:19] [  2817][E][WiFiClient.cpp:313] setSocketOption(): fail on -1, errno: 9, "Bad file number"
* From here on it seems to be missing WiFi or a deadlock in the starting activity:
35351:[2025-01-06 03:43:31] [ACTIVITY] starting activity: 1
35352:[2025-01-06 03:43:31] [SLEEP] delaying sleep for 15 seconds
35353:[2025-01-06 03:43:31] ets Jul 29 2019 12:21:46
35354:[2025-01-06 03:43:31] 
35355:[2025-01-06 03:43:31] rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
...
35590:[2025-01-06 03:43:46] [SETUP] Battery: 16% (3.53v)
...
35603:[2025-01-06 03:43:46] [SETUP] starting activities task
35604:[2025-01-06 03:43:46] [ACTIVITY] starting activity: 1
* what the hell did the CPU did in this 90mins???
35605:[2025-01-06 05:11:43] ets Jul 29 2019 12:21:46
...
35626:[2025-01-06 05:11:45] [SETUP] Battery: 0% (3.31v)

Well, odd for sure. The crashes when downloading the PNG I also encountered with mine, usually when screenshotter and plate collide in file access (rare but happened).

Thank you very much for your effort! I'll see if I can at least track down the heap problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants