elasticsearch container quits during stack build, but can be started again in docker #202

wkhchow · 2022-05-10T21:33:21Z

When building new stack with latest branch opendrr-api (update_psra_apr2022) for some reason es container quits while the db is still building. You can still enable it in docker before it reaches the that stage to continue the build process.

kibana-opendrr_1 | 2022-05-10T20:55:22.837488770Z {"type":"log","@timestamp":"2022-05-10T20:55:22+00:00","tags":["warning","plugins","licensing"],"pid":8,"message":"License information could not be obtained from Elasticsearch due to ConnectionError: connect ECONNREFUSED 172.18.0.2:9200 error"}

The text was updated successfully, but these errors were encountered:

anthonyfok · 2022-05-10T21:57:49Z

The following log (saved by @wkhchow using docker compose logs -t) may hopefully offer some clues:

opendrr_may_10_2002.log

anthonyfok · 2022-05-10T22:51:33Z

Will, Drew and I brainstormed a little on Slack earlier today, and here are some initial observations/thoughts/ideas:

Drew thinks it could be resource issue, e.g. the Elasticsearch container ran out of memory.
Anthony agrees; the size of our dataset is growing after all, and it seems Elasticsearch itself is taking up quite a bit of RAM.
Will wonders why it is Elasticsearch container and not the Python script (in python-env container) that ran out of memory (and exit with signal 9).
There is no way to shutdown Elasticsearch (Java server) remotely from another container, not since Elasticsearch 2.x (see https://stackoverflow.com/questions/17191539/how-to-stop-shut-down-an-elasticsearch-node), so it seems unlikely the (random?) shutdown is caused by any errors in our bash, Python or PostgreSQL scripts.

Perhaps something like the following crude CPU/RAM monitoring script can be extended to record Docker containers status etc. to give us more clues?

#!/bin/bash

instance_type=$(sudo dmesg -t | grep 'Amazon EC2' | cut -f4 -d' ' | sed -e 's/\/.*//')
user="$1"
logfile="${HOME}/logs/log_$(date +"%Y-%m-%d_%H-%M-%S")_${instance_type}_${user}_cpu-ram-process.log"

while true; do
  ( date; uptime; free -h; ps auxwww | grep "^${user}" ; echo) | tee -a "${logfile}"
  sleep 15
done

Docker Compose logs do not seem to record when a container stopped abnormally. Maybe those starting/stopping info are shown on the running console? Perhaps script ("make typescript of terminal session", from bsdutils) or similar utilities can be used to capture those starting/stopping messages?

This is a temporary workaround to ensure the elasticsearch-opendrr service stays up even if it failed or got stopped for whatever abnormal reasons. See #202 for details.

wkhchow added the Bug Something isn't working label May 10, 2022

wkhchow assigned anthonyfok, drotheram and wkhchow May 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

elasticsearch container quits during stack build, but can be started again in docker #202

elasticsearch container quits during stack build, but can be started again in docker #202

wkhchow commented May 10, 2022

anthonyfok commented May 10, 2022

anthonyfok commented May 10, 2022 •

edited

Loading

elasticsearch container quits during stack build, but can be started again in docker #202

elasticsearch container quits during stack build, but can be started again in docker #202

Comments

wkhchow commented May 10, 2022

anthonyfok commented May 10, 2022

anthonyfok commented May 10, 2022 • edited Loading

anthonyfok commented May 10, 2022 •

edited

Loading