Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

elasticsearch container quits during stack build, but can be started again in docker #202

Open
wkhchow opened this issue May 10, 2022 · 2 comments
Assignees
Labels
Bug Something isn't working

Comments

@wkhchow
Copy link
Contributor

wkhchow commented May 10, 2022

When building new stack with latest branch opendrr-api (update_psra_apr2022) for some reason es container quits while the db is still building. You can still enable it in docker before it reaches the that stage to continue the build process.

kibana-opendrr_1 | 2022-05-10T20:55:22.837488770Z {"type":"log","@timestamp":"2022-05-10T20:55:22+00:00","tags":["warning","plugins","licensing"],"pid":8,"message":"License information could not be obtained from Elasticsearch due to ConnectionError: connect ECONNREFUSED 172.18.0.2:9200 error"}

@wkhchow wkhchow added the Bug Something isn't working label May 10, 2022
@anthonyfok
Copy link
Member

The following log (saved by @wkhchow using docker compose logs -t) may hopefully offer some clues:

@anthonyfok
Copy link
Member

anthonyfok commented May 10, 2022

Will, Drew and I brainstormed a little on Slack earlier today, and here are some initial observations/thoughts/ideas:

  • Drew thinks it could be resource issue, e.g. the Elasticsearch container ran out of memory.
  • Anthony agrees; the size of our dataset is growing after all, and it seems Elasticsearch itself is taking up quite a bit of RAM.
  • Will wonders why it is Elasticsearch container and not the Python script (in python-env container) that ran out of memory (and exit with signal 9).
  • There is no way to shutdown Elasticsearch (Java server) remotely from another container, not since Elasticsearch 2.x (see https://stackoverflow.com/questions/17191539/how-to-stop-shut-down-an-elasticsearch-node), so it seems unlikely the (random?) shutdown is caused by any errors in our bash, Python or PostgreSQL scripts.
  • Perhaps something like the following crude CPU/RAM monitoring script can be extended to record Docker containers status etc. to give us more clues?
    #!/bin/bash
    
    instance_type=$(sudo dmesg -t | grep 'Amazon EC2' | cut -f4 -d' ' | sed -e 's/\/.*//')
    user="$1"
    logfile="${HOME}/logs/log_$(date +"%Y-%m-%d_%H-%M-%S")_${instance_type}_${user}_cpu-ram-process.log"
    
    while true; do
      ( date; uptime; free -h; ps auxwww | grep "^${user}" ; echo) | tee -a "${logfile}"
      sleep 15
    done
  • Docker Compose logs do not seem to record when a container stopped abnormally. Maybe those starting/stopping info are shown on the running console? Perhaps script ("make typescript of terminal session", from bsdutils) or similar utilities can be used to capture those starting/stopping messages?

anthonyfok added a commit that referenced this issue May 11, 2022
This is a temporary workaround to ensure the elasticsearch-opendrr service
stays up even if it failed or got stopped for whatever abnormal reasons.

See #202 for details.
wkhchow pushed a commit that referenced this issue May 12, 2022
This is a temporary workaround to ensure the elasticsearch-opendrr service
stays up even if it failed or got stopped for whatever abnormal reasons.

See #202 for details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants