You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Yesterday we ended up with a weird state in ITRB Prod where the Solr pod wasn't restarting because of a configuration issue. Once we got it restarted, it started in download mode -- so the first thing it did was wipe the Solr database!
Given the weird state, it's not clear to me that we could have restarted in LOAD_DATA=no mode and avoiding wiping the database, but if we move this deletion line later in the script that at least becomes an option, so hopefully we can catch that.
I think the right move would be to download this file to a separate directory -- this could be a separate mount point, which we can hopefully configure to be InitContainer-only so we don't need to lock up 400G for the Solr pod. We can then wipe the main database only after the download is complete.
The text was updated successfully, but these errors were encountered:
This means that if we accidentally start a pod with LOAD_DATA=yes, we
should have about an hour to re-run it with LOAD_DATA=no before the
database gets deleted.
Closes#842.
We currently wipe the Solr database as the first step in the init pod:
translator-devops/helm/name-lookup/templates/scripts-config-map.yaml
Line 15 in 3333b46
Yesterday we ended up with a weird state in ITRB Prod where the Solr pod wasn't restarting because of a configuration issue. Once we got it restarted, it started in download mode -- so the first thing it did was wipe the Solr database!
Given the weird state, it's not clear to me that we could have restarted in LOAD_DATA=no mode and avoiding wiping the database, but if we move this deletion line later in the script that at least becomes an option, so hopefully we can catch that.
I think the right move would be to download this file to a separate directory -- this could be a separate mount point, which we can hopefully configure to be InitContainer-only so we don't need to lock up 400G for the Solr pod. We can then wipe the main database only after the download is complete.
The text was updated successfully, but these errors were encountered: