Redundant restarts #278

widhalmt · 2023-09-28T12:37:33Z

We have handlers restarting services and thus creating lag that can lead to timeouts in our checks.

I guess, we can reduce the count of restarts, especially for Elasticsearch and speed up the checks dramatically.

One issue I found so far is that in #137 we agreed on not restarting the cluster after the change, but we still notify the handler.

I'll look into the roles and try to remove any redundant restart.

widhalmt · 2023-09-28T12:38:15Z

This might also be connected to #252 .

We don't need to restart Elasticseach after this task. Everything is set in a similar task earlier. This one is only to change the start bevaviour to a safer one (not reinitializing the cluster). The change is only needed during restarts, so whenever Elasticsearch is restarted, the new version will be used. fixes #278

Restarting Elasticsearch takes quite a while and may lead to connection issues as well as sync issues. So keeping restarts to a minimum is important. These changes will make sure that, even when the `Restart Elasticsearch` handler is notified, it will only restart if Elasticsearch was running before. If there's a fresh start (after reconfiguration) we don't need to restart again. Same goes for Logstash and Kibana. Some restarts of these tools happen fairly fast. But others (like after fresh installs or updates) will trigger internal jobs that should not be intercepted by another restart. Beats restart very fast and as far as I know there's not a big downside to restarting them right after the first start so I didn't include them in the change. Additionally, this PR will make sure some tasks in `verify.yml` of the full stack are only run when the service to be checked is actually running on this node. This helps with spreading services over nodes to save ressources. Since GitHub hosted runners are quite low on ressources we can't run every service on every node in a cluster setup anymore. So this PR will make sure that only Elasticsearch runs everywhere and the others are spread out. Caches get cleared after every role in during a Molecule test. This helps with saving ressources, too. Elasticsearch still won't sync all shards due to full volumes, the watermarks for Elasticseach are set to extremely high volumes so that the cluster can at least get into sync. fixes #278 fixes #141 fixes #194

widhalmt added bug Something isn't working feature New feature or request labels Sep 28, 2023

widhalmt self-assigned this Sep 28, 2023

widhalmt mentioned this issue Sep 28, 2023

Set Debian 11 as new default distro for molecule #277

Merged

widhalmt mentioned this issue Sep 28, 2023

Remove redundant restart #279

Merged

widhalmt mentioned this issue Oct 16, 2023

Elasticsearch config #288

Merged

widhalmt closed this as completed in #279 Oct 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redundant restarts #278

Redundant restarts #278

widhalmt commented Sep 28, 2023

widhalmt commented Sep 28, 2023

Redundant restarts #278

Redundant restarts #278

Comments

widhalmt commented Sep 28, 2023

widhalmt commented Sep 28, 2023