ops issues #290

Evanfeenstra · 2024-09-09T17:16:23Z

crashing ec2

limit memory per container, and total docker limit?
rate limiting in traefik
logs outside sometimes - cloudwatch. also make sure to have good logs
log rotation if its local

ip addresses changing

static IPs on lightning nodes ($3/month)
could still have a load balancer (for domains) Forward to traefik

better logs in swarm UI

Evanfeenstra · 2024-09-09T17:27:03Z

superadmin

creating swarms
restarting EC2
update route53 stuff

tomsmith8 · 2024-09-09T17:31:12Z

@Evanfeenstra could you prioritise setting docker and container limits.

@kevkevinpal could you prioritise migrating the btc graph, updating the github actions pipeline and deprecating the non swarm ec2 instances

Next up then would be setting up cloud watch?

Evanfeenstra · 2024-09-09T18:39:43Z

just merged a per container memory limit, set it once and it applies to every container

84ab225

Its global_mem_limit in the yaml config file, its a number in bytes

Evanfeenstra · 2024-09-09T19:00:50Z

@tobi-bams here's a new SetGlobalMemLimit cmd, maybe u can add a frontend for it? https://github.com/stakwork/sphinx-swarm/blob/master/src/cmd.rs#L152

tobi-bams · 2024-09-09T20:04:51Z

@tobi-bams here's a new SetGlobalMemLimit cmd, maybe u can add a frontend for it? https://github.com/stakwork/sphinx-swarm/blob/master/src/cmd.rs#L152

Yea, sure I can.

Evanfeenstra · 2024-09-11T16:29:36Z

log rotation: https://github.com/stakwork/sphinx-swarm/releases/tag/v0.4.98

tomsmith8 · 2024-09-20T09:55:38Z

Update all swarms to m5.large or higher.

Do not use t groups due to CPU credits and spikes causes machines to become unavailable.

tomsmith8 · 2024-09-20T09:59:31Z

@Evanfeenstra any updates on keeping logs?

not deleting and keeping locally
future -> stream logs to something like cloudwatch

Evanfeenstra assigned Evanfeenstra, kevkevinpal, tobi-bams, tomsmith8 and gonzaloaune Sep 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ops issues #290

ops issues #290

Evanfeenstra commented Sep 9, 2024

Evanfeenstra commented Sep 9, 2024

tomsmith8 commented Sep 9, 2024

Evanfeenstra commented Sep 9, 2024

Evanfeenstra commented Sep 9, 2024

tobi-bams commented Sep 9, 2024

Evanfeenstra commented Sep 11, 2024

tomsmith8 commented Sep 20, 2024

tomsmith8 commented Sep 20, 2024

ops issues #290

ops issues #290

Comments

Evanfeenstra commented Sep 9, 2024

crashing ec2

ip addresses changing

better logs in swarm UI

Evanfeenstra commented Sep 9, 2024

superadmin

tomsmith8 commented Sep 9, 2024

Evanfeenstra commented Sep 9, 2024

Evanfeenstra commented Sep 9, 2024

tobi-bams commented Sep 9, 2024

Evanfeenstra commented Sep 11, 2024

tomsmith8 commented Sep 20, 2024

tomsmith8 commented Sep 20, 2024