Skip to content

Commit

Permalink
Feature: Add readyz subcommand for Docker healthcheck (#270)
Browse files Browse the repository at this point in the history
  • Loading branch information
axllent committed Mar 30, 2024
1 parent 83c70aa commit a805567
Show file tree
Hide file tree
Showing 5 changed files with 86 additions and 2 deletions.
2 changes: 2 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,6 @@ RUN apk add --no-cache tzdata

EXPOSE 1025/tcp 1110/tcp 8025/tcp

HEALTHCHECK --interval=15s CMD /mailpit readyz

This comment has been minimized.

Copy link
@mbrodala

mbrodala Apr 2, 2024

Maybe this could be reduced a bit? Currently it will take 15 seconds until this service is marked healthy, thus a service which depends_on (Compose) this, will need to wait at least 15 seconds.

Also consider adding --start-period=Xs (default: 0s) if you know roughly how long Mailpit needs to start. It may also make sense to adjust --start-interval=Ys (default: 5s) if the startup should be quick but may take 1-2 seconds longer.

In any case thanks a lot for adding this!

This comment has been minimized.

Copy link
@axllent

axllent Apr 3, 2024

Author Owner

@mbrodala Yes, I suppose it could be reduced.

There are a few things to note here:

  1. --start-interval (interval to attempt healthchecks at during the start period) is fairly new to the official Docker server, and is not supported in some Docker "variations" (forks/clones). Mine (docker.io on ubuntu) for instance returns an error unknown flag: start-interval - so this is not an option if there is the potential to break things.
  2. --start-period (start period for the container to initialize before the retries starts to count down) doesn't mean it checks quicker, it just introduces a delay to the start-interval results to allow for slow docker environments to start.

The only way I think I can safely reduce this initial upstart time is to lower --interval from 15s to something like 5s, which is a very ugly way to do this (there are a lot of complains about this) which led to the new API (--start-interval). Reducing it to 5s means it'll be healthy after 5 seconds, but it will also continue to check every 5 seconds.

The problem here is that Mailpit is effectively "ready" almost instantly when Docker starts, meaning this could even be 2s, however to fire the check every 2 seconds in order to get a quicker "ready status" just seems stupid to me.

So two questions for you:

  1. Would 5 seconds be acceptable?
  2. Have you tried to manually set the healthcheck arguments in your docker-compose file, and if so, what worked well for you?

This comment has been minimized.

Copy link
@mbrodala

mbrodala Apr 3, 2024

Yes, our intention was to have Mailpit quickly report its fast startup and then have regular "slow" check intervals. We didn't notice that the start-* flags are that new that they are not available everywhere yet. In this case the current default makes sense.

We also try to reduce the maintenance on our side if there is a usable upstream solution. We already have a healthcheck in place for Mailpit but like to drop it now that there's one by default. That would answer your 2nd question.

As for the 1st: yes, this would give us a fast startup. But at the same time too frequent checks for everyone without a custom healthcheck so I can understand that this may not be a good default.

This comment has been minimized.

Copy link
@axllent

axllent Apr 4, 2024

Author Owner

I would love this option too as it is silly to have to wait a full cycle of "interval" before the first test is run, and maybe --start-interval solves just that (it is difficult to understand exactly how it is supposed to behave without actually being able to try it), but as I said I cannot add that as a default health check option as this will completely break docker on some user's systems.

I have also given the "5s interval" more thought, and I don't like the idea of running the health check every 5 seconds. It is not because it puts load on the Mailpit server, but rather that every 5 seconds Docker will run /mailpit readyz, meaning the OS has to open & execute the binary (unnecessary disk i/o and CPU cycles).

I fully understand your reasoning for wanting to simplify your maintenance, but I don't really see a realistic solution to this other than you having your own health check options in place. I hope you can understand and agree with my point of view?

This comment has been minimized.

Copy link
@mbrodala

mbrodala Apr 8, 2024

Of course, as already stated I understand the reasoning behind the current setup. We'll try to use the default wherever possible and use a custom healthcheck if necessary. :-)


ENTRYPOINT ["/mailpit"]
75 changes: 75 additions & 0 deletions cmd/readyz.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
package cmd

import (
"crypto/tls"
"fmt"
"net/http"
"os"
"path"
"strings"
"time"

"github.com/axllent/mailpit/config"
"github.com/spf13/cobra"
)

var (
useHTTPS bool
)

// readyzCmd represents the healthcheck command
var readyzCmd = &cobra.Command{
Use: "readyz",
Short: "Run a healthcheck to test if Mailpit is running",
Long: `This command connects to the /readyz endpoint of a running Mailpit server
and exits with a status of 0 if the connection is successful, else with a
status 1 if unhealthy.
If running within Docker, it should automatically detect environment
settings to determine the HTTP bind interface & port.
`,
Run: func(cmd *cobra.Command, args []string) {
webroot := strings.TrimRight(path.Join("/", config.Webroot, "/"), "/") + "/"
proto := "http"
if useHTTPS {
proto = "https"
}

uri := fmt.Sprintf("%s://%s%sreadyz", proto, config.HTTPListen, webroot)

conf := &http.Transport{
IdleConnTimeout: time.Second * 5,
ExpectContinueTimeout: time.Second * 5,
TLSHandshakeTimeout: time.Second * 5,
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
}
client := &http.Client{Transport: conf}

res, err := client.Get(uri)
if err != nil || res.StatusCode != 200 {
os.Exit(1)
}
},
}

func init() {
rootCmd.AddCommand(readyzCmd)

if len(os.Getenv("MP_UI_BIND_ADDR")) > 0 {
config.HTTPListen = os.Getenv("MP_UI_BIND_ADDR")
}

if len(os.Getenv("MP_WEBROOT")) > 0 {
config.Webroot = os.Getenv("MP_WEBROOT")
}

config.UITLSCert = os.Getenv("MP_UI_TLS_CERT")

if config.UITLSCert != "" {
useHTTPS = true
}

readyzCmd.Flags().StringVarP(&config.HTTPListen, "listen", "l", config.HTTPListen, "Set the HTTP bind interface & port")
readyzCmd.Flags().StringVar(&config.Webroot, "webroot", config.Webroot, "Set the webroot for web UI & API")
readyzCmd.Flags().BoolVar(&useHTTPS, "https", useHTTPS, "Connect via HTTPS (ignores HTTPS validation)")
}
2 changes: 1 addition & 1 deletion cmd/root.go
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ func init() {
rootCmd.Flags().BoolVarP(&logger.VerboseLogging, "verbose", "v", logger.VerboseLogging, "Verbose logging")

// Web UI / API
rootCmd.Flags().StringVarP(&config.HTTPListen, "listen", "l", config.HTTPListen, "HTTP bind interface and port for UI")
rootCmd.Flags().StringVarP(&config.HTTPListen, "listen", "l", config.HTTPListen, "HTTP bind interface & port for UI")
rootCmd.Flags().StringVar(&config.Webroot, "webroot", config.Webroot, "Set the webroot for web UI & API")
rootCmd.Flags().StringVar(&config.UIAuthFile, "ui-auth-file", config.UIAuthFile, "A password file for web UI & API authentication")
rootCmd.Flags().StringVar(&config.UITLSCert, "ui-tls-cert", config.UITLSCert, "TLS certificate for web UI (HTTPS) - requires ui-tls-key")
Expand Down
5 changes: 5 additions & 0 deletions internal/storage/database.go
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,11 @@ func Close() {
}
}

// Ping the database connection and return an error if unsuccessful
func Ping() error {
return db.Ping()
}

// StatsGet returns the total/unread statistics for a mailbox
func StatsGet() MailboxStats {
var (
Expand Down
4 changes: 3 additions & 1 deletion server/handlers/k8sready.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,14 @@ package handlers
import (
"net/http"
"sync/atomic"

"github.com/axllent/mailpit/internal/storage"
)

// ReadyzHandler is a ready probe that signals k8s to be able to retrieve traffic
func ReadyzHandler(isReady *atomic.Value) http.HandlerFunc {
return func(w http.ResponseWriter, _ *http.Request) {
if isReady == nil || !isReady.Load().(bool) {
if isReady == nil || !isReady.Load().(bool) || storage.Ping() != nil {
http.Error(w, http.StatusText(http.StatusServiceUnavailable), http.StatusServiceUnavailable)
return
}
Expand Down

0 comments on commit a805567

Please sign in to comment.