Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application health status check #3379

Merged
merged 7 commits into from
Sep 3, 2024
Merged

Application health status check #3379

merged 7 commits into from
Sep 3, 2024

Conversation

sebr72
Copy link
Contributor

@sebr72 sebr72 commented Aug 21, 2024

No description provided.

@sebr72 sebr72 marked this pull request as draft August 21, 2024 18:22
@sbrunner sbrunner self-requested a review August 22, 2024 07:19
@sebr72 sebr72 force-pushed the healthstatus branch 2 times, most recently from b4133ad to bdcd26b Compare August 28, 2024 16:20
@sebr72 sebr72 marked this pull request as ready for review August 28, 2024 16:20
Copy link
Member

@sbrunner sbrunner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, the test should not test the increasing of the queue, it's not a pertinent metric.

Expected test:
WaitingJobsCount == 0 => True
LastExecutedJobTimestamp > now - FloatingWindow => True
otherwise => False

The pseudocode we wrote together was better, but I don't remember it!

@sbrunner
Copy link
Member

sbrunner commented Aug 29, 2024

Examples of the responses of the /metrics/healthcheck endpoint:

  1. When we call the healthy method:
{
    "application": {
        "healthy": true,
        "message": "sbr test.",
        "duration": 0,
        "timestamp": "2024-08-29T13:44:58.398Z"
    }
}

Response code: HTTP code 200.

  1. When we call the unhealthy method:
{
    "application": {
        "healthy": false,
        "message": "sbr test.",
        "duration": 0,
        "timestamp": "2024-08-29T13:44:58.398Z"
    }
}

Response code: HTTP status code 200 (or 500 when we set JAVA_OPTS to -DhttpStatusIndicator=true)

  1. If we raise an exception:
{
    "application": {
        "healthy": false,
        "message": "sbr test.",
        "error": {
            "type": "java.lang.RuntimeException",
            "message": "sbr test.",
            "stack": [
                "org.mapfish.print.metrics.ApplicationStatus.check(ApplicationStatus.java:15)", 
                "com.codahale.metrics.health.HealthCheck.execute(HealthCheck.java:374)", 
                "com.codahale.metrics.health.HealthCheckRegistry.runHealthChecks(HealthCheckRegistry.java:184)", 
                "com.codahale.metrics.servlets.HealthCheckServlet.runHealthChecks(HealthCheckServlet.java:177)", 
                "com.codahale.metrics.servlets.HealthCheckServlet.doGet(HealthCheckServlet.java:146)", 
                "javax.servlet.http.HttpServlet.service(HttpServlet.java:529)", 
                "javax.servlet.http.HttpServlet.service(HttpServlet.java:623)", 
                "com.codahale.metrics.servlets.AdminServlet.service(AdminServlet.java:153)", 
                "javax.servlet.http.HttpServlet.service(HttpServlet.java:623)", 
                "org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:199)", 
                "org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:144)", 
                "org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:51)", 
                "org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:168)", 
                "org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:144)", 
                "com.thetransactioncompany.cors.CORSFilter.doFilter(CORSFilter.java:209)", 
                "com.thetransactioncompany.cors.CORSFilter.doFilter(CORSFilter.java:244)", 
                "org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:168)", 
                "org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:144)", 
                "com.codahale.metrics.servlet.AbstractInstrumentedFilter.doFilter(AbstractInstrumentedFilter.java:112)", 
                "org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:168)", 
                "org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:144)", 
                "org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:352)", 
                "org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:117)", 
                "org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:83)", 
                "org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:361)", 
                "org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:126)", 
                "org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:120)", 
                "org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:361)", 
                "org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:131)", 
                "org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:85)", 
                "org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:361)", "org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:100)", 
                "org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:361)", 
                "org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:164)", 
                "org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:361)", "org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:63)", 
                "org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:361)", 
                "org.springframework.security.web.authentication.www.BasicAuthenticationFilter.doFilterInternal(BasicAuthenticationFilter.java:168)",
                "org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)", 
                "org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:361)", 
                "org.springframework.security.web.header.HeaderWriterFilter.doHeadersAfter(HeaderWriterFilter.java:90)", 
                "org.springframework.security.web.header.HeaderWriterFilter.doFilterInternal(HeaderWriterFilter.java:75)", 
                "org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)", 
                "org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:361)", 
                "org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter.doFilterInternal(WebAsyncManagerIntegrationFilter.java:62)", 
                "org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)", 
                "org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:361)", 
                "org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:117)", 
                "org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:87)", 
                "org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:361)", 
                "org.springframework.security.web.access.channel.ChannelProcessingFilter.doFilter(ChannelProcessingFilter.java:133)", 
                "org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:361)", 
                "org.springframework.security.web.session.DisableEncodeUrlFilter.doFilterInternal(DisableEncodeUrlFilter.java:42)", 
                "org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)", 
                "org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:361)", 
                "org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:225)", 
                "org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:190)", 
                "org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:354)", 
                "org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:267)", 
                "org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:168)", 
                "org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:144)", 
                "org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201)", 
                "org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)", 
                "org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:168)", 
                "org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:144)", 
                "org.mapfish.print.servlet.RequestSizeFilter.doFilter(RequestSizeFilter.java:40)", 
                "org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:168)", 
                "org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:144)", 
                "org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:168)", 
                "org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:90)", 
                "org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:482)", 
                "org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:130)", 
                "org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:93)", 
                "org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74)", 
                "org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:346)", 
                "org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:388)", 
                "org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:63)", 
                "org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:936)", 
                "org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1791)", 
                "org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:52)", 
                "org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1190)", 
                "org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)", 
                "org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:63)", 
                "java.base/java.lang.Thread.run(Thread.java:829)"
            ]                
        },
        "duration": 0,
        "timestamp": "2024-08-29T13:46:38.000Z"
    }
}

Response code: HTTP code 200.

@sbrunner
Copy link
Member

sbrunner commented Aug 29, 2024

Then for me, we have two choices:

1st:
Green: healty method
Yellow: unhealty method
Red: throw an exception

2nd:
Green: healty method
Yellow: healty method with a message that starts with e.g.: WARNING:
Red: unhealty method
And keep the exception for the unexpected behavior

Personally, I prefer the second one :-)

@sebr72 sebr72 closed this Aug 29, 2024
@sebr72 sebr72 reopened this Aug 29, 2024
@sebr72
Copy link
Contributor Author

sebr72 commented Aug 29, 2024

We have 3 states:
Green, Yellow and Red. For Healthy, Sick, Dying

Comparison of 2 options:
In both cases
Green: healthy method, Http status automatically set to 200
Red: Unexpected exceptions, Http status automatically set to 500

1st:
Result.healthy() is Green, Result.unhealthy() is Orange, Exceptions are red
Yellow: unhealthy method, Http status automatically set to 200
Red: throw an exception, Http status automatically set to 500

2nd:
Result.healthy() is Green, Result.healthy() is Orange (with details in message), Result.unhealthy() is red (unexpected exceptions are red too)
Yellow: healthy method with a message that starts with an agreed Keyword (e.g.: "WARNING:") , Http status automatically set to 200
Red: unhealthy method, Http status is set to 500 (not automatically)

Pros and cons:
1st:
Clients will have to parse the healthy status to know if it is healthy
We do not have to set the JAVA_OPTS to -DhttpStatusIndicator=true, hence one less thing to document and do wrong.
There is no special cases for exceptions, hence less code. Unless the developer sees one that requires a special case.
The developer can manually create an exception when the MFP server is in Red.
You cannot have a 500 without an exception.

2nd:
A special keyword like WARNING will have to be defined to identify Yellow state, and our clients will have to parse it.
If the JAVA_OPTS to -DhttpStatusIndicator=true, is forgotten, the red status becomes a 200, which is dangerous.
You can have a 500 http status without an exception

@pmauduit
Copy link
Contributor

We have 3 states:
Green, Yellow and Red. For Healthy, Sick, Dying

Since I have been asked for my insight on this topic, I felt 2 different usecases for the endpoint:

  • kubernetes probes: they actually don't care of the "intermediate" state, even if we can tweak the threshold and the number of hits which would fail, it needs a more binary answer than the 3 states: either it's green (green), either it's failing (red). From what I am aware of, I think http probes in k8s are mainly based on the http status code (200 or something else), not the content of the response.
  • developer who will investigate in case of "yellow" / sick. It means something external to kubernetes will need to parse the endpoint and warn people when getting yellow for further investigations.

The main question is in which side to put the "Sick" state: either on "healthy" or on "dying" side. I would be in favor of the first suggested solution, because setting an extra java property at startup sounds to me as a special configuration case framework-wise, which is not supposed to be the default behaviour, and probably for a good reason. "Sick" also sounds to me more like a "Yes, but ..." with extra investigation to perform, but the service should still be available but like in a degraded state, so maybe it makes sense to still consider it as OK.

@sebr72 sebr72 requested a review from sbrunner August 30, 2024 09:37
@sbrunner
Copy link
Member

sbrunner commented Sep 3, 2024

@pmauduit for info: followup of the discussion: #3393

Copy link
Member

@sbrunner sbrunner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed for tests :-)

@sebr72 sebr72 merged commit bd3ce4a into master Sep 3, 2024
9 checks passed
@sebr72 sebr72 deleted the healthstatus branch September 3, 2024 12:36
@geo-ghci-int geo-ghci-int bot added this to the 3.31.0 milestone Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants