Friday, August 11, 2017

How do Kubernetes and its pods behave regarding SIGTERM, SIGKILL and HTTP request routing

Introduction

During a recent project we saw that HTTP requests are still arriving in pods (Spring Boot MVC controllers) even though Kubernetes' kubelet told the pod to exit by sending it a SIGTERM.
Not nice, because that means that those HTTP requests that still get routed to the (shutting down) pod will most likely fail, since the Spring Boot Java process for example has already closed already all its connection pools.

See this post (also shown below) for an overview of the Kubernetes architecture, e.g regarding kubelets.


Analysis

The process for Kubernetes to terminate a pod is as follows:
  1. The kubelet always sends a SIGTERM before a SIGKILL.
  2. Only when a POD does not finish within the graceful period (default 30 sec) after SIGTERM, the kubelet sends a SIGKILL.
  3. Kubernetes keeps routing traffic to a pod until the readiness probe fails, even after the pod received a SIGTERM.
So for a pod there is always an interval between receiving the SIGTERM and the next readiness probe request for that pod. In that period requests can (and most likely) will still be routed to that pod, and even (business) logic can still be executed in the terminated pod.

This means that after sending the SIGTERM, the readiness probes must fail as soon as possible to prevent the SIGTERMed pod from receiving more HTTP requests. But still there will be a (small) period of time requests can be routed to the pod.

A solution would be to terminate the webserver within the pod's process (in this case Spring Boot's webserver) immediately gracefully after receiving a SIGTERM. This way any still directed requests  before the readiness probe fails will fail in any way, i.e no more requests are accepted.  
So still you would have some failing requests getting passed on to the pod.  But at least no business logic will be executed anymore.

This and other options/considerations are discussed here.





2 comments:

Anubhav said...

Good post, but I would like to point out a correction.

When kube-api receives a pod delete call, it signals kubelet to delete the pod, and at the same time.

Similarly the endpoint controller is also watching for delete events in the kube-api, and when a pod is deleted, it updates the endpoint object by removing the endpoint of the pod getting deleted.

Anubhav said...

When pod is getting delete, the endpoint controller is watching kube-api and deletes the endpoint in the endpoint object for the pod getting deleted.

This operation is independent of the readiness probe