Overcoming GKE Ingress Health Check Limitations

Current state

I’m exposing a backend service APIs, which lives on a Kubernetes cluster on Google Cloud, using the Kubernetes Ingress. The backend system is a Tomcat application that exposes client-facing APIs through the port 8080 and its internal endpoints ( e.g: health check ) through a different port, 8085. The internal APIs are not exposed to the “outside world”. Also, each port exposes the APIs to different context paths:

Client-Facing APIs:

port: 8080

context-path: /api/*

Internal APIs:

port: 8085

context-path: /management/

If we would transpose this configuration in Kubernetes terms we would have 3 resources: Ingress, a NodePort Service and a Deployment ( we will consider that the app is stateless ).

ingress.yaml

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: ingress
spec:
  rules:
  - http:
      paths:
        - path: /api/*
          backend:
            serviceName: backend-service
            servicePort: 8080
        ...

Service.yaml

apiVersion: v1
kind: Service
metadata:
  name: service
spec:
  type: NodePort
  ports:
    - port: 8080
      targetPort: 8080
      protocol: TCP
      name: http
    - port: 8085
      targetPort: 8085
      protocol: TCP
      name: management
  selector:
    app.kubernetes.io/name: backend-service
    app.kubernetes.io/instance: test

Deployment.yaml

apiVersion: apps/v1
kind: Deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: backend-service
      app.kubernetes.io/instance: test
  template:
    spec:
      containers:
        - name: backend-service
          image: my/backend-service:1.0.0
          imagePullPolicy: Always
          ports:
            - name: http
              containerPort: 8080
              protocol: TCP
            - name: management
              containerPort: 8085
              protocol: TCP           
          livenessProbe:
            httpGet:
              path: /management/health
              port: management
            initialDelaySeconds: 60
            timeoutSeconds: 3
          readinessProbe:
            httpGet:
              path: /management/health
              port: management
            initialDelaySeconds: 60
            timeoutSeconds: 3

Nothing really fancy happening, I would say that the most “extraordinary” things are the custom liveness and readiness probes that are pointing to the health check on the management context.

The problem

Google Kubernetes Engine ( shortly GKE ) creates a default health check to verify the state of the backend services and in the case that they are unhealthy it will not allow any client to connect to it by returning a 502 response.

Default GKE Health check

The health check feature is pretty cool and it’s also pretty smart. If a readiness probe is defined for your deployment, the health check will be configured based on that. Everything sounds good in theory, right? Well, it isn’t exactly like that.

The health check configuration will be picked from the readiness probe only if the health check endpoint is exposed under the same port as the one that the ingress will access. It sounds more confusing than it is so let me visualize it for you:

Ingress-Service-Deployment Configuration

As you can notice, there is a direct connection between the Ingress and the /api/* endpoints through the BE Service. When I tried to set up a custom health check to point out to the correct node port after a couple of minutes it was reset to the default one. The reason is exactly the fact that the ingress does not have access to the health check endpoint ( different ports mapped ). Due to security reasons, all the endpoints under the /management/* path are not exposed to the internet so they can be accessed only from within the Kubernetes cluster or from the node that they are living on, so exposing the 8085 port to the ingress was not a solution.

The Solution

The first option was, of course, to create a custom health check endpoint on the /api/* context but like this, we were just by-passing health check mechanisms that are already in place. The final option was …. drum rolls … multi-container pods.

Inside the same pod, we’re now running 2 containers instead of one. One is still the same Backend API, untouched. The other would be an NGINX acting as a reverse proxy. How would that work?

Multi-container Pod with NGINX Reverse Proxy

The diagram looks rather confusing so let’s take it to step by step:

the port exposed by the service to ingress is now 80 instead of 8080;
inside the BE pod are now running 2 containers instead of 1;
the BE App container still has the same 2 ports exposed:
- 8080 for /api/* context;
- 8085 for /management/* context;
the Nginx container has only one port exposed: 80
the Nginx server acts as a reverse proxy:
- / endpoint returns a 200 OK response;
- /api/* endpoints will be proxied to the BE App container;

How does the configuration look like now?

ingress.yaml

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: ingress
spec:
  rules:
  - http:
      paths:
        - path: /api/*
          backend:
            serviceName: backend-service
            servicePort: 80
        ...

Service.yaml

apiVersion: v1
kind: Service
metadata:
  name: service
spec:
  type: NodePort
  ports:
    - port: 80
      targetPort: 80
      protocol: TCP
      name: http
    - port: 8085
      targetPort: 8085
      protocol: TCP
      name: management
  selector:
    app.kubernetes.io/name: backend-service
    app.kubernetes.io/instance: test

Deployment.yaml

apiVersion: apps/v1
kind: Deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: backend-service
      app.kubernetes.io/instance: test
  template:
    spec:
      containers:
        - name: backend-service
          image: my/backend-service:1.0.0
          imagePullPolicy: Always
          ports:
            - name: api
              containerPort: 8080
              protocol: TCP
            - name: management
              containerPort: 8085
              protocol: TCP
          livenessProbe:
            httpGet:
              path: /management/actuator/health
              port: management
            initialDelaySeconds: 60
            timeoutSeconds: 3
          readinessProbe:
            httpGet:
              path: /management/actuator/health
              port: management
            initialDelaySeconds: 60
            timeoutSeconds: 3
        - name: nginx
          image: nginx:1.17.5
          imagePullPolicy: Always
          ports:
            - name: http
              containerPort: 80
              protocol: TCP
          volumeMounts:
            - name: nginx-configuration
              mountPath: /etc/nginx/nginx.conf
              subPath: nginx.conf
      volumes:
        - name: nginx-configuration
          configMap:
            name: nginx-config

ConfigMap.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: nginx-config
  labels:
    app.kubernetes.io/name: nginx-config
    app.kubernetes.io/instance: test
data:
  nginx.conf: |
    user  nginx;
    worker_processes  1;

    error_log  /var/log/nginx/error.log warn;
    pid        /var/run/nginx.pid;

    events {
        worker_connections  1024;
    }

    http {
      include       /etc/nginx/mime.types;
      default_type  application/octet-stream;

      log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                        '$status $body_bytes_sent "$http_referer" '
                        '"$http_user_agent" "$http_x_forwarded_for"';

      access_log  /var/log/nginx/access.log  main;

      sendfile        on;
      keepalive_timeout  65;

      upstream backend-service {
        server 127.0.0.1:8080;
      }

      server {
        listen                  80;

        location /api {
          return 302 /api/;
        }

        location /api/ {
          proxy_pass http://backend-service;
          proxy_redirect     off;
        }

        location / {
             return 200;
        }
      }
    }

A new yaml file popped up in the form of a config map. This config map contains the nginx configuration which will override the original one from the docker image. Most of the configuration is boilerplate but I’ve put in bold the important bits:

define an upstream called backend-service which points to the localhost ( this is how you can access containers from the same pod ) on port 8080;
calls to /api will be redirected to `/api/` ( adding an extra / );
calls to /api/ will be redirected to the backend-service;
calls to / will return a 200 OK;

Bogdan Sucaciu

Overcoming GKE Ingress Health Check Limitations

Current state

ingress.yaml

Service.yaml

Deployment.yaml

The problem

The Solution

How does the configuration look like now?

ingress.yaml

Service.yaml

Deployment.yaml

ConfigMap.yaml

Previous & Next posts

6 Ways to Keep up with Technology

The Power of Software Prototyping

Recent Posts

Recent Comments

Archives

Categories

Meta

About author

bogdansucaciu

Overcoming GKE Ingress Health Check Limitations

Current state

ingress.yaml

Service.yaml

Deployment.yaml

The problem

The Solution

How does the configuration look like now?

ingress.yaml

Service.yaml

Deployment.yaml

ConfigMap.yaml

Previous & Next posts

6 Ways to Keep up with Technology

The Power of Software Prototyping

Recent Posts

Recent Comments

Archives

Categories

Meta

About author

bogdansucaciu

You might also like