Wednesday, October 27, 2021

Service to Service call patterns in Google Cloud - GKE

This is a series of posts that will explore service to service call patterns in some of the application runtimes in Google Cloud. This specific post will explore GKE without using a service mesh and the next post will explore GKE with Anthos Service Mesh.

Set Up

The set-up is simple, two applications - caller and producer are hosted on the application runtime with caller making a http request to the producer. An additional UI is packaged with the caller that should make it easy to test the different scenarios.

The producer is special, a few faults can be injected into the producers response based on the post body from the caller:

  1. An arbitrary delay
  2. A specific response http status code

These will be used for checking how the runtimes behave under faulty situation.

GKE Autopilot Runtime

The fastest way to get a fully managed Kubernetes cluster in Google Cloud is to spin up a GKE Autopilot cluster. Assuming such a cluster is available, the service to service call pattern is through the abstraction of a Kubernetes service and looks something like this:

A manifest file which enables this is the following:

Once a service resource is created, here called "sample-producer" for instance, a client can call it using the services FQDN - sample-producer.default.svc.cluster.local. In my sample, the caller and the called are in the same namespace, for such cases calling by just the service name is sufficient.

A sample service to service call and its output in a simple UI looks like this:

A few things to see here:
  1. As the request flows from the browser to the caller to the producer, the headers are captured at each stage and presented. There is nothing special with the headers so far, once service meshes come into play they start to get far more interesting.
  2. The delay does not do anything, the browser and the caller end up waiting no matter how high the delay.
  3. Along the same lines, if the producer starts failing, caller continues to send requests down to the service, instead of short circuiting it.


Service to service call in a Kubernetes environment is straightforward with the abstraction of a Kubernetes service resource providing a simple way for clients to reach the instances hosting an application. Layering in a service mesh provides a great way for the service to service calls to be much more resilient without the application explicitly needing to add in libraries to handle request timeouts or faulty upstream services. This will be the topic of the next blog post.