This Lab, logically follows previous steps required to provision and curate a Kubernetes cluster. Please review them before proceeding. If you are in doubt, feel free to contact me directly via https://www.linkedin.com/in/citurria/
Testing BookInfo app with Circuit Breaker based policy
The third and last test in the Service Mesh, is using a Circuit Breaker based pattern. It further protects our microservices in case of certain conditions occur, such as preventing that an unexpected number of requests overflow and affect the microservices in the service mesh.
We might decide to throttle or simply reject new incoming requests when a number of current incoming http requests reaches certain threshold.
For demonstration purposes, we are going to set rules to allow a maximum of 1 request at a time. If more than 1 request comes in, we will prevent it from entering the mesh.
Go back to your PuTTY session and inspect the destination-policy-productpage-cb.yaml file.
- Line 2: This time, we are creating a Destination Policy, instead of a routing rule.
Lines 7 to 10: This rule applies when calling Product page
- Lines 11 to 20: Circuit breaker parameters. Basically, setting 1 maximum incoming connection to be allowed at any time.
Note: The Istio Ingress Controller automatically manages and keeps track of these counters.
Before we proceed, let’s make sure there is no other previous destination rule left in the environment.
istioctl get destinationpolicies -o yaml
As a second precaution, let’s make sure we force reset all Istio Ingress Controller network Counters. One way we can do this, is by calling a direct API in the underlying Envoy network proxy that lives inside the “Istio Ingress” Deployment -> “Istio-Ingress” Container.
Because we have done Lab 1 and 2, we know that we can easily use Weave Scope Dashboard to drill down into the “Istio-Ingress” Container and attach a shell into it, so that we can run commands, such as curl to call the internal Envoy APIs. Pretty cool huh?
- Once attached to the Container shell, call Envoy API (port 15000) to reset all counters:
- Call another Envoy API to see the current stats. Most counters should be set to zero.
- Before we apply the Destination Policy rule, let’s test the normal behaviour. For this we are going to use a test utility called gobench, used to
create traffic of API request going into the Product page.
However, first we need to locate the API endpoint where the Product Page is running, so that we can target it via the PuTTY command line gobench command.
A simple way to get it is using kubectl and locating the “Istio-Ingress” exposed Port.
Go back to the PuTTY session (not Weave Scope anymore) and retrieve all services:
kubectl get services –all-namespaces
Note: We could call the internal IP (i.e. 10.104.217.62) on port 80 or simply localhost (127.0.0.1) on port 32651, that is the external port on which this service is mapped to.
Both cases are equivalent!
Now, let’s install gobench load test utility (I used version 1.10.2).
wget https://dl.google.com/go/go1.10.2.linux-amd64.tar.gz && tar -C /usr/local -xzf go1.10.2.linux-amd64.tar.gz && export PATH=$PATH:/usr/local/go/bin && GOPATH=/tmp/gb/ go get github.com/valyala/fasthttp && GOPATH=/tmp/gb/ go get github.com/cmpxchg16/gobench && cp /tmp/gb/bin/gobench /usr/local/go/bin/
No let’s simulate traffic.
gobench -u http://%5BSERVER%5D:%5BPORT%5D/productpage -k=true -c 2 -r 20
That will generate 20 requests over 2 concurrent sessions. Where [SERVER][PORT] is the endpoint used to expose the Istio-Ingress Controller.
gobench -u http://127.0.0.1:32651/productpage -k=true -c 2 -r 20
- The result should be 40 hits, 40 successful requests (or almost 40, it could be that 1 or 2 failed because of contingency)
Now, apply the Destination Policy
istioctl create -f /tmp/mgt/env/envSvcMesh/meshDemo/destination-policy-productpage-cb.yaml
Before running the load test again, go back to the Weave Scope Dashboard page -> Istio-Ingress container shell window and reset the counters once again, so that we get a clean view of the circuit breaker.
The command is the same as we did before:
Now, run gobench command again to simulate the load, but this time let’s increase the concurrent quantity to force our circuit breaker to trip:
gobench -u http://%5BSERVER%5D:%5BPORT%5D/productpage -k=true -c 4 -r 20
That will generate 20 requests over 4 concurrent sessions. Where [SERVER][PORT] is the endpoint used to expose the Istio-Ingress Controller.
gobench -u http://127.0.0.1:32651/productpage -k=true -c 4 -r 20
- This time, we should see due to the policy only by half of the requests were served. This is because only 1 request was allowed at a time, so, by the time each third request comes in, it was done with the previous one.
- To see the number of times that we tripped the circuit breaker, go back to the Weave Scope Dashboard -> Istio-Ingress container shell window and this time send a stats API to Envoy proxy:
curl http://localhost:15000/stats | grep productpage
- Lots of Envoy statistics and counter will show. Have a special look at the rq_pending_overflow – That indicates the number of times the policy had to be enforced and so the circuit breaker tripped. In this case, 39 times.
- To finish, let’s produce more constant traffic, so that we can analyse time series data and visualise it in Grafana. This data is being scraped via Prometheus by querying on the Istio Control Plane every few seconds.
gobench -u http://%5BSERVER%5D:%5BPORT%5D/productpage -k=true -c 4 -r 500000
That will generate 500,000 requests over 4 concurrent sessions. Where [SERVER][PORT] is the endpoint used to expose the Istio-Ingress Controller.
gobench -u http://127.0.0.1:32651/productpage -k=true -c 4 -r 500000
- Open a new browser tab and click on the Grafana bookmark.
- Grafana use as a data source Prometheus and Prometheus will scrape every few seconds the Istio Control Plane, as network data comes through the Service Mesh Envoy Proxies
Feel free to spend some time analysing the resulting data.
Depending on the amount of load being generated, you will see lots of HTTP errors. That basically means that we can force to see the break point for the current microservices deployment size. Then, try scaling the BookInfo microservices out and see how the results improve.
I hope you found this blog useful. If you have any question or comment, feel free to contact me directly at https://www.linkedin.com/in/citurria/
Thanks for your time.