Deployment Strategies for Microservices
Deployment strategies utility
One of the biggest challenges in developing cloud native applications today is speeding up the number of your deployments. With a microservices approach, developers are already working with and designing completely modular applications that allow multiple teams to write and deploy changes simultaneously to an application.
Shorter and more frequent deployments offer the following benefits:
- Reduced time-to-market.
- Customers can take advantage of features faster.
- Customer feedback flows back into the product team faster, which means the team can iterate on features and fix problems faster.
- Higher developer morale with more features in production.
But with more frequent releases, the chances of negatively affecting application reliability or customer experience can also increase. This is why it’s essential for operations and DevOps teams to develop processes and manage deployment strategies that minimize risk to the product and customers.
In the Microservices world there is few different way to release an application, you have to carefully choose the right strategy to make your infrastructure resilient.
- recreate: terminate the old version and release the new one
- ramped: release a new version on a rolling update fashion, one after the other
- blue/green: release a new version alongside the old version then switch traffic
- canary: release a new version to a subset of users, then proceed to a full rollout
- a/b testing: release a new version to a subset of users in a precise way (HTTP headers, cookie, weight, etc.). This doesn’t come out of the box with Kubernetes, it imply extra work to setup a smarter loadbalancing system (Istio, Linkerd, Traeffik, custom nginx/haproxy, etc).
- shadow: release a new version alongside the old version. Incoming traffic is mirrored to the new version and doesn't impact the response.
In this post, we’ll take a look at each strategy and see what type of application would fit best for it.
Microservices deployment strategies
Application deployment becomes more complicated the more the product grows. Deployment of a new application thus may mean deployment of new infrastructure code with it. A deployment strategy determines the deployment process, and is defined by the deployment configuration that a user provides while hosting the application on Kubernetes. Some of them are mentioned below:
Recreate
A deployment defined with a strategy of type Recreate will terminate all the running instances then recreate them with the newer version. There is no need for Intelligent routing and traffic switching mechanisms as there exists only one version.
Although this is not a true continuous deployment strategy which involves a stipulated downtime. This is more like a old fashioned deployment/installation process where this approach comprise cleanliness and conceptual simplicity. At no time users don’t have to manage more than one application version in parallel.
In this type of very simple deployment with Kubernetes, all of the old pods are killed all at once and get replaced all at once with the new ones. A full example and steps to deploy can be found at https://www.devopsforward.com/kubernetes-deployment-strategies/
As seen below the traffic is totally directed to App-V2 once App-v1 is fully deleted and App-v2 is rolled out.
This strategy is mostly appropriate for non-critical systems where downtime which depends on both shutdown and boot duration of the application. is acceptable and comes at no significant cost.
Rolling (or) Ramped Deployment
Rolling deployments also known as incremental are release patterns where the application is gradually deployed to the machines one at a time or in batches. App-V2 is slowly rolled out replacing App-V1 instances one after the other until all the V2 instances are completely rolled out turning App-V1 turned off. Ramped deployments are great when the new versions are backward compatible — both API and data-wise.
The rolling deployment is the standard default deployment to Kubernetes. It works by slowly, one by one, replacing pods of the previous version of your application with pods of the new version without any cluster downtime. A rolling update waits for new pods to become ready via your readiness probe before it starts scaling down the old ones. If there is a problem, the rolling update or deployment can be aborted without bringing the whole cluster down.
The number of simultaneous deployment targets in a rolling deployment is referred to as the window size. A window size of one deploys to one target at a time, and that deployment must finish before another deployment starts. Kubernetes Deployment supports rolling-update parameters mentioned below to streamline the deployment time and rollout process:
- Max surge: How many instances to add in addition of the current amount.
- Max unavailable: Number of unavailable instances during the rolling update procedure.
As seen below the traffic is gradually shifted to V2 until V2 is completely/fully rolled-out. At any point of time based on the max-unavailable value set there will not be any downtime as at least one instance will be serving the traffic.
Blue/ Green (or Red / Black) deployments
In a blue/green deployment strategy (sometimes referred to as red/black) the old version of the application (green) and the new version (blue) get deployed at the same time. When both of these are deployed, users only have access to the green; whereas, the blue is available to your QA team for test automation on a separate service or via direct port-forwarding.
After testing that the new version meets the requirements, we update the Kubernetes Service object that plays the role of load balancer to send traffic to the new version by replacing the version label in the selector field.
As seen below the traffic is immediately shifted to V2 upon switching the switching the traffic destination.
Canary Deployments
Canary deployments are a bit like blue/green deployments, but are more controlled and use a more ‘progressive delivery’ phased-in approach. There are a number of strategies that fall under the umbrella of canary including: dark launches, or A/B testing.
A canary is used for when you want to test some new functionality typically on the backend of your application. Traditionally you may have had two almost identical servers: one that goes to all users and another with the new features that gets rolled out to a subset of users and then compared. When no errors are reported, the new version can gradually roll out to the rest of the infrastructure.
While this strategy can be done just using Kubernetes resources by replacing old and new pods, it is much more convenient and easier to implement this strategy with a service mesh like Istio.
As an example, you could have two different manifests checked into Git: a GA tagged 0.1.0 and the canary, tagged 0.2.0. By altering the weights in the manifest of the Istio virtual gateway, the percentage of traffic for both of these deployments is managed.
For a step by step tutorial on implementing canary deployments with Istio, see, GitOps Workflows with Istio.
As seen below the traffic is gradually shifted with an incremental rythm to V2 until V2 is completely/fully rolled-out and all the canary tests are passed.
Canary deployments with Weaveworks Flagger:
A simple and effective way to manage a canary deployment is by using Weaveworks Flagger.
With Flagger, the promotion of canary deployments is automated. It uses Istio or App Mesh to route and shift traffic, and Prometheus metrics for canary analysis. Canary analysis can also be extended with webhooks for running acceptance tests, load tests or any other type of custom validation.
Flagger takes a Kubernetes deployment and optionally a horizontal pod autoscaler (HPA) to create a series of objects (Kubernetes deployments, ClusterIP services and Istio or App Mesh virtual services) to drive the canary analysis and promotion.
By implementing a control loop, Flagger gradually shifts traffic to the canary while measuring key performance indicators like HTTP requests success rate, requests average duration and pods health. Based on the analysis of the KPIs, a canary is either promoted or aborted, and the analysis result is published to Slack. For an overview and demo see, Progressive Delivery for App Mesh.
Dark deployments or A/B Deployments
A dark deployment is another variation on the canary (that incidentally can also be handled by Flagger). The difference between a dark deployment and a canary is that dark deployments deal with features in the front-end rather than the backend as is the case with canaries.
Another name for dark deployment is A/B testing. Rather than launch a new feature for all users, you can release it to a small set of users. The users are typically unaware they are being used as testers for the new feature, hence the term “dark” deployment.
A/B testing is really a technique for making business decisions based on statistics, rather than a deployment strategy. However, it is related and can be implemented using a canary deployment so we will briefly discuss it here.
With the use of feature toggles and other tools, you can monitor how your user is interacting with the new feature and whether it is converting your users, or whether they find the new UI confusing and other types of metrics.
Flagger and A/B deployments:
Besides weighted routing, Flagger can also route traffic to the canary based on HTTP match conditions. In an A/B testing scenario, you'll be using HTTP headers or cookies to target a certain segment of your users. This is particularly useful for front-end applications that require session affinity. Find out more in the Flagger docs.
In addition to distributing traffic amongst versions based on weight, you can precisely target a given pool of users based on a few parameters (cookie, user agent, etc.). This technique is widely used to test conversion of a given feature and only rollout the version that converts the most.
Istio, like other service meshes, provides a finer-grained way to subdivide service instances with dynamic request routing based on weights and/or HTTP headers.
As seen below the traffic to Version B is released to a subset of users under specific condition.
Shadow
Shadowing is a deployment pattern where production traffic is asynchronously copied to a non-production service for testing. This is particularly useful to test production load on a new feature. A rollout of the application is triggered when stability and performance meet the requirements. App-V1 receives real-world traffic alongside App-V2 and doesn’t impact the response.
For Kubernetes, this requires Istio which enables mirroring of traffic through istio-ingress-gateway.
As seen below the Version B receives real-world traffic alongside version A and doesn’t impact the response.
Conclusion
There are different ways to deploy an application, when releasing to development/staging environments, a recreate or ramped deployment is usually a good choice. When it comes to production, a ramped or blue/green deployment is usually a good fit, but proper testing of the new platform is necessary. If you are not confident with the stability of the platform and what could be the impact of releasing a new software version, then a canary release should be the way to go. By doing so, you let the consumer test the application and its integration to the platform. Last but not least, if your business requires testing of a new feature amongst a specific pool of users, for example all users accessing the application using a mobile phone are sent to version A, all users accessing via desktop go to version B. Then you may want to use the A/B testing technique which, by using a Kubernetes service mesh or a custom server configuration lets you target where a user should be routed depending on some parameters.
I hope this was useful, if you have any questions/feedback feel free to comment below.