Most applications nowdays are scalable, distributed and deployed mostly to some cloud infrastructure. Updates to the application are also pushed to production at a frenetic rate.

Generally most of the upgrades needs to have these characteristics
— Zero downtime of production environment.
— Ability to rollback the changes (if required).

In this post we will look into few such strategies
Rolling deployment
Blue-Green deployment
Canary deployment
Shadow deployment

1.0 Rolling deployment

  • It is a phased manner of deployment, where new version of application is deployed on the production servers one at a time in a rolling fashion.
  • So there may be a time duration, when user traffic is redirected to both new and old version of code (which co-exists in live production environment)
  • Pros
    — Zero downtime
    — Easy to setup and no additional infrastructure cost
  • Cons
    — No way to manage user traffic across old or new nodes
    — API/Code needs to be backward compatible (especially database schema changes) – as both version of code should be able to work in production at same time.
    — Rollback is slow.

2.0 Blue Green Deployment

  • A complete new copy of production environment is created and new application is deployed on it.
  • Both new and old system continue to run in parallel, with all user traffic still going to the old application.
  • When application team has confidence on the stability of the new environment, the loadbalancer/router then routes all user traffic to the new application.
  • Both application may use the same database backend, or use a replicated database – so that both environments are always in sync.
  • Pros
    — Production supports only 1 active version of application at a time.
    — Rollback can be done instantly by routing all user traffic to older application again.
  • Cons
    — Infrastructure cost (as complete replica of production environment needs to be set up).
    — Overhead is more, as for some time two environments needs to be monitored

3.0 Canary deployment

  • It is almost similar to blue green deployment – with the only difference that only a small set of users are first switched over to new application.
  • Instead of making the new application version completely live – only a subset of users are slowly moved to the new application.
  • Users to be routed to new application can be
    — random – e.g. route 5% of user traffic to new app
    — based on geographical location or IP Address range.
    — or maybe released only to set of internal users, etc.
  • Only when sufficient testing is done on the new release version by the subset of users, the application is made live to everyone.
  • Pros & Cons – almost same a Blue-Green deployment, with the added con of very slow upgrade to new application version (time to go live)

Since the main idea of canary deployment is to expose only a subset to users to new application, it does not always needs to be done by creating a completely new environment.

Canary deployment can also be done “instance-based“, where the new version of app is deployed on selected application instances, and all requests from the subset of users are redirected to this new instance.

Below diagram depicts another way we can do canary deployments.

The term “canary deployment” comes from an old coal mining technique.

Coal mines generally would contain dangerous gases like carbon monoxide, which are harder to detect by humans and can affect the miners. Canaries were found to be more sensitive and a more effective indicator as they showed more visible signs of distress.

Miners would then send canaries as early detectors to ensure miner’s safety.

4.0 Shadow deployment

  • In shadow deployment, a new shadow environment is created for new version of application (just like in blue-green deployment).
  • However the entire user traffic is routed to both new and old applications – basically fork out all incoming requests to older application and send them to newer application.
  • Pros
    — can do performance testing for new application under actual production load.
    — can delay actual rollout of new application till is fully stable
  • Cons
    — infrastucture cost increases
    — complex to set up
    — have to avoid side effects, e.g. if payment request is routed to both application, then payment should not be deducted twice.