Blog - Streem

Our Ruby on Rails deployment journey - Streem

Written by Antoine | Feb 22, 2022 6:46:00 AM

While there are plenty of blog posts and technical articles explaining how to deploy a Rails application onto any number of different combinations of platforms, we found limited advice about why choose one approach over another. This post aims to focus on the high-level considerations that guided the evolution of our tooling and environment, highlighting the benefits and limitations we experienced as we progressed.

(Also note that the considerations in this post would equally apply to applications written with other languages and/or frameworks.)

 

 Classic deployment

The most common type of deployment, which we refer to as “classic”, consists of copying the application code onto a metal server or VM then running it via the OS services management system. The steps involved typically look like this on a Linux-based server:

  • connect to server via SSH
  • pull latest code from Git repository
  • run bundle to install/update gems
  • trigger rails db:migrate task to update the database schema
  • define/update a systemd service that runs rails server
  • restart the service to stop the previous rails server process and start a new one with the latest code

These steps are usually automated thanks to tools like Capistrano or Mina, with the help of Foreman or Procsd to manage systemd services for each component of the application (web server, background job processor, message queue listener, etc…)

This server needs to have the correct version of Ruby installed as well as any dependencies such as MySQL client library, OpenSSl, etc…

Once deployed, the main concern of running a production application is maintaining reliability and performance. The most straightforward way to address both these is to run multiple instances of the application across several servers with a load balancer set up in front to spread the incoming requests between them.

Both Capistrano (natively) and Mina (via plugin) accordingly allow deploying to multiple servers at once. Since this can extend the time it takes to complete all deployment steps, they also provide locking mechanisms to prevent conflicts.

This “classic” deployment is fast and straightforward and it got us a long way during the first few years of building Streem, using mina tasks triggered directly by engineers from their machines. However it does have a number of limitations that resulted in growing friction as our systems and team expanded.

 Challenges

1. Consistency

Having a set of specific servers onto which the application is deployed means the OS, runtime and dependencies on each of them need to be kept in sync to avoid any “drift” that could cause a different behaviour on one but not other.

Even with the help of tools such as Ansible and a team effort to keep all our applications running on the same latest Ruby version, this can become a burden with a non-trivial number of servers.

Additionally, with the application codebase becoming larger over time stopping the previous version and starting the new one under load can take several seconds, during which multiple servers or even processes on the same server might run different versions of the application. The Load Balancer has limited visibility on the state of the rollout to decide where to route incoming requests as it relies on a simple health check every few seconds.

Making sure all code changes are backward-compatible and configuring a web server like puma with phased-restarts can help mitigate this, but we still experienced some dropped requests every now and then and gem loading issues due to memory shared between processes.

2. Isolation and sizing

Another issue is the difficulty of sizing servers to optimize performance & stability versus costs, given the difference in resources used by each application component at different times.

For example the load on our background job processor Sidekiq is highly variable as content is ingested in irregular batches, while requests to web server instances from users increase steadily during working hours, and our message queue listeners are always busy but consume very little CPU.

Setting up a dedicated VM for each component makes maintenance and deployment very tedious, while squeezing everything onto one big one put all of them at risk in case of issue (misbehaving process, broken dependency update…) Auto-scaling VMs is possible but difficult to set up and limited in ability to respond to rapid change.

3. Flexibility

Last but not least, one of the key element of our goal to move fast is the ability for engineers to easily deploy previews of any work in progress that product managers and stakeholders can test. This requires keeping even more servers ready for engineers to deploy to without conflict, which is another drain of maintenance time and cost.

Solutions

Thankfully the last decade has seen new approaches to solve these challenges mature and become more widely available to engineering teams.

Containerization

First, the advent of software containers where OS, runtime libraries and application code are all bundled into one image solves the consistency challenge by making sure the exact same code and dependencies are running across any number of servers.

Docker is the most popular tool for this. The Dockerfile that defines the image configuration is itself versioned alongside the application to make sure every change is recorded and reviewed.

Images are stored in a registry from which it can be pulled by deployment servers that will use it to start containers running the application processes. They can also be used for development to ensure consistency across engineers machines.

Orchestration

The deployment of containers to run and expose multiple isolated processes with appropriate resources and dynamic scaling in response to usage are the responsibilities of an orchestrator.

Kubernetes has become the de-factor standard for orchestration thanks to the support of all major cloud providers. We use it via the fantastic managed Google Kubernetes Service. Kubernetes intelligently schedules containers depending on the capacity of available VMs, and can also automatically provision or delete them following resizing of the application processes or temporary deployments.

Processes for each component of our backend application are started from the same Docker image with fine-grained memory and CPU resources and smart autoscaling behaviour such as scaling sidekiq workers by number of jobs in the queues thanks to kube-sidekiq-autoscale.

Advanced load balancing capabilities make it easy to run multiple replicas of the same process in parallel for different purposes (web, mobile, admin…) to provides extra flexibility and visibility, or different code branches and environments for previewing changes.

Example of backend components processes running on GKE

 

Modern semi-automated deployment

Whenever a change is merged into our code repository, Cloud Build prepares a new Docker image and pushes it to Container Registry.

As we appreciated the flexibility of manual deployment we’ve stuck with our old friend Mina and extended it by building the mina-kubernetes plugin that wraps the krane gem, which calls the official Kubernetes CLI kubectl under the hood.

Krane lets us define Kubernetes resources using .yml.erb templates that receive variables such as the location of our Docker repository and which Docker image to use, as well as values for the Rails environment and the Rails credentials key. It then deploys them in controlled manner to a given namespace on the destination Kubernetes cluster.

Kubernetes resources for production deployment

Definition of the web component deployment

mina-kubernetes provides simple tasks such as mina kubernetes:deploy that pushes all the resources to a given Kubernetes cluster. It also makes it easy to deploy the image from a given branch onto a dynamically created namespace for preview testing.

 

Closing notes

Throughout the years we’ve also considered other solutions such as Heroku, AWS Beanstalk, GCP App Engine, etc… but found the trade-offs between costs, flexibility, complexity and proprietary lock-in to be in favour of Kubernetes.

We’ve come across multiple claims that Kubernetes is a very complex system that is overkill for startups, but given most of the complexity is abstracted by GCP Kubernetes Engine we’ve been able to leverage many of its benefits with comparatively less work than the classic deployment approach required.

While it’s not perfect and always a work in progress (one notable improvement would be to fully automate the deployment pipeline with no human intervention required after code merge) we find this deployment process convenient and flexible, which is what matters most to us as we deploy code changes multiple times a day to the different applications that compose Streem’s backend.

Interested in joining the Streem team? Check out our Careers page.