Part 5: Docker Swarm with HAProxy for highly available services

Ensuring your services are up and running when building monoliths is usually straightforward. If you keep the server(s) on which this monolith is deployed running, all should be well. However, moving to a microservice architecture will quickly reveal this flawed approach. You need a platform onto which you can deploy a microservice and let its internal intelligence handle high availability and scaling.

Docker Swarm vs Kubernetes vs (X)

We are not oblivious to the fact that the majority of DevOps engineers have, by now, gathered in the town square, clenching their pitchforks, chanting all kinds of ancient spells after reading the heading. But as with every choice we make, many factors must be considered. Chief among them was ‘time to production.’ We needed to come up with a solution where developers, who at the time were on very tight timelines, could deploy microservices in the form of standalone jars to a production environment. And we needed to do this fast. The simplicity with which we could deploy a Docker Swarm to serve this need fits our client’s pressured timelines. Given more time, we will probably transition to Kubernetes, but at the time of writing, we have not encountered a workload that Docker Swarm could not manage.

Simplicity gives you speed

As mentioned, setting up a Docker Swarm cluster is not very difficult. An abundance of how-to’s available online will get you up and running in no time. In this instance, we installed a six-node cluster: three manager and three worker nodes. The three manager nodes are needed to have a quorum and avoid any split-brain situations. So a cluster count of three is essential. We can now easily scale the number of worker nodes to however many we want and need to. In this particular instance, it was not required for more than three worker nodes. We have a Keepalived instance running in Master-Backup-Backup mode on the three manager nodes. This gives us a single floating IP that GitLab can connect to in order to deploy the services.

The type of services deployed

For this project, the customer needed to deploy consumer and producer applications for the Kafka environment. These were merely Java jars that needed a place to run. In other words, they were not exposing any ports on which they would serve data to clients, etc. These jars were compiled on predefined Docker containers running specific Maven and Java versions. Our tests showed that we could reduce the deployment time to about three minutes by caching the Maven artefacts. This time is measured from when the code is pushed to GitLab to when Docker Swarm registers the service as deployed. This is a significant improvement over previous deployment strategies for this particular customer.

What about a service exposing ports and data?

Once you start deploying services like a web server, a Tomcat server, or any API service that is exposed on a certain port, you will need to have a point of entry into the Docker Swarm environment for your clients to reach the service. The best way to do this is with an ingress load balancer. At HBPS, we are very familiar with HAProxy as a load balancer. It has served us well in the past, so naturally, our go-to for a load balancer in this case was also HAProxy. There is an excellent write-up of the deployment strategies to inject HAProxy into your Docker Swarm environment. After reading through the documentation, we decided that the best choice for this customer would be to deploy HAProxy to run on a single node (at a time) to reduce the amount of east-west traffic it would be hammered within other deployment types. HAProxy was set up to use the Docker DNS service for service discovery and route the incoming traffic to the relevant backend based on the incoming header packets. At the time of writing, we were working on a strategy and deployment to employ the data plane API to make changes to the HAProxy backend. We were also working on getting Keepalived to work with the HAProxy container. The idea behind the latter is that a Keepalived check script needs to check if the HAProxy container is running on the worker node, and if it is, it needs to attach the floating IP to this worker node. If not, it must enter the backup state and leave well enough alone.

Putting this whole process together

The workflow is quite simple. A developer would push code to their producer/consumer project in GitLab. This push will trigger a rebuild of their Docker image that contains their jar, and once the build is done and successful, it will deploy it to the Docker Swarm cluster via the GitLab pipeline. A script written by HBPS will use the service name of the service it just deployed and use it in another pipeline to change the HAProxy configuration. As mentioned earlier, this last part was still in development at the time of writing.

A note on monitoring/management of Docker Swarm

During our research, we came across a very powerful piece of software called Portainer. This software can be used to manage Docker Swarm clusters, standalone Docker instances, as well as Kubernetes clusters. It can, in fact, also create Kubernetes clusters on your behalf, which we are aiming to do soon. We got in contact with the sales team from Portainer and were very impressed by both their open-source approach and their pricing. Their prices for a pro version of the software are impressive. We will be engaging with them in the future on any projects where we will be deploying Docker Swarm.

Conclusion

We decided to use Docker Swarm mainly due to time constraints, but we don’t regret it. More work goes into Docker Swarm regarding service discovery, but we felt it is still worth it. With the Docker Swarm environment, we have complete control over the environment and were able to deliver a production platform in a matter of days.