You’re a budding new startup that’s joining the technology scene in 2020 (or frankly, in the last 5 years) and you’re architecting your CI pipeline. You’re looking to move fast, and you want to go with what’s familiar. You work with someone that’s a Bash ninja, a Makefile enthusiast, and connoisseur of Groovy so Jenkins seems like the logical choice, but I’m here to tell you it doesn’t have to be that way. Here are ten reasons DevOps engineers pitching Jenkins provide themselves with job security and don’t want you to know there are better solutions to your problems. Don’t fall into the ‘everyone uses Jenkins’ trap. Most people use Jenkins, and everyone hates it.
For all intents and purposes, the Jenkins UI hasn’t changed in over a decade. Look at screenshots of Jenkins, Hudson, Cloudbees, or whatever variant of Jenkins you’ve used. There’s the Blue Ocean project, but it’s in this constant state of limbo where half the features of classic Jenkins haven’t been ported over in the past 3 years, so you’re continuously going back and forth.
Gone are the vestiges of a “full stack” engineer. Engineers want to write either frontend code or backend code. With Jenkins, you’re forced to learn another language specific to your CI system that can’t be run locally. Every Jenkins tutorial begins with determining whether the examples are written in declarative pipeline format or scripted pipeline syntax. After you’ve found the appropriate example, go to your favorite pipeline and proceed to paste in your groovy into the replay section over and over until you receive a passing run. Success? Copy paste your replay into a PR, and don’t bother with linting, because it’s a monumental effort.
So you’ve doubled in engineering headcount and your Jenkins queue fills up, and your engineers are clamoring that their PRs are taking too long to run. You’re on AWS and you start googling about how to spin up new workers dynamically, and lo and behold you stumble upon plugins such as the Swarm plugin and the EC2 Plugin. The EC2 plugin is relatively easy to install, and you provide it with credentials to be able to create new workers. Everything seems fine on day 0, engineers are happy with how the workers are spinning up and down and the queue is humming. In fact, you set the plugin up to use spot instances, so you’re saving the company money on cloud costs on Jenkins workers. At this point, a week goes by and everyone is more than pleased with your efforts on this. Suddenly, you’re intermittently losing connectivity to your new EC2 Plugin provisioned workers, and they’re not reconnecting again. You restart Jenkins and it doesn’t recall which workers it provisioned, and now you have orphaned EC2 instances in your account that you have to manually clean up. The spot prices on your instances have changed and your spot requests are no longer being fulfilled. This plugin hasn’t changed in 5 years, you remark.
So you want to backup your Jenkins configuration because you’re a responsible DevOps engineer. You begin researching this topic to realize in 2020, no one has established best practices around this area. There’s a hodge podge of plugins out there that seem to accomplish the task, but they aren’t well maintained, lack many basic features, and don’t backup to the cloud. You create a versioned S3 bucket to periodically push to and enable EBS snapshotting in case all else fails. Isolating which change hosed your Jenkins installation is still a stab in the dark.
Every Jenkins user wants to install or update their favorite plugin because a new feature came out they must have. A plugin upgrade scares DevOps engineers because it’s a single point of failure, there isn’t a simple way of reverting any plugin that has a large dependency graph and plugins are infrequently upgraded as is. This process entails restarting Jenkins during off peak times, snapshotting your Jenkins volumes in order to allow easy reverts, and spot checking unique Jenkins pipelines to confirm that things work as expected. The next day, you wake up to slack messages from engineers remarking that they’re unable to run their tests because of a pipeline plugin upgrade, because you didn’t examine the changelogs for all ten of its dependencies. You haven’t lived until you’ve manually uploaded an HPI to pin a plugin version that’s compatible with your master. Eventually, the only upgrade strategy that becomes reasonable is spinning up a new master and having your developers migrate over at their leisure.
You work in an engineering organization of 100 engineers, and your Jenkins goes down. You scramble to restore a snapshot to the EC2 instance where your Jenkins master runs. Your post mortem action items involve investigating how one can run a Jenkins Master in high availability mode. You dive deep into this topic to realize the architecture of Jenkins allows it to only run a single master, requiring you to duct tape solutions together such as a standby failover Jenkins employing an EFS volume. Trying this solution results in slow git clones for your mono repo as you realize EFS isn’t as performant as you expected, and you delve into running a git reverse proxy that creates git reference clones in your EFS volume to speed up the process. This seems like too much work, so you begin investigating potential enterprise solutions and realize they’re splitting masters based on teams, or running hot-warm, all essentially a bandaid to the problem.
Tuning your JVM and Speeding up your Instance
Occasionally, your Jenkins master starts slowing down because of the volume of builds that you’re running. Engineers report that pages are loading slowly, regular actions are taking several seconds to occur, webhook events are delayed. You begin by looking at the usual suspects: investigating usage, saturation, and errors and don’t really observe anything that’s outside of the ordinary for running a Java application. Diving deep into the rabbit hole of Java, you discover that as your Jenkins master grows, it’s imperative to tune its JVM flags, adjust the Heap size, turn on durability settings, and switch over to an IOPS SSD.
There’s a faction of engineers that think placing something in Kubernetes will make it more robust or easier to maintain. “Oh — there’s a Helm chart for Jenkins” is a common paradigm I’ve often heard. Upon first glance, running Docker in Docker seems like a completely feasible solution, but when you’re bringing up containers that hang or refuse to be killed, your Kubelet will quickly become unresponsive. When your Kubelet dies in the middle of the day, and you realize it’s unable to reattach the EBS volume to the new Jenkins master pod, you’ve already wasted an hour of engineering time not realizing the pod came up in a different AZ than your EBS volume.
Emulating your CI
Reproducing your Jenkins CI setup is a near impossible feat. Determining whether you have an actual bug, as opposed to a Groovy script error or some disparity with your AMI is often the case. Unless you’re running everything in Docker, there doesn’t exist a “runner” that allows you to emulate the steps your Jenkins pipeline takes on your local box.
Changes across Multiple Projects
Let’s assume that your DevOps team didn’t have the foresight to create shared Jenkins pipelines before you began all your projects. At a certain point, several projects will be extremely boilerplate, but will require tons of copy and paste. Eventually, you’ll run into a problem where you’ll have to change one line in multiple Jenkinsfiles. You’ll embark upon the task of reducing this code sprawl, and realize every file has multiple snowflake changes where this line is used. Shepherding this change has gone from a simple find and replace, to essentially writing a poor man’s syntactic parser, as opposed to other CI solutions that use YAML or JSON.
You deserve better — there are multiple open source projects such as Buildkite, Drone, Concourse, JenkinsX or GoCD. These tools bring a modern approach to CI by offering a declarative syntax, sane storage options (by storing your configuration and runs in a database), the ability to reproduce your CI runs locally, plugins for common CI scenarios, horizontal worker scaling, and frameworks to allow you to add any in house abstractions you may require. If you’re not interested in rolling your own, there are SaaS solutions such as CircleCI, Travis, or Harness, GitLabCi, TeamCity, and cloud platform solutions like Cloud Build. All offer more fine grained control of your pipeline with easier configurations and more maintainable platforms.