DevOps is a topic very near and dear to me. Its something that I helped organizations with a lot as an App Modernization Consultant in the Cognizant Microsoft Business Group. However, I find it ubiquitous that DevOps is misunderstood or misrepresented to organizations.
What is DevOps?
In the most simplest sense, DevOps is a culture that is focused on collaboration that aims to maximize team affinity and organizational productivity. While part of its adherence is the adoption of tools that allow team to effectively scale, at its core its a cultural shift to remove team silos and emphasize free and clear communication. One could argue, as Gene Kim does in his book The Phoenix Project, that the full realization of DevOps is the abolishment of IT departments; instead IT is seen as a resource embedded in each department.
From a more complex perspective, the tenants of DevOps mirror the tenants of Agile and focus on small iterations allowing organizations (and the teams within) to adjust more easily to changing circumstances. These tenants (like with Agile) are rooted in Lean Management which was borne out of the Toyota Production System (TPS) (link) which revolutionized manufacturing and allowed Toyota to keep pace with GM and Ford, despite the later two being much larger.
The Three Ways
DevOps culture carries forth from TPS the three ways which govern how work flows through the system, how it is evaluated for quality, and how observations upon that work inform future decisions and planning. For those familiar with Agile, this should, again, sound familiar – DevOps and Agile share many similarities in terms of doctrine. A great book for understanding The Three Ways (also authored by Gene Kim) is The DevOps Handbook.
The First Way
The First Way focuses on maximizing the left to right flow of work. For engineers this would be the flow of a chance from conception to production. The critical idea to this Way is the notion of small batches. We want teams to consistently and quickly sending work through flows and to production (or some higher environment as quickly as possible). Perhaps contrary to established thought, the First Way stresses that the faster a team moves the higher their quality.
Consider, if a team works on a large amount of changes (100), does testing, and then ultimately deploys the change the testing and validation is spread out across these 100 changes. Not only are teams at the mercy of a quality process which must be impossibly strict, if a problem does occur, the team must sort out WHICH of the 100 changes caused the problem. Further, the sheer size of the deployment would likely make rollback very difficult, if not impossible. Thus, the team may also be contending with downtime or limited options to ensure the problem does not introduce bad data.
Now consider, if that same team deployed 2 changes. The QA team can focus on a very narrow set of testing and if something goes wrong, diagnosing is much easier given the smaller size. Further, the changes could likely be backed out (or turned off) to prevent the introduction of bad data into the system.
There is a non-linear relationship between the size of the change and the potential risk of integrating the change – when you go from a ten-line code change to a one-hundred-line code change, the risk of something going wrong is more than 10x higher, and so forthRandy Shoup, DevOps Manager, Google
Smaller batch sizes can help your teams move faster and get their work in front of stakeholders more efficiently and quickly, doing so induces better communication between the team and their users, which ultimately help each side get what they want out of the process. There is nothing worse than going off in a corner for 4 months, building something, and having it fall short of the needs of the business.
The Second Way
Moving fast is great but is only part of the equation. Like so much in DevOps (and Agile) the core learnings are defined in a way that is supplementary to each other. The Second Way emphasizes the need for fast feedback cycles or more directly, is aimed at ensuring that the speed is supported by automated and frequent quality checks.
The Second Way is often tied to a concept in DevOps called shift-left, shown by the graph visual below:
It is not uncommon for organizations embracing a siloed approach to Quality Assurance to start QA near the end of a cycle, generally to ensure they can validate the complete picture. While this makes sense, it value is misplaced. I would ask anyone who has built or tested software how often this process ends up being a bottleneck (reasons be damned) in delivery? If you are like most clients I have worked with, the answer is Yes and Always.
The truth is, such a model does not work if we want teams to move with speed and quality. Shift Left therefore emphasizes that it is the people at the LEFT who need to do the testing (in the case of engineering that would be the developers). The goal is to discovered a problem as quickly as possible so that it can be corrected building on the ubiquitous understanding that the earlier a problem is found the cheaper it is to fix.
To but it bluntly, teams cannot make changes to systems, teams, or anything is there is not a sense of validation to know what they did worked. For engineering, we can only know something is working if we can test that is working, hence the common rule for high-performing teams that no problem should ever occur twice.
I cannot overstate how important these feedback cycles are, especially in terms of automation. Especially in engineering, giving developers confidence that IF they make a mistake (and they will) it will get caught before it gets to production is HUGE. Without this confidence, the value provided by The First Way will be limited.
And equally critical to creating the cycles is UNDERSTANDING the means of testing and what use case is best tested by what. Here is an image of the Testing Pyramid which I commonly use with clients when explaining feedback cycles for Engineering.
For those wondering where manual testing goes – it is at the very top and has the fewest number. Manual tests should be transitioned to an automated tool.
A final point I want to share here, DevOps considers QA as a strategic resource NOT a tactical one. That is, high functioning teams do NOT expect QA persons to do the testing, these individual are expected to ORGANIZE the testing. From this standpoint, they would plan out what tests are needed and ensure the testing is happening. In some cases, they may be called on to educate developers on what tests fit certain use cases. Too often, I have seen teams view QA as the person who must do the testing – this is false and only encourages bottlenecking. Shift-left is very clear that DEVELOPERS need to do the majority of testing since they are closer to a given change than QA.
The Third Way
No methodology is without fault and it would be folly to believe there is a prescriptive approach to anything that fits any team. Thus, The Third Way stresses that we should use metrics to learn about and modify our process. This includes how we work as well as how our systems work. The aim is to create a generative culture that is constantly accepting of new ideas and seeks to improve itself. Teams embracing this Way apply the scientific method to any change in process and work to build high-trust. Any failure is seen, not as a time to blame or extort, but rather to learn and evolve.
“The only sustainable competitive advantage is an organization’s ability to learn faster than the competition”Peter Senge – Founder of the Society for Organizational Learning
For any organization the most valuable asset is the knowledge of their employees for only through this knowledge can improvements be made that enable their products to continue to produce value for customers. Put another way:
Agility is not free. It’s cost is the continual investment to ensure teams can maintain velocity. I have seen software engineering department leads ask, over and over, why is the team not hitting their pre-determined velocity. Putting aside the fallacy of telling a team what speed they should work at, velocity is not free. If I own a sports car, but I perform not maintenance on it, soon it will drive the same as a typical consumer sedan. Why?
“.. in the absence of improvements, processes do NOT stay the same. Due to chaos and entropy, processes actually degrade over time”Mike Rother – Toyota Kata
No organization, least of all engineering, can hope to achieve its goals if it does not continually invest in the process of reaching those goals. Teams which do not perform maintenance on themselves are destined to fail and, depending on the gravity of the failure, the organization could lose more than just money.
In Scrum, teams will use the Sprint Retrospective to call to attention things which should be stopped, started, and continued as a way to ensure they are continually enhancing their process. However, too often, I have seen these same teams shy away from ensuring, in each sprint, there is time taken to remove technical debt or add some automation, usually because they must hit a target velocity or a certain feature. This completely gets away from the spirit of Agile and DevOps.
Its about culture
Hopefully, despite my occasional references to engineering, you can understand that The Three Ways are about culture and about embracing many lessons learned from manufacturing about how to effectively move work through flows. DevOps is an extremely deep topic that, regrettably, often gets boiled down to a somewhat simplistic question of “Do you have automated builds?”. And yes, automation is a key to embracing DevOps but less important than establishing the cultural norms to support it. Simply having an automated build is indifferent if all work must be pass through central figures or certain work is given into silos where the timeline is no longer the teams.
The topic of DevOps is well covered, especially if you are a fan of Gene Kim. I recommend these books to help understand DevOps culture better – I list them in order of quality:
- The DevOps Handbook (Gene Kim) – Amazon
- The Phoenix Project (Gene Kim et al) – Amazon
- Effective DevOps (Davis and Daniels) – Amazon
- The Unicorn Project (Gene Kim) – Amazon
- Accelerate (Forsgren et al) – Amazon
Thank you for reading