Is your automation stuck or siloed?
Watch this on-demand webinar to get you automating like a pro, demonstrating the importance of simplicity, and delivering hands-on guidance. The session also discusses Cutover’s Automation Runbooks, our solution for fully automating processes in a way that extends the value of your existing toolset. Watch now
This is part 4 of a series of installments of our white paper: How to create an Automate-first culture without losing control. Missed part one, two and three? You can read part one here, part two here, and part three here. We will be releasing the entire paper in installments on the blog - but if you want to read the full piece - you can download it here.
In the last blog, we concluded with the question: how do you accelerate the change management process in line with the modern digital economy and its demand for new features. In the final installment in this series, we explore the answer to this question, and then outline the key questions that were asked during the session by the listeners.
Part 4: How do you leverage automation to accelerate change management?
‘Data is your buddy’
Data is how you make smoother changes to your processes. Performance of automation should be measured in a quantifiable way, for example, are you achieving the throughput of volume that you want? This gives you an objective viewpoint on whether or not the introduction of automation has been successful.
The first step is identifying the objectives of your initiative, before then identifying the results and thresholds you need to see to confirm success or failure:
- You want to make sure it works as it did previously (unless you were specifically looking for change)
- You want to make sure that new features and functionality work as expected. To verify this, you need to be able to monitor them, and test them as required.
Once you have the data in place, and you can measure this in real-time, you’re able to demonstrate whether you’re able to meet expectations, and only degrade within acceptable thresholds. This process shortens CAB and minimizes the number of people who need to be involved. It is definitive: you can now approve or reject based on data, and crucially, without discussion. Moreover, if something does fail, the conversation becomes even more valuable because it involves an evaluation of business and technology. Conversations look like this: ‘Okay this is lower by X - what do we want to do?’, ‘Does the value of functionality exceed the pain?’, ‘Let’s go forward as long as this is fixed in 48hrs, which is better than going back to the drawing board’.
Don’t bend tools out of shape
Many organizations trip up by attempting to bend one of their myriad of tools to perform in a way that is beyond (or completely different to) what they’re meant to do. It’s extremely tempting to extend the usable life on a product that is specialized, well-used, well-liked, and well-understood. Sometimes you can get away with it, but oftentimes you can’t and it can cause complications and risk. For effective automation, you need to ensure you get the right tools for the right job.
This is where we go back to the goal state of the ‘orchestrator of orchestrators’ which aligns and works across existing toolsets. As we mentioned previously, it avoids the need for ‘rip and replace’ automation, and by bringing in a solution for cross-tool automation and orchestration, it also prevents the tendency to bend existing tools out of shape to force coordination or collaboration.
The session concluded with the following questions on automation:
Q: What is legacy?
A: A lot of companies that start today have no requirement for a data center or comms room. They’re building directly into the cloud, not running their own infrastructure. When the term ‘legacy’ is used, it typically describes companies that have a pre-existing set of technology that they’re running themselves, where the responsibility for care and due process is on them. The maintenance processes for cloud-native services are usually administered by the provider, e.g. AWS, Azure, etc.
If you are running a hybrid environment, it’s likely you have some legacy that needs consideration. That said, there is a notion that cloud-native or startups don’t have any legacy, and this is simply not universally true. While it is the case that they will have significantly less legacy infrastructure, depending on how they grow, the pace of acquisition, and how they maintain their environment, there is still a high possibility of acquiring legacy IT (whether on-premises or in the cloud) that poses challenges and demands attention.
Q: How do we make sure we’re automating the right piece/place of the change process, rather than the low-hanging fruit, or the ‘fun to do’ pieces?
A: It’s very tempting to pick the ‘easy bit’ when it comes to automation. In our experience, we’ve certainly seen this a number of times, people pick a bottleneck, automate one part of it and move the bottleneck on, or they choose the ‘fun bit’. There’s nothing wrong with doing it as a start, and the practice of starting small rather than going into a massive exercise requiring significant documentation is very sensible.
Ultimately, you can’t pick and choose for too long in isolation, as it gets you into a tight spot. This is also where teams are likely to use tools beyond their capabilities because it’s easy, and begin the process of stretching solutions out of shape because it’s convenient. It’s easy to start with what you promise yourself is a ‘prototype’ and quickly lead yourself down a blind alley. Again, data is your friend in terms of knowing where the right place to go is. Make sure you have a full view of your environment, interdependencies, bottlenecks, and an understanding of your ‘brittle’ areas, gather your requirements and practice change management based on this data as outlined above. Keep the question of ‘why are we automating?’ at the forefront:
- If the answer is to remove human error, it’s important to understand your history quantitatively and qualitatively: where does the process break down? Where are you getting the errors/issues? Then, count how fewer errors you get as you automate and, if you’re not seeing the reduction you need, then it’s time to reassess.
- If the answer is to reduce time, it’s important that you know how long each step of the process is taking. Use your data to decide where to automate/augment your process to reduce time. Then, once you’ve implemented it, time it, to make sure it’s effective.
- If the answer is to reduce costs, don’t forget that this is also related to time. Here, Jim gives us an anecdote as an example: He was working for a bank, having come in as Head of Run Ops. He was told that they had test scripts for all of their apps. He thought at first that he was extremely lucky - but then they gave him a 600-page book, a physical script. This meant that the test cycle was incredibly expensive: it would either take one person six months, or they could have a team of 50 people poking and prodding the system.
When it comes to achieving all three benefits, it’s about understanding where we are lining up resources across both human and machine elements. If you do this effectively, execution time comes down, and errors, costs, and duration reduce - but in order to both execute and validate this, getting the metrics is key.
Q: How do you get visibility?
A: It is really important to have a UI for the business. The alternative is looking at a developer’s green screen to see what’s coming down the line and, as you can imagine, this is far from ideal from a visibility or practical standpoint, nor is it scalable.
With a collaborative, comprehensive, intelligible UI, automation is no longer resigned to the technical corners of the business, and it can readily involve other teams. For example, take a release process: you can clearly build and see when you need to involve Marketing, Client-facing teams, or Support, and what’s required from each team. Add in-application communications, so that Marketing can send a Slack when their comms are ready, or Support are notified via email about knowledge base articles, and automation is now running the collaborative process. It’s easy to miss the context or the importance of the interdependencies without visualization in a UI - if it's a point of time in an IM - or if the end-to-end process lives in 5-6 tools. Having a full overview of the whole piece is invaluable to visibility, as you can now isolate and identify what’s not going well, and aggregate quantitatively when things are taking longer than expected.
Q: You talked about automating for unplanned events, but what about ‘planned unplanned events’ like Incident Management or Disaster Recovery?
A: Disaster Recovery should be approached as a planned event - you just don’t know the start date. For most organizations, the decision to move to the DR site is the last resort, as teams are worried about the risk. To remedy this, practice and familiarity of both your plan and the automation involved is so important. If you get to the point where you’re able to do it regularly and confidently, it becomes the first port of call, and gives you time to look and see what’s happened.
For incident management there are two types of incident you must consider:
- An incident where you know what’s gone wrong. For example, you get a notification that a server is down, or a system is dead. These are the incidents you can automate fully, you just need to have the building blocks: a process, a plan, and run through your steps.
- The other half is when you have no idea what’s causing the issue. It might be intermittent, or perhaps there are lots of alerts, and you can’t figure out what the trigger point is. This is when you need a human orchestrator who leverages the support of automation. For example, an app is made up of X number of components - health checking all of them should be automated. You could use people for this, but you won’t get the same rapid response or efficiency, as it takes more time to organize and mobilize people. The problem with a manual, people-first approach is that in a world where there is the expectation for full uptime, every minute of downtime is a minute where your customer could be going to your competitor. As a result, you no longer have the luxury of a ‘cold start’ for incidents.
Do you need an ‘orchestrator of your orchestration tools’?
If you’re looking to build effective automation into your existing resilience and release programs, Cutover can help extend the value of the tools you’re already using - enabling collaboration, execution, and orchestration across tools, teams, and timelines. See our automation runbooks in action, by scheduling a demo with our team.