Managing Failure in DevOps for SAP
Few of us like to fail. OK, there’s a certain subset of entrepreneurs who carry a copy of the Lean Startup in their back pocket at all times and are adamant that failure is the only way to learn (it’s character-building!), but most of us try to avoid it, usually for some pretty solid reasons.
The trouble is, there’s no getting away from the topic of failure when we talk about DevOps. It comes up pretty often. You may have heard one or more of these phrases, for example:
“Move fast and break things.” OK, I think we’re beyond this one. Even Facebook doesn't use it any more. But it does capture how some people still see modern software development.
“Fail fast; fix fast.” Hmmm. Less contentious, though possibly still scary. There are definitely plenty of people who see this outcome as part of DevOps. But do we want failure as a planned result?
“A culture of continuous experimentation and learning.” Ah. A bit of a trick, this one of the ‘3 ways of DevOps’. After all, how are we going to learn if we don’t fail sometimes? In which case, doesn’t continuous learning sound like we might be failing a lot? Well, not necessarily - see below.
I’m sure there are more in the same vein.
But more importantly, what does this attitude mean for DevOps in SAP? Can we afford to accept that we’ll get things wrong in mission-critical systems, when the business might stop operating if we do?
Well, obviously not. But I’m going to argue that’s the wrong way of thinking about failure in the first place. A DevOps mindset doesn’t mean we’re happy to break stuff all the time because we think velocity is more important than software quality. Rather it’s about how we balance our perception of risk against business value, how we manage risk in order to minimize it, and how we’re set up to respond if the worst does happen.
(I might also argue that there’s no real distinction here anyway. After all, a lot of (maybe most?) SAP deployments result in errors and incidents when the term DevOps hasn’t even been whispered in a Basis manager’s ear. But that’s another story.)
So to put this another way, how can we move fast without breaking things, and fix them fast on the occasions when we do?
Here are a few recommendations.
1. Recognize that failure isn’t always technical
We begin with that DevOps mindset question. Talk to technical people like developers and solution architects, and when you say ‘failure’ the automatic assumption is that something technical has been broken. That’s fair enough I guess, given technical issues are what they spend much of their lives working on.
But failure in DevOps can be about outcomes, not just code. Failure might mean shipping a feature that didn’t work the way users expected. Or one that didn’t deliver the outcome they were looking for. Or one that just wasn’t engaging enough for them to use at all.
That’s where something like ‘fail fast; fix fast’ can be a positive statement of customer-focused intent rather than mere acceptance of risk. Contrast this with ‘move fast and break things’, for example. The ability to quickly address ‘failures’ in DevOps is what allows us to adopt that culture of continuous learning and experimentation - in SAP as much as anywhere else.
2. Break down your requirements
If you’ve adopted DevOps you should have this in mind anyway, but it’s fundamental to the idea of moving fast and safely. You’re setting yourself up for failure if you try to deliver more quickly and often but don’t adjust your approach to requirements accordingly. You’ll either fail to deliver on time (which makes the whole exercise rather pointless) or you’ll be stuck in the bad world of moving fast and breaking things.
This isn’t easy, especially when large chunks of functionality have to be delivered. Many organizations have adopted specialist backlog management tools like Atlassian Jira to help them manage these new ‘user stories’ in an effective and consistent manner. Such tools should be connected to your SAP DevOps automation to ensure SAP teams also get the benefit.
3. Make sure quality doesn’t wait until after development
I didn’t want to write ‘shift left’ again here because if you’re investigating DevOps you’ll see that phrase a lot (on this site as much as anywhere else). But to be fair, that’s because it’s an important part of a DevOps approach. If you can identify issues sooner you will not only be able to address them more quickly and easily, but you’ll also avoid the often significant delays created by repeated handoffs between Dev, Basis and QA teams. Plus you’ll stop distracting the QA team with avoidable issues, so they can focus on the important stuff!
This will mean making some changes to your process. Mandatory peer reviews are a very good start. You might want to consider Test Driven Development. Baking in some unit testing is probably a sound idea. The dev system is a good point to check for code quality and adherence to standards, too. And so on. The right SAP DevOps automation will enable you to do most of this automatically, and at least track the rest.
4. Be sure the right people are saying yes
The whole idea of shift left is arguably a bit pointless - and at the very least undermined - if the people making changes are able to approve them regardless of the results of whatever checks you build in. But it’s surprisingly common for deployments to QA to have significantly less rigor than those further down the line; developers approving their own code is hardly unheard of and it’s a risky approach, particularly if people feel under pressure to deliver fast in the new DevOps approach.
You need a way to ensure correct segregation of duties when it comes to approvals across your SAP landscape, including in development systems.
More than that though, you really want a process where the right individuals sign off every change at each stage according to project, functional area, SAP system, and so on. You might even have a specific process for changes flagged by your DevOps automation as risky. You almost certainly will for emergency changes (which also need to be safe and audited). For the best combination of speed and safety you should be able to build custom workflows for different types of change, which automatically assign the right approvers and notify them at the appropriate point.
5. Stop making technical deployment errors!
Overwrite, overtakes, sequencing and dependencies. There’s a whole category of technical deployment-related issues that can stop Production systems from working, even if the code being delivered is gold standard. Which is why many firms spend so much time and effort trying (and usually failing, to some degree) to get them right.
But if you want to move at high speed you simply can’t afford to spend hours or even days poring over spreadsheets to build the right level of cutover confidence. This is the kind of thing that DevOps - and especially more advanced CI/CD pipelines - demands you automate. Customers using our DevOps automation have been known to completely eliminate unplanned downtime.
6. Have an escape plan
Sometimes things just happen. People make mistakes. Production systems might have unique config. Stuff gets hard-coded, even when it shouldn’t. The most rigorous development and delivery process might still contain opportunities for things to go wrong, albeit far, far less often.
But if you’re deploying far, far more often, even that slight chance of failure grows over time. That’s why Mean Time to Restore (MTTR) is a more recognized DevOps metric than Mean Time to Failure. If we’re going to move at pace we mustn’t be held back by the fear of unintended outcomes.
So how can your DevOps process accommodate failure when it can’t be avoided? You need the ability to roll back SAP production deployments quickly, similar to the way in which Git-based non-SAP pipelines would revert to the most recent good build.
Automation is the key to managing failure
If you’ve got this far, the role for automation in helping you is probably pretty clear. It’s fair to say that a manual DevOps pipeline would be exceptionally unusual for other software, and the requirement in SAP is no different. You just need the right SAP-specific automation like ActiveControl from Basis Technologies. Our team of experts are ready to discuss how it could help you to manage risk and avoid, and deal with, failures of all kinds, so why not get in touch?