How do you measure DevOps Success?

Part of any successful campaign to win people over to doing things in a DevOps style usually requires some sort of promise of it being a better way.  So how do you prove you are seeing benefits as you start to implement your plan?  How do you quantify all the unlogged hours you put in doing things manually after you have just automated it.  Sure you and maybe your boss will notice you have reduced your hours down to 80 hours a week, but what about the people at the top?  The ones who ultimately need convinced to let you keep spending time on it?

It's not that hard to show if you have a change and incident management process of some type.  For instance, ITIL has built in some basic but useful stats that can be tracked.  For instance, number of incidents over time or number of incidents in a period after a change.  If you are not that formal then hopefully you are still tracking issues, bugs, trouble tickets.  If you aren't even doing that then you need to establish some change management system.  It can be as simple as a bug tracker or E-Mail's to the people involved with the change.  I have also seen it done in some monster sized spreadsheets.  If you don't think you need to then go back and read my previous post about the need for change management.

You don't always get results on the first couple of things you change, it depends on where you start.  One of my favorite things to start automating is the account creation process.  I just hate having to login to more than a system or two to add a person, but that time is hard to show when it's just part of your routine. So a change management ticketing system will let you show that you were able to complete the ticket in two days. It shows that you added two people to the servers without automation and also your other work.  With automation I normally only spend a few minutes adding users.  So the time the ticket is open drops to less than a day.  If you do say ten user changes a week then you will have some serious time savings to show.  Even one or two changes can be an incredible time saver.  

The other big way change management can show your leaders that they are getting what they paid for from your automation is after change related incidents.  This plays a lot more into the we are only human problem that automation can help you solve.  Forgetting a simple configuration may not show itself right away.  For example, not resetting the Java Heap Size properly causes your companies website to crash at lunch every day.  It's a simple fix and restart but it will impact a lot of people.  Adding lines to an automated script that change the setting and prevent it from happening again will increase customer confidence. Most importantly, it will let you have an uninterrupted lunch!