Anders Bengtsson has done it again and written another stunning blog post about validating your runbook designs.
This ties in nicely with a discussion that I was having with a customer yesterday where I was stealing a comment made at the Best of MMS 2011 UK event where Adam Hall (I'm sure it was Adam, if not I apologise to whoever it was) coined a phrase about a day of design roughly equates to an hour in the runbook designer.
The point I was trying to get across to the customer is that while setting up SCORCH (or to some extent Opalis) is relatively simple and you can dive into the console and knock up runbooks very quickly too, it's very important to take that step back, map out your process first fully, exploring all angles, then create the runbook from this plan, and then like Anders says, build it out with resilience, checking and logging.
Back to the initial point...
Anders now has a trilogy of posts that I would really recommend reading, even if you feel really confident creating runbooks, I'd bet you don't use half the methods fully like Anders recommends.
Post 1: Fault Tolerance in Runbooks
Post 2: Building a log for Runbooks
Post 3: Validate your Runbook Design (He also includes a runbook which automates some checking!)