Backing out of a release

Nobody likes to delay a scheduled release. However, risk management is a big part of release management, and sometimes it just makes sense to abort and reschedule. When deploying applications to a production environment, introducing unknowns can potentially have widespread impact. In some cases, the negative impact can far outweigh the benefits of troubleshooting and tweaking to push an application update out.

Following are some tips for helping to mitigate the possible negative consequences of a failed production deployment. This posting primarily applies to deploying web based applications to a production server environment hosting multiple applications.

Use a maintenance window – A key step in mitigating risk and minimizing user impact is establishing a “maintenance window”. Identify when server traffic is typically low, and block out a time frame for deploying applications or application updates. Regardless of whether the build/deployment/release is automated or not, it’s useful to have a set time when changes to the system occur. A release window adds a certain level of objectivity and predictability to system changes. It offers a boundary to risk exposure in that risk is minimized by literally limiting the scope of the time you’re systems are being changed. An additional benefit is the ability to notify your users that this time is when system updates occur, and they needn’t worry if they experience issues while attempting to use online resources. Depending on your user base, this can be helpful in reducing unnecessary trouble tickets or bug reports.

Ideally, most everything would be automated, happen almost instantly, and you’d have elegant, intelligent maintenance pages/handlers to gracefully redirect users to a “this application is being updated” page, but that’s a topic for another posting. Then again, ideally, app deployments would just auto-magically happen all by themselves, all the dependencies would always be in place, and config files would never mysteriously have the QA values still in them (“I could have sworn I ran the ANT task to update that file..”). Wouldn’t that be cool?

OK, so you’ve got a maintenance window – now what? How to decide when to investigate, troubleshoot, and make the darn application work right whether it wants to or not? When to say “Uncle”, back out of the release, and hang your head in shame?

Estimate time to complete – Accurately estimating how long a particular task or automated job takes to complete is essential to managing smooth application deployments. Deciding when to back out all comes down to accurate estimates of how long tasks take to complete. Consider the following when coming up with estimates: how long does the actual deployment take if nothing goes awry? how long would it take to double check some basic dependencies or configuration files? How long would it take to back out and undo all changes and restore the system to it’s pre-release state (worst case scenario)?

The rest is simple – if you’re deep the in release window and things are bumpy, ask yourself, “Can I still backout of this before the window closes?”. This is your “point of no return”. If the answer is yes, keep troubleshooting and get things sorted out. If it’s a close call, take your medicine and back out.

If you have somehow managed to get past the point of no return, well, you better make sure you get everything running ok (without introducing new problems!) before the window closes, or else you might as well just start updating your resume.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: