Sunday, December 23, 2007

Deployment: December 23, 2007

The last deployment was a success. Well… sort of. We added in some new functionality which was pulling additional information from a back-end system. Unfortunately, that system wasn’t able to handle the extra load. And, because of the Christmas season, all of the systems were being taxed more heavily than usual, so there was a danger that this extra load would be enough to start causing crashes.

So it was decided to do an “emergency deployment,” and remove the functionality. The back-end system would be going through a change, the month after, which would make it able to handle this load, so the next version of my app would have the functionality reinstated.

It sucks that we had to do a deployment on the weekend before Christmas—when I was supposed to be on holidays—but that’s the way it goes, sometimes.

1:30AM: We all logged onto the conference bridge, and confirmed we were ready to go.

1:35–1:40: There were no database changes, this time around, so all we had to do was redeploy the application itself. We did so.

1:40–1:50: The application was back up, and we did a quick Sanity Test, to ensure it was working. The Sanity Tests passed.

1:50–2:05: The client did their Landing Tests. Again, the tests passed.

Sunday, December 2, 2007

Deployment: December 2, 2007

Finally. We finally got this thing deployed. After all of the false starts and number of times the release was deferred, it seems anti-climactic to have such a short post for this release, but the fact is, when we finally got a chance to deploy this thing, it went without a hitch.

1:30AM: We all logged onto the conference bridge, and confirmed we were ready to go.

1:35: We shut down the two applications that we had to deploy, for this release—the “front-end” app and the “back-end” app.

1:40: We backed up the database for the front-end app, and began the deployment of the back-end app. I can’t stress enough the importance of having a good, solid deployment plan, so that you can execute tasks in parallel like this, and not worry about losing track of who’s doing what!

1:45: The backup finished for the database, so our Database Analyst (DBA) began executing the new database scripts.

1:45: As the DBA executed the DB scripts, the back-end app was taking a bit longer than expected to come back up.

1:50: The back-end app came back up, and the DBA finished executing the scripts. We did our Sanity Test for the back-end app.

2:00: Sanity tests for the back-end app passed. We now began the deployment of the front-end app. (Because it depends on the back-end app, we had to ensure that the back-end app was up and running properly, before bothering to deploy the new version of the front-end app.)

2:05: The front-end app finished deploying, and we began our Sanity testing. At this point, we were about an hour ahead of schedule.

2:20: Sanity testing finished. We now got the clients to begin their Landing tests. We actually had to call some people, and get them to join the bridge early, since we were still ahead of schedule.

2:20–5:00: We performed Landing tests. We turned up two defects, but they were deemed minor enough that we could leave the release in, until a fix could be found.

Saturday, December 1, 2007

Deployment: December 1, 2007

The investigations into the back-end system have completed, and they believe the problems were caused by a problem with the hardware load balancer, for the back-end system. They’re making the change on the morning of December 1st, which means that we’re being deferred yet again.

Assuming that all goes well with the changes to the load balancer, we’ll go in Saturday night/Sunday morning, meaning December 2nd. We’ll have a go/no-go call at 5:00 PM Saturday afternoon, to make the decision.

And just to make everything even more fun, the email servers were down all day Friday, so updates couldn’t be sent via email. We were all waiting around to see what would happen, but nobody was able to send updates.

I’m almost afraid to ask what else can go wrong with this release.

“Load Balancer” Defined/Explained

For high-availability systems, we usually want to cluster our servers. That is, instead of having one, very powerful server, we might want to split the processing between two or more servers. Requests can be processed by any of the servers in the cluster. This way, if any of the servers crashes, the other servers can handle the load, until the broken server is fixed.

However, most client applications can’t deal with a cluster; they need one place to go to, to get requests processed. So in order to enable clustering, there usually needs to be a load balancer put in place. The client applications only know the address/location of the load balancer, and the load balancer takes care of forwarding those requests to the servers in the cluster.

Load Balancer

Depending on your needs, you may use a software load balancer, or a hardware load balancer. A software load balancer is simply a program running on an existing server, whereas a hardware load balancer is a dedicated networking device, which does nothing but balance traffic between different servers.

Deployment: November 30, 2007

The investigations into the back-end system’s crash were inconclusive. Still a no-go for our deployment. Again, maybe it’ll happen Friday night/Saturday morning, but otherwise, it’ll be Saturday night/Sunday morning.

Deployment: November 29, 2007

The back-end system were were depending on deployed successfully on Tuesday morning, so we were scheduled to go in Wednesday night/Thursday morning. Everything was set, and we’d had all of our go/no-go meetings.

Unfortunately, the same back-end system crashed Wednesday afternoon. We had to cancel, pending investigation into what caused the crash.

Assuming that all went well, we’d go in the next day.