CSA Explains… Disaster Recovery
While many of us are considering how to automate more business processes, create more intelligent appliances, and integrate more technology into our society, a few firms are solving the issues of what to do when these systems fail. At the CSA Explains … Disaster Recovery on 25 October 2002, David Chapa of DataStaff, Brian Wolfe of Laurus Technologies, and David Dickerson of Veritas described the varying levels of service available through different Disaster Recovery approaches.
The business objective of Disaster Recovery is to manage system outages. For individuals, this is often accomplished by backing up our data onto a zip-drive or networked storage disk. For enterprises though, the process is complicated by larger applications, larger databases, higher transaction volumes, less lost data, and quicker restore demands. To address these issues, Disaster Recovery must be planned and managed at the enterprise level.
According to Mr. Chapa, Disaster Recovery should be defined as restoring a business to a predetermined operating level. For instance, What are the service level expectations following an outage? The Disaster Recovery plan should include contact lists, task lists, and an operational manual. Managers should keep in mind that the company’s best DBAs may not be available when disaster strikes.
Mr. Chapa went on to introduce some of the issues to a Back-up and Restore approach to Disaster Recovery. The goal of the back-up process is to take a snapshot of the data in a fast, compressed, and low bandwidth process. While back-up is important, it is restorability that defines success. Some firms find that they have backed-up the years of data to tape, but it would requires an inordinate amount of time to pull the data off the tape and restore the system to an operational state. While tape drives have been improving, some applications require storing the data to a separate disk drive to provide the speed required. Technologically, the industry has been evolving to sequentially solve the next bottleneck in this business process.
Mr. Wolfe followed with an introduction to the Replication approach to Disaster Recovery. Replication is defined by the action of geographically distributing identical data sets and keeping that data synchronized. There are three basic methods for replication: Synchronous, Asynchronous, and Hybrid Adaptive. In the synchronous approach, data is stored in the replica database at the same time it is stored in the application database. This is a form of mirroring the data. In an asynchronous approach, data is stored in the application database and is sent to the replica database to be stored. The process is completed when the replica database sends back a message stating that it has successfully committed the data to storage. In a hybrid Adaptive approach, software controls which of the two above approaches is used at any given time.
While both synchronous and asynchronous approaches preserve write order fidelity (writing the same data in the same sequence to different databases), the asynchronous approach suffers from lacking the most current data in the replicated database. The trade-off that forces Disaster Recovery teams to accept asynchronous replication is latency. As the geographical distance is increased between the application database and replication database, latency is introduced into they system. (Latency is the lag time between hitting save and waiting to take the next step.) For most enterprise applications, end users will not accept systems with 5 second latencies. As such, the replication processes must resort to asynchronous methods.
Mr. Dickerson concluded the discussion by introducing the topic of moving the entire application off site. Due to recent changes in the rules governing financial institutions institutions, banks must be able to withstand a metropolitan outage without loosing their systems. To accomplish this goal, firms are moving to “global clustering”, a method of replicating the entire application at different sites around the world. While companies can do this with in-house resources, Veritas has a product specific to this new market demand.
One of the best aspects of this presentation was observing how well each of these individuals coordinated their content. Each of the speakers and presenting companies have much broader value offerings that that which they were able to describe. Yet each industry participant worked together so as to provide the audience the breadth and depth on this otherwise hot and obscure topic.