Operations with critical needs to quickly recover from IBM i (iSeries/AS400) server downtime want Disaster Recovery Replication Solutions. Disaster Recovery Replication Solutions allow users to recover in minutes, not hours…if they make the proper choices and clearly understand the limitations of each solution.
There Are 3 Basic IBM i (iSeries/AS400) Server HA Solutions
When it comes to Disaster Recovery Replication Solutions, you can pick from Remote Journaling, Disk Controller/iASP Replication, or SAN-to-SAN Replication. The intent of all 3 solutions is to have a mirror copy of the production data on a target system to access in the event of production server downtime.
Remote Journaling leverages IBM i journaling technology to transmit a changed object from the production server to the target server.
Disk Controller/iASP Replication requires 2 parts: 1) a duplicate server and 2) the mirrored environment needs to be set up in iASPs – independent ASPs. This architecture looks like the production system is sending a mirrored copy of the data to a second server as though the mirrored data is sent to a second array of disks. This architecture transmits any block from the disk controllers that includes changed data to the mirrored server.
As you may already know, a block is huge compared to a changed object so you will need large bandwidth for this approach. More about this later.
SAN-to-SAN Replication makes snapshot copies of data and transmits it from the production SAN to the target SAN. Snapshots are scheduled in intervals, from 24 hours to, say, every 4 hours.
So What Do You Need To Know About Each Approach?
2 key factors to consider with SAN-to-SAN Replication include 1) interval and 2) data integrity.
If you snapshot your system every 24 hours when all users are off and the system is quiesced (paused or dedicated for snapshot), your system should be completely saved prior to replication.
On the other hand, if you snapshot every 4 hours while the system is in production, not all data will be properly saved. First, part of the system changes will be in the processor or disk cache before it can be saved by the snapshot. This means that not all of the data got properly sent. Also, some data could be between temporary locations (memory, cache, etc.) and disk, resulting in damaged data being transmitted.
Further, the recovery process requires steps to prepare the snapshot data for recovery. Even with the recovery process, if data is damaged or missing you have to accept that.
The questions to ask include: 1) can I live with recovery with the interval scheduling, 2) delay to recover, and 3) can I live with some of my system incomplete?
SAN-to-SAN Replication requires 2 compatible SANS. You will want to understand the cost of acquisition, the cost of setup and the cost of ongoing support (yikes, this is REAL pricey).
Disk Controller/iASP Replication has its own set of considerations that need to be assessed.
While Disk Controller/iASP Replication is often presented as simple mirroring that requires little-to-no administration, there are other factors to consider.
First, Disk Controller/iASP Replication requires your system and data filer to be setup in iASPs. This is a tedious job that takes 100-200+ hours. Initial setups can range from $20,000 to $40,000+ even for smaller systems.
If you have 3rd party software you need to check with your supplier to determine if their software is certified to run in a Disk Controller/iASP Replication solution.
Second, Disk Controller/iASP Replication requires you have 2 servers at the same OS level. The target server needs to be big enough to mirror the production server. This suggests that your target server or cloud host can be a lot more expensive.
Third, while replicating there is no way to view your transmitted data to validate its integrity. You get to find out when you use the target server system…and only after you have completed all the recovery steps. You may have as many as 43 recovery steps to prepare and validate the target system.
Fourth, how long does it take to complete the required recovery steps for Disk Controller/iASP Replication? 2-4 hours? Longer? How much effort is involved? Important factors to investigate.
Fifth, Disk Controller/iASP Replication requires enormous bandwidth to work because of the size and volume of data. We have seen cases of 50-100 + MBPS are needed. In addition, some users need to pay extra to be sure there is no “noise” on the lines to interfere with the transmission. How much does this cost? In my experience, the bandwidth costs more than our hosting fees by factors of 150%-300%. Quite simply, bandwidth is expensive.
Remote Journaling leverages IBM i remote journaling. Much of the “heavy lifting” is done by the IBM i firmware so the overhead of Remote Journaling does not impact production.
Remote Journaling forces object updates to be saved to disk so it does not linger in cache that might otherwise be lost due to unexpected downtime.
Remote Journaling transmits the changed objects to the target where the data is validated for integrity before it is applied to the data base. This is a step to make sure your target data is complete and accurate.
Of the 3 options, Remote Journaling data transmission uses the least amount of bandwidth. Because only changed objects are transmitted, bandwidth requirements are a fraction of the requirements of the other two options.
Once the network changes are in place, Remote Journaling recovery can be in a range of 5-15 minutes.
The software licensing for Remote Journaling is higher than Disk Controller/iASP Replication and less than the SANs for SAN-to-SAN Replication. Setup for Remote Journaling is about the same or less than Disk Controller/iASP Replication or SAN-to-SAN Replication.
Remote Journaling has several BIG advantages. Faster recovery, lower bandwidth that translates into lower cost, and lower total cost of ownership.
For my money, if I had to pick one, I would go with Remote Journaling.
Leave a Reply