Data Recovery Article
Tale of the Tape: Beware of Wind Quality
By Henry Newman
A little more than a year ago, a customer told me his company was going to write tapes remotely over an OC-3 connection. The tape drive being used was the StorageTek T9940B, which has a native transfer rate of 30 MB/sec and can support up to about a 68 MB/sec transfer rate with compression. The average compression this customer was seeing was about 1.5 to 1, so he was running the local tape drives at about 45 MB/sec. After factoring in packetization and contention, we figured that we would be very lucky to see 15 MB/sec on this OC-3 line, and would likely see about 6 MB/sec. I cautioned them that this was a bad idea.
Tape wind quality can be a problem on local Fibre Channel networks, but the problem is exacerbated when replicating directly to tape remotely.
In this installment in our series on data replication, we hope to shed some light on the little known issue of wind quality posed by high-end tape media (LTO-I, LTO-II, IBM 3590 and STK 9840/9940). Lower-end tape can have the same problem, but more often than not, since the tape is running at a slower speed, users can keep up with the tape drive performance.
Marginal tape wind quality
Over the long term, the data on the last two tapes has a much greater potential to have problems with data reliability. As you know from previous articles (see Preparing for a Disaster
and Preparing for a Disaster, Part 2
), on a per-device basis, high-end tape has about two orders of magnitude lower bit error rate than Fibre Channel disks and about three orders of magnitude over SATA devices, but all that changes with poor wind quality.
Almost all tape drives are designed to stream data to the tape. If the drive cannot be streamed, it constantly starts and stops. This start-stop action combined with the tape threading within the drive can cause the tape to wind on the reel improperly, creating "scattered" winds. This happens because when the data stream is interrupted, the drive must stop and back up to reposition the read/write heads before starting again. This results in slight tension transients that allow the tape to wander ever so slightly. This is true for every drive type and vendor, be it AIT, DLT or LTO. It is not an issue of the quality of the drive hardware or media (although high-quality media can reduce the occurrence), but a fundamental issue with all tape drives when two smooth surfaces come together at high speed.
Those familiar with magnetic tape have seen this problem since the days of reel-to-reel tape, and knowledgeable media experts have warned about the issue for years. Over time, the compressive forces within the tape pack will result in plastic distortion of the tape edges associated with these scattered winds, and increase error occurrence.
What Can Be Done About Tape Wind Quality?
Writing a tape that has poor wind quality, putting it on a shelf for five years or so and then trying to read it could be a potential disaster. Most of the software packages I know of cannot solve the wind quality problem, since they would have to the following:
- Write the tape.
- Reposition the tape to EOT (End of Tape).
- Rewind slowly to BOT (Beginning of Tape).
Besides the extra time this takes, it would be highly impractical for an HSM or backup vendor to know the special commands needed to do this for every tape drive and then keep up with firmware updates. More importantly, the software would have to know that the tape was being remotely written.
The problem isn't necessarily limited to remotely written tapes. If you have a tape drive that can write at almost 70 MB/sec and you do not have the Fibre Channel HBA bandwidth, the memory bandwidth, PCI bus bandwidth, and/or the RAID bandwidth, the problem could happen anytime, anywhere. Therefore, the issue could be a local one too. Although most modern tape drives have some capability to adjust tape speed to the data stream, there are limitations to this feature.
Anyone architecting a system must ensure that the tape drives run at the rated speed with the compression of data. This is true for helical tape drives such as AIT, DTF and others, and linear tape drives such as DLT, LTO, IBM 3590, StorageTek and so on.
Running at the rated speed with compression is sometimes hard because HSM and backup applications are not necessarily designed for asynchronous double-buffered I/O (reading into one buffer while writing to the tape in another buffer asynchronously). Add the fact that if you are writing at 70 MB/sec, two tape drives can use up most of the bandwidth of a single 2 Gb Fibre Channel HBA. In addition, the RAID device must support streaming reads at 70 MB/sec for each tape drive that is writing. This could be a problem for lower-end RAID controllers, or where the data in question is residing on a device configured at RAID-1. Even the fastest 15K disk drives cannot sustain 70 MB/sec for just the outer cylinders.
Tape Wind Quality Is Key For Long-Term Archiving
Anyone designing a system must take into account all tape drives. The architecture for writing streaming tape must take into account the whole data path:
RAID controller->Fibre Channel Switch->HBA->PCI Bus->File System->PCI Bus->HBA->Fibre Channel Switch->Tape Drive
Tape wind quality is especially important if you plan to put the tapes on the shelf for long-term archiving. If the tapes that are being poorly wound are used daily, then it is likely that the poor wind will be less of an issue, since the tapes are being wound and unwound and the tension issues occur in different places (assuming they do not leave the library or are properly handled between uses). This is not to say that this is a good thing, but it's less of a bad thing.
If you are going to put a tape on the shelf for five years with poor wind quality, the bit-error rate will go up significantly, and you might not be able to read your data even though the tape is rated for a 30-year shelf life. Who can you blame? You probably can't blame the tape vendor or the tape drive vendor. Can you blame the manufacturer of the software? Not likely. Remote replication of tapes makes the concept of virtual tape an important one when trying to move the data to the off-site facility, with data written to disk before it is written to tape.
Relieving the tension transients in the tape pack that lead to poor wind quality is a necessary step for maximizing the likelihood of a successful read of any archived tape. Can this be accomplished with a background utility that can be invoked prior to assigning a tape to archive? Not with today's hardware and software utilities. The individual SCSI commands exist, but they are not available in any user-accessible routine. The extra time taken (several minutes per tape) on re-tension can pay dividends if the data on the tape becomes needed, and by that time, it could be the only recourse.
We've spent the last few articles discussing issues surrounding remote replication of disk and tape data and the use of backup and HSM software. At this point, it should be clear that you cannot directly write tapes over the network and expect high reliability after long-term archiving - unless you ensure that the tapes are wound correctly. Even with an OC-48 connection, you can still have issues with TCP/IP congestion, TCP/IP retries and a myriad of other potential problems that could prevent the tape drive from streaming.
I have learned a great deal about how tapes should be used and stored from experts in the field, and I hope you have too. For further information, a search on Google for tape + wind + quality turned up a number of resources on wind quality for both data and audio tapes.
Back to the customer at the beginning of the story. How did that company resolve its dilemma? Fortunately, they took my advice and did disk-to-disk-to-tape instead, thereby avoiding a tape wind quality problem.