arl: faqs: backup: using tapes

[this] [up] [top]


Last updated 20030707T0900


Why should one not use compressed tars when backupping on tape?

When compressing using unix compress, gzip, or bzip2 one needs to have the whole content, because compression does change its rotors all the time.

whole content means data written on to tape must match 100% data read, i.e. no errors.

rotor means all the keys and definitions for compression program

When a rotor is lost the whole data after the point is lost. This means a single error in tape will corrupt the whole tape content, because no other rotor cannot be obtained after the error point.

Typical fatal errors:
  • tar cvzf - . | dd of=tape device
  • tar cvf - . | gzip -9 | dd of=tape device
  • it works, why should I care!
    I may work a while, but when the tape is used again and again (wearing), or break somehow manually, the problem begins. This is usually the same moment one needs tape backup content!!! (What did Murphy say?).
  • it worked .. using gzip and when tested the tape, it worked!
    It may work now, but it does not guarantee it works tomorrow. Some small error, even cosmic might destroy the content enough to have tape full of garbage.
  • but bzip2 is able to correct (single) errors
    Do not rely on it. You can test it yourself. I created a single error (changing letter d to letter t within bzip2 -9 compressed tar) and bzip2recover did not fix the error.

Thumb rules when using compression
  • compress only single files
    this will allow tar to skip over the bad data on tape to the next tar file header, and only on point of the backup is lost, not the whole backup.
  • use native tape drive given compression with uncompressed tars
    native compression is not large block based, so single error does not mean catastrophy.


© arl