Forum Discussion

andrew_mcc1's avatar
7 years ago

NBU allows overwrite of tapes from another domain?

I have a customer with two NBU domains (two separate Master Servers) both with tape backups that are sent off-site. It seems that if they recall an unexpired tape for restore and put it in the library on the “wrong” site, it gets put into the scratch pool once the Robot Inventory and volume configuration have run which means it is very likely to be overwritten. Is this normal behaviour? 

Customer is concerned an operator error is likely to lead to data loss. Apart from the media write-protect switch, is there a way to stop NBU treating unknown NBU tapes as available new media? I think the Robot Inventory Advanced Options could help but I suspect this could also be error prone?

NBU seems to leave tar and CPIO etc. tapes alone by default and I always thought it did for NBU tapes if it wasn't sure. Thanks, Andrew

  • mph999's avatar
    mph999
    7 years ago

    Once you inventry the library, and BB1234 becomes 'known' to the system (it appears in the list of media) - lets call this master_server_A.  at this point it is given a mediaID, and so it is not an unknown mediaID at this point.

    It contains images (from the other system, master_server_B) on the tape, but there are no images in the NBU catalog for the tape on master_server_A.

    If you then consider an expired tape from, master_server_A (AA1234), it contains no images in the catalog, and thus from the 'catalog view' it is the same as BB1234.  Both tapes have data, but no images in the catalog.

    If you were to use either tape, they would both be overwritten.

    The NBU tape header is checked, to be sure the media ID is matching what it should be, but no other checks are made, and there is no way to tell from the tape header if the tape 'really' contains valid data that should not be overwritten.

     

    Phase 1/2 does not really resolve the issue - the images would become 'valid' again and so the tape would not be overwritten until it next expires.  A phase 1 takes 15 or so minutes at a guess, a phase 2 several hours if the tape is full - not really an option.

    If the media IDs on system A use different characters of the barcode than system B (first 6 instead of last 6 ) then this would stop the tape being overwritten.  However, it would also make it much harder to move and import (phase 1/2 ) tapes in to the system if that ever was a requirement.

     

  • NBU would treat the media as a new media and overwrite it.

    If you consider this:

    NBU writes to a tape - AA1234

    The images on the tape expire, the tape mves back to scratch

    NBU catalog contains no data about images on the tape (as they expired)

    ___________________

    You add another tape into the library, it has data on it from another NBU domain, tape BB1234

    You inventory the library, NBU adds the tape and puts it in the scratch pool

    NBU catalog contains no data about images on the tape

    Consider AA1234 and BB1234, from NBU viewpoint they are the same ...  no images in the catalog and 'data' on the tapes.

    If NBU refused to write to tape BB1234 (as it was added from another system), it would also refuse to write to AA1234 - you can see how this would be an issue, NBU would be unable to overwrite and tape that had expired and was valid for re-use.

    The way around this is to use completly seperate barcodes - AAxxxx for one site and BBxxxx for another, so it is clear where tapes go.

     

    • andrew_mcc1's avatar
      andrew_mcc1
      Level 6

      Thanks for this but I'm a bit unclear

      Consider AA1234 and BB1234, from NBU viewpoint they are the same

      I would have thought they are different as there is an existing Media ID record for AA1234 whereas BB1234 is completely unknown so may contain valid backup images. I would expect there could be at least an option to Freeze or Suspend unknown tapes with valid NBU headers if they are detected. This would be unlike new media which will be unlabeled and can safely have a Media ID generated and put into use.

      Also does Phase I or 2 Import avoid this issue? Unfortunately the customer has tapes in the same barcode range across both sites. 

      Andrew 

      • mph999's avatar
        mph999
        Level 6

        Once you inventry the library, and BB1234 becomes 'known' to the system (it appears in the list of media) - lets call this master_server_A.  at this point it is given a mediaID, and so it is not an unknown mediaID at this point.

        It contains images (from the other system, master_server_B) on the tape, but there are no images in the NBU catalog for the tape on master_server_A.

        If you then consider an expired tape from, master_server_A (AA1234), it contains no images in the catalog, and thus from the 'catalog view' it is the same as BB1234.  Both tapes have data, but no images in the catalog.

        If you were to use either tape, they would both be overwritten.

        The NBU tape header is checked, to be sure the media ID is matching what it should be, but no other checks are made, and there is no way to tell from the tape header if the tape 'really' contains valid data that should not be overwritten.

         

        Phase 1/2 does not really resolve the issue - the images would become 'valid' again and so the tape would not be overwritten until it next expires.  A phase 1 takes 15 or so minutes at a guess, a phase 2 several hours if the tape is full - not really an option.

        If the media IDs on system A use different characters of the barcode than system B (first 6 instead of last 6 ) then this would stop the tape being overwritten.  However, it would also make it much harder to move and import (phase 1/2 ) tapes in to the system if that ever was a requirement.

         

  • I have setup my Media Rules on each server so if the 'wrong' tape is put in a tape silo it gets assigned a Media Type that we  dont use.  That is, we use HCART for our LTO4 tapes that start with "T".  If any "W" tapes (from the other silo) gets injected that tape gets defined a DLT.  Wo dont have any DLT drives so the tape wont get used (and I have time to fix the problem the next morning).

    • andrew_mcc1's avatar
      andrew_mcc1
      Level 6

      Thanks for this. Yes that seems a good plan, however the customer has tapes in the same barcode range across both sites. I guess their options are write-protecting media going offsite plus the Inventory Robot Advanced Setting to put new media into None Volume Pool for injects. However there still remains the possibility of operator error which is their real concern.

      Andrew

      • Marianne's avatar
        Marianne
        Level 6

        Operator error should be eliminated through proper training, policies and procedures, along with documented consequences. 

        Honestly - how does Master1 needs tape XXXXX for restore, tape come onsite and operator 'accidently' inserts the tape into Master2 tape library? Do they not immediately realize that the restore tape is still reported as 'non-robotic' on Master1?

        Your customer will also need to take reponsibility to somehow split the media - slowly phase out the shared barcodes and move/duplicate to dedicated barcodes. 
        Some years ago we had a customer that went as far as purchasing different colour tapes for the different robots, along with different range labels.

        Another customer had a large STK tape library that was shared amongst more than one master. The operators thought it good to 'borrow' tapes from other environments when they ran out. 
        After some real data loss due to overwritten tapes, they were called in and explained the consequences. Only at this point did the 'accidents' stop.