Forum Discussion

  • One of the "joys" of deduplication is you never know what you will get, until you do it. The only way to know is to put some data down and see. You can "guesstimate" based on similar data - if you have VM already for example, you can assume additional VM will get the same dedupe. Since my data domain allows me to have 256 separate file systems, I create new targets as much as possible - so each of my oracle DB has its own target, VM and Isilon grouped similarly.

    As you can see from my data below - i get a wide variance. averages about 8x for the whole thing.

    Name 24hr Pre 24hr Post 24hour Dedupe % 7 day Pre 7 Day Post

    7 Day Dedupe %

    nb 43579.7 1063.6 41.0(97.6) 424127.6 11709 36.2(97.2)
    veeam 287 25.7 11.2(91.0) 5381.3 287.5 18.7(94.7)
    branch 7.7 1.6 4.7(78.9) 571.1 31.1 18.4(94.6)
    db1 13.3 11.3 1.2(14.8) 1547.3 134.3 11.5(91.3)
    db10 73.5 27.9 2.6(62.1) 38250.6 5666.8 6.7(85.2)
    db11 0.8 0.3 2.7(62.7) 556.7 92.4 6.0(83.4)
    db12 357.4 87.8 4.1(75.4) 67197.2 992 67.7(98.5)
    db13 24.9 8.6 2.9(65.5) 93.6 31.5 3.0(66.4)
    db14 0.9 0.4 2.1(52.5) 167.6 35.1 4.8(79.1)
    db2 20.9 1 21.4(95.3) 182.9 8.3 22.1(95.5)
    db3 189.6 188.9 1.0(0.4) 4892.2 1569.9 3.1(67.9)
    db4 344.4 263.2 1.3(23.6) 17418.9 6220.2 2.8(64.3)
    db6 660.5 241.6 2.7(63.4) 1422 548.6 2.6(61.4)
    db7 25.2 10.6 2.4(58.0) 24864.8 429.9 57.8(98.3)
    db8 2.1 0.5 4.4(77.2) 10255.1 43.2 237.6(99.6)
    db9 0.4 0.1 3.3(69.6) 6.6 1.3 5.0(79.9)
    isilon.2 4.8 1 4.8(79.1) 383.4 9.9 38.9(97.4)
    isilon.3 74 46.5 1.6(37.2) 19388.4 2076.6 9.3(89.3)
    isilon.4 794.3 565.2 1.4(28.8) 6499.5 4571 1.4(29.7)
    isilon.5 28.1 8.2 3.4(70.9) 3047.3 72.4 42.1(97.6)
    isilon.6 68.8 20.4 3.4(70.3) 8075.6 195.2 41.4(97.6)
    isilon.7 5.4 2.1 2.6(60.8) 2249.6 95.6 23.5(95.7)
    isilon.archive 1.7 1.6 1.1(7.5) 447.5 20.1 22.2(95.5)
    mssql 14453.6 4872.4 3.0(66.3) 78632.2 33669.1 2.3(57.2)
    nasfs 264.4 76.7 3.4(71.0) 1604.4 402.7 4.0(74.9)
    nasfs 0 0 157.8(99.4) 0.1 0 143.9(99.3)
    nasfs2 445.7 3.3 133.6(99.3) 3444 48.4 71.2(98.6)
    racdb - - - 434.3 11.2 38.9(97.4)
    splunk 1778.3 375.4 4.7(78.9) 90958.4 2625.3 34.6(97.1)
    nbu 5648.6 280.3 20.2(95.0) 81975 3309.3 24.8(96.0)

     

    Currently Used:   3587575.7    444407.1            -            -         8.1x (87.6)
    Last 7 days           894075.2      74907.7           7.5x      1.6x    11.9x (91.6)
    Last 24 hrs              69156.2       8186.4           6.3x       1.3x     8.4x (88.2)

     

  • Hi,

    The purpose of deduplication is to eliminate duplication. In other words, if two clients have the same chunk of data, as much of the OS would be for instance, then only 1 copy would be stored. So you can't really ask such a question because server 1, 2, 3, 10000 might all have like data and it would be stored once.

    In NetBackup (if you're using MSDP) you could see how much data was sent to the server, and how much was stored i.e the before and after size, but that figure does not imply that the stored size, is only used by the specific job/server. If another client had sent the same data and the resulting hash was the same, then it would also get a message saying sent x, stored y but clear Y is going to be shared.

    Hope that helps.

    • Tape_Archived's avatar
      Tape_Archived
      Moderator

      As RiaanBadenhorst suggested its hard to determine how much data each client independently deduped.

      WolfgangSwie you may try this - Find the backup size of the clients that you are looking for and multiply by the dedupe factor (factor for entire MSDP) and you can share that information if your need it or with you customer or groups. Not a break through suggestion :smileyhappy: ,but an option you can try.