A Gentle Approach to Storage Expiry
In this article I'll explain how to introduce storage expiry in to your Enterprise Vault environment so that it will be least impactful to your environment, and your users.
What is Storage Expiry?
Storage expiry essentially is the process of removing items that are now beyond their required 'keep until' period. These periods, of course, are determined by the retention categories which are applied to the items which have been archived in your environment.
Why run Storage Expiry?
Many people have had Enterprise Vault for several years, and hopefully they have taken time to setup proper retention categories. Please, please, please make sure in any organisation that you look after that the team don't fall in to the trap of just archiving everything and keeping it forever. The Enterprise Vault team need to work closely with legal and compliance groups to make sure that items are kept as long as they are needed.
Just as important as setting keeping items for the correct period of time, is being able to prove that items were in fact deleted / removed-from-storage when their retention has expired. This is the bit that many Enterprise Vault administrators fall down on. So, really poor environments don't define any retention, they just keep everything forever. Moderately poor environments define a nice simple set of retention categories and correctly add items to those retention categories, but they may then never run storage expiry.
Second to the compliance issues is the simple fact that it will help reduce the storage needs of your Enterprise Vault environment (provided you're not using Enterprise Vault collections). If no storage expiry is implemented then the storage requirements for your environment will just grow and grow, without any kind of restraint. Not many organisations have bottomless amounts of money, so it makes sense to make sure that the storage being used is also not bottomless.
Hopefully after reading some of this information you'll be in a position to run storage expiry, to correctly apply the deletion-part of your retention policy. But don't just plunge straight in!
How not to run Storage Expiry
Lets say you have defined three retention categories, 1 year, 3 years, and 7 years. Lets also assume that you have been archiving now for 8 years, you've read a little bit about storage expiry and you've decided that the nagging emails from your compliance/legal team ae finally going to lead you to take action.
You turn on storage expiry to run that very same evening.
This is bad.
It can be very, very bad!
The first time that storage expiry runs in this situation a huge load will be generated on your storage devices, on your Enterprise Vault servers, and in particular on the SQL servers hosting your Directory and Vault Store databases. Performance of the storage expiry process itself will be slow, just simply due to the sheer volume of data that needs to be processed. It will take many runs of the storage expiry process before it has properly caught up with your retention categories.
Of course in environments where storage expiry wasn't really considered at the design stage, and in those same environments where storage expiry isn't turned on for several years after data should have been expired, then there might be an avoidable backlog in processing those first few runs of storage expiry.
Some of these things are avoidable though! Or at the very least you can lessen the impact of the process.
Best way to run Storage Expiry
A much better way to run storage expiry if you have the chance, is to plan it in from the very beginning. That's right, start running storage expiry well before the data has even reached 1 year old. Of course nothing will be eligible for expiry, but it will have become such an ingrained part of the Enterprise Vault operations that it'll be second-nature when it actually starts to process items, and delete them. Also by starting it this way it when the process comes to delete items the first time for 'real' they'll not be such a huge amount of data to process.
As we saw above, the things then to take in to consideration are:
- Run it outside of any mailbox archiving window
Storage Expiry tends to be a heavy weight beast when it comes to database activity, and, it's largely accessing the same tables as mailbox archiving would be. For these reasons it is best to keep the two aspects of your archiving-world separate. Don't let storage expiry run, or overlap into, mailbox archiving time.
- Give it as big a space as possible
As I just mentioned it is heavyweight in terms of the load it places on your systems, so giving it as big a window as possible is the preferred way to run it. If you can let it run all day Saturday and Sunday. Two big windows like this will allow for most if not all of the storage deletion to take place.
You are aiming for the gentle approach
Things to avoid
The major things to avoid are:
- Not planning storage expiry properly
Hopefully with the information gleaned here you'll be able to factor this in to your new Enterprise Vault implementations, and introduce it to existing environments too.
- Archiving schedules
Make sure you keep the storage expiry and mailbox archiving routines well separated. In some organisations this is relatively easy, mailbox archiving happens 5 evenings a week, therefore storage expiry can take place all day long on Saturday and Sunday. For other organisations which are hitting the 24x7 markers then finding the 'right' time will be more tricky, but not impossible. It's just a matter of studying for a while what other activities are taking place in your environment alongside when the user load is being placed on it too.
Summary
Storage Expiry in Enterprise Vault is the process of removing items from the Enterprise Vault system once their retention times have been reached. It's an essential part of the overall Enterprise Vault environment in order to meet both regulatory and compliance requirements as well as helping ensure that the storage requirements don't grow without limit forever. Proper implementation of retention categories, based on business needs, along with the appropriate deletion of archived items 'when it is time' is essential.
How do you implement retention and storage expiry? Let me know in the comments below...