Michael Petrov
Co-Founder, CEO
Implementation and Approach to a New Project.


Imagine a client with a need for shared NAS storage, specifically speaking CIFS share – with a total need of 4T.



A client is actively storing “working” files on the share, and those files are actively accessed for 6 months. However, the uses of these files are specifically used only for look-up, and in other rare cases.

Now, as many companies are, this client is very serious about its SLA's emphasizing that these files and the data mustn’t be lost. Additionally, the client requires the ability to be able to recover files after corruption, accident changes, deletion, or any other operational mistakes that its employees may cause.


Utilizing previous enterprise storage – EMC Celerra - technology was put  in place...


For a client, we created a terabyte of “hot” storage using fast SAS disks. This would save money by not putting all 3T of data on fast disks. We also created 3T's of SATA storage to keep files that were older than 6 months.


The following maintenance/backup jobs we setup:

1.       We created a Powershell script that would move all 6 months and older files from “hot” disks/volumes to “archive” disks/volumes. The script was also removing WRITE permission from the files for all users in the client’s organization. So archive files could not be modified.

2.       We ran an incremental backup using Backup Exec on the “hot” volume.

3.       We created a monthly full snapshot of backup on tapes that would be pulled and stored offsite.

4.       We also had a standby Celerra SAN where we used CIFS to CIFS mirroring to accommodate requirements for disaster recovery. 


Most of the requests that we were getting for the “hot” volume were focused on finding and restoring overwritten or a lost files.

To do so, we used incremental backups to find a trace of the latest good version of backup files in question. But of course, we did have limitations, allowing only up to 3 hour windows, as we were doing incremental backups every 3 hours.


The Problem, Or Problems I should say-

1.       The backup window was 3 hours, with full incremental backups taking up to one hour. Reason: The volume contained more than a half of millions of folders and more than 5 million files (Do not ask me why...). And Yes – we did use VTLUs and NDMP and Celerra snapshots.

2.       From time-to-time the client was changing the structure of the “hot” volume directory tree, and those changes were not and could not be propagated to “archive” volume. Then when a client employee wanted to find the file in the “archive” the person could not because he/she was looking for a correspondent to the “hot” volume place and the “archive” tree would not sync. This created unnecessary requests for restorations and troubleshooting. Therefore, we started logging every file with archival progress; which then required the client need a GUI to be able to look-up files to see where the actual files were archived.

* In every single case we were able to find the file in the archive and point to the directory structure change.


So the technology setup using EMC Celerra worked but had its own limitations.  Those limitations put the client on a project to find “archival” software, forcing the client to start interviewing backup companies and in-turn got dragged away from the original concept by so called “solutions providers”. It certainly created a situation when we felt that we would have to offer something better.


Proposed Solution:

Since the client moved to VNX platform we offered the following solution to address all limitations.

1.       To create one big disk pool, containing SSDs, SAS and SATA disks to produce a total space of 4T. 1T which would consist of fast disks and 3T's of slow disks. It is not a huge expense as the disks are much bigger today. Then, we put FAST technology this pool so that the EMC VNX could move data between fast disks and slow disks automatically. By doing that we solved the archiving problem. If moving between fast and slow disks is done on the background, there is no need for the archival script. We could keep all the files in one volume and did not to need archive anything. If the client were to change the folder tree structure there would be no need to synchronize it. Then we analyzed the actual usage of the backups we realized that those backups were actually a version control. The problem was that the backup was done to cover for the lack of version control. If we had something like SVN or SharePoint or SourceSafe, we could have easily avoided doing those incremental backups. So for the “backup” solution we offered the following, providing regular EMC snapshots. When the client needed to restore a file we would mount a snapshot as a different NAS letter and give access to the particular user so that the user could look at how exactly data looked at the “suspected” timeframe. We could then make as many of those snapshots as we wanted. If we have enough storage we could keep the snapshots for a period of a whole year.


After introducing this solution the client’s CEO and CIO; both agreed that this was the way to go. We are currently at the stage of planning for this implementation.

We will be updating on the development and results of the projects as they go.




Zezar on 7/26/2012 7:52:18 AM

I keep code in SCM, either SVN or iigceasnnrly git.Mail is with an IMAP and Google. I do use offlineimap to backup mail from IMAP to disk. I still need to work Gmail's pop to do the same, but I am a little worried they won't play together nicely.As for my multimedia . I've recently bought myself a NSLU2. I plan to put as large a USB hard drive as possible on there and backup to it. As for backing that up, I hope LVM2 will suffice. Otherwise perhaps another similar setup with rsync.


Leave a reply

Name (required)
Email (will not be published) (required)

Number from the image above
Latest blog posts
VNX Versions
Subscribe to the blog by e-mail

Sign up to receive
Digital Edge blog by e-mail

Subscribe    Unsubscribe