Category Archives: Backup

EMC World 2011 – Las Vegas – day 1

So after the first day at EMC World what Marvels of technology have been announced ?
What groundbreaking nuggets of geeky goodness to be announced. So, first things first VPLEX ! looks like they may have cracked it..   Active/active storage over a synchronous distances, Geoclusters will never be the same again !!..   and also a slightly ambiguous announcement around integration with Hadoop opensource (more to follow on that).

What was the message of the day though ? What was this years theme..   This year EMC are talking about Big data and the cloud. Clearly recent acquisitions of Isilon and Greenplum have planted EMC’s head firmly back in the clouds.  Greenplum giving end users the ability to scale out Database architectures for data analytics to mammoth scale with Greenplums distributed node architecture and massive parallel processing capabilities. To br frank, learning about the technology was borderline mind numbing, but my god its a cool technology. Then we have large scale out NAS with Isilon and its OneFS system giving the ability to present massive NAS repositories and scale NAS on a large scale. So obviously, EMC are talking about big data.

I also had the opportunity to sit in on an NDA VNX/VNXe session and what they’re going to do is….    aaah, I’m not that stupid. But needless to say, there are some nice additions on the way, the usual thing with higher capacity smaller footprint drives and getting more IO in less U space, but also some very cool stuff on the way which will enable EMC to offer a much cheaper entry point for compliance ready storage..  watch this space.

In true style EMC threw out some interesting IDC touted metrics further justifying the need to drive storage efficiencies and re-iterating the fact that there will always be a market for storage. So, our digital universe consists of 1.2 Zettabytes of data, currently, of which 90% of that is unstructured data and that figure is predicted to grow by x44 over this decade. Also 88% of fortune 500 companies have to deal with Botnet attacks on a regular basis and have to contend with 60 Million Malware variants.  So making this relevant, the 3 main pain points of end users are; firstly our time old friend budget, then explosive data growth and securing data.

So how have EMC addressed these ? Well, budget is always a fun one to deal with, but with efficiencies in storage by way of deduplication, compression, thin provisioning and auto tiering of data, end users should get more bang for their buck. Also, EMC easing up on the rains with pricing around Avamar and the low entry point of VNXe, this should help the case. Dealing with explosive data growth again tackles with deduplication, compression, thin provisioning and auto tiering of data, but also now with more varied ways of dealing with large sums of data with technologies such as Atmos, greenplum, Isilon. Then the obvious aquisition of RSA to tie in with the security message, all be it that has had its challenges.

I’m also recently introduced the concept of a cloud architect certification track and the concept of a Data Scientist (god knows, but I’ll find out). So I went over to the proven professionals lounge and had a chat with the guys that developed the course. Essentially it gives a good foundation for steps to consider when architecting a companies private cloud, around Storage, virtualisation, networking and compute. If you’re expecting a consolidated course which covers the storage consolidate courseware, Cisco DCNI2, DCUCD course and VMware install configure manage,  then think again, but it does set a good scene as an overlay to understanding these technologies. It also delves into some concepts around cloud service change management and control considerations and the concept of a cloud maturity model (essentially EMM, but more cloud specific). I had a crack at the practice exam and passed with 68%, aside from not knowing the specific cloud maturity terms and EMC specific cloud management jargon anyone with knowledge of servers, Cisco Nexus and networking, plus virtualization shouldn’t have to many issues, but you may want to skim over the video training package.

There was also a nice shiny demo from the Virtual Geek Chad Sakkac showing the new Ionix UIM 2.1 with Vcloud integration using CSC’s cloud service to demonstrate not only the various subsets of multi tenancy, but also mobility between disparate systems. When they integrate with public cloud providers such as Amazon EC2 and Azure, then things will really hot up, but maybe we need some level of cloud standards in place ?…   but we all know the problem with standards, innovation gives way to bureaucracy and slows up…   but then again with recent cloud provider issues, maybe it couldn’t hurt to enforce a bit of policy which allows the market to slow up a little and take a more considered approach to the public cloud scenario..   who knows ?

Anyway.. watch this space..  more to come


Interestingevan on

A few weeks ago Imyself and a few others were asked by Chris Mellor at the register to provide my thoughts around whether Replication could replace backup. Take a look at the below link to see the article :

More on the Commvault Cloud Connector

Following my previous post on Commvault catching the cloud bug. Here’s a bit more information on the ins and outs of the Commvault Cloud Connector.

Firstly.. how much does it cost ?  heres the good news.. its free !!  you use it in conjunction with standard disk licences or CDSO licences. Typically people are using this option for auxillary (secondary or tertiary copies) of data. If you have existing backup to disk licences, you simply need to upgrade to service pack 4 of Simpana 8 and download the March 2010 update pack, the connector is in there. The only difference in setting up the maglib for cloud storage is that you have to input the username and password provided by your cloud storage provided by the cloud storage provider. Currently this only supports Amazon S3 and Microsoft Azzure, although Commvault have a big sales event comming up soon, so we’ll see if the announce any other supported providers there.

Remember that if you are interested in backing up or archiving to the cloud you will still need standard disk or CDSO licences and if you want to do your primary backups to the cloud, you will still need a disk staging area locally if you want to use dedupe (advanced disk/CDSO) as deduplicaton will need to occur locally. Also if you want to run auxillary jobs of deduped backup jobs, you will need to implement silo storage to facilitate this.

Commvault caught the cloud bug

In light of the fact that the storage community as a whole now has its head in the cloud, Commvault have decided to join the party. Commvault’s notorious claim to fame being that the same common  platform is used for backup, archive, ediscovery and replication. Commvault Simpana traditionally supports media such as disk or tape (as most backup vendors do), but they’ve now created a storage connector to support “cloud” storage as the backup media of choice. Commvault Simpana currently works with API sets from the likes of Amazon S3 and Microsoft Azure. Intregration for EMC Atmos and Iron Mountain is soon to follow according to Commvaults press release.

Jeff Echols, CommVault’s director of cloud solutions says

The practical use of cloud storage for us is as an extension – a potential way to move unstructured data out of the data center. Simpana treats the cloud as just another tier of storage. Conceptually, it looks like another target.

Customers can take advantage of Simpana’s built-in features including data deduplication, encryption and eDiscovery in conjunction with the cloud as a “farline” tier of storage. Customers can also add Simpana Search to index data prior to sending it to a cloud provider.


For those of you who are unfamiliar with Commvault, Please see the below document and video from Commvault.


For the full Commvault Press release regarding this Click here

EMC Avamar – Deduplication in Backup

With all the backup products in the market, how do you choose which product is suitable for any given requirement? Well for this post, I shall introduce you to EMC Avamar. Avamar technologies was aquired by EMC back in 2006 and provides efficiency in backup via deduplication at block level. 

Avamar is predominantly positioned today as an appliance with the Avamar software pre-isntalled on a Dell Power Edge 2950. In the same fashion as any other backup products, agents are deployed on systems to be backed up..  nothing new there. The intelligence comes into play where deduplication is concerned. Avamar agents will keep track of blocks which have been backed up and only send changed blocks of data over the network. This has a few benefits:

  1. Capacity Efficiency (only change a word on a previously backed up document, you only backup the changed blocks..  not the whole document again.
  2. Network utilisation. End users become accustomed to the fact that their network will be hit hard during a backup window. With products like VMWare; server sprawl is rife and you can end up really hammering your network. With Avamar only backing up changed data, network utilisation during backups is dramatically reduced.
  3. Remote offices. Many Companies have remote offices dotted around with piddly little links, block level changes will be significantly smaller than file level incremental changes. So bandwidth issues aren’t allways as apparent with avamar.
  4. Avamar plays best with customer data that have large commonalities (ie, file data, OS library files, etc.). Less commonalities (ie, Database volumes, where rate of change is greater) will mean a lower dedupe ratio.


Avamar appliances can be sold as single nodes (in which case you need a replicated pair of single nodes for EMC to support the solution) or as a RAIN solution which works in much the same way that RAID does. You have a parity node, capacity nodes and a spare node.

click to enlarge

If you come up with an avamar opportunity and want to have any level of accuracy in terms of the size of appliance required.

These are the questions the reseller needs to be asking.

Capacity Questions

  • How much of the data is File data ?
  • How much data is Database data ?
  • Is any data VMFS (VMWare system) ?
  •  if so how much ? how much data is email data ?
  • Is there any mail archive data, if so how much ?

for each of the above, what are the following :

  • Number of daily backups being retained number of weekly backups being retained
  • number of monthly backups being retained (would advise not to retain more than 3 months of data, is it becomes a very costly solution).
  • What is the daily rate of change for each of the above (% Approx)
  • What is the projected annual growth of data (% approx)
  • How many sites are being backed up
  • Data being backed up per site
  • size of link between sites

Plus the obvious questions around how much they’re looking to spend, the smallest change in an avamar config can have potentially large cost implications.

See the below video for a more in-depth white board curtosy of EMC: