/********************************************************* END OF STYLE RULES *********************************************************/

Friday, October 27, 2006

'Seeing What's Next' in the Storage Industry, Part I

I'm reading Clayton Christensen's latest book titled Seeing What's Next. This is the third book in a series after The Innovator's Dilemma and The Innovator's Solution. This book leverages the theories from those two but, as he says in the preface, "Seeing What's Next shows how to use these theories to conduct an "outside-in" analysis of how innovation will change in industry." Perfect. Let's do our homework by applying this to the storage industry - specifically array subsystems and associated data services and storage networking products.

Clayton says that to identify significant change in an industry, look at three customer groups: Non-consumers, Undershot customers, and Overshot customers. I see all three of these in the storage industry.

Non-consumers and Storage
Clayton defines these as potential customers who are not buying the products today because they either can't afford it, or for some reason don't have the ability to use the existing products or to apply them to the problem they are trying to solve. Instead they either go without, hire someone else to do it, or cobble together their own 'less-than-adequate' solution. These customers can be important because it's a place where new technology that appears sub-optimal to existing customers can mature to the point where it becomes attractive to the mainstream. Clayton calls this a 'New-market disruptive innovation'

I see two such groups for storage subsystems and data services. One group has always been there - the small office/home office market. Most of these customers are still not buying $10k NAS or RAID servers, installing them at home, backing them up, remote mirroring, etc. Instead they put data on a single HDD and maybe manually backup to a second drive (cobble together a less-than-adequate solution), or maybe use an SSP such as Google or Apple's iDisk (hire someone else to do it). A few players are pushing into this space such as Stonefly, and Zetera with their Storage-over-IP technology but I'm personally not seeing anything that justifies a significant 'new market disruption'.

The more interesting group that meets the definition of 'non-consumers' is the new group of High-Performance Technical Computing (HPTC) users. These users have big high-performance computing jobs to do but their research budgets don't allow them to buy multimillion dollar mainframes and storage subsystems. They have figured out how to build their own parallel computers using lots of commodity x64 rack servers harnessed together with customized Linux operating systems. Part of customizing Linux to run these highly parallelized compute jobs is to use the Lustre File System. Lustre uses its own object storage protocol on top of commodity IP interconnects so the object-based storage can effectively share storage between many compute nodes. Then, on the storage side, they cobble together their own storage servers using Linux with added Lustre target-side components on commodity hardware.

Much of this solution would be considered 'not-good-enough' by many mainstream enterprise storage customers - ethernet is considered too slow, availability and data integrity are not sufficient, and it requires an 'assemble-it-yourself' Linux storage array. As Clayton's books describe however, this is exaclty how many disruptive innovations start. In addition, the sustaining enhancements to make this acceptable to mainstream datacenters are in process in the open-source community and associated standards bodies. These include RPC over RDMA and 10G ethernet that will exceed current FC performance, improved availability and reliability through enhancements in NFS and pNFS, as well as sustaining Lustre enhancements driven by ClustreFS Inc. I've talked to several managers of large datacenters who are interested in migrating key applications such as Oracle and web services from large mainframes to scaleable rack servers and who are watching and waiting for the storage and SAN technology to support that migration. So, this one clearly looks like a new market disruption in process.

Overshot Customers
These are the customers for whom existing products more than meet their goals and who are not willing to pay more for new features. The customers that were in the soon-to-be-extinct segment called 'mid-range storage' meet this definition. They just want reliable RAID storage and MAYBE a few basic data services such as snapshot, etc. I've talked to several ex-midrange customers who know there are low-end arrays that meet their reliability and availability goals, and they just want to know how cheaply they can get them.

The other giveaway that overshot customers exist is the rapid growth of companies supplying what used to be considered not-good-enough technology. This exists with the growth of Netapp and NAS storage. NAS has been considered sub-optimal for datacenter storage for several reasons. No RDMA, limited path failover, data can't be seamlessly migrated between, or striped across NAS servers. Netapp's growth proves that more customers are finding these limitations acceptable. In parallel, the NFS and pNFS enhancements in process will solve these reliability/availability restrictions. So, I'm betting that this is a low-end disruption that will just keep growing.

Undershot Customers
Undershot customers are those who have trouble getting the job done with the products available today and would be willing to pay more for added features that help. These are the customers companies love. They can invent new features and charge more for them. Storage has lots of undershot customers driven by the need to comply with information laws, while at the same time consolidating storage on the storage network, while at the same time keeping it 100% available. The storage industry is putting a lot of energy into claims they can add features to help these undershot customers. The problem though, as described in The Innovator's Solution (chapter 5), and in my post on Disruption and Innovation in Data Storage, is that sometimes a component designed to a modular interface can't effectively solve new problems due to the restrictions of the now outdated standard interface. The inventors of the block interface standard never planned for today's data management problems or the type of highly intelligent, networked storage that we have today. In a case like this, the competitive advantage shifts to an integrated company that can invent a new proprietary interface and provide new functionality on both sides of the interface (in this case both the storage server AND the data client/app server). This has been a common theme of mine in this blog. Designing a storage system to meet these new requirements requires a new interface between app server and storage or, stating it in Clayton's terms, the competitive advantage has swung to companies with an interdependent architecture that integrates multiple layers of the stack across a proprietary interface.

I'm going to leave with that thought for today and continue with it in Part II. In the meantime, think about EMC and the host software companies it's been acquiring, or Microsoft with it's move to the storage side with VDS, or Oracle and its investment in an array company (Pillar).