2009-03-09

Object Storage, Part 3 - Explicit and Implicit Policies

Once metadata becomes an intrinsic part of each stored object, application-specified metadata provides a rich vocabulary by which to enable applications to communicate with the underlying storage system and for administrators to manage them.

As part three of the object storage series of posts, this entry covers the ability to specify explicit and implicit policies that gives both the application and the administrator control over how data is managed, and drives additional value in the storage subsystem. This entry builds on top of the last entry, Object Storage - Metadata, which introduced the importance of metadata and why it applies to storage and to object storage in particular.

More Than Just a Bit Bucket

When people first think about storage, they think about bits. And that's fundamentally what storage systems do. They take your bits, keep them, and give them back to you. But if that's all a storage system does, it's pretty dumb, as there is a lot more to what applications need then just storing bits.

Storage is also more than the storage system and applications — The storage administrator is also an important player in enterprise storage, and is often charged with goals that may or may not agree with the desires of the application.

So, if storing bits is "Dumb Storage", what is "Intelligent Storage"? Well, an easy example is the below list of many of the things that applications and administrators want to have their storage system do for them:
  • Index
  • Protect
  • Share
  • Compress
  • Replicate
  • Distribute
  • Archive
  • Cache
  • Tier
  • Version
And this list is just the beginning.

Explicit Policies

In order for these higher-level behaviours that are desired by administrators and applications to be fulfilled, they first need to be communicated to a storage system. And metadata fulfils this role perfectly.

If an application wishes for a given stored data to be protected such that only that application can access it, it needs only to attach metadata to the object that indicates this intent, and trust that the storage system will honour its request. This agreement between the application and the storage system is the contact of functionality.

Want multiple copies? Add metadata. Want it shredded on delete? Add metadata. Want index keywords? Add metadata. Etc.

Thus, through a vocabulary of well-defined metadata that it will honour, the capabilities of a storage system can be advertised to an application. And the sum of this metadata forms an explicit policy, specified by the application to the storage system, as an atomic part of the stored object.

Implicit Policies

But this isn't the only way that policies can work. While the application knows best from its perspective, it is only one small and limited part of an enterprise. There are larger forces at work — desires to ensure that data is not lost in a disaster, desires to reduce costs, desires to meet legal obligations, and to manage information over time and space.

Enter the storage administrator.

Like with explicit policies, implicit policies are also built around metadata. But instead of having an intent being explicitly stated as metadata directives, implicit policies map an intent to a collection of objects with common characteristics.

Let's imagine that the storage administrator wishes to ensure that critical financial documents are protected against a site disaster, and are retained for a minimum of ten years. The administrator can create an implicit policy that says:

For all objects with metadata that indicates it is a financial document, make two copies, one remote, and keep them for ten years.

Once again, the metadata is key. The metadata might be a path in a file system, or a document type, or the division within an organization. Regardless, the administrator now has a tool to take subsets of stored data, limited only by their imagination and the avaialble metadata, and make things happen.

And unlike explicit policies, implicit policies can be changed without having to change the metadata of the stored objects. By combining both types of policies, an application can specify metadata (an explicit policy) that selects which implicit policy is to be used. And all of these approaches can be combined: An application may say that this object "Must be high performance", that the retention is "governed by implicit policy named Finance", and say nothing about replication.

As can be imagined, this approach to storage management is very powerful, and allows the value of storage to be expressed in terms of business values to an organization. In our next entry, we will look at some concrete examples of explicit and implicit policies, and see how these are often implemented in object storage systems.

No comments: