2009-08-28

More Industry Alignment on Object Storage

In a discussion on one of EMC's blog entries, The Future Doesn't have a File System, Paul Carpentier, of Centera fame, reiterated the need for an industry-wide, lightweight web-based standard for object-based storage access.

His initial thinking was as follows:

1. Unique identifiers; 128 bit, hex representation proposed
2. Object = immutable [Content + Metadata]; content is free format, metadata is free format, XML recommended
3. Simple access protocol; HTTP proposed; non-normative client libraries optional
4. READ and READ METADATA operation; (READ gets metadata and content)
5. WRITE and DELETE operation
6. Small set of standardized XML policy metadata constructs re service level, compliance, life cycle; TBD
7. Persisted Distributed Hash Table to allow variable identifier mapping; 128 bit to 128 bit; HTTP accessed

What is interesting is the degree to which this proposal is aligned with the work being done by the Storage Networking Industry Association in it's Cloud Storage Technical Working Group. This working group is creating a new standard call the Cloud Data Management Interface, which is intended to provide a standardized method for access and management of cloud data using a light-weight RESTful access method.

While the draft standard is not quite released to the public, let's take a quick peek at how it compares to Mr. Carpentier thoughts:

1. Unique identifiers; 128 bit, hex representation proposed

SNIA is proposing to use XAM XUIDs for identifiers, which allows vendors to innovate and define how their identifiers are comprised, while still ensuring global uniqueness and the ability for any object ID to be managed by any vendor's system.

While a basic 128-bit identifier, such as a UUID, is simpler, it does not provide strong guarantees that it will be unique across cloud vendors, and this is critical for emerging cloud models such as cloud migration, federation, peering and interchange.

2. Object = immutable [Content + Metadata]; content is free format, metadata is free format, XML recommended

While some vendors (such as Bycast) will implement the proposed standard by using immutable objects, the standard includes the optional ability to modify both object content and metadata for existing objects, without changing the object identifier.

Metadata will include both user-generated items and system-generated items, and will be represented using XML or JSON.

3. Simple access protocol; HTTP proposed; non-normative client libraries optional

SNIA is using RESTful principles and the HTTP protocol as a foundation for the standard, and simplicity is a key design goal. Almost every part of the standard is optional, and the client can discover what parts of the standard are supported by any given implementation.

Client libraries to provide simplified language mapping is anticipated, but the goal is to enable full use using standard HTTP libraries.

4. READ and READ METADATA operation; (READ gets metadata and content)

The HTTP GET and HEAD operation map to these functions.

5. WRITE and DELETE operation

The HTTP PUT and DELETE operations map to these functions.

If a cloud does not support mutable objects, then the cloud storage provider can indicate this to a client via the capabilities discovery interface, and any attempts to modify an existing object would fail.

6. Small set of standardized XML policy metadata constructs re service level, compliance, life cycle; TBD

SNIA is actively working on standardizing a set of "Data System Metadata", which allows a client to specify what level of service that it desires from a cloud. Examples include maximum latency, degree of replication, etc.

7. Persisted Distributed Hash Table to allow variable identifier mapping; 128 bit to 128 bit; HTTP accessed

This is outside of what the standard is proposing, but by using the included queue data object functionality, vendors can add functionality such as lookups and transformations. This allows extension by vendors in a standardized way, and allows them to take advantage of much of the common infrastructure provided by the standard.


In summary, I would encourage everyone who is interested in cloud storage or in the industry to take a look at the work that the SNIA is doing, and to get involved!

No comments: