Vanishing Into the Infrastructure

2010-06-01

Not the End, Just the End of a Beginning

Today, I am proud to announce my new blog as part of the NetApp blogging community: Objects in Context.

I am very pleased to join the NetApp team, and to have this opportunity to start writing a second chapter. January 2010 marked the passage of a decade since I first started working on object storage, and we've seen many milestones, from the first large-scale deployments to the development of industry standards. Over the next few years, I fully expect things to accelerate — After all, the Internet changes everything, and storage is just starting to catch up.

Moving forward, I will be re-purposing this blog here to discuss topics such as information visualization, radio design, iPad development and other similar non-work-related subjects. I encourage all my readers to continue to follow my adventures in object storage over at the NetApp blog. We've got many exciting things in store, and I can assure you that in this case, one plus one is far greater than two.

2010-03-30

CDMI Functional Areas

With the 1.0 release of the Cloud Data Management Interface standard quickly wrapping up, below is my summary of the major functional areas covered in the 1.0g draft:

Object Access by Name - Sections 8 and 9

Cloud storage clients can store, list and retrieve objects by name. This is the most common method for accessing objects, and allows the placement of objects into "containers" to group together like objects, in a similar manner as directories in a filesystem.

For more details, see CDMI Tutorial - Basic Input/Output.

Object Access by ID - Sections 8 and 9

Cloud storage clients can also store and retrieve objects by ObjectID. Every named object also has an Object ID, but not all objects have a name. When unnamed objects are created, they can only be accessed by Object ID. Storing Object IDs in a database is more efficient than storing URIs, and this mode of object access is more suited when access is ID based or query based.

Data System Metadata - Section 16.4

Data System Metadata is a set of special metadata items that are interpreted by the cloud in order to allow the cloud storage client to specify the level of service, protection, placement and other characteristics for stored objects.

For more details, see CDMI Tutorial - Data Management, Part 1.

Serialization - Section 15

Serialization allows objects and containers to be transformed into a portable data object that can be deserialized back into the original objects and containers. This is useful for archiving and for system-to-system transport.

Domains - Section 10

Domains specify administrative control within a CDMI cloud, and define how users are mapped to permissions, specifies delegation of authentication and authorization, and provides usage summaries.

Queues - Section 11

Queues are special data objects that can store multiple values in a first in, first out access semantics. Queues represent a key technology for connecting together applications in the cloud, as they enable reliable inter-process communication with underlying persistent data storage.

Query - Section 11.1.3

Query is implemented as a queue interface that allows a client to express in a standardized manner a query, and for a query engine to create a response as a CDMI data object with a specific format of results.

Notification - Section 11.1.1

Notification builds on top of queues, and allows a client to subscribe to a client-defined set of notifications about operations performed against the storage cloud.

Audit - Section 11.1.2, 17

Audit builds on top of queues, and allows a client to subscribe to a client-defined set of log messages about system operations.

Exports - Section 13

Exports allows a client to specify and control how access to named objects are provided through network files protocols. Exports also allow a container to be exported as a block device.

Snapshots - Section 14

Snapshots are the ability for a client to specify that access to a set of named objects and containers should be preserved.

Retention - Section 18

Retention and Hold are a set of functions that allow a client to specify that an object may not be modified or deleted, and for how long the restrictions must remain in place.

2010-02-24

The World's Shortest CDMI Implementation

The CDMI standard is built around the concept of "capabilities", which describe which functionality a given cloud storage system provides to clients. As a result, it permits a system to only implement a small subset of the standard while still being compliant.

So, to demonstrate this, I present to you the world's shortest* CDMI implementation, written in Ruby:

# World's smallest CDMI implementation, or, how to say NO in CDMI.
require 'socket'
require 'openssl'

listener = TCPServer.new('', 2000)

# Set up TLS
ssl_context = OpenSSL::SSL::SSLContext.new()
ssl_context.cert = OpenSSL::X509::Certificate.new(File.open("cdmi_server.cert"))
ssl_context.key = OpenSSL::PKey::RSA.new(File.open("cdmi_server.key"))
ssl_listener = OpenSSL::SSL::SSLServer.new(listener, ssl_context)

while (connection = ssl_listener.accept)
    request = ""
 while(request.index("\n\n") == nil)
  request << connection.gets
 end

 print "-- CLIENT REQUEST -------------------------------------------------------------\n"
 print request
 print "-------------------------------------------------------------------------------\n"

 if(request.index("GET ") == 0)
  uri = request.slice(request.index(" ") + 1, request.length)
  uri = uri.slice(0, uri.index(" "))
  
  if(uri == "/")
   connection.puts("HTTP/1.1 200 OK\nContent-Type: application/vnd.org.snia.cdmi.container+json\nX-CDMI-Specification-Version: 1.0\n\n{\"objectURI\" : \"/\", \"objectID\" : \"AABwbQAQgTpfe4qRBsyCCw==\", \"parentURI\" : \"/\", \"capabilitiesURI\" : \"/cdmi_capabilities/\", \"completionStatus\" : \"Complete\", \"metadata\" : {}, \"childrenrange\" : \"0-0\", \"children\" : [\"cdmi_capabilities/\"]}")
  elsif(uri == "/cdmi_capabilities" || uri == "/cdmi_capabilities/" )
   connection.puts("HTTP/1.1 200 OK\nContent-Type: application/vnd.org.snia.cdmi.capabilities+json\nX-CDMI-Specification-Version: 1.0\n\n{\"objectURI\" : \"/cdmi_capabilities/\", \"objectID\" : \"AABwbQAQnP7GJT2muKDelQ==\", \"parentURI\" : \"/\", \"capabilities\" : {\"cdmi_security_https_transport\" : \"true\", \"cdmi_read_metadata\" : \"true\", \"cdmi_list_children\" : \"true\"}, \"childrenrange\" : \"\", \"children\" : []}")
  else
   connection.puts("404 Not Found\n")
  end
 else
  connection.puts("501 Not Implemented\n")
 end
    
 connection.close
end

* It's a little longer than it could be, because it is written for readability. Removal of comments, code tightening, etc, is left as an exercise for the reader.

Now, since this uses TLS, you'll need a TLS-compatible client. Here's a test client for this purpose:

# Test client for a minimal CDMI implementation
require 'socket'
require 'openssl'

socket = TCPSocket.new('localhost', 2000)

ssl_context = OpenSSL::SSL::SSLContext.new()
ssl_socket = OpenSSL::SSL::SSLSocket.new(socket, ssl_context)
ssl_socket.sync_close = true
ssl_socket.connect

ssl_socket.puts("GET #{ARGV[0]} HTTP/1.0")
ssl_socket.puts("accept: application/vnd.org.snia.cdmi.object+json")
ssl_socket.puts("X-CDMI-Specification-Version: 1.0")
ssl_socket.puts("")

print "-- SERVER RESPONSE ------------------------------------------------------------\n"
while line = ssl_socket.gets
 print line
end
print "-------------------------------------------------------------------------------\n"

In order to run these, you need a x.509 certificate and key. You can generate these using OpenSSL. You can follow the below instructions for apache, then instead of step 5, copy the certificate to "cdmi_server.cert", and copy the key to "cdmi_server.key".

http://www.akadia.com/services/ssh_test_certificate.html

2010-02-23

A CDMI Tutorial - Data Management, Part 1

As we covered in first part of this tutorial, the SNIA Cloud Data Management Interface provides a RESTful mechanism for the basic storage and retrieval of data. However, the core of the standard is focused around data management — how objects are stored, delivered, placed, protected and more.

This post is the second in a series on CDMI. Subsequent posts will cover the following areas:

Basic Input/Output
Data Management, Part 1 (This post)
Data Management, Part 2
Advanced Input/Output
Cloud-to-Cloud Interactions
Queues and Query
Authentication and Access Control
Billing and Accounting

Data Management and Metadata

CDMI storage systems see the world as a tree of objects and containers. Objects store data, and containers contain child objects and containers. Regardless of if the data is stored via HTTP or via traditional protocols such as NFS or CIFS, all data represented through the CDMI protocol is fundamentally seen in terms of objects and containers.

Management of stored data is enabled through metadata. Every container and object can have metadata associated with it. Metadata in CDMI is organized into three general categories; user metadata, storage system metadata, and data system metadata.

User Metadata

User metadata is set directly by CDMI clients, or indirectly through the extended metadata interfaces of other access protocols. For example, in NFS, extended attributes can be mapped to CDMI user metadata items. User metadata items are arbitrary, and are not interpreted by the storage system.

Storage System Metadata

Storage system metadata are generated by the CDMI storage system, and provide read-only access to information about the stored data that is managed by the storage system. The creation time of a object is a good example of a storage system metadata item.

Data System Metadata

Data system metadata are provided by a CDMI client, or specified through an out-of-band management interface, and determine how the stored data should be managed. For example, data system metadata can specify the degree of replication, or an encryption level desired to protect data while stored on disk.

It is through the specification of data system metadata that CDMI enables the management of how data should be stored.

Example Object Metadata

As an example, let's assume that we have a CDMI-enabled storage system that also provides a NFS share, and we've stored some documents onto it.

We can then connect to the system via CDMI, and access the CDMI metadata of the "Documents" container:

GET /Documents/ HTTP/1.1
Host: cloud.example.com
Content-Type: application/vnd.org.snia.cdmi.object+json
X-CDMI-Specification-Version: 1.0

HTTP/1.1 200 OK
Content-Type: application/vnd.org.snia.cdmi.container+json
X-CDMI-Specification-Version: 1.0
{
  "objectURI" : "/Documents/",
  "objectID" : "AABwbQAQ8ypO85j/ml8TZQ==",
  "parentURI" : "/",
  "accountURI" : "/cdmi_accounts/default_account/",
  "capabilitiesURI" : "/cdmi_capabilities/container/",
  "percentageComplete" : "Complete",
  "metadata" : {
    "user.DosAttrib": "0x20",
    "cdmi_ctime" : "2009-12-29T12:43:32.479832Z",
    "cdmi_atime" : "2010-01-02T16:12:53.521983Z",
    "cdmi_mtime" : "2010-01-02T16:12:53.521983Z",
    "cdmi_acount" : "52",
    "cdmi_mcount" : "12",
    "ACL" : {
      "acetype" : "0x00",
      "identifier" : "jdoe",
      "aceflags" : "0x03",
      "acemask" : "0x000F005F",
      "acetime" : "2009-12-29T12:43:32.479832Z" 
    },
    "cdmi_data_redundancy": "2",
    "cdmi_immediate_redundancy": "2",
    "cdmi_infrastructure_redundancy": "2",
    "cdmi_geographic_placement": [
        "US" 
    ],
    "cdmi_encryption": "AES_256_CTR",
    "cdmi_data_redundancy_billed": "2",
    "cdmi_immediate_redundancy_billed": "2",
    "cdmi_infrastructure_redundancy_billed": "1",
    "cdmi_geographic_placement_billed": [
        "US" 
    ],
    "cdmi_encryption_billed": "AES_256_CTR" 
  },
  "childrenrange" : "1-3",
  "children" : [
    "Financials/",
    "CDMI_Spec.pdf",
    "hello.txt" 
  ]
}

There is quite a few metadata items here, and it looks quite complex, but it's not as bad as it first appears. To see what these metadata items mean, and how they are used to manage stored data, let's review them individually:

User Metadata Items

The directory (container in CDMI speak) has a single user metadata item:

      "user.DOSATTRIB": "0x20",

This user metadata item is an extended attribute that indicates the archive bit is set in the DOS mode of a directory. In this case, this user metadata item was created when the directory was created by SAMBA, and the storage server presents extended attributes on the filesystem as user metadata items.

Storage System Metadata Items

The directory has five storage system metadata items, and an ACL, which is an example of a more complex storage system metadata item:

        "cdmi_ctime" : "2009-12-29T12:43:32.479832Z",
        "cdmi_atime" : "2010-01-02T16:12:53.521983Z",
        "cdmi_mtime" : "2010-01-02T16:12:53.521983Z",
        "cdmi_acount" : "52",
        "cdmi_mcount" : "12",
        "ACL" : {
            "acetype" : "0x00",
            "identifier" : "jdoe",
            "aceflags" : "0x03",
            "acemask" : "0x000F005F",
            "acetime" : "2009-12-29T12:43:32.479832Z" 
        },

In this example, the first three metadata items contain the creation, last access and last modify times, respectively, and the remaining two show the number of accesses and the number of modifications since creation. The ACL metadata specifies the access control restrictions for the folder, and is based on NFSv4 ACLs.

Data System Metadata Items

Finally, we have the data system metadata items. These items are specified by a CDMI client, or an out-of-band management application, and specify how the data in the "Documents" folder should be managed:

"cdmi_data_redundancy": "2",
"cdmi_immediate_redundancy": "2",
"cdmi_infrastructure_redundancy": "2",
"cdmi_geographic_placement": [
    "US"
],
"cdmi_encryption": "AES_256_CBC",

The first data system metadata item, "cdmi_data_redundancy", indicates how many indpendent copies of the stored data should be kept. In this case, it has been set to two, which means that two copies of the data should be stored. Likewise, "cdmi_immediate_redundancy" indicates that two copies should be provided synchronously, "cdmi_infrastructure_redundancy" indicates that the two copies should be located in separate failure domains, "cdmi_geographic_placement" indicates that the copies should remain within the United States, and "cdmi_encryption" indicates that AES with a 256 bit key in counter mode should be used to protect the data.

It is important to note that data system metadata items expresses the desired management behaviour. This is separate from the actual management behaviour. In order to indicate to a client what management behaviours are actually being provided, CDMI provides a matching series of data system metadata items, ending with the suffix "_billed":

"cdmi_data_redundancy_billed": "2",
"cdmi_immediate_redundancy_billed": "2",
"cdmi_infrastructure_redundancy_billed": "1",
"cdmi_geographic_placement_billed": [
    "US"
],
"cdmi_encryption_billed": "AES_256_CTR",

In this case, the data system metadata is specifying that the storage system is able to meet the requested level of redundancy, the requested immediate redundancy, but is not able to provide the requested infrastructure redundancy, or the requested encryption method. This allows a client to discover if the requested data system services are being provided.

Data Management Summary

So, putting this all together, data system metadata is specified or inherited from parent containers. The data system metadata specified the desired data system services for stored content. The system tries to accomplish the requested data system services, and indicates what services are actually being provided in the corresponding billed data system metadata items.

All of the data system metadata items specified for the "Documents" container will also be applied to the child "Financials" container, unless overridden by a data system metadata value specified there. For example, if the "Financials" container has the following data system metadata:

"cdmi_data_redundancy": "3",

Only the redundency is changed — all other items are inherited from the parent. Thus, the billed metadata values would be:

"cdmi_data_redundancy_billed": "3",
"cdmi_immediate_redundancy_billed": "2",
"cdmi_infrastructure_redundancy_billed": "1",
"cdmi_geographic_placement_billed": [
    "US"
],
"cdmi_encryption_billed": "AES_256_CTR",

Currently all data system metadata items are inherited, and there is no way to override an inherited value except by specifying a new data system metadata item.

This concludes a quick overview of the basics of data management and metadata in CDMI. In the next part of the tutorial, we will discuss the various data system metadata services defined in CDMI.