2008-12-15

XAM Active Objects

The SNIA XAM standard provides a comprehensive object model and interface for the storage of dynamic data objects. The standard provides methods by which collections of data (known as XSets) can be created and deleted, data can be stored, retrieved and manipulated inside these XSets, and queries can be performed to locate XSets that match specified charactertistics.

While the first version of the standard was under development, I worked with Jared Floyd of Permabit to ensure that the XAM Job model was architected in such a way that it would include all of the components required to support active objects, like those supported in ADE.

What are XAM Active Objects?

This is best illustrated by an example:

Let us create a hypothetical XSet that includes an XStream (binary data) that contains the bytecodes defining a Java program. When this XSet is committed, it can automatically start executing. The entity that executes these active objects can either be performed by a sidecar system that attaches to the XAM Storage System (XSS), and becomes aware of new active objects via the standard XAM query facilities, or the execution of these active objects can be an intrinsic part of the XSS.

The Java program would be executed in the security context of the XSET it is contained within, and thus it can access local data stored within it's own XSET, and optionally perform XAM operations based on its security credentials. This allows it to read other XSets, create new XSets and perform queries to discover XSets.

For example, an active XAM object could remain resident within the storage system, performing queries for specific types of XSETs (for example, PDF objects), and convert them into a newer version of the PDF format. Such an model would allow dynamic format conversion of archived content, just by loading in a new XSET. This model is also be useful for analysis, where a large number of XSETs need to be analysed or datamined in the background.

Because XAM Storage Systems will often be distributed, multiple active objects can easily be parallelized across the compute infrastructure the makes up the system, and parallel computing patterns are easy to implement, as one active object can create XSETs that act as child active objects (the equivalent of performing a fork in UNIX). As code can be bundled with data, this model will also enable the creation of data-driven MIMD and SIMD parallel data processing systems.

XSETs become process contexts, complete with inputs, code, state and outputs, with the full ability to discover and access data within the storage system, create new data, spawn child processes and report status information to external systems. Because XSETs belong to user contexts, adding processing quotas, limits on computing resources, and other policies can use the same methods as used for other XAM policies.

Part of the Standard?

All this could be implemented today as a vendor-specific extension to the standard. However, adding this capability to the XAM standard would require work to be done in the following areas:
  • Active object language profiles would have to be defined, since multiple languages including Java, .net and interpreted languages such as ruby could all be supported.
  • Language bindings would need to be standardized to allow these programs embedded in the active objects to be perform XAM operations on the XSS they are resident in.
  • Standard job-style status reporting would be beneficial to allow standardized active object execution status monitoring.
  • Policies for computing-related aspects, such as resource usage and quotas would need to be defined.
  • Requirements for security isolation would need to be defined.

Given that none of these would require changes to the core XAM standard, this could easily be added as an optional part of the standard, much like Level 2 query, which provides full-text search within XStreams.

In summary, the XAM standard provides a foundation upon which a rich data-driven distributed computing system can easily be created. This opens up many intriguing possibilities, and would be relatively easy to formally add to the standard.

No comments: