2008-12-23

Concurrency and Complexity

HP's Russ Daniels understands a critical issue facing the next generation of software development:

"If you think about whatever those skills are that are necessary to be able to program a computer, we don't understand what they are very well, but we know that they're not very widely distributed. So you end up with a relatively small portion of the population who can write a program successfully. Then you start introducing certain kinds of abstractions: if you introduce recursion, you lose some people; if you introduce pointers, you lose some people; if you introduce event-driven models, you lose some people; when you introduce concurrency, you lose virtually everybody.

So we create these programming languages that support concurrency and then you find that there's a lock and an unlock around every method, because otherwise people hurt themselves. It presents an interesting dilemma—we need to be able to get more stuff done, we need to figure out how to do stuff in parallel, but we're pretty bad at doing that."

One of the keys to solving this complexity problem is to make the mechanics of parallelism safe, automatic, and transparent to the programmer. Ultimately, this approach needs to extend far beyond just parallelism, as we see the same failings in the areas of security, fault tolerance, fault recovery, dynamic scaling, and resource economics.

An ADE Example

In order to see how this could work, let's take a quick look at an example of how ADE transparently provides concurrency and protection of resources that must be accessed serially:

Let's imagine that we're tasked with building a simple LAN switch, and we want to provide an IP address resolution for connected Ethernet hosts. If we want to send an IP packet to a given computer, we first need to find what the MAC address of the computer with a given IP address is. This is performed using the Address Resolution Protocol (ARP), which was originally documented in IETF RFC 826.

At a simplified level, the protocol works as follows:
  1. A host which wants to talk with a specific IP address sends an ARP broadcast request asking for the MAC address that traffic should be addressed.
  2. The switch intercepts the broadcast, and looks up the IP address in its local cache. If the IP to MAC address mapping is found, it responds to the originally requesting host with the ARP response.
  3. If the mapping is not found in the local cache, the switch broadcasts the ARP request to all other ports.
  4. If the switch receives a ARP response, it adds the IP to MAC mapping to its local cache, and forwards the response to the originally requesting host.
This is equivalent to a simple database insert/lookup problem, and in traditional concurrent programming models, errors can easily be introduced if care is not taken to lock the local cache while performing modifications.

In contrast, because ADE handles concurrency by forcing all component interactions to be atomic transactions, no locks are required and the system can be proven to be correct and deadlock free.

In this example, we are going to consider three actors: The ARP Lookup Process, the ARP Cache Process and the ARP Discovery Process. These actors have the following roles:
  • The ARP Lookup Process receives ARP lookup requests from the network hosts, asks the ARP Cache Process if the mapping is known, asks the ARP Discovery Process for the mapping if not known, and responds to the requesting host.
  • The ARP Cache Process looks up mappings in the local cache, and adds mappings when asked.
  • The ARP Discovery Process sends ARP lookup requests to the network, and when a mapping is discovered, it asks the ARP Cache Process to remember them.
In this model, there would be one instance of the ARP Lookup Process per ARP lookup packet received on the network, one instance of the ARP Cache Process, and one instance ARP Discovery Process per cache miss.

In a busy network, there could be dozens of ARP requests being processed, and many concurrent requests that may require changes to be made to the ARP cache. Even if ARP discovery is adding a new mappings at the same time that a new request is looking up a mapping, there is no possibility that the cache can become corrupt, because all access to the cache is automatically serialized by virtue of the entities and their messages that are exchanged.

This is the key aspect of ADE that simplifies the creation of distributed systems. Because concurrency is automatic, when problems are decomposed into collections of interacting actors, any parallelism inherent in the problem will be automatically utilized.

2008-12-15

XAM Active Objects

The SNIA XAM standard provides a comprehensive object model and interface for the storage of dynamic data objects. The standard provides methods by which collections of data (known as XSets) can be created and deleted, data can be stored, retrieved and manipulated inside these XSets, and queries can be performed to locate XSets that match specified charactertistics.

While the first version of the standard was under development, I worked with Jared Floyd of Permabit to ensure that the XAM Job model was architected in such a way that it would include all of the components required to support active objects, like those supported in ADE.

What are XAM Active Objects?

This is best illustrated by an example:

Let us create a hypothetical XSet that includes an XStream (binary data) that contains the bytecodes defining a Java program. When this XSet is committed, it can automatically start executing. The entity that executes these active objects can either be performed by a sidecar system that attaches to the XAM Storage System (XSS), and becomes aware of new active objects via the standard XAM query facilities, or the execution of these active objects can be an intrinsic part of the XSS.

The Java program would be executed in the security context of the XSET it is contained within, and thus it can access local data stored within it's own XSET, and optionally perform XAM operations based on its security credentials. This allows it to read other XSets, create new XSets and perform queries to discover XSets.

For example, an active XAM object could remain resident within the storage system, performing queries for specific types of XSETs (for example, PDF objects), and convert them into a newer version of the PDF format. Such an model would allow dynamic format conversion of archived content, just by loading in a new XSET. This model is also be useful for analysis, where a large number of XSETs need to be analysed or datamined in the background.

Because XAM Storage Systems will often be distributed, multiple active objects can easily be parallelized across the compute infrastructure the makes up the system, and parallel computing patterns are easy to implement, as one active object can create XSETs that act as child active objects (the equivalent of performing a fork in UNIX). As code can be bundled with data, this model will also enable the creation of data-driven MIMD and SIMD parallel data processing systems.

XSETs become process contexts, complete with inputs, code, state and outputs, with the full ability to discover and access data within the storage system, create new data, spawn child processes and report status information to external systems. Because XSETs belong to user contexts, adding processing quotas, limits on computing resources, and other policies can use the same methods as used for other XAM policies.

Part of the Standard?

All this could be implemented today as a vendor-specific extension to the standard. However, adding this capability to the XAM standard would require work to be done in the following areas:
  • Active object language profiles would have to be defined, since multiple languages including Java, .net and interpreted languages such as ruby could all be supported.
  • Language bindings would need to be standardized to allow these programs embedded in the active objects to be perform XAM operations on the XSS they are resident in.
  • Standard job-style status reporting would be beneficial to allow standardized active object execution status monitoring.
  • Policies for computing-related aspects, such as resource usage and quotas would need to be defined.
  • Requirements for security isolation would need to be defined.

Given that none of these would require changes to the core XAM standard, this could easily be added as an optional part of the standard, much like Level 2 query, which provides full-text search within XStreams.

In summary, the XAM standard provides a foundation upon which a rich data-driven distributed computing system can easily be created. This opens up many intriguing possibilities, and would be relatively easy to formally add to the standard.