Skip to content. | Skip to navigation

Personal tools

Navigation

You are here: Home / Wiki / Blobstore

Blobstore

Blob store

Blob store

An implementation of a "blob store" for Emulab experiments is available. The basic aim is simply to provide some sort of persistent data storage capability which is very easy to access from many kinds of experimental nodes. The original motivation was to hold configuration information for "black box" nodes, but the design seeks to be appropriate for other uses, too (especially in environments where shared filesystems might be unavailable or inappropriate).

Key assumptions

Some of the decisions made in meetings included the following:

  • Blobs are mutable. (The contents of a blob can be changed without affecting its ID.)
  • We will support multiple transport methods, including Frisbee and HTTP. The server will advertise a list of available transports (which might vary by blob), and the client is permitted to select from among whichever are listed.
  • At least the following "scopes" for blobs will be available: "global" (system provided, for all users), "globally shared/anonymous" (user provided, but readable by all users), "private" (user provided, accessible only to the appropriate project/experiment).
  • We will provide support for caching, including on the client side.
  • It will be possible to operate the system without any client-assigned names to blobs. (That does not rule out subsequently allowing client assigned names, of course. We are refraining from designing in client names right now, partly because of namespace/collision issues.)

Implementation details

Some of the consequences of those assumptions, and initial implementation decisions:

  • Blobs are stored analogously to disk images: somewhere under /usr/testbed/... for system blobs, and /proj/foo/... for user blobs.

  • The initial user interface for manipulating the contents of blobs is simply via the filesystem. This will not preclude other means of blob manipulation to be added later (e.g. to support users without accounts on ops).

  • The mapping of blob IDs to underlying files is potentially many-to-one.

  • For HTTP, access control is implemented by requiring the experiment key as part of the URL.

  • The server supports comparisons of cryptographic hashes of blobs. This allows both simple cache validity checking, and detection of a blob being modified during transfer. In the case of the backing store being directly manipulable by users, the server can use the file mtime as a heuristic for determining whether its own hash is up to date.

Interface

Creating a blob

The desired contents of a blob should be stored in a file under /proj on ops, and can be modified at any time. To promote the file to a blob, use the mkblob command (on ops). This needs to be done only once, even if the file is changed, though it is permissible to run it multiple times. (Performing more than mkblob will create multiple logical blobs, each backed by the same physical file.) For example:

echo "Example blob contents" > /proj/example/user/myblob
/usr/testbed/bin/mkblob /proj/example/user/myblob

Upon success, a UUID identifying the new blob will sent to standard output.

Retrieving a blob

Any experimental node can retrieve any blob, though permission checks apply. Filesystem permissions on the file backing the blob can be used to allow or deny access to it: if the file is world readable, then any experimental node may read the blob; if the file is group readable, then any node belonging to an experiment in the project corresponding to the file's group may read it.

The simplest way to retrieve a blob onto an experimental node is to use the getblob command, which is invoked as follows:

getblob [-d] [-h] [-o file] [-t transport]... [-v] blobid

    -d              debug mode (verbose)
    -h              help
    -o file         output file name
    -t transport    specify transports, in decreasing priority
                        where transport can include:
            http    HTTP
            https   HTTP over TLS/SSL
    -v              show version

Using getblob will automatically handle caching and authentication. If getblob is not available or suitable for any reason, you can also attempt to retrieve a blob over HTTP(S) manually, by constructing a URL like:

http://www.emulab.net/blob/read/pid/eid/key/blobid

where pid and eid should be replaced by the proper project and experiment IDs, and key should contain the experiment authentication key found in the shared directory /proj/pid/eid/tbdata/webkey. blobid is again the blob UUID.

Removing a blob

If you no longer require a blob, it can be removed by issuing the command rmblob blobid on ops. This does NOT automatically remove the file backind the blob. Blob removal is permitted only for the same user that initially created the blob.