DataONE API - 2.0

Mutability of Metadata

These notes were initiated by DV with responses by RW and MJ around 17 December, 2010.

Notes clarified and consolidated on 08/31/2011 by RW.

SystemMetadata will be modified by CNs, MNs and clients. The CN modifies the SystemMetadata during certain operations, such as MN-CN Synchronization and MN-MN replication. A MN will modify SystemMetadata provenance as an object is updated. Clients and MNs will modify SystemMetadata to reflect new policies regarding replication, access and ownership, and will notify the CN when a MN-MN Replication event has completed.

SystemMetadata Mutability

SystemMetadata has certain elements that, once created, will never change. The immutable set of elements are determined during the MN-CN Synchronization process and are static through the existence of the object. The mutable set of elements are modified due to certain interactions and restrictions placed on which actor in the system may perform the updates.

The Immutable set:

Element Type
identifier Types.Identifier
formatId Types.ObjectFormatIdentifier
size long
checksum Types.Checksum
submitter Types.Subject
dateUploaded Types.DateTime
originMemberNode Types.NodeReference

The Mutable set:

Element Type
rightsHolder Types.Subject
accessPolicy Types.AccessPolicy
replicationPolicy Types.ReplicationPolicy
obsoletes Types.Identifier
obsoletedBy Types.Identifier
dateSysMetadataModified: Types.DateTime
authoritativeMemberNode: Types.NodeReference
replica Types.Replica

REST API

Methods affecting SystemMetadata
Tier REST Function Parameters
Tier 1 PUT /meta/{pid} CNCore.updateSystemMetadata() (session, pid, sysmeta) -> boolean
Tier 2 PUT /owner/{pid} CNAuthorization.setOwner() (session, pid, userId) -> Types.Identifier
Tier 2 PUT /accessRules/{pid} CNAuthorization.setAccessPolicy() (session, pid, accessPolicy) -> boolean
Tier 2 PUT /accessRules/{pid} MNAuthorization.setAccessPolicy() (session, pid, accessPolicy) -> boolean
Tier 3 PUT /object/{pid} MNStorage.update() (session, pid, object, newPid, sysmeta) -> Types.Identifier
Tier 4 POST /notify CNReplication.setReplicationStatus() (session, pid, status) -> boolean
Tier 4 PUT /meta/replication/{pid} CNReplication.updateReplicationMetadata() (session, pid, replicaMetadata) -> boolean
Tier 4 PUT /meta/policy/{pid} CNReplication.setReplicationPolicy() (session, pid, policy) -> boolean

internal only:

externally available through REST API:

Interactions affecting SystemMetadata

The CN is the ultimate arbiter of SystemMetadata changes. There needs to be a clear delineation of responsibility with regard to which processes will interact with the CN store such that the SystemMetadata remains consistent.

MN-CN Synchronization:

The MN-CN Synchronization process will set all the immutable elements the first time an item is created. It will also add items to mutable elements that were provided by the MN:

It will also reset dateSysMetadataModified (Types.DateTime) to the time the object was added.

The MN-CN Synchronization process may also update SystemMetadata by calling the CNCore.updateSystemMetadata internally. It will update the authoritativeMemberNode (Types.NodeReference) or obsoletedBy (Types.Identifier) on the CN when during synchronization of the node listed as the authoritativeMemberNode, it finds those fields have changed been modified, and changes dateSysMetadataModified (Types.DateTime) to reflect the date on the SystemMetadata sent from the MN.

MN-MN Replication:

The MN-MN Replication process running on the CN will call CNReplication.updateReplicationMetadata to modify replica (Types.Replica) on the SystemMetadata to reflect the replica copies available. A MN will call the CNReplication.setReplicationStatus that modifies the replica list replica (Types.Replica) to indicate when a replication from one MN to another has been completed. After each operation, dateSysMetadataModified (Types.DateTime) will be modified to be the date the operation was performed.

Client-CN Interactions:

Clients, either ITK or MNs, may call the following methods on the CN:

The execution of these methods will alter various elements:

A side effect of each of these operations will be an update to
dateSysMetadataModified (Types.DateTime).

Client-MN Interactions:

A Client may call the following:

This operation alone does not have an effect on the CN’s definitive store. A subsequent call to the CN will via func:CNAuthorization.setAccessPolicy will need to be made by the MN.

An object may be updated on an MN. The update mechanism will create a new object that is the descendent of the object updated. The descendant object will have the obsoletes (Types.Identifier) field set while the ancestor object will need an obsoletedBy (Types.Identifier) element added. The synchronization process will update the ancestor’s SystemMetadata with the new value.

Robert’s Notes:

From these interactions, there is no mechanism defined that updates authoritativeMemberNode on the Authoritative MN.

I am uncertain why MNAuthorization.setAccessPolicy() is needed. It would appear to be a proxy of the CNAuthorization.setAccessPolicy. So why not eliminate the MN call and direct all client calls to the CN?

To answer my question about Synchronization updating responsibility: Synchronization should only update the obsoletedBy and authoritativeMemberNode fields of the SystemMetadata from the Authoritative MN (and only the Authoritative MN).