DataONE API - 2.0

Node Identity and Registration

DataONE nodes are of two types, Coordinating Nodes and Member Nodes. Member Nodes are data and metadata providers that serve particular communities and that agree to interoperate with other nodes using the DataONE Service Interface. Coordinating Nodes provide services to each other and to the network of Member Nodes to enable DataONE to function as an integrated federation.

Node Identifiers

Each node in DataONE is assigned a unique, immutable identifier which serves to link all information about the node together in the system. References in various metadata documents in DataONE always utilize this NodeReference, as this will remain constant even as protocols and service endpoints evolve over time. Thus, while the URL endpoint for a node’s services may change over time, possibly even moving across domains, the NodeReference will always be constant. The DataONE NodeReference takes the following form:

NodeReference = urn ":" node ":" identifier
urn           = "urn"
node          = "node"
identifier    = *( idchars )
idchars       = ALPHA / DIGIT / "_"

ALPHA and DIGIT are patterns representing the upper and lower ASCII letters [A-Za-z] and the ASCII digits [0-9], defined in the ABNF standard. Thus, urn:node: is a constant prefix, always in lowercase, and identifier is a short, unique name for the node that is case sensitive. For example, valid NodeReferences might include:

urn:node:KNB
urn:node:DRYAD
urn:node:CN_UCSB

By policy, the length of nodes identifiers will generally be restricted to 25 characters, inclusive of the urn:node: prefix, and will be reviewed for appropriateness for the node during the node approval process (see Node Registration below).

In this case, appropriateness means concise, memorable, and durable. In general, the identifier should not contain terms that are likely to change over the very long term - implementation details such as host names, software service names, and versions. Identifier length is restricted to make it easy for system administrators and other programmers to read, recall, and type them. DataONE UI’s will make use of the name field of the Node record for display, so the identifier does not have to be meaningful for end-users.

Node Authentication and Contact

In order to become a Member Node (or Coordinating Node) in DataONE, the node must be authenticated by DataONE in order to securely communicate with other DataONE nodes. One of the first steps in preparing the node for registration is receiving a DataONE certificate that will be used for negotiating secure connections with other nodes. This certificate is an X.509 certificate that is backed by a cryptographic key. The certificate will contain a distinguished name, that is included as the subject field in the node record. Over time, these node certificates will expire and will need to be renewed by installing the new certificate on the Member Node, and updating the subject field if necessary. The Node record provided in DataONE can contain a list of subjects representing the node, each corresponding to a valid DataONE certificate installed on the node that can be used for authentication.

In addition, every node must have a contact person with whom DataONE can communicate about DataONE operations (such as new node certificates) and policies as needed. This contact person must be registered and verified with DataONE prior to registration.

Node Registration

Registration as a node in the DataONE network is accomplished by registering as a Member Node (or Coordinating Node) through an existing Coordinating Node registration service (see CNRegister.register()). This service takes a Types.Node description as input, including a proposed Types.NodeReference for this node and additional metadata such as the nodeContact in the Node description. If the NodeReference is syntactically correct and is unique, and the nodeContact is a verified account registered with DataONE, then the registration service will successfully return the Types.NodeReference value for this node, which is then permanently assigned and can not be reused or reassigned. At this point, the Types.Node has been registered but has not yet been approved. The request to become a node will be reviewed by DataONE, and, if approved, will be added to the list of Nodes in the federation. At this point, the Node will be be able to participate in all synchronization and replication services available in DataONE.

Registration Procedure

Along with the production environment, DataONE maintains other environments of inter-communicating Coordinating and Member Nodes for various testing purposes. Aside from a unique list of nodes, each environment maintains their own sets of data objects, object formats, and user accounts. The registration steps described below pertain to a single environment, so registering a node to a new environment would require running through the procedure in its entirety for the new environment.

Step 1: Stand-alone testing

Prior to registration, the node needs to be tested for proper functionality of its services, and proper form of its content. Certain integration tests used by the core team have been deployed to a web server (http://mncheck.test.dataone.org) so member node implementers can test basic services in a stand-alone environment.

Step 2: Content checking

Not every aspect of the node can be checked prior to testing, and some tests take too long to be automated in a web-based platform. Also better done prior to node registration, content checking should be done to make sure that:

  1. all object formats used by the member node are registered with DataONE.
  2. the member node supports the required checksum algorithms.
  3. the system metadata of each object contains accurate AccessPolicies as per that node?s agreement with their submitters.
  4. system metadata RightsHolders are valid subjects, representable by X.509 certificate distinguished names, or a plan is in place to map these accounts to accounts that are representable in such a way.
  5. any other tests determined to be relevant for that node.

This step is best done in close coordination with the DataONE core developer team.

Step 3: Node Registering

Registering the node involves the following steps.

  1. Registering the nodeContact account with the environment via the identity portal. This account needs to be one compatible with CiLogon.

Using the portal

  1. go to https://cn-{ENVIRONMENT}.dataone.org/portal

  2. choose your account provider (this step may be bypassed if you have already logged in

  3. At the My Account tab, fill out the Account Details fields, and click “Register.” (This will register this account and display the subject. If there is no button labeled “Register”, but one labeled “Update”, your account is already registered.)

    The subject displayed is the part within the parentheses, in the format “CN=foo,DC=cil ogon,DC=org”, and it is this value that must match what is in the Node record’s subject field.

  1. Submitting a cn.register(Session, Node) request, where the Session parameter contains the certificate of the person making the request, and the Node parameter is, in most cases, the Node record served by the mn.getCapabilities() service call (GET /node). Problems with the node record will be reported back as an exception.

  2. Approving the node.

    1. Contact the DataONE contact person that the node has been registered and ready for approval.
    2. Review any content checking test results with the node contact.
    3. DataONE will approve the node.

Step 4: Functional Integration testing (except in PROD environment)

At this point, the appropriate multi-node functional tests (for synchronization, replication, and updateSystemMetadata) will be run. Tests in this arena are intended to shake out remaining bugs, and will in most cases be done in close coordination with the DataONE core developers team. Success at this step requires a dedicated developer resource from the member node implementation team for about a 1-2 week period, as bug fixing at this point tends to be sequential.