Tier | REST | Function | Parameters |
---|---|---|---|
Tier 1 | GET /monitor/ping | MNCore.ping() | () -> null |
Tier 1 | GET /log?[fromDate={fromDate}][&toDate={toDate}][&event={event}][&pidFilter={pidFilter}][&start={start}][&count={count}] | MNCore.getLogRecords() | (session, [fromDate], [toDate], [event], [pidFilter], [start=0], [count=1000]) -> Types.Log |
Tier 1 | GET / and GET /node | MNCore.getCapabilities() | () -> Types.Node |
Tier 1 | GET /object/{pid} | MNRead.get() | (session, pid) -> Types.OctetStream |
Tier 1 | GET /meta/{pid} | MNRead.getSystemMetadata() | (session, pid) -> Types.SystemMetadata |
Tier 1 | HEAD /object/{pid} | MNRead.describe() | (session, pid) -> Types.DescribeResponse |
Tier 1 | GET /checksum/{pid}[?checksumAlgorithm={checksumAlgorithm}] | MNRead.getChecksum() | (session, pid, [checksumAlgorithm]) -> Types.Checksum |
Tier 1 | GET /object[?fromDate={fromDate}&toDate={toDate} &formatId={formatId}&replicaStatus={replicaStatus} &start={start}&count={count}] | MNRead.listObjects() | (session, [fromDate], [toDate], [formatId], [replicaStatus], [start=0], [count=1000]) -> Types.ObjectList |
Tier 1 | POST /error | MNRead.synchronizationFailed() | (session, message) -> Types.Boolean |
Tier 1 | GET /replica/{pid} | MNRead.getReplica() | (session, pid) -> Types.OctetStream |
Tier 2 | GET /isAuthorized/{pid}?action={action} | MNAuthorization.isAuthorized() | (session, pid, action) -> boolean |
Tier 2 | POST /dirtySystemMetadata | MNAuthorization.systemMetadataChanged() | (session, pid, serialVersion, dateSysMetaLastModified) -> boolean |
Tier 3 | POST /object | MNStorage.create() | (session, pid, object, sysmeta) -> Types.Identifier |
Tier 3 | PUT /object/{pid} | MNStorage.update() | (session, pid, object, newPid, sysmeta) -> Types.Identifier |
Tier 3 | POST /generate | MNStorage.generateIdentifier() | (session, scheme, [fragment]) -> Types.Identifier |
Tier 3 | DELETE /object/{pid} | MNStorage.delete() | (session, pid) -> Types.Identifier |
Tier 3 | PUT /archive/{pid} | MNStorage.archive() | (session, pid) -> Types.Identifier |
Tier 4 | POST /replicate | MNReplication.replicate() | (session, sysmeta, sourceNode) -> boolean |
Tier 1 | GET /query/{queryEngine}/{query} | MNQuery.query() | (session, queryEngine, query) -> Types.OctetStream |
Tier 1 | GET /query/{queryType} | MNQuery.getQueryEngineDescription() | (session, queryEngine) -> Types.QueryEngineDescription |
Tier 1 | GET /query | MNQuery.listQueryEngines() | (session) -> Types.QueryEngineList |
The MN_core API provides mechanisms for a Member Node to report on the level of service compliance and to specify replication policies. The capabilities information is used in the Member Node registration process by the Coordinating Nodes.
The state of health API provides mechanisms for the monitoring infrastructure to report on the current state of the DataONE infrastructure and for the Coordinating Nodes to track the current operating state of the Member Node.
Tier | REST | Function | Parameters |
---|---|---|---|
Tier 1 | GET /monitor/ping | ping() | () -> null |
Tier 1 | GET /log?[fromDate={fromDate}][&toDate={toDate}][&event={event}][&pidFilter={pidFilter}][&start={start}][&count={count}] | getLogRecords() | (session, [fromDate], [toDate], [event], [pidFilter], [start=0], [count=1000]) -> Types.Log |
Tier 1 | GET / and GET /node | getCapabilities() | () -> Types.Node |
Low level “are you alive” operation. A valid ping response is indicated by a HTTP status of 200. A timestmap indicating the current system time (UTC) on the node MUST be returned in the HTTP Date header.
The Member Node should perform some minimal internal functionality testing before answering. However, ping checks will be frequent (every few minutes) so the internal functionality test should not be high impact.
Any status response other than 200 indicates that the node is offline for DataONE operations.
Note that the timestamp returned in the Date header should follow the semantics as described in the HTTP specifications, http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.18
The response body will be ignored by the caller except in the case of an error, in which case the response body should contain the appropriate DataONE exception.
Use Cases: | |
---|---|
Rest URL: | GET /monitor/ping |
Returns: | Null body or Exception. The body of the message may be ignored by the caller. The HTTP header Date MUST be set in the response. |
Return type: | null |
Raises: |
|
Response
The response should be a valid HTTP response with a blank or arbitrary body. Only the HTTP header information is considered by the requestor. A successful response MUST have a HTTP status code of 200. In case of an error condition, the appropriate HTTP status code MUST be set, and an exception or error information MAY be returned in the response.
Example
Example of ping request and response for a Member Node. Lines prefixed with “>” indicate outgoing information, lines prefixed with “<” show content returned from the server. Lines associated with SSL connection initiation and close are not shown here. Note that the actual response headers may vary, the only required header fields are the first status line and a Date entry. However, in order to fully support clients that may cache the response, it is recommended that the Expires, and Cache-Control headers are returned.
export NODE="https://demo2.test.dataone.org/knb/d1/mn" curl -k -v "$NODE/v1/monitor/ping" > GET /knb/d1/mn/v1/monitor/ping HTTP/1.1 > User-Agent: curl/7.21.6 (x86_64-pc-linux-gnu) libcurl/7.21.6 OpenSSL/1.0.0e zlib/1.2.3.4 libidn/1.22 librtmp/2.3 > Host: demo2.test.dataone.org > Accept: */* > < HTTP/1.1 200 OK < Date: Tue, 06 Mar 2012 14:19:59 GMT < Server: Apache/2.2.14 (Ubuntu) < Content-Length: 0 < Content-Type: text/plain <
Retrieve log information from the Member Node for the specified slice parameters.
This method is used primarily by the log aggregator to generate aggregate statistics for nodes, objects, and the methods of access.
The response MUST contain only records for which the requestor has permission to read.
Note that date time precision is limited to one millisecond. If no timezone information is provided UTC will be assumed.
Access control for this method MUST be configured to allow calling by Coordinating Nodes and MAY be configured to allow more general access.
Rest URL: | GET /log?[fromDate={fromDate}][&toDate={toDate}][&event={event}][&pidFilter={pidFilter}][&start={start}][&count={count}] |
---|---|
Parameters: |
|
Returns: | |
Return type: | |
Raises: |
|
Example
Example of retrieving 3 log records from a Member Node. The xml command is provided by xmlstarlet and is used to format the output.
export NODE="https://demo2.test.dataone.org/knb/d1/mn" curl -k -s "$NODE/v1/log?start=0&count=3" | xml fo <?xml version="1.0" encoding="UTF-8"?> <d1:log xmlns:d1="http://ns.dataone.org/service/types/v1" count="3" start="0" total="1273"> <logEntry> <entryId>1</entryId> <identifier>MNodeTierTests.201260152556757.</identifier> <ipAddress>129.24.0.17</ipAddress> <userAgent>null</userAgent> <subject>CN=testSubmitter,DC=dataone,DC=org</subject> <event>create</event> <dateLogged>2012-02-29T23:25:58.104+00:00</dateLogged> <nodeIdentifier>urn:node:DEMO2</nodeIdentifier> </logEntry> <logEntry> <entryId>2</entryId> <identifier>TierTesting:testObject:RightsHolder_Person.4</identifier> <ipAddress>129.24.0.17</ipAddress> <userAgent>null</userAgent> <subject>CN=testSubmitter,DC=dataone,DC=org</subject> <event>create</event> <dateLogged>2012-02-29T23:26:38.828+00:00</dateLogged> <nodeIdentifier>urn:node:DEMO2</nodeIdentifier> </logEntry> <logEntry> <entryId>3</entryId> <identifier>TierTesting:testObject:RightsHolder_Group.4</identifier> <ipAddress>129.24.0.17</ipAddress> <userAgent>null</userAgent> <subject>CN=testSubmitter,DC=dataone,DC=org</subject> <event>create</event> <dateLogged>2012-02-29T23:27:40.255+00:00</dateLogged> <nodeIdentifier>urn:node:DEMO2</nodeIdentifier> </logEntry> </d1:log>
Returns a document describing the capabilities of the Member Node.
The response at the Member Node base URL is for convenience only. Clients of Member Nodes SHOULD use the /node URL to retrieve the node capabilities document.
Rest URL: | GET / and GET /node |
---|---|
Returns: | The technical capabilities of the Member Node |
Return type: | |
Raises: |
|
Example
export NODE="https://demo2.test.dataone.org/knb/d1/mn" curl -k -s "$NODE/v1/node" | xml fo <?xml version="1.0" encoding="UTF-8"?> <d1:node xmlns:d1="http://ns.dataone.org/service/types/v1" replicate="true" synchronize="true" type="mn" state="up"> <identifier>urn:node:DEMO2</identifier> <name>DEMO2 Metacat Node</name> <description>A DataONE member node implemented in Metacat.</description> <baseURL>https://demo2.test.dataone.org:443/knb/d1/mn</baseURL> <services> <service name="MNRead" version="v1" available="true"/> <service name="MNCore" version="v1" available="true"/> <service name="MNAuthorization" version="v1" available="true"/> <service name="MNStorage" version="v1" available="true"/> <service name="MNReplication" version="v1" available="true"/> </services> <synchronization> <schedule hour="*" mday="*" min="0/3" mon="*" sec="10" wday="?" year="*"/> <lastHarvested>2012-03-06T14:57:39.851+00:00</lastHarvested> <lastCompleteHarvest>2012-03-06T14:57:39.851+00:00</lastCompleteHarvest> </synchronization> <ping success="true"/> <subject>CN=urn:node:DEMO2, DC=dataone, DC=org</subject> <contactSubject>CN=METACAT1, DC=dataone, DC=org</contactSubject> </d1:node>
The MNRead API implements methods that enable object management operations on a Member Node.
Tier | REST | Function | Parameters |
---|---|---|---|
Tier 1 | GET /object/{pid} | get() | (session, pid) -> Types.OctetStream |
Tier 1 | GET /meta/{pid} | getSystemMetadata() | (session, pid) -> Types.SystemMetadata |
Tier 1 | HEAD /object/{pid} | describe() | (session, pid) -> Types.DescribeResponse |
Tier 1 | GET /checksum/{pid}[?checksumAlgorithm={checksumAlgorithm}] | getChecksum() | (session, pid, [checksumAlgorithm]) -> Types.Checksum |
Tier 1 | GET /object[?fromDate={fromDate}&toDate={toDate} &formatId={formatId}&replicaStatus={replicaStatus} &start={start}&count={count}] | listObjects() | (session, [fromDate], [toDate], [formatId], [replicaStatus], [start=0], [count=1000]) -> Types.ObjectList |
Tier 1 | POST /error | synchronizationFailed() | (session, message) -> Types.Boolean |
Tier 1 | GET /replica/{pid} | getReplica() | (session, pid) -> Types.OctetStream |
Retrieve an object identified by pid from the node.
The response MUST contain the bytes of the indicated object, and the checksum of the bytes retrieved SHOULD match the SystemMetadata.checksum recorded in the Types.SystemMetadata.
If the object does not exist on the node servicing the request, then Exceptions.NotFound must be raised even if the object exists on another node in the DataONE system.
Also implmented by Coordinating Nodes as CNRead.get().
Use Cases: | |
---|---|
Rest URL: | GET /object/{pid} |
Parameters: |
|
Returns: | Bytes of the specified object. |
Return type: | |
Raises: |
|
Examples
(GET) Retrieve the object with identifier “XYZ332”:
export NODE="https://demo2.test.dataone.org/knb/d1/mn" curl -k "$NODE/v1/object/XYZ332" ... data ...(GET) Attempt to retrieve a non-existent object (and show headers in response):
export NODE="https://demo2.test.dataone.org/knb/d1/mn" curl -D - "$NODE/v1/object/DOESNTEXIST" HTTP/1.1 404 Not Found Date: Tue, 06 Mar 2012 15:25:35 GMT Server: Apache/2.2.14 (Ubuntu) Content-Length: 196 Vary: Accept-Encoding Content-Type: text/xml <?xml version="1.0" encoding="UTF-8"?> <error detailCode="1800" errorCode="404" name="NotFound"> <description>No system metadata could be found for given PID: DOESNTEXIST</description> </error>
Describes the object identified by pid by returning the associated system metadata object.
If the object does not exist on the node servicing the request, then Exceptions.NotFound MUST be raised even if the object exists on another node in the DataONE system.
Use Cases: | |
---|---|
Rest URL: | GET /meta/{pid} |
Parameters: |
|
Returns: | System metadata object describing the object. |
Return type: | |
Raises: |
|
Examples
(GET) Retrieve system metadata from a Member Node for object “XYZ332” which happens to be science metadata (an EML document) that has been obsoleted by a new version with identifier “XYZ33”:
curl http://m1.dataone.org/mn/v1/meta/XYZ332 <?xml version="1.0" encoding="UTF-8"?> <d1:systemMetadata xmlns:d1="http://ns.dataone.org/service/types/v1"> <serialVersion>1</serialVersion> <identifier>XYZ332</identifier> <formatId>eml://ecoinformatics.org/eml-2.1.0</formatId> <size>20875</size> <checksum algorithm="MD5">e7451c1775461b13987d7539319ee41f</checksum> <submitter>uid=mbauer,o=NCEAS,dc=ecoinformatics,dc=org</submitter> <rightsHolder>uid=mbauer,o=NCEAS,dc=ecoinformatics,dc=org</rightsHolder> <accessPolicy> <allow> <subject>uid=jdoe,o=NCEAS,dc=ecoinformatics,dc=org</subject> <permission>read</permission> <permission>write</permission> <permission>changePermission</permission> </allow> <allow> <subject>public</subject> <permission>read</permission> </allow> <allow> <subject>uid=nceasadmin,o=NCEAS,dc=ecoinformatics,dc=org</subject> <permission>read</permission> <permission>write</permission> <permission>changePermission</permission> </allow> </accessPolicy> <replicationPolicy replicationAllowed="false"/> <obsoletes>XYZ331</obsoletes> <obsoletedBy>XYZ333</obsoletedBy> <archived>true</archived> <dateUploaded>2008-04-01T23:00:00.000+00:00</dateUploaded> <dateSysMetadataModified>2012-06-26T03:51:25.058+00:00</dateSysMetadataModified> <originMemberNode>urn:node:TEST</originMemberNode> <authoritativeMemberNode>urn:node:TEST</authoritativeMemberNode> </d1:systemMetadata>(GET) Attempt to retrieve system metadata for an object that does not exist.:
curl http://cn.dataone.org/cn/v1/meta/SomeObjectID <?xml version="1.0" encoding="UTF-8"?> <error detailCode="1800" errorCode="404" name="NotFound"> <description>No system metadata could be found for given PID: SomeObjectID</description> </error>
This method provides a lighter weight mechanism than MNRead.getSystemMetadata() for a client to determine basic properties of the referenced object. The response should indicate properties that are typically returned in a HTTP HEAD request: the date late modified, the size of the object, the type of the object (the SystemMetadata.formatId).
The principal indicated by token must have read privileges on the object, otherwise Exceptions.NotAuthorized is raised.
If the object does not exist on the node servicing the request, then Exceptions.NotFound must be raised even if the object exists on another node in the DataONE system.
Note that this method is likely to be called frequently and so efficiency should be taken into consideration during implementation.
Use Cases: | |
---|---|
Rest URL: | HEAD /object/{pid} |
Parameters: |
|
Returns: | A set of values providing a basic description of the object. |
Return type: | |
Raises: |
|
Examples
(HEAD) Retrieve information about the object with identifier “ABC123”:
curl -I http://mn1.dataone.org/mn/v1/object/ABC123
HTTP/1.1 200 OK
Last-Modified: Wed, 16 Dec 2009 13:58:34 GMT
Content-Length: 10400
Content-Type: application/octet-stream
DataONE-ObjectFormat: eml://ecoinformatics.org/eml-2.0.1
DataONE-Checksum: SHA-1,2e01e17467891f7c933dbaa00e1459d23db3fe4f
DataONE-SerialVersion: 1234
(HEAD) An error response to a describe() request for object “IDONTEXIST”:
curl -I http://mn1.dataone.org/mn/v1/object/IDONTEXIST
HTTP/1.1 404 Not Found
Last-Modified: Wed, 16 Dec 2009 13:58:34 GMT
Content-Length: 1182
Content-Type: text/xml
DataONE-Exception-Name: NotFound
DataONE-Exception-DetailCode: 1380
DataONE-Exception-Description: The specified object does not exist on this node.
DataONE-Exception-PID: IDONTEXIST
Returns a Types.Checksum for the specified object using an accepted hashing algorithm. The result is used to determine if two instances referenced by a PID are identical, hence it is necessary that MNs can ensure that the returned checksum is valid for the referenced object either by computing it on the fly or by using a cached value that is certain to be correct.
Rest URL: | GET /checksum/{pid}[?checksumAlgorithm={checksumAlgorithm}] |
---|---|
Parameters: |
|
Returns: | The checksum value originally computed for the specified object. |
Return type: | |
Raises: |
|
Retrieve the list of objects present on the MN that match the calling parameters. This method is required to support the process of Member Node synchronization. At a minimum, this method MUST be able to return a list of objects that match:
fromDate < SystemMetadata.dateSysMetadataModified
but is expected to also support date range (by also specifying toDate), and should also support slicing of the matching set of records by indicating the starting index of the response (where 0 is the index of the first item) and the count of elements to be returned.
Note that date time precision is limited to one millisecond. If no timezone information is provided, the UTC will be assumed.
Access control for this method MUST be configured to allow calling by Coordinating Nodes and MAY be configured to allow more general access.
Use Cases: | |
---|---|
Rest URL: | GET /object[?fromDate={fromDate}&toDate={toDate} &formatId={formatId}&replicaStatus={replicaStatus} &start={start}&count={count}] |
Parameters: |
|
Returns: | The list of PIDs that match the query criteria. If none match, an empty list is returned. |
Return type: | |
Raises: |
|
Example
Retrieve an object list from a member node, and pipe the response through an xml formatter for easier viewing:
curl "https://gmn-dev.test.dataone.org/mn/v1/object?count=5" | xml fo
<?xml version="1.0"?>
<ns1:objectList xmlns:ns1="http://ns.dataone.org/service/types/v1" count="5" start="0" total="12">
<objectInfo>
<identifier>AnserMatrix.htm</identifier>
<formatId>eml://ecoinformatics.org/eml-2.0.0</formatId>
<checksum algorithm="MD5">0e25cf59d7bd4d57154cc83e0aa32b34</checksum>
<dateSysMetadataModified>1970-05-27T06:12:49</dateSysMetadataModified>
<size>11048</size>
</objectInfo>
...
<objectInfo>
<identifier>hdl:10255/dryad.218/mets.xml</identifier>
<formatId>eml://ecoinformatics.org/eml-2.0.0</formatId>
<checksum algorithm="MD5">65c4e0a9c4ccf37c1e3ecaaa2541e9d5</checksum>
<dateSysMetadataModified>1987-01-14T07:09:09</dateSysMetadataModified>
<size>2796</size>
</objectInfo>
</ns1:objectList>
This is a callback method used by a CN to indicate to a MN that it cannot complete synchronization of the science metadata identified by pid. When called, the MN should take steps to record the problem description and notify an administrator or the data owner of the issue.
A successful response is indicated by a HTTP status of 200. An unsuccessful call is indicated by a returned exception and associated HTTP status code.
Access control for this method MUST be configured to allow calling by Coordinating Nodes and MAY be configured to allow more general access.
Use Cases: | |
---|---|
Rest URL: | POST /error |
Parameters: |
|
Returns: | A successful response is indicated by a HTTP 200 status. An unsuccessful call is indicated by returing the appropriate exception. |
Return type: | |
Raises: |
|
Called by a target Member Node to fullfill the replication request originated by a Coordinating Node calling MNReplication.replicate(). This is a request to make a replica copy of the object, and differs from a call to GET /object in that it should be logged as a replication event rather than a read event on that object.
If the object being retrieved is restricted access, then a Tier 2 or higher Member Node MUST make a call to CNReplication.isNodeAuthorized() to verify that the Subject of the caller is authorized to retrieve the content.
A successful operation is indicated by a HTTP status of 200 on the response.
Failure of the operation MUST be indicated by returning an appropriate exception.
Use Cases: | |
---|---|
Rest URL: | GET /replica/{pid} |
Parameters: |
|
Returns: | Bytes of the specified object. |
Return type: | |
Raises: |
|
The MNQuery API is an optional API that may be implemented by Member Nodes that intend to support querying the local repository. The actual form of the query is undefined, and t is expected that a small set of well known query engine types will be supported.
Tier | REST | Function | Parameters |
---|---|---|---|
Tier 1 | GET /query/{queryEngine}/{query} | query() | (session, queryEngine, query) -> Types.OctetStream |
Tier 1 | GET /query/{queryType} | getQueryEngineDescription() | (session, queryEngine) -> Types.QueryEngineDescription |
Tier 1 | GET /query | listQueryEngines() | (session) -> Types.QueryEngineList |
Submit a query against the specified queryEngine and return the response as formatted by the queryEngine.
The query() operation may be implemented by more than one type of search engine and the queryEngine parameter indicates which search engine is targeted. The value and form of query is determined by the specific query engine.
For example, the SOLR search engine will accept many of the standard parameters of SOLR, including field restrictions and faceting.
This method is optional for Member Nodes, but if implemented, both getQueryEngineDescription and listQueryEngines must also be implemented.
Note
This method is in DRAFT status and is scheduled for version 1.1 of the APIs
Use Cases: | |
---|---|
Rest URL: | GET /query/{queryEngine}/{query} |
Parameters: |
|
Returns: | The structure of the response is determined by the chosen search engine and parameters provided to it. |
Return type: | |
Raises: |
|
Provides metadata about the query service of the specified queryEngine. The metadata provides a brief description of the query engine, its version, its schema version, and an optional list of fields supported by the query engine.
Note
This method is in DRAFT status and is scheduled for version 1.1 of the APIs
Rest URL: | GET /query/{queryType} |
---|---|
Parameters: |
|
Returns: | A list of fields that are supported by the search index and additional metadata. |
Return type: | Types.QueryEngineDescription |
Raises: |
|
Returns a list of query engines, i.e. supported values for the queryEngine parameter of the getQueryEngineDescription and query operations.
The list of search engines available may be influenced by the authentication status of the request.
Note
This method is in DRAFT status and is scheduled for version 1.1 of the APIs
Rest URL: | GET /query |
---|---|
Parameters: | session (Types.Session) – Session information that contains the identity of the calling user as retrieved from the X.509 certificate provided with the request. The certificate must be traceable to an authority recognized by DataONE, currently CILogon. Transmitted as part of the SSL handshake process. |
Returns: | A list of names of queryEngines available to the user identified by session. |
Return type: | Types.QueryEngineList |
Raises: |
|
Provides mechanisms Member Nodes to verify access to resources for users (subject). See the document Identity Management and Authenticated Session Management for more details on some authentication options.
Tier | REST | Function | Parameters |
---|---|---|---|
Tier 2 | GET /isAuthorized/{pid}?action={action} | isAuthorized() | (session, pid, action) -> boolean |
Tier 2 | POST /dirtySystemMetadata | systemMetadataChanged() | (session, pid, serialVersion, dateSysMetaLastModified) -> boolean |
Test if the user identified by the provided session has authorization for operation on the specified object.
A successful operation is indicated by a return HTTP status of 200.
Failure is indicated by an exception such as NotAuthorized being returned.
The body of the response is arbitrary and SHOULD be ignored by the caller.
If the action is not authorized, then a NotAuthorized exception MUST be raised.
Note
Should perhaps add convenience methods for “canRead()” and “canWrite()” to verify that a user is able to read / write an object.
Use Cases: | |
---|---|
Rest URL: | GET /isAuthorized/{pid}?action={action} |
Parameters: |
|
Returns: | True if the operation is allowed |
Return type: | boolean |
Raises: |
|
Notifies the Member Node that the authoritative copy of system metadata on the Coordinating Nodes has changed.
The Member Node SHOULD schedule an update to its information about the affected object by retrieving an authoritative copy from a Coordinating Node.
Note that date time precision is limited to one millisecond.
Access control for this method MUST be configured to allow calling by Coordinating Nodes.
Rest URL: | POST /dirtySystemMetadata |
---|---|
Parameters: |
|
Returns: | True if notification was received OK, otherwise an error is returned. |
Return type: | boolean |
Raises: |
|
Tier | REST | Function | Parameters |
---|---|---|---|
Tier 3 | POST /object | create() | (session, pid, object, sysmeta) -> Types.Identifier |
Tier 3 | PUT /object/{pid} | update() | (session, pid, object, newPid, sysmeta) -> Types.Identifier |
Tier 3 | POST /generate | generateIdentifier() | (session, scheme, [fragment]) -> Types.Identifier |
Tier 3 | DELETE /object/{pid} | delete() | (session, pid) -> Types.Identifier |
Tier 3 | PUT /archive/{pid} | archive() | (session, pid) -> Types.Identifier |
Called by a client to adds a new object to the Member Node.
The pid must not exist in the DataONE system or should have been previously reserved using CNCore.reserveIdentifier().
The caller MUST have authorization to write or create content on the Member Node.
Use Cases: | |
---|---|
Rest URL: | POST /object |
Parameters: |
|
Returns: | The identifier that was used to insert the document into the system. |
Return type: | |
Raises: |
|
Examples
The outgoing request body must be encoded as MIME multipart/form-data with the system metadata portion and the object as file attachments.
(POST) Create a new object with a given identifier (XYZ33256):
curl -E /tmp/x509up_u502 \
-F "pid=XYZ33256" \
-F "object=@sciencemetadata.xml" \
-F "sysmeta=@sysmeta.xml" \
https://m1.dataone.org/mn/v1/object
HTTP/1.1 200 Success
Content-Type:
Date: Wed, 16 Dec 2009 13:58:34 GMT
Content-Length: 355
XYZ33256
The system metadata included with the create call must contain values for the elements required to be set by clients (see System Metadata). The system metadata document can be crafted by hand or preferably with a tool such as generate_sysmeta.py which is available in the d1_instance_generator Python package. See documentation included with that package for more information on its operation.
For example, the system metadata document for the example above was generated using the sequence of commands:
<<log on to cilogon.org and download my certificate>>
MYSUBJECT=`python my_subject.py /tmp/x509up_u502`
echo $MYSUBJECT
CN=Dave Vieglais T799,O=Google,C=US,DC=cilogon,DC=org
python generate_sysmeta.py -f sciencemetadata.xml \
-i "XYZ33256" \
-s "$MYSUBJECT" \
-t "eml://ecoinformatics.org/eml-2.0.1" \
> sysmeta.xml
The generated system metadata document contains default information that indicates:
The generated system metadata document is presented below:
<?xml version='1.0' encoding='UTF-8'?>
<ns1:systemMetadata xmlns:ns1="http://ns.dataone.org/service/types/v1">
<identifier>XYZ33256</identifier>
<formatId>eml://ecoinformatics.org/eml-2.0.1</formatId>
<size>22936</size>
<checksum algorithm="MD5">2ec0084d1e11e0d5c9a46ba6a230aa85</checksum>
<submitter>CN=Dave Vieglais T799,O=Google,C=US,DC=cilogon,DC=org</submitter>
<rightsHolder>CN=Dave Vieglais T799,O=Google,C=US,DC=cilogon,DC=org</rightsHolder>
<accessPolicy>
<allow>
<subject>public</subject>
<permission>read</permission>
</allow>
<allow>
<subject>CN=Dave Vieglais T799,O=Google,C=US,DC=cilogon,DC=org</subject>
<permission>changePermission</permission>
</allow>
</accessPolicy>
<replicationPolicy replicationAllowed="true"/>
<dateUploaded>2012-02-20T20:39:19.664495</dateUploaded>
<dateSysMetadataModified>2012-02-20T20:39:19.70598</dateSysMetadataModified>
</ns1:systemMetadata>
This method is called by clients to update objects on Member Nodes.
Updates an existing object by creating a new object identified by newPid on the Member Node which explicitly obsoletes the object identified by pid through appropriate changes to the SystemMetadata of pid and newPid.
The Member Node sets SystemMetadata.obsoletedBy on the object being obsoleted to the pid of the new object. It then updates SystemMetadata.dateSysMetadataModified on both the new and old objects. The modified system metadata entries then become available in MNRead.listObjects(). This ensures that a Coordinating Node will pick up the changes when filtering on SystmeMetadata.dateSysMetadataModified.
Use Cases: | |
---|---|
Rest URL: | PUT /object/{pid} |
Parameters: |
|
Returns: | The identifier of the document that is replacing the original, which should be the same as newPid. |
Return type: | |
Raises: |
|
Given a scheme and optional fragment, generates an identifier with that scheme and fragment that is unique.
The message body is encoded as MIME Multipart/form-data
Rest URL: | POST /generate |
---|---|
Parameters: |
|
Returns: | The identifier that was generated |
Return type: | |
Raises: |
|
Todo
Need to provide a list of recommended identifier schemes.
Deletes an object managed by DataONE from the Member Node. Member Nodes MUST check that the caller (typically a Coordinating Node) is authorized to perform this function.
The delete operation will be used primarily by Coordinating Nodes to help manage the number of replicas of an object that are present in the entire system.
The operation removes the object from further interaction with DataONE services. The implementation may delete the object bytes, and in general should do so since a delete operation may be in response to a problem with the object (e.g. it contains malicious content, is innappropriate, or is the subject of a legal request).
If the object does not exist on the node servicing the request, then an Exceptions.NotFound exception is raised. The message body of the exception SHOULD contain a hint as to the location of the CNRead.resolve() method.
Use Cases: | |
---|---|
Rest URL: | DELETE /object/{pid} |
Parameters: |
|
Returns: | The identifier of the object that was deleted. |
Return type: | |
Raises: |
|
Hides an object managed by DataONE from search operations, effectively preventing its discovery during normal operations.
The operation does not delete the object bytes, but instead sets the Types.SystemMetadata.archived flag to True. This ensures that the object can still be resolved (and hence remain valid for existing citations and cross references), though will not appear in searches.
Member Nodes MUST check that the caller is authorized to perform this function.
If the object does not exist on the node servicing the request, then an Exceptions.NotFound exception is raised. The message body of the exception SHOULD contain a hint as to the location of the CNRead.resolve() method.
Rest URL: | PUT /archive/{pid} |
---|---|
Parameters: |
|
Returns: | The identifier of the object that was archived. |
Return type: | |
Raises: |
|
The Replication API provides methods to support CN-directed replication of content between MNs.
Tier | REST | Function | Parameters |
---|---|---|---|
Tier 4 | POST /replicate | replicate() | (session, sysmeta, sourceNode) -> boolean |
Called by a Coordinating Node to request that the Member Node create a copy of the specified object by retrieving it from another Member Nodeode and storing it locally so that it can be made accessible to the DataONE system.
A successful operation is indicated by a HTTP status of 200 on the response.
Failure of the operation MUST be indicated by returning an appropriate exception.
Access control for this method MUST be configured to allow calling by Coordinating Nodes.
Use Cases: | |
---|---|
Rest URL: | POST /replicate |
Parameters: |
|
Returns: | True if everything works OK, otherwise an error is returned. |
Return type: | boolean |
Raises: |
|
Response
The response should be a valid HTTP response with a blank or arbitrary body. Only the HTTP header information is considered by the requestor. A successful response must have a HTTP status code of 200. In case of an error condition, the appropriate HTTP status code must be set, and an exception or error information may be returned in the response.
The outgoing request body must be encoded as MIME multipart/form-data with the system metadata portion as a file attachment and the sourceNode parameter as a form field.
curl -v -X POST "https://localhost:8000/mn/v1/replicate" \
-H "Content-type: multipart/form-data" \
-F "sysmeta=@systemmetadata.xml" \
-F "sourceNode=urn:node:MN_B"
* About to connect() to localhost port 8000 (#0)
* Trying ::1... Connection refused
* Trying fe80::1... Connection refused
* Trying 127.0.0.1... connected
* Connected to localhost (127.0.0.1) port 8000 (#0)
> POST /mn/v1/replicate HTTP/1.1
> User-Agent: curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3
> Host: localhost:8000
> Accept: */*
> Content-Length: 1021
> Expect: 100-continue
> Content-type: multipart/form-data; boundary=----------------------------88ffdd8070e9
>
* Done waiting for 100-continue
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Date: Fri, 14 Jan 2011 22:01:13 GMT
< Server: WSGIServer/0.1 Python/2.6.1
< Content-Type: text/xml
<
<
* Closing connection #0