DataONE API - 2.0

Use Case 25 - Detect Damaged ContentΒΆ

Revisions
View document revision history.
Goal
System should scans for damaged/defaced data and metadata using some validation process.
Summary

All content being added or incorporated into the DataONE infrastructure has checksums computed, providing a unique signature specific to the particular sequence of bytes present in that object. Any change to the object will result in a different checksum being calculated for that object.

It would be prohibitive to continually compute checksums for all content contained in the system. Therefore, the system should have provision for periodic, random checks that compute the checksum for objects and compare with the original.

The system should automatically replace content determined to be incorrect, and system content managers should be notified of such events. Alternatively, bad content could be queued for processing, which is semi-automatically processed by the data managers.

Actors
  • Data manager
  • Data owner
  • Member Nodes
  • Coordinating Nodes
  • Content quality checking service
Preconditions
  • Content present in DataONE
  • Checksums for all content computed and preserved
  • Mechanisms available for spot checks of checksum for any object held at any location in the system
Triggers
  • Bad content is discovered by failing checksum comparison.
Post Conditions
  • Bad content present in the system has been replaced with known correct objects
  • Data managers are notified
  • Data owners may be notified
../../_images/25_seq.png

Figure 1. Interactions for use case 25, System validates metadata and data