Log Aggregation Overview

See subversion log for revision information.


Logs will need to be aggregated from all Member Nodes and stored at each Coordinating Node. Each Coordinating Node must replicas its logging data to the other Coordinating Nodes. The process described below relies on several technologies: Quartz Scheduler, Hazelcast Data Distribution, Metacat Repository Storage, and the Solr Search platform.

Log Aggregation Scheduling

Log Aggregation will performed once a day per Member Node. There is no set time as to when harvesting of log records will be performed on a daily basis. At the time it is performed, it will harvest all the records from the last run until the beginning of the current day (00:00:00). Therefore, the aggregate logs on the CN may be only as recent as 48 hrs in the past, depending on when a query is executed and when the last harvesting was run. In the Log Scheduler class, Quartz will be used as the mechanism to schedule log harvesting.

Log Aggregation Distributed Execution

The Log Aggregator will run as a distributed executable class for each Member node, as well as locally for the Metacat instance that acts as the storage mechanism for the Coordinating Node. The distribution of the execution will be handled by Hazelcast. There is a maximum number of executions of the Log Aggregator that may be running at any given time (# of CNs * max per CN).

The Log Aggregator task will run a getLogRecords query against a Member Node. It will loop through the Types.Log results, calling Log Publisher for each Types.LogEntry, and place on a distributed Hazelcast topic a Types.LogEntry to be broadcast to all members of the Hazelcast cluster (including itself).

Log Aggregation Indexing

Each Coordinating Node will have a Log Aggregator will be running a Log Subscriber that listens for events on the distributed Hazelcast topic. For each Types.LogEntry message received, it will call Log Indexer. Log Indexer will take the Types.LogEntry and create a Solr Document finally submitting the document to Solr for addition to the Lucene index.


Figure 1. Sequence diagram illustrating the sequence of the Log Aggregator process.