Retention management

Retention Management provides operations to manage the lifecycle of documents. Plans can be defined that perform actions on a document such as protecting the document from deletion or modification, or deleting the document if certain criteria are met. The actions in such a plan can be time-based or triggered by defined events.

ImageMaster retention management is compatible with the retention management functionality of Amazon S3, ECS, Hitachi HCP and NetApp SnapLock.

The retention period, which is set via ImageMaster, is propagated to these storage systems’ retention management mechanisms.

Retention marks

The retention mark is a property of each document that defines the actions that can be taken on the document. The retention mark may determine whether a document can be deleted or modified, and it may thus prohibit deletion or check-in operations. Each document has exactly one such a mark. A newly created document has a mark that places no restrictions on the document. The retention mark of a document can be modified through the execution of a retention plan.

Retention plan

A retention plan can be seen as a very simple program that is associated with a document and executed over its lifetime. A plan is always associated with only one document. A plan has a list of steps to execute which is called a program. Each consists of an operator to be called and parameters to be passed to the operator. The parameters are defined as RAQL expressions and are evaluated in a query before being passed to the operator. An optional comment can be defined which is used for logging of the plan execution. A retention plan is associated with a role used for the execution of the plan. This role is used to call each operator in the program. The role must have the appropriate rights to carry out the actions specified in the program.

Plans can be added by project specific extensions using the extension point IRetentionAssignmentExt or the addPlan and addPlanTemplate web service operations. As part of the plan execution, the operator for each step in the program is called with the defined parameters and the step is removed from the program after successful execution. The execution of each step also modifies the plan state. The plan state defines when the next step in the plan can be executed. Possible states include waiting for a specific time or event. The following plan states exist:

  • RUN: The plan is ready to run.

  • ABORT: The plan has been aborted.

  • FINISH: The plan execution has finished.

  • WAIT_TIME: The plan is waiting until a specific time.

  • WAIT_EVENT: The plan is waiting for a global event to be signaled.

  • WAIT_PERIODIC_EVENT: The plan is waiting for a global and periodic event to be signaled again.

  • WAIT_DOC_EVENT: The plan is waiting for a document-specific event to be signaled for the document to which the plan belongs.

  • WAIT_PERIODIC_DOC_EVENT: The plan is waiting for a document-specific and periodic event to be signaled again for the document.

  • WAIT_DOCUMENT_CHANGE: The plan is waiting for a change in the plan's document.

The following example shows a plan that protects a document from deletion until an event is signaled:

<!-- set a delete protection -->

<step operator="ima:setProtectionTag">             

  <parameter name="tag">const("DELETE_PROTECTED")</parameter>

  <parameter name="message">

      const("Deletion message.")

  </parameter>

  <comment>Example step comment</comment>

</step>

<!-- wait for an event -->

<step operator="ima:waitEvent">             

  <parameter name="eventId">

      const(ef6a4802-9aad-4634-abd3-5f61c344d968)

  </parameter>

</step>

<!-- remove the delete protection -->

<step operator="ima:setProtectionTag">             

  <parameter name="tag">const("NONE")</parameter>

</step>

Figure 577: Example – retention plan

Plan execution

Retention plans are executed in the following ways:

  • Asynchronously:

    The asynchronous job implemented by the RetentionJobImpl class is responsible for finding plans in the system that can be executed and running them until they reach a waiting or finished state.

  • Synchronously on creation:

    When a new plan is added to a document, it is immediately executed until it reaches a waiting or finished state. This ensures that changes made to the document at the beginning of the plan take immediate effect.

  • Synchronously on certain operations:

    If an operation is executed that depends on the content of a document's retention mark, then the plans of that document are executed synchronously until they reach a waiting or finished state before the retention mark of the document is checked. This ensures that the retention mark of the document represents the current state of the document according to the attached plans. Operations that trigger a plan execution are the deletion of the document or a revision and the check in of new or modified revisions.

Transactions

Each step that is executed in a plan is generally executed in its own transaction. This ensures that a failure during the execution of a step does not cause the effects of all previous steps to be rolled back as well.

An exception is the synchronous execution on the creation of a plan. In this case, the steps in the plan are executed inside the transaction that created the plan. This is necessary to ensure that effects from the surrounding transaction are also visible during plan execution. Specifically, this is required so a plan that is added to a newly created document has access to that document and can modify things like the retention mark. This is impossible if the plan steps run in an isolated transaction.

Roles

Each plan has an associated role. This role is used for the execution of the steps in the plan independent of the role that triggered the plan execution (asynchronous or synchronous). The plan role is used to perform the following actions:

  • resolving the document of the plan

  • evaluating the parameters for an operator call

  • calling the operator

The role needs to have read access to the document to which the plan is attached and read access to any objects referenced in the operator parameters. Furthermore, it must have the required rights to perform the actions of the operators in the plan.

The role does not need the rights to update the plan during execution. Updating the state and program of the plan is done using the predefined "planRunner" role.

Operator parameters

Operator parameters are specified as RAQL expressions. Each parameter expression must result in exactly one value that is then passed to the operator.

An operator expression can reference the document that the plan belongs in the following ways:

  • Referencing a column called "plan:document" which contains the document that belongs to the executed retention plan. A galaxy of the same name is available which contains exactly the one document.

  • Referencing a column called "plan:revision" which contains the latest revision of the plan's document. A galaxy of the same name is also available.

  • For each single-value attribute in the document, a column is available with the name of the attribute containing the attribute value from the document's latest revision or null if the attribute is empty.

Specifically, each expression is evaluated as if it were placed in the following query:

project(["result", <expression>],
  project(
     ["plan:document", <the document>],
     ["plan:revision", <the latest revision>],
     ["docType.attr1", <value of attribute docType.attr1>],
     ["docType.attr2", <value of attribute docType.attr2>],
     ... further single-value attributes ...,
    empty()))

Operator parameters are accessed by name. The order of the parameters is not relevant.

Example: This expression returns the creation time of the plan's document

ima:document:creationTime(ref("plan:document"))

Example: This expression returns the value of the attribute "docType.attr1"

ref("docType.attr1")

Events

Events define conditions that retention plans can wait for to time further actions. What exactly an event represents is outside of the scope of the retention management. Calling the signalEvent operation indicates that whatever condition the event represented has now happened and plans that are waiting for this event are now ready to run.

Events have a unique id and name, a description and a signal time that indicates when the event has been signaled. An event is initially created unsignaled and once it has been signaled, it cannot go back to being unsignaled.

Events can be created either as global or document-specific events. A global event has one system-wide signal state. A document-specific event has one signal state per document in the system.

Events can be created either as one-time events or as periodic events. A one-time event can be signaled only once (per document). Calling the signalEvent operation a second time for such an event results in an error. A periodic event can be signaled multiple times. The signal time of such an event is the most recent time the event was signaled. If a plan waits for a periodic event, it will wait until the event has been signaled at least once since the plan started waiting.

Holds

A hold is a special condition that can be placed on a document to halt the normal processing of retention plans. A hold adds its own retention plan that defines what actions are taken during the duration of the hold. As long as the hold is active (its plan is not yet finished or aborted), the execution of all other plans for the document is suspended. Once the plan finishes or is aborted, the hold is automatically lifted. A lifted hold is still present on the document for information purposes but it is no longer active and no longer has any effect.

Multiple holds can be placed on a document. The most recently added hold is the current hold and only its plan is executed. A previously added hold will be suspended just like normal plans on the document until the current hold is lifted. This means that holds are generally only lifted in the reverse order they were added to the document. An exception is the manual cancellation of the retention plan associated with a hold. This causes the hold to be lifted even if it is not the current hold.

When a hold is added to a document, the current retention mark of the document is remembered, and is automatically restored once the hold is lifted. This allows the hold plan to freely modify the retention mark of the document without affecting the suspended normal retention plans.

Plan templates

Retention plans can be predefined as templates to be assigned to individual documents later. The templates are stored in the system document type _PLAN_TEMPLATE. Templates can be manually assigned with the addPlanTemplate method or automatically assigned to new documents when the documents are created.

The _PLAN_TEMPLATE_HOLD boolean attribute defines whether the template is a hold template or not. Assigning a hold template to a document creates a hold on the document while assigning a non-hold plan just adds a normal plan.

The other attributes of the _PLAN_TEMPLATE type define the fields of the plan:

  • _PLAN_TEMPLATE_NAME: the name of the template

  • _PLAN_TEMPLATE_DESCRIPTION: the description of the template

  • _PLAN_TEMPLATE_ROLENAME: the name of the role for plan execution

  • _PLAN_TEMPLATE_PROGRAM: the program for the plan

  • _PLAN_TEMPLATE_ASSIGNMENT: optional assignment information for product extensions to decide if the template should be assigned or not. The retention management itself does not use this information.