ARM manual mode

In the ARM manual mode, if a physical storage becomes unavailable, by default archiving stops immediately for the affected realms and an administrator manually does the following:

  • Put the unavailable storage into maintenance mode to suspend archiving in this storage.

  • Alter the minCopy value on the corresponding realm respectively.

If there is at least one other physical storage in this realm assuming a corresponding minCopy value for the realm, the archiving process can continue using this spare storage.

  • After reestablishing the availability of the temporarily missing storage, reset the minCopy value.

  • Start a manual recovery process to resynchronize the archives.

Missing documents are read from the spare storage, being validated against a checksum and written to the previously unavailable storage. This reestablishes the redundancy level of two (or more) for all documents.

Example –  maintenance with manual recovery

Suppose realm “Standard” includes 2 physical archives: A1 and A2. The minCopy value for “Standard” is 2. At some time point, the connection with A1 has been lost due to a network failure. To deal with the outage of A1 the following happens and the administrator proceeds as follows:

  • First of all, all archiving stops as A1 is unavailable.

  • The administrator turns on the maintenance mode for the unavailable storage A1 and sets the minCopy value of realm “Standard” to 1.

  • The archiving proceeds: the documents are only physically written to the second storage A2.

  • The connection with the storage A1 is restored.

  • The administrator turns off the maintenance mode for A1 and resets the minCopy value to 2 for realm “Standard”.

  • Documents are again written into both archives A1 and A2. The documents’ binary data archived during the time interval of the broken connection for A1, reside only in A2.

  • The administrator starts an ARM recovery process that copies (and verifies) the missing documents’ binary data from A2 to A1.

Maintenance mode

When an archive is unavailable for a longer period of time or for a planned downtime, setting the archive into maintenance mode. An archive in maintenance mode is not accessed in any read or write operations of the ImageMaster system, but file stubs are written into the database in order to recover attachments that were archived during the duration of the archive’s unavailability. Later these stubs are used to recover these missing attachments into the archive by creating and running ARM recovery processes. As well if the auto recover flag is enabled for the realm that includes the archive, the recovery can be done fully automatically by the auto recover mode of ARM.

The advantage of setting a planned maintenance mode is that the archive consistently stays unavailable for a longer period of time. If this was not done, the archive unavailability mechanism feature (see Temporary archive unavailability mechanism) would try to reach the unavailable archive and this archive would be temporarily used in read/write operations resulting in potential errors when the onlineCheckingInterval is reached again and again for the archive in question.

After the unavailability period the maintenance mode has to be explicitly disabled for the archive. As a reminder, when changing the maintenance mode for an archive, check and reset the original minCopy value for the affected realms.

Furthermore, maintenance mode can also be used for temporary outages as described above in the ARM manual mode example.

To switch on or off the maintenance mode or to alter the realm’s minCopy value, use the AdminClient (see [UM AdminClient], chapter General archive properties).

Manual recovery after unexpected data loss

Manual ARM recovery processes can also be used to reestablish a redundancy level of two or more in case of an unexpected data loss with redundant archive storages. In such a scenario binary objects created during a predefined time span can be restored from the functioning storage and duplicated into a new storage that replaces the dysfunctional or damaged storage area. For related details on how to create, start and monitor recovery processes see the ARM system manual [SM ARM].