Revision and content

Revision definition

Revisions play an essential role in the basic document management services. Whenever new revisions are created or retrieved, or returned by a query, SOAP messages will be exchanged which contain information with a structure that is based on the revision definition. A revision definition is generally characterized by a metadata element, which contains the attributes and one or more content elements which may refer to binary attachments.

Content and revision separation

In a simple scenario a revision will be linked with only one content object. This is usually the case for electronic documents which may be regarded as being completely represented by one binary file. For example, a word processing file may consist of several pages (being presented by the word processing application) which are all contained in one “doc” file. In practice it is not appropriate and hardly possible to further split down such an electronic document into several files (or content units in ImageMaster) and the doc file will represent one business record associated with one specific revision.

The idea which stands behind the separation of revision and content becomes clear, when considering certain file formats which do not inhere in the common conception of a “page”. A typical example may be a scanned paper document where several pages of the original document result in a collection of several image files.1 All these files together represent the original document. Each file will be mapped to one content object and the complete original document is represented by the revision, which ties the single files (content objects) together.

The document model is flexible enough to manage arbitrary content combinations. A revision may technically be associated with several content objects based on different file types. In certain cases it may be appropriate to define a content mapping which, for example, ties a word processing file together with scanned images and spreadsheet files in one revision. The idea of a “dossier” may thus be realized by one document type. Note that such a modeling naturally bears certain implications for queries and access control which have to be considered in the design phase of the customer specific project; content parts which are tied to one revision will usually be subject to the same versioning and to the same document access restrictions (as an alternative such a scenario may require enhanced role and permission modeling with corresponding queries for further access control which considers single content parts). Content objects can be addressed by their unique content key.