Recommendation:
The delivery formats should allow:
Explanation: (hidden) annotations like OCR-ed versions of the text should be possible and included to make the documents searchable, Internet links should be possible to let references point to reviews and/or to their location in the DML, and resolve other references, internal and external.
Explanation: ``Multi-page'' is required to allow convenient access to a whole document, i.e., an article or a book (see the section ``Download Units'' below). ``Faithful'' means: the delivery data should be a lossless compression of the raw scanned data.
Explanation: It should be possible to look, for example, at the reference page at the end of a 300 page document without downloading all the preceding 299 pages.
Explanation: This makes the use of the documents convenient and easy. E.g., zooming allows the recognition of small indices and other tiny details.
The delivered files should contain
Further Explanations, Examples:
When choosing a file format, it is important to check that knowledge and implementation of this format is sufficiently stable among digitization and software vendors, or that a sufficient free software community supports it. Proprietary formats, only supported by fragile start-ups and whose conversion or management depend on very specific or lossy software, is to be avoided.
Two file formats currently conform to the requirements above: DjVu [6] , PDF [15].