MOAC California museums working with libraries and archives to increase and enhance access to cultural collections
MOAC REPORT 2003: Table of Contents
  Introduction, Robin Chandler
  Project Manager's Report, Richard Rinehart
  Standards and Best Practices, Guenter Waibel
  Partner Reports
    Bancroft Library, James Eason
    Grunwald Center for Graphic Arts, Layna White
    Hearst Museum of Anthropology, Josh Meehan
    UCR/California Museum of Photography, Steve Thomas

MOAC Standards and Specifications
Guenter Waibel, Digital Media Developer, UC Berkeley Art Museum & Pacific Film Archive

Museums and the Online Archive of California (MOAC) builds on existing standards and their implementation guidelines provided by the Online Archive of California (OAC) and its parent organization, the California Digital Library (CDL). Setting project standards for MOAC consisted of interpreting existing OAC/CDL documents and adapting them to the projects specific needs, while at the same time maintaining compliance with OAC/CDL guidelines. The present overview over the MOAC technical standards references both the OAC/CDL umbrella document and the MOAC implementation / adaptation document at the beginning of each section, as well as related resources which provide more detail on project specifications.

The project implements specifications for digital image production, as well as three interlocking file exchange formats for delivering collections, digital images and their respective metadata. Encoded Archival Description (EAD) XML describes the hierarchy of a collection down to the item-level and traditionally serves for discovering both the collection and the individual items within it. For viewing multiple images associated with a single object record, MOAC utilizes Making of America 2 (MOA2) XML. MOA2 makes the images representing an item available to the viewer through a navigable table of contents; the display mimics the behavior of the analog item by e.g. allowing end-users to browse through the pages of an artist's book. Through the further extension of MOA2 with Text Encoding Initiative (TEI) Lite XML, not only does every single page of the book display in its correct order, but a transcription of its textual content also accompanies the digital images.

Both technical standards and their implementation by the OAC have undergone a number of changes since the outset of the MOAC project in 1999. For example, MOAC started out as an EAD SGML project, yet migrated to XML as soon as the OAC allowed submissions in the emerging format. As a coveted side effect, MOAC has replaced the use of character entities for special characters with UTF-8 (Unicode Transformation Format) encoding, a standard recommended for use with XML. A UTF-8 encoded XML document assigns a unique number to any of the special characters covered in the Unicode specifications (95,221 as of Unicode 3.2). The client will see the correct character displayed as long as their font contains the corresponding glyph.

An even more profound paradigm shift in terms of metadata encoding, submission and display comes in the form of the new digital object standard METS (Metadata Encoding and Transcription Standard). Although the CDL has not yet officially adopted METS, information on the most recent developments and how the new format may shape MOAC in the future can be found in the digital objects section of this document. These two instances of fairly significant changes in the project's specifications may serve as a gentle reminder that despite its solid foundation in standards, the MOAC information architecture will continue to face the challenge of an ever-changing technical environment.

2. Digital Image Production

OAC/CDL Standards:

Digital Image Format Standards (July 9, 2001) [PDF]

Best Practices for Image Capture (February 2001) [PDF]

CDL Digital Object Standard: Metadata, Content and Encoding (May 18, 2001) [PDF]


Related Resources:

Daniel L. Johnston. A Simplified Standard Method of Digital Image Tonal Capture for Archival Project. PICS 2002: IS&T's PICS Conference. Portland, Oregon, 2002. p. 210-213

The MOAC project standard for digital imaging consists of two components: a baseline standard for digital still image master files and derivatives, and a standard for digital imaging metadata. The standard for digital still image masters and derivatives fully complies with the Digital Image Format Standards.

2.1 Digital Still Images and Derivatives



File Format

Color Target


> 600 ppi or 3000 pixels / longest (whichever possible / yields bigger file size) OR 20 MB target

TIFF (uncompressed)



< 3000 pixels / longest dimension; color bar cropped out

Tiff (uncompressed)



> 150 pixels / longest dimension (recommended 640, 800 or 1024)

JPEG (compressed)

Optional (recommended No)


= 150 pixels / longest dimension

JPEG (compressed)

Optional (recommended No)

All digital imaging takes place at Gamma 2.2; master files are in RGB color space (access / thumbnails may be in sRGB) at a bit depth of 24bit for color or 8 bit for b/w source materials.

The combined guideline of resolution / pixel dimension / file size for master files ensures that captures from fairly large, fairly small or oddly proportioned sources will yield an adequate amount of data. Some examples might help clarify the need for the differentiated approach: A fairly small object captured at 600 ppi yields a file which will be unsatisfactory for most uses. If the original object is the size of a slide (1.5 x 1 inch), a 600 ppi capture standard results in a file of 900 x 600 pixels or 1.5 MB of data. The same object captured at 3000 pixels per longest dimension will yield a file of 3000 x 2000 pixels or 17 MB of data. Obviously, the second capture comes much closer to fulfilling the promise of "one file, many uses" any master file has to live up to, and it comes close enough to the file size guideline of 20 MB.

On the other hand, the capture of a fairly large object illustrates a situation in which a ppi-based standard alone would demand a capture exceeding hardware capability. If our original material is the size of a handscroll (e.g. 12 x 60 inches), the 600 ppi standard would insist that we capture a file of 7200 x 36000 pixels, or 740 MB in data. However, the same object captured at our second specification of 3000 pixels per longest dimension will only yield a file of 600 x 3000 pixels or 5 MB. In this case, because of the odd proportions of the object, the appropriate target for capture would be our 3rd guideline of approximating 20 MB per image file, yielding about 100 ppi, or about 1200 x 6000 pixels in our example. The 20 MB figure derives from the file size of a 4x5 transparency scanned at 600ppi.

For an uncompressed image at 24bit color, the following formulas may be used to determine filesize:

File Size (k)= (height inches x width inches x ppi x ppi x bit depth) / 8192 or

File Size (k)= (pixel dimension x pixel dimension x bit depth) / 8192

Further divide the results by 1024 for MB.

For institutions engaging in imaging exceeding the master file specifications significantly, we also recommend saving a submaster file. The submaster has no color bar; it has a smaller pixel dimension than the master file (between 2000 and 3000 pixels / longest dimension). This file only contains the salient visual content to be communicated, and it is smaller and more portable than the master file. It may be used to (1) create new access / thumbnail files through batch processing, (2) as the back-end file for a dynamic imaging server or (3) as a source file for most mid-end print uses.

To ensure color accuracy, the project recommends color adjustments with the help of color bars (such as a Kodak Q-13). The color bar provides a target of known values which allows for color correction either immediately after the capture, or at any later date. For more details on how to use color bars in archiving digital image files, please consult Daniel L. Johnston's article ("A Simplified Standard Method of Digital Image Tonal Capture for Archival Projects"). In keeping with OAC/CDL recommendations, most MOAC project partners chose this method for safeguarding color over automatic color management software.

The RGB color values on a Kodak Q-13 at Gamma 2.2 should be the following:


0 (A)









































Choosing the A, M and B point as references points for adjusting scanner settings and spot checking individual scans usually yields satisfactory results. You may adjust your scanner by using a tonal curve (or other color manipulation tools in your scanning software) on the scan of a Kodak Q-13 target to approximate the RGB values in the tables. Using those settings for subsequent scans, the colors should be within acceptable range of the target values, and may be spot-checked after each scan e.g. with the Color Sampler Tool in Photoshop. We recommend that the color bar be cropped out of access and thumbnail files to maximize screen real estate for content.

Since many institutions already had pre-existing digital images which may not have been captured in compliance with the project specifications, MOAC decided to allow those legacy resources to be submitted. These legacy image files will be re-captured by partner institutions according to project specifications at a later date.

For institutions without a file server, the OAC has offered to host digital image files by special arrangement; otherwise MOAC endorses a de-centralized model in which each institution hosts their own images locally.

2.2 Digital Imaging Metadata

The following table summarizes the minimal technical metadata MOAC project partners agreed to capture. All of the fields may be extracted from tiff or jpeg fileheaders using programs with metadata harvester capability such as iViewMediaPro or Canto Cumulus.

Elements required by MOAC are highlighted in grey.

File Type


File Dimensions

1024 x 1028 pixels

Sub-object Format

image/tiff, image/jpeg etc.

(Lossless) Compression Format


Bit Depth

1, 8, 24, etc.

Color Space



600 dpi; 400 dpi interpolated to 600 dpi

File Date


File Locator


The technical metadata captured lives as part of a digital object submitted to the OAC. For more details on the fields and a discussion of encoding the technical metadata accompanying each digital file please see section 4. Digital Objects of this document. Note that in the CDL Digital Object Standard, the fields listed above are dispersed among the Content File Inventory, Structural Metadata and Administrative Metadata - Technical.

3. File Exchange Formats

Related Resources:

Gnter Waibel. Granular Collections Access. An Information Architecture Informed by Standards. Proceeds from Electronic Imaging & The Visual Arts (EVA), 2001 [HTML]

3.1 Collection Guides: Encoded Archival Description (EAD)

OAC/CDL Standards:

OAC Best Practices Guidelines Version 1.0: Encoding New Finding Aids Using Encoded Archival Description (August 23, 2001) [PDF]

MOAC Implementation / Adaptation

OAC Best Practices Guidelines for Museum Collections Version 1.0: Encoding New Collection Guides Using Encoded Archival Description (May 31, 2002) [PDF]

Related Resources:

Record Export for Access to Cultural Heritage (REACH) [HTML]

Categories for the Description of Works of Art (CDWA) [HTML]

At the heart of MOAC's strategy for integrating museum collections with library and archival holdings sits Encoded Archival Description (EAD), the common information submission package of the OAC. The standard has its origins in the archival community, and the OAC Best Practice Guidelines (BPG) in effect at the outset of the MOAC project had to be adapted in order to accommodate museum data. As a key outcome of the MOAC project, the consortium established a complimentary mBPG (Museum Best Practice Guideline) for encoding museum collections.

The most noticeable change between the two guidelines occurred at the item level, where MOAC expands the archival BPG to accommodate the project's agreed upon descriptive metadata set. Changes at the higher level of description remain fairly minimal; e.g. the museums reserve the right to repeat certain types of information (such as artist's names and credit lines) at lower levels of the EAD hierarchy. MOAC partners are legally bound to display this type of information whenever a digital image of an item displays. According to archival practice, information always has to appear at its highest appropriate level in the EAD, and may not be repeated.

Another instance of divergent implementation of the EAD: Unlike archival collections, museums put less emphasis on the provenance of a particular collection; similar to museum exhibitions, museum collections organize around a theme or an artist rather than the common origin of materials. The museum BPG produces a valid EAD XML file, and the interoperability between files produced with either BPG remains uncompromised. However, in its details museum mark-up may be at odds with some practices of archival description as established by international standards such as ISAD(G).

MOAC uses EAD XML exclusively as a file exchange format for purposes of data integration among cultural heritage communities, without any implications of providing strict archival description. As a semantic differentiation between archival and museum EAD, MOAC now refers to its EAD XML as "Collection Guides" rather than "Finding Aids."

The museum implementation of the EAD makes extensive use of the opportunities provided by the standard for aggregating and contextualizing objects. At the highest level of description, Museum EAD collection guides typically include a biography of an artist or the history of a movement as well an overview detailing the scope of a collection. They break collections into series, which in turn may contain essays about the particular group of objects gathered under the category in question. At the item-level, contributors have re-used label-copy from exhibitions to provide additional context for an object.

To encode museum data about objects in the EAD, the REACH (Record Export for Art and Cultural Heritage) Element Set was mapped to its appropriate EAD tags. The REACH Element set emerged out of a project initiated by the Getty Information Institute and Research Libraries Group. REACH created a set of 20 elements that are detailed enough to describe an object for researchers, yet simple and generic enough that these fields would be found in most institution's collection management systems. REACH was informed by the more complex Categories for the Description of Works of Art (CDWA) and the Museum Educational Site Licensing Project (MESL) element sets.

The table below is based on Chapter G of the mBPG for Museum Collections and maps the 14 REACH elements deemed relevant by MOAC to their respective EAD encoding tags.

The following codes are used to indicate whether the element is required or not:


R                     This EAD tag is required at this level in all collection guides.

M                    This EAD tag is mandatory when the information is available

at this level in a collection guide.

REACH Element

EAD Guideline / Mark-up



If a collection is not subdivided into series and has no file-level descriptions, describe each item-level component using nested Components beginning with <c01>.  If a collection does include series and/or file-level descriptions, begin item-level description with the next available <c0x> component.

  • Set the level attribute on all item-level components used to "item".
  • Use an id attribute unique within the OAC environment on all item-level components. This id will be used to link Digital Objects back to the exact location of their description in the collection guide. Until the OAC/CDL makes a programmatic decision on Permanent Object Identifiers (POIs), the use of an institutional acronym and an Accession Number is recommended.
  • Example: <c02 id="bampfa_1996.49.1" level="item">


Use one <did> as the first subelement within all <c0x> tags.


Electronic Location & Access

If digital surrogates for the item described exist, provide access to those files through a <daogrp> and nested <daoloc> elements referencing either a single access file, or a separate METS / MOA2 document. Do not use <dao>, even if you only reference a single file. Please note that all external resources referred to in an EAD document have to be declared as entity references (<!ENTITY>) after the document type declaration (<!DOCTYPE>). The name of the entity in the Declaration Subset Area (see Chapter B) has to correspond to the entityref attribute. Using a combination of entityref and href provides two paths for the final display system to access the referenced resource (in this case an image).

  • For a simple reference from an inline thumbnail to an access file, follow Example 1. For a reference from an inline thumbnail to a METS object, follow Example 2. For an example of an entity reference, see Example 3.
  • Example 1: <daogrp><daoloc entityref="bampfa_1994.31_1_2" href= images/bampfa_1994.31_1_2.jpg role="hi-res"></daoloc><daoloc entityref="bampfa_1994.31_1_3" href="" role="thumbnail"></daoloc></daogrp>
  • Example 2: <daogrp><daoloc entityref="bampfa_CC.212_1_2" href="" role="hi-res"></daoloc><daoloc entityref="bampfa_CC.212_1_3" href="" role="thumbnail"></daoloc></daogrp>
  • Example 3: <!ENTITY bampfa_CC.212_1_3 SYSTEM " images/bampfa_CC.212_1_3.jpg" NDATA jpeg>

Creator / Maker

Use one <origination> for the person or group responsible for the design or creation of the object. If an individual artist is unknown, this field should contain a designation by school and period or the name of the culture group responsible for the creation of the work. The name should represent the attribution currently accepted by the holding institution.

  • Birth and death dates and country of origin, if known, should go in this field. The ULAN should be used here in conjunction with the NORMAL attribute.
  • Example: <origination><persname role="Creator" NORMAL="Van Gogh, Vincent" source="ULAN"> Vincent Van Gogh, </persname> 1853-1890, Holland </origination>

Object Name / Title

Use one <unittitle> for the name or title given to the object by the creator/maker, curator, or owner. Descriptive titles or names based on classification terms or object type may be included here, and must also be included in the 'Type of Object' field (below).

  • Example: <unittitle>Portrait of DeVille</unittitle>

Date of Creation / Date Range

Use one <unitdate> with the type attribute set to the appropriate value.

  • To include a bulk date, insert the element <unitdate type="bulk"> immediately after the first <unitdate>..
  • Example: <unittitle label="Title">Sarah Smith papers, <unitdate type="inclusive">1930-1975</unitdate> <unitdate type="bulk">(bulk 1955-1970)</unitdate></unittitle>
  • If normalizing <unitdate>s using the normal attribute, use ISO 8601 : 1988(E) (available online at: <>) for determining the format for the attribute value.  The following format is recommended: YYYY-MM-DD or YYYY-MM (using 4 digits for years and hyphens between elements).  For example, December 12, 1904 would be normalized as "1904-12-12" and April 1962 as "1962-04."  Use a forward slash (/) to separate dates in a range, and use the fullest form of the date at each end of the range.  For example, normalize August 8-24, 1986 as 1986-08-08/1986-08-24 and November 1923-March 1924 as 1923-11/1924-03.



Use one <physdesc> following the <unittitle> in the item-level <coX>.


Type of Object

Within <physdesc>, use one <genreform> to classify the object by type. For material culture collections, the classification could be the object name (for example, "chair", "canoe", etc.); fine art institutions should use this field to specify object genre or format (for example "painting", "engravings", etc.). Indicate the use of controlled vocabulary by setting the source attribute to the appropriate value.

  • Example: <genreform source="aat">engravings</genreform>

Style / Period Group / Movement / School

Within <physdesc>, use one <physfacet> with the type attribute set to "cdwa styles-description" to identify a style or period in the history of art. Preferably, these terms are encoded in AAT with the source attribute set correspondingly to "aat".

  • Time periods such as "19th century" should be included in the Date element. Only periods or movements with proper names should be included as <physfacet>.
  • Example: <physfacet encodinganalog="cdwa styles-description" source="aat">Surrealist</physfacet>

Medium / Materials

Within <physdesc>, use another <phsyfacet> with the type attribute set to "cdwa materials-description" to identify the substance(s) of which the object is made or the technique or process used in its creation. Use of controlled vocabulary, with the source attribute set to the appropriate value, is encouraged.

  • Example: <physfacet encodinganalog="cdwa materials-description" source="aat">oil paintings</physfacet>

Place of Origin / Discovery

Within <physdesc>, use one or more <geogname> with the role attribute set to the appropriate value to indicate the geographic location in which an object was created, or the location at which the object was found. Use the attribute "cdwa creation-place" and "cdwa context-excavationplace" respectively. Use of controlled vocabulary such as TGN is encouraged, with the source attribute set appropriately.

  • This tag may be especially useful for archaelogical artifacts, while its use for art objects with know creator information is limited.
  • Example: <geogname encodinganalog="cdwa creation-place" source="tgn"> Ibrica, Pennsula </geogname>


Within <physdesc>, use one <dimensions> for the measurements  of an object. Width, Height, Depth and Circumference should be indicated as w, h, d, c before the actual measurement, as well as the measurement unit (inches or cm).

  • Example: <dimensions>w22 x h37 x d17 cm</dimensions>

Current Repository

Use one <repository> for the full name of the current repository of the object; include place if appropriate. The repository name has to be formed according to AACR2 (Chapter 24) catalog entry form.

  • Example: <repository>Berkeley Art Museum/Pacific Film Archive</repository>

Current Object ID

Use one <unitid> for the inventory number assigned to the object by the current repository.

  • Preferred use: This field is for the object's accession number or ID number or current inventory number or any unique identifying number as assigned by the current repository.  Inventory numbers or other identifiers that may have been assigned to the object by former owners should be reported in the Notes field.
  • Example: <unitid>1958.5.59</unitid>


Use one <custodhist>, nested in <admininfo>, for the name of the previous owner of an object.

  • Preferred Use: Enter the name of a person, institution, or organization which formerly owned the object (as for example in a credit line), or include current owner's name, especially if different from current repository.
  • Example: <admininfo><custodhist>Gift of Mr. and Mrs. J. Farnsworth </custodhist></admininfo>


Use one or more <odd> elements for a textual description of an object; the object history (associated people, organizations, places, and events in the object's history); distinguishing features; inscriptions/marks; condition; edition/state.  Use <odd> for any descriptive text, remarks and comments documenting the object or commenting on it from an interpretive, educational or curatorial perspective. The text has to be specific to the item described in the respective <coX>; group or collection level descriptions should be placed at the appropriate series or subseries level, and not be repeated in each object's <odd>. Notes may be included in the EAD markup (tagged using the <odd> element), or may reside in external text files, linked to from the EAD finding aid (good for very large texts).

  • Use one <head> with an appropriate heading for the type of content encoded. Multiple <odd> entries are distinguished by their different heads.
  • Types of notes may include, but are not restricted to, description/content, inscriptions/marks, state/edition, transcription, history, condition, reference/bibliography, language, exhibition history.
  • Example: <odd><head>Description</head><p>This work epitomizes the ephemeral and spare quality of Cornell's late shadow-boxes and has been linked in influence to minimalism as much as to other art in this particular medium such as the work of Bruce Conner. This work is loosely based on a story of a 19th century Russian ballerina.....</p></odd><odd><head>Condition</head><p>This work has loose nail joints and fragile pigmentation...</p></odd>



Use one or more <subject> within <controlaccess> for encoding the content or subject matter of the object. Use one <subject> for each term. Indicate the use of AAT or other controlled vocabulary (strongly encouraged) by setting the source attribute correspondingly.

  • Example: <subject source="lcsh">California, Northern; </subject><subject source="lctgm">Events</subject><subject source="lctgm">Fires</subject><subject source="lctgm">Rivers</subject><subject source="lcsh">Sacramento (Calif.)</subject><subject source="lctgm">Settlements</subject></p>


3.2 Digital Objects

3.2.1 Making of America 2 (MOA2)

OAC/CDL Standards:

CDL Digital Object Standard: Metadata, Content and Encoding (May 18, 2001) [PDF]

Related Resources:

The Making of America II Testbed Project White Paper Version 2.0 (September 15, 1998) [PDF]

The digital object standard Making of America 2 (MOA2) adopted by MOAC serves three broad purposes. First, it allows project members to transfer digital objects to the central repository of the OAC. Second, it provides the means to display all the image files for one record as navigable clusters. In this way, the digital object standard extends the collection guide standard by adding sophisticated display for items represented by more than one image surrogate. Third, the digital object standard ensures the longevity of the digital files created in the project by outlining specifications for minimal technical metadata. In this function, the digital object standard augments the production specifications for digital images.

The UC Berkeley libraries developed MOA2 during a project of the same name as an XML document type definition (dtd), and the CDL subsequently adopted the mark-up as its digital object standard. MOAC reviewed the standard and, with a few exceptions, implemented the minimal set required by the CDL. The following tables and texts have been adapted from Appendix C: Metadata for Digital Objects of the CDL Digital Object Standard. For more details on the encoding of the fields in MOA2 XML, please refer to the original CDL document.


The Columns of the table are:

1.     Feature: a descriptive name of the metadata element

2.     Example: examples of this element's content

3.     Description/Comments: a definition of this metadata element

Elements required by MOAC are highlighted in grey.

(1) Metadata for the entirety of the Digital Object





Unique identifier reference

urn:ucb:I0182A, 10.1000/I0182A,

This element uniquely identifies a particular digital object


Patrick Breen diary : ms., 1846 Nov. 20‑1847 Mar. 1.

Name or title for object, not necessarily unique, for display to user.


diary, ledger, photoalbum, stereograph, etc.

Class of work of which this digital object is an instance. Analogous to a MARC 655 field.

Descriptive Metadata Reference

An identifier or location for descriptive metadata regarding this object.

Descriptive Metadata Type


The form of descriptive metadata associated with this object.


(2) Content File Inventory

The content file inventory of a digital object contains a listing of all of the files containing digital content derived from the primary source. The metadata elements within the content file inventory contain the most basic information needed to identify, retrieve, and display the content files.






A digital library object may encapsulate several different electronic expressions of the original work which has been digitized in different formats. A version within a digital library object consists of all files necessary to process and display a particular expression to a user (e.g., an SGML transcription + DTD +DSSSL style sheet). Files within a single, root <FileGrp> element constitute a digitized version of the object.

File ID

<File ID="I0182A">

A unique identifier, internal to the object, for referencing this particular File from the Structural Map.

File Type

text/xml, text/XML, image/tiff, etc.

Used to inform client software regarding the file's data format, and hence what general viewer type will be needed.

File Sequence

23rd of 42 page images

Relative position of a particular file within its encapsulating subset of files.

File Date


The date the file was created expressed as ISO Date Format YYYY-MM-DD

Administrative Metadata Reference

<File ADMID="A125 A137">

This attribute carries information necessary to locate all administrative metadata relevant to this file. In the digital library object, this consists of an IDREF attribute referring to a particular tagged section within the Administrative Metadata portion of the digital library document.

File Use


Used to describe generic instances of an image.

File Dimensions

1024 x 1028 pixels

Dimension information such as the resolution offered by the object (i.e., not the captured resolution) may be provided. This element documents the forms of the image object that can be requested from the repository (i.e., in order to assist an intermediary in navigation, manipulation, etc.). For images of all types (i.e., bitonal and continuous tone), this is resolution and pixel dimensions. The element is not applicable for text.

File Locator


A unique identifier or locator which may be used by client software to retrieve the file in question.


(3) Structural Metadata Table

Structural metadata records the structure of the work from which the digital object is derived. The hierarchy created by structural metadata is crucial for display and navigation of a digital object.




Structural Type

Logical, physical, etc.

Structural type is used to indicate whether the internal structure of the object is best described as a logical structure (e.g., this is a diary consisting of entries) or a physical structure (e.g., this is a book consisting of pages).

Structural Divisions/Sub-object Relationships

Parent div of diary, with two child divs of type entry, which are siblings:


<div TYPE='diary'>

   <div TYPE='entry'>


   <div TYPE='entry'>



A digital object may be logically divided into parts (e.g., letters in a diary). If resources are made available to support some level of encoding, structural divisions are encoded with the TEI element DIV.  Many of the attributes of the Digital Object will be applicable to the Structural Divisions.

DIVs provide information on sub-object relationships. A diary entry in a diary section (e.g., a year) would have as its parent the section, and would have as siblings the previous and next diary entries. If, for example, it was an unusually long diary entry with sections of its own, its "children" would be the sections within the entry.

Sub-object Type

Table of contents, entry, illustration, etc.

Similar to the genre for an object, sub-object type specifies a class of material of which this sub-object is a particular instance, such as entries in a diary, pages in a photoalbum, etc.

Sub-object sequence

1 - N

Pages require a sequence indicator (e.g., this is the third page in the sequence of pages contained in this book).

Sub-object Label

Page 3, "Fit the Second - The Bellman's Speech"

Name or title for the sub-object, not necessarily unique, for display to user.

Sub-object Format

text/xml, text/XML, image/tiff, etc.

Images of all types (e.g., page images and continuous tone images) require format information. The contents of the Sub-object Format element are coordinated with the Content Type element (see above). While Content Type declares the available formats for a particular "type" of information (e.g., encoded text), the Sub-object Format element refers to these declarations to inform the intermediary of the available formats for the object at hand. For example, a page image may be said to be available as a GIF image, a PDF file, and a TIFF G4 image.

Sub-object reference

<fptr FILEID="I0182A">

This attribute carries information needed to locate the sub-object. In the digital library object, this consists of an IDREF attribute referring to a particular file within the File Inventory section, possibly combined with a reference to a tagged item within a file.


(4) Administrative Metadata Table - General

Administrative metadata encompasses all information necessary for objects' long-term use and management. It includes information on the technical features of content files, intellectual property rights information, and source and provenance information.




Administrative Metadata ID

<AdminMD ID="AM183">

A unique identifier, internal to a digital library object, which allows this metadata to be referenced by other portions of the object


(5) Administrative Metadata Table - Technical

Technical administrative metadata elements include information necessary to document the technical processes employed in digitizing primary source material.





(Lossless) Compression Format


Type of algorithm needed to decompress the image, with note of software package used to apply the format, and degree/percentage of compression used where options exist.

Bit Depth

1, 8, 24, etc.

Color depth, often needed by viewer and acts as an indication of quality to user.

Color Space


Color space used, often needed by viewer and indicates whether image was initially created for onscreen display or for pre-press output. (Some color space parameters such as white point may require individual tags).


600 dpi; 400 dpi interpolated to 600 dpi

The settings on the input scanning device (cameras usually measure these in dimensions, other devices in dpi). Note where device does its own interpolation.


(6) Administrative Metadata Table - Rights

Rights administrative metadata elements include information regarding the intellectual property rights relevant to the digital object's storage, transmission and use. Please note that all rights fields in this table relate to digital files, not to the physical object the files represent.






Owner(s) of the copyright on the digital image file, which MAY be the creator of the digital image file, or the person(s) from whom the digital image file was purchased or licensed. It should contain the name(s) of the person(s) from whom copy/distribution and display/transmission rights may be secured. Note: this refers to the copyright on the digital image only, not the work(s) represented in the digital image.

Credit Line

Copyright Berkeley Art Museum, 1978. All rights reserved.

The text required to be displayed whenever the image/data appears.

Copying & Distribution Restrictions

Copy and distribution of this file is prohibited without the express written consent of...

text that spells out any copyright restrictions pertaining to the copy and distribution of this image file.

Display & Transmission Restrictions

This file may be displayed or transmitted across a network only by person(s) who have signed a license agreement with ...

text that spells out any copyright restrictions regarding the transmission and display of this image file.


(7) Administrative Metadata Table - Source

Source administrative metadata elements are intended to record all information necessary to determine the origin of the current file.





Source Item ID

a local catalog number plus page number for a book; an accession number (and possibly a page or part number) for a special collections item

A number or alphanumeric string uniquely identifying the source of this file (recursively).

Source Type

Photographic print, slide, manuscript, printed page(s), another digital image

To identify the material from which the digital file was created - the item on hand, even if it itself is a reformatted version, e.g. the scan of a 35mm slide of a painting would be entered here as a 35mm slide.

Physical Dimensions of Source

10.2cm x 18.4cm

Actual physical dimension of source. Needed for appropriate facsimile output.

3.2.2. Metadata Encoding and Transmission Standard (METS)

Related Resources:

METS Official Website [HTML]

VRA Core 3.0 Website [HTML]

MIX Official Website [HTML]

Over the last year, a Digital Library Federation funded initiative has revised and expanded the MOA2 dtd into an XML schema called Metadata Encoding and Transmission Schema (METS). The CDL has not yet officially adopted METS as its digital object standard, but the OAC started developing infrastructure around the new standard for their website re-launch OAC 2.0 on July 15th 2002. MOAC assumes that it will continue to submit MOA2 objects to the OAC until an official policy for METS has been adopted.

In the future, METS will offer the opportunity to re-visit the existing digital object standard and the fields recommended for capture. METS makes extensive use of the W3C XML Namespace specification, which allows it to co-ordinates a number of standards specializing in different aspects of digital object description and management.

METS may reference a descriptive metadata standard, and unlike MOAC's current MOA2 implementation, positions digital objects as searchable entities in their own right. A METS object repository provides more direct access to item-level data, and may become the primary access to MOAC data. In this scenario, EAD Collection Guides may cede some of their importance for initial discovery to METS. After having discovered digital images of an object through a search across the METS repository, an end-user may discover related materials by backing out into the greater context of the EAD Collection Guide. For a comparison of the traditional access model and the new functionality provided by METS, please consult Tables 1 and 2. A prime candidate for MOAC adoption as a METS descriptive metadata extension schema would be VRA Core 3.0 XML Schema (currently under development).

Table 1: Traditional Discovery and Navigation through the EAD, with multimedia granularity added through MOA2 / METS

Some of the functionality gained in the shift from MOA2 to METS has already been implemented in the new OAC website. The OAC converted existing MOA2 documents into METS and enriched them with descriptive metadata gleaned from EAD Collection guides. Furthermore, it generated METS object from scratch for digital images solely described in Collection Guides. As a result, end-users may now search directly across all the digital surrogates in the OAC through a new image search ( Once the OAC/CDL has set an official policy for METS, contributing institutions should expect to generate Digital Objects for any digital surrogate they submit so it can become part of the image search. Currently MOAC only submits MOA2 XML for complex objects, i.e. objects which are represented by more than one image.

Table 2: Discovery through METS, context through EAD

METS may also reference a number of administrative metadata standards. A prime candidate for adoption as a new Technical Metadata standard would be NISO Metadata for Images in XML Schema (MIX). The feasibility for MOAC to capture technical metadata at this level of sophistication largely depends on the availability of supporting software. Once capture software for camera backs and scanners writes a comprehensive set of metadata to the image fileheaders or a correlated external text file, this information may be encoded using the MIX standard.

3.3 Transcriptions: Text Encoding Initiative Lite (TEI Lite)

Related Resources:

MOAC has started to experiment with TEI (Text Encoding Initiative) Lite XML to deliver transcripts of unique artifacts such as artist's books, mailart and film scripts. TEI began as a research effort of three scholarly societies (the Association for Computers and the Humanities, the Association for Computational Linguistics, and the Association for Literary and Linguistic Computing), and by now has turned into an international consortium maintaining the standard.

The current TEI Lite implementation extends the MOA2 object with a transcript which may be displayed side by side with the digital surrogate. In this way, both the physical layout and the intellectual content of a particular page become accessible simultaneously. Extending MOA2 with transcriptions has proved especially helpful for artist's books which remain illegible at the access file specifications.

Please consult Gnter Waibel's article "Granular Collections Access. An Information Architecture Informed by Standards" for a more detailed expose on TEI Lite, as well as the interplay of EAD and MOA2.

4. Tools for data encoding and image file management

Related Resources:

Gnter Waibel. Produce, Publish and Preserve. A Holistic Approach to Digital Assets Management. Spectra: Digital Imaging Special Issue. Laguna Beach, California. Fall 2000; p. 38-44. [PDF]

Margo Dunlap, Johanna Glenny, Gnter Waibel. The DAMD Manual [RTF]

As a project conceived around the idea of exchanging cultural heritage data through XML, MOAC had to address the issue of how to mark up vast quantities of data in the specified XML file exchange formats. The UC Berkeley Bancroft Library supplied a Microsoft Access database (GenDB) from which EAD and MOA2 XML could be extracted; however, GenDB turned out to be an uneasy fit for some project members - it only runs on a PC platform and primarily focuses on original cataloging rather than data re-purposing. Since the bulk of the data to be exchange already lived in the member institution's collections management systems, BAM/PFA embarked on creating a database tool which could import data from pre-existing systems, enrich it with further information, and then export the result to the file-exchange formats according to MOAC project specifications. The resulting database should also keep track of all the digital images of each member institution. At the end of the development, project members had a choice of using either the BAM/PFA Filemaker database or the Access database.

The Digital Assets Management Database (or DAMD) developed at BAM/PFA exports to EAD XML, MOA2 XML and TEI Lite XML. It consists of 7 relational FilemakerPro databases, which store information about the institution using the database, collections and their hierarchies, items and their hierarchies as well as the image surrogates and transcriptions associated. DAMD provides the glue between sets of pre-existing information. In production, all descriptive metadata is imported into the database from the collections management system; technical metadata is imported from the fileheaders of the image files.

Data input for collections mostly centers around assigning objects to their home in a collection hierarchy, and entering contextual information such as collection overviews or artist's biographies. If an object has more then one image representation, the MOA2 structure has to be specified; if applicable, textual content needs to be transcribed into the database. While the whole operation may seem overwhelmingly complicated on paper, it actually only takes a few minutes to set up a complex object and plug it into the collection's hierarchy. The reward comes in the form of push-button export to the EAD, MOA2 and TEI Lite: a series of scripts harvest the data and write it to external text files containing the complete, well-formed mark-up for each MOAC file exchange format.

Appendix A: Example of an EAD Item-Level Record

This collection guide excerpt represents an item-level description for the script of a video by conceptual artist Theresa Hak Kyung Cha.

<c02 id="bampfa_1992.4.41" level="item">



<daoloc entityref="bampfa_1992.4.41_1_2" href="" role="hi-res"></daoloc>

<daoloc entityref="bampfa_1992.4.41_1_3" href="" role="thumbnail"></daoloc>


<origination><persname>Cha, Theresa Hak Kyung</persname></origination>

<unittitle>Re Dis Appearing</unittitle>




<physfacet type="Materials-Description">Typewritten text on paper, 2 pages. Pencil on paper, 1 page.</physfacet>

<geogname>United States, born Korea</geogname>

<dimensions>h 11 x w 8 -1/2 inches</dimensions>


<repository>Berkeley Art Museum/Pacific Film Archive</repository>



<admininfo><custodhist><p>Gift of the Theresa Hak Kyung Cha Memorial Foundation</p></custodhist></admininfo>



<p>The typewritten text is a 2 page master script/ finished scenario for Cha's video "Re Dis Appearing." Handwritten additions by Cha are made in pencil. The scenario is typed on a Media Communication form, Laney College for a titled TV Production 31/A.</p><p>The pencil on paper is a floor plan for the video.</p><p>A finished or video scenario describes camera action, video switching, audio, narration and stage blocking. The floor plan blocks set positions, lights, etc. The plan accompanies the master script. "Re Dis Appearing" is a black and white video, 3 minutes by Cha. The video includes an aural sequence.</p><p>"Re Dis Appearing" was screened at Worth Ryder Gallery, Berkeley, Ca. 1977. The video was also included in Videotape by Women From Los Angeles Women's Video Centre. Australian Tour 1979-1980.</p>



Appendix B: MOA2 Object

This complete MOA2 XML document represents the digital object referenced by the EAD entry in Appendix A.

<?xml version = '1.0' standalone = 'no'?>


<ArchObj OBJID='bampfa_1992.4.41' LABEL='Cha, Theresa Hak Kyung - Re Dis Appearing'>





<FileGrp VERSDATE='6/29/2001' ADMID='ADM1'>

<File ID = 'bampfa_1992.4.41_1_1' MIMETYPE = 'image/tiff' SEQ = '1' X = '4800' Y = '7428' UNIT = 'PIXELS' CREATED = '10/31/2000' OWNERID = 'bampfa_1992.4.41_1_1.tif' ADMID = 'ADM1' GROUPID = '636' USE = 'ARCHIVE'>

<FLocat LOCTYPE = 'URL'>No Online Storage</FLocat>


<File ID = 'bampfa_1992.4.41_2_1' MIMETYPE = 'image/tiff' SEQ = '2' X = '4800' Y = '7428' UNIT = 'PIXELS' CREATED = '10/31/2000' OWNERID = 'bampfa_1992.4.41_2_1.tif' ADMID = 'ADM1' GROUPID = '637' USE = 'ARCHIVE'>

<FLocat LOCTYPE = 'URL'>No Online Storage</FLocat>


<File ID = 'bampfa_1992.4.41_3_1' MIMETYPE = 'image/tiff' SEQ = '3' X = '4800' Y = '7428' UNIT = 'PIXELS' CREATED = '10/31/2000' OWNERID = 'bampfa_1992.4.41_3_1.tif' ADMID = 'ADM1' GROUPID = '638' USE = 'ARCHIVE'>

<FLocat LOCTYPE = 'URL'>No Online Storage</FLocat>



<FileGrp VERSDATE='6/29/2001' ADMID='ADM2'>

<File ID = 'bampfa_1992.4.41_1_2' MIMETYPE = 'image/jpeg' SEQ = '1' X = '352' Y = '480' UNIT = 'PIXELS' CREATED = '10/31/2000' OWNERID = 'bampfa_1992.4.41_1_2.jpg' ADMID = 'ADM2' GROUPID = '636' USE = 'REFERENCE'>

<FLocat LOCTYPE = 'URL'></FLocat>


<File ID = 'bampfa_1992.4.41_2_2' MIMETYPE = 'image/jpeg' SEQ = '2' X = '351' Y = '480' UNIT = 'PIXELS' CREATED = '10/31/2000' OWNERID = 'bampfa_1992.4.41_2_2.jpg' ADMID = 'ADM2' GROUPID = '637' USE = 'REFERENCE'>

<FLocat LOCTYPE = 'URL'></FLocat>


<File ID = 'bampfa_1992.4.41_3_2' MIMETYPE = 'image/jpeg' SEQ = '3' X = '353' Y = '480' UNIT = 'PIXELS' CREATED = '10/31/2000' OWNERID = 'bampfa_1992.4.41_3_2.jpg' ADMID = 'ADM2' GROUPID = '638' USE = 'REFERENCE'>

<FLocat LOCTYPE = 'URL'></FLocat>



<FileGrp VERSDATE='6/29/2001' ADMID='ADM3'>

<File ID = 'bampfa_1992.4.41_1_3' MIMETYPE = 'image/jpeg' SEQ = '1' X = '110' Y = '150' UNIT = 'PIXELS' CREATED = '10/31/2000' OWNERID = 'bampfa_1992.4.41_1_3.jpg' ADMID = 'ADM3' GROUPID = '636' USE = 'THUMBNAIL'>

<FLocat LOCTYPE = 'URL'></FLocat>


<File ID = 'bampfa_1992.4.41_2_3' MIMETYPE = 'image/jpeg' SEQ = '2' X = '110' Y = '150' UNIT = 'PIXELS' CREATED = '10/31/2000' OWNERID = 'bampfa_1992.4.41_2_3.jpg' ADMID = 'ADM3' GROUPID = '637' USE = 'THUMBNAIL'>

<FLocat LOCTYPE = 'URL'></FLocat>


<File ID = 'bampfa_1992.4.41_3_3' MIMETYPE = 'image/jpeg' SEQ = '3' X = '110' Y = '150' UNIT = 'PIXELS' CREATED = '10/31/2000' OWNERID = 'bampfa_1992.4.41_3_3.jpg' ADMID = 'ADM3' GROUPID = '638' USE = 'THUMBNAIL'>

<FLocat LOCTYPE = 'URL'></FLocat>



<FileGrp VERSDATE = '5/16/2001' ADMID = 'ADMtr'>

<File ID='bampfa_1992.4.41' MIMETYPE = 'text/xml' SEQ='1' CREATED ='5/16/2001' OWNERID = 'bampfa_1992.4.41_tr.sgm'>

<FLocat LOCTYPE = 'URL'></FLocat>



<AdminMD ID = 'ADM1'>




<BitDepth BITS = '24' />





<Owner>The Regents of the University of California, Berkeley


<Credit>The Regents of the University of California, Berkeley




<AdminMD ID = 'ADM2'>




<BitDepth BITS = '24' />





<Owner>The Regents of the University of California, Berkeley


<Credit>The Regents of the University of California, Berkeley




<AdminMD ID = 'ADM3'>




<BitDepth BITS = '24' />





<Owner>The Regents of the University of California, Berkeley


<Credit>The Regents of the University of California, Berkeley




<AdminMD ID = 'ADMtr'>



<Transcriber>Guenter Waibel





<Owner>The Regents of the University of California, Berkeley


<Credit>The Regents of the University of California, Berkeley



<Source SOURCEID='bampfa - 1992.4.41'>



<Details>TEI Lite encoding by DAMD FilemakerPro




<StructMap TYPE = 'logical'>

<div N = '1' LABEL = 'Cha, Theresa Hak Kyung - Re Dis Appearing'>

<div N = '2' LABEL = 'page 1'>

<fptr FILEID = 'bampfa_1992.4.41_1_1' MIMETYPE = 'image/tiff' />

<fptr FILEID = 'bampfa_1992.4.41_1_2' MIMETYPE = 'image/jpeg' />

<fptr FILEID = 'bampfa_1992.4.41_1_3' MIMETYPE = 'image/jpeg' />

<fptr FILEID = 'bampfa_1992.4.41' MIMETYPE = 'text/xml' TAGID = '1' />


<div N = '2' LABEL = 'page 2'>

<fptr FILEID = 'bampfa_1992.4.41_2_1' MIMETYPE = 'image/tiff' />

<fptr FILEID = 'bampfa_1992.4.41_2_2' MIMETYPE = 'image/jpeg' />

<fptr FILEID = 'bampfa_1992.4.41_2_3' MIMETYPE = 'image/jpeg' />

<fptr FILEID = 'bampfa_1992.4.41' MIMETYPE = 'text/xml' TAGID = '2' />


<div N = '2' LABEL = 'floorplan'>

<fptr FILEID = 'bampfa_1992.4.41_3_1' MIMETYPE = 'image/tiff' />

<fptr FILEID = 'bampfa_1992.4.41_3_2' MIMETYPE = 'image/jpeg' />

<fptr FILEID = 'bampfa_1992.4.41_3_3' MIMETYPE = 'image/jpeg' />

<fptr FILEID = 'bampfa_1992.4.41' MIMETYPE = 'text/xml' TAGID = '3' />




Appendix C: Transcription in TEI Lite XML

This complete TEI Lite XML document contains the transcription for the manuscript as referenced by the MOA2 XML in Appendix B.

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE TEI.2 PUBLIC "-//TEI//DTD TEI Lite//EN//1.0" "teilite.dtd">





<title>Cha, Theresa Hak Kyung - Re Dis Appearing</title>


<publicationStmt><p>Berkeley Art Museum/Pacific Film Archive</p></publicationStmt>

<sourceDesc><p>bampfa - Item Id 1992.4.41</p></sourceDesc>





<div1 type='Documentation'>

<pb n='1' id='p1'/>

<p>Audio</p><p>begin voice #1</p><p>o commencer. dbut.</p><p>fin.</p><p>begin voice #2</p><p>follow in english</p><p>UHM MAH...AH PAH...</p><p>touv dans un jardin</p><p>le quel</p><p>un bol. un bol du th.</p><p>th vert. th au someil.</p><p>reflt sur l'eau</p><p>un arbre chauve</p><p>des portraits fixes</p><p>des feuilles</p><p>dj pass</p><p>clbration ici.</p><p>d'un poid du temps</p><p>gout amer</p><p>la drnire tasse</p><p>dguis</p><p>masqu</p><p>une camouflage</p><p>pas de jeu</p><p>si, du jeu</p><p>rite. avant l'acte de.</p><p>la voie de mot</p><p>d'une langue du soi</p>

<pb n='2' id='p2'/>

<p>Audio</p><p>et puis, le jardin.</p><p>du the</p><p></p><p>audio translation:</p><p>where to begin. begin.</p><p>end.</p><p>found in a garden</p><p>which</p><p>a bowl. a bowl of tea.</p><p>tea bowl.</p><p>tea green. tea of sleep.</p><p>reflected on water</p><p>the bald tree</p><p>fixed portraits</p><p>leaves</p><p>already past</p><p>passed</p><p>celebration here.</p><p>of a certain time's</p><p>weight</p><p>bitter taste</p><p>last cup</p><p>disguised</p><p>masked</p><p>a camouflage</p><p>not of game</p><p>yes, of game</p><p>rite. before the act of.</p><p>the way of word</p><p>the way of a tongue of</p><p>self</p><p>and then, the garden.</p><p>the tea.</p>

<pb n='3' id='p3'/>