Specifications for submissions
of EAD encoded finding aids and associated materials from
museums to the Online Archive of California
Draft for the "Museums and the
Online Archive of California" project, with notes on
relations between standards recommended for MOAC and other standards used in the field
(Detailed Image recommendations will be added)
1. Markup for item-level records
1. Markup of item-level records (EAD container lists)
These records will usually be exported from museum collection management databases where the information is already structured, rather than marked up manually. These specifications can be used to markup automatically when exporting from databases, using scripts on records exported as text, or for manual markup.
Below, each "field"
of markup is addressed separately, including notes on the use of related
structural and terminology standards, with an example of a completed record
at the end. For future use in XML, case does matter, so all tags/elements
and attributes should be kept lower case.
A. Markup of item-level records (EAD container lists)
Related Standards Used:
REACH Element Set
The number and definitions of fields of data to include in item level records for this project were derived from the REACH Element Set. The REACH (Record Export for Access to Cultural Heritage) was a project initiated by the Getty Information Institute and Research Libraries Group, and included a number of museum partners - including MOAC participant, the Berkeley Art Museum. The REACH project created a set of elements that were determined to be detailed enough to describe an object and yet simple and generic enough that these fields would be found in most museum's collection information systems, even with minimal cataloging. The order of the fields is determined by convention in the museum community for museum wall labels. The REACH was designed for just such a use as this: to create a common element set that would come from individual museums to be used in a shared system. Thus REACH did not define a syntax (such as an SGML DTD or MARC-like system) but is purely a semantic set of fields with common definitions, which can be used in different syntax's and applications (such as here in a union database of EAD documents). Most of the 20 REACH elements are carried over here verbatim; only in a couple of places were two REACH elements collapsed into one EAD markup field for simplicity and appropriateness. The REACH element set was designed partially as a distillation of the more complex, exhaustive CDWA and MESL element sets, and in relation to the Dublin Core element set.
CDWA (Categories for the Description of Works of Art)
CDWA, a Getty & museum community developed standard, is a full semantic, descriptive standard for describing art objects. "The AITF Categories for the Description of Works of Art are guidelines for formulating the content of art databases. They articulate an intellectual structure for descriptions of objects and images: in this sense they constitute a schematic representation of the requirements and assumptions implicit in the practice of the discipline of art history. The AITF hopes that the Categories will become a standard to which existing art information systems can be mapped and upon which new systems can be developed. By providing a single, encompassing framework for descriptive information about works of art, the Categories are intended to enhance compatibility between diverse systems that wish to share art information. Such a standard will contribute to the integrity and longevity of information transmitted across networks and moved to new systems." (from CDWA website http://www.gii.getty.edu/cdwa/). Here the CDWA is mainly used to add detail and granularity to the markup of item-level information where the EAD DTD is generic and leaves it up to the implementor to define the details. Specifically the EAD uses the <physdesc> and <physfacet> elements to contain information about the physical aspects of the object. These elements are generic and repeatable (to allow cross-domain use), but they include attributes for type and source if one wishes to indicate further detail about exactly what type of physical description is being marked up, and what the authority source for that type is. Since museum objects have detailed physical information, type is used here to define the difference between physfacet/media and physfacet/techniques. The type attribute was added to the EAD version 1.0 along with CDWA as an official source specifically to aid in the markup of museum and object collections.
So, while the REACH element set is used to provide the simplest common ground for the whole record, CDWA is used in the opposite way, to add detail to a specific element when needed. Other authorities used by participating museums, such as ICONCLASS, will be explicit in the markup when appropriate.
AAT (Art & Architecture
ULAN (Union List of Artist Names)
TGN (Thesaurus of Geographic Names)
LCTGM (Library of Congress Subject Heading Thesaurus for Graphic Materials)
These terminology standards,
all developed largely by the Getty, museum and library communities, are
staples of the museum community - used in print and database collection
systems to provide consistency of terms for access purposes. These authority
sources will be used whenever possible inside the marked/fielded data elements.
The EAD offers very consistent ways to include both unstructured terms
and structured terms, either exclusively or together in a record, as described
These fields should
be included in the order shown. Elements are named, then defined as they
would be found in the museum collection system, then the EAD markup is
shown. All fields are repeatable. This set (based on REACH) is a target
or template for markup. However if museum lacks data for one or more fields,
it should simply be included empty. If the museum needs to add more than
this number of fields, they should consult with the project team on how
best to include that.
Start and End item-level record:
Start with <c0x level=item> (<c01>, <c02>, etc. depending on how many levels of hierarchy are represented in the finding aid. Whenever you get to the item-level however, the record should begin with <c0x> and contain the elements below)
End with </c0x>
1: Electronic Location & Access
The URL linking the object record to a digital image of the object or the filename for that digital image. Repeatable for multiple, different images of object. If you have thumbnails of images, and then larger versions, it's preferrable to have the thumbnail be the image which displays on the page, and is linked to the larger image. To enable this, always place the thumbnail image second in markup order.
<daoloc entityref="b1968.79-1a" role="hi-res"></daoloc>
<daoloc entityref="b1968.79-1t" role="thumbnail"></daoloc>
Note that image file names are referred to with entityrefs. Entityrefs should be declared for each image earlier in the EAD document (immediately after <!DOCTYPE.... and before the first <ead> tag) with this syntax:
SYSTEM "1968.79.a.jpg" NDATA JPEG>
It is also allowable to include a second, HREF type linking mechanism in the markup like this:
<daoloc entityref="b1968.79-1a" href="http://www.bampfa.berkeley.edu/images/1968.79.a.jpg" role="hi-res"></daoloc>
<daoloc entityref="b1968.79-1t" href="http://www.bampfa.berkeley.edu/images/1968.79.a.jpg" role="thumbnail"></daoloc>
The name of a person or corporate entity responsible for the design or creation of the object. Where an individual artist is unknown,
this field should contain a designation by school and period or the name of the culture group responsible for the creation of the work. The name should represent the attribution currently accepted by the holding institution. Birth and death dates and country of origin, if known, should go in this field, after the name. The ULAN may be used here in conjunction with the NORM attribute.
<origination><persname> value[Vincent van Gogh, 1853-1890, Holland] </persname></origination>
role="Creator" NORM="Van Gogh, Vincent" source="ULAN"> value[Vincent Van
Gogh, 1853-1890, Holland] </persname></origination>
3: Object Name/Title
The name or title given to the object by the creator/maker, curator, or owner, or the text of a caption that appears with the image as in prints, cartoons, and photographs.
The field for a title or name of the object. Descriptive titles or names based on classification terms or object type should be
provided for objects that do not have formal titles.
of DeVille] </unittitle>
4: Date of Creation/Date Range
The year in which the object was created; if specific year not known, or if object executed over several years, give date range.
<unitdate> value </unitdate>
value[19th century] </unitdate>
5: Place of Origin/Discovery
The geographical location in which an object was created, or if not known, then place object was found. Especially useful for archaeological artifacts, not so useful for art objects where creator info is known.
<geogname role="Creation-Place" source="CDWA"> value[Iberian Peninsula] </geogname>
source="CDWA"> value[Iberian Peninsula] </geogname>
The substance(s) of which the object is made.
<physfacet> value[oil on canvas] </physfacet>
CDWA may be used here for specificity.
source="CDWA"> value[oil on canvas] </physfacet>
A term or phrase describing how the object was created.
wax method] </physfacet>
CDWA may be used here for specificity.
source="CDWA"> value[lost wax method] </physfacet>
Measurements associated with any particular dimension of the object.
x 37 inches] </dimensions>
9: Current Repository Name
The full name of the current repository of the object; include place if appropriate.
Art Museum] </repository>
10: Current Object ID Number
The inventory number currently assigned to the object by the current repository.
This field is for the object's accession number or ID number or current inventory number or any unique identifying number as
assigned by the current repository. Inventory numbers or other identifiers that may have been assigned to the object by former
owners should be reported in the Notes field.
The name of a previous owner of the object.
Enter the name of a person, institution, or organization that formerly owned the object or include current owner's name, especially if different from current repository.
<custodhist> Gift of Mr. and Mrs. J. Farnsworth </custodhist>
Extended Loan to the J. Wright Estate </custodhist>
Textual description of object; object history: associated people, organizations, places, and events in the object's history;
distinguishing features; inscriptions/marks; condition; edition/state. Any descriptive text, remarks and comments documenting the object or commenting on it from an interpretive/curatorial perspective. For this purpose, this text should be only that which is specific to this item; group or collection level descriptions should be placed above the group of objects being described, and not repeated in each object's <odd> (other descriptive data) field.
work epitomizes the ephemeral and spare quality of Cornell's late shadow-boxes
and has been linked in influence to minimalism as much as to other art
in this particular medium such as the work of Bruce Conner. This work is
loosely based on a story of a 19th century Russian ballerina..... </p></odd>
Multiple, separate notes should be distinguished by the addition of a <head> sub-element like this:
<odd><head>Description</head><p> This work epitomizes the ephemeral and spare quality of Cornell's late shadow-boxes and has been linked in influence to minimalism as much as to other art in this particular medium such as the work of Bruce Conner. This work is loosely based on a story of a 19th century Russian ballerina..... </p></odd>
This work has loose nail joints and fragile pigmentation...</p></odd>
Types of notes may
include description/content, inscriptions/marks, state/edition, transcription,
history, condition, reference/bibliography, language, exhibition history,
13: Subject Matter
The content or subject matter of the object. AAT or other authority source for term is encouraged here. Note required "Subject:" in front of text.
value[madonna and child] </subject>
14: Type of Object
The classification of the object by type.
This field is for the term(s) that indicate the classification of the object. For material culture collections, this will tend to be
the object name (for example, chair, canoe, etc.); fine art institutions should use this field to specify object genre or format
(for example, painting, engraving, etc.). AAT or other authority source for term is encouraged here. Note required "Classification/Genre:" in front of text.
<genreform>Classification/Genre: value[engraving] </genreform>
norm="engravings">Classification/Genre: value[engraving] </genreform>
A term identifying a style or period in the history of art.
This field is for the term(s) identifying a style or period whose characteristics are represented by the object. These terms should
preferably be in the AAT, except where the AAT is too Western Art centric. Note required "Style/Group:" in front of text.
source="CDWA">Style/Group: value[Surrealism] </physfacet>
Example of one item-record marked up
Note some fields are not used, such as <geogname>, which is more appropriate for archaeological artifacts. Some extra elements are inserted, which are not semantically defined since they will not be coming from the museum collection system, but rather are needed only for the EAD markup (<did> and <physdesc> elements).
<daoloc entityref="b1968.79-1a" role="hi-res"></daoloc>
<daoloc entityref="b1968.79-1t" role="thumbnail"></daoloc>
<origination><persname>Vincent van Gogh, 1853-1890, Holland </persname></origination>
<unittitle>Portrait of Gaugin</unittitle>
<physfacet type="Materials-Description" source="CDWA">
oil on canvas </physfacet>
<physfacet type="Materials-Processes" source="CDWA">
Intaglio with spatula </physfacet>
<dimensions>22 x 37 inches</dimensions>
<repository>Berkeley Art Museum</repository>
<p>Gift of Mr. and Mrs. J. Farnsworth </p>
<p>This work was the last painting Van Gogh painted during the short stay
of fellow painter Paul Gaugin at their small house in Arles, France. It
<subject>Subject: Portrait of Artist Paul Gaugin</subject>
<corpname>Style/Group: Early Expressionist </corpname>
The object of these specifications is to provide a common format for the images from all of the participating institutions for viewing inside the finding aid alongside the item record. Very high resolution images for archiving and future use are not covered here as they are not used in the project directly; and are up to each institution to create and store themselves (however TIFF file format is recommended for this purpose).
The common format covers only the following: File format, image viewing gamma, and file pixel dimensions. File naming conventions are covered in the Metadata section. If a museum does not have one size of image (for instance the high-res) then they would simply omit that image and the corresponding EAD markup and only the two they do have will be displayed.
which follow are based on those used successfully in the California Heritage
Project (UC Berkeley). In addition the AMICO (Art Museum Image Consortium)
Project is recommending a maximum resolution of 1024x786 pixels, and the
MESL (Museum Education Site Licensing) Project allowed each museum to use
a different scheme, but the maximum-sized images typically varied in size
from 1284x1876 to 1472x999 pixels and they used JPEG file format as we
recommend. The MESL project in particular conducted end-user surveys to
determine the acceptable parameters for their images. So, we feel that
the recommended maximum image size falls into the range being used in museum
community project; thus indicating an acceptance of them for the end-user
and the ability of museums to produce them. File formats also use ISO standards,
which are also used in the museum community, thus reducing duplicative
work for museums and making maximum use of their images in multiple venues.
1. Viewing files should
be processed to be viewed in a gamma 2.2 viewing environment.
2. Each viewing file
should be provided in three different resolutions to meet different viewing
Resolution," for detailed study, "Medium Resolution," for full-screen viewing, and "Thumbnail," for embedding in finding aids.
3. Pixel dimensions (maximum) for each resolution class:
Thumbnail 192 x 192
(e.g., 128W x 166H)
Medium Resolution 768 x 768
High Resolution 1536 x 1536
4. File Formats for each resolution class:
Medium Resolution JPEG (JIFF)
High Resolution JPEG (JIFF)
Each file format should
be tested against a variety of imaging programs and browsers to insure
formats should not be used.