Specifications for submissions
of EAD encoded finding aids and associated materials from
museums to the Online Archive
of California
Draft for the "Museums and the
Online Archive of California" project, with notes on
relations between standards recommended
for MOAC and other standards used in the field
(Detailed Image recommendations will be added)
Contents:
1. Markup for item-level
records
2. Images
1. Markup of item-level records (EAD container lists)
These records will usually be exported from museum collection management databases where the information is already structured, rather than marked up manually. These specifications can be used to markup automatically when exporting from databases, using scripts on records exported as text, or for manual markup.
Below, each "field"
of markup is addressed separately, including notes on the use of related
structural and terminology standards, with an example of a completed record
at the end. For future use in XML, case does matter, so all tags/elements
and attributes should be kept lower case.
A. Markup of item-level records (EAD container lists)
Related Standards Used:
REACH Element Set
The number and definitions
of fields of data to include in item level records for this project were
derived from the REACH Element Set. The REACH (Record Export for Access
to Cultural Heritage) was a project initiated by the Getty Information
Institute and Research Libraries Group, and included a number of museum
partners - including MOAC participant, the Berkeley Art Museum. The REACH
project created a set of elements that were determined to be detailed enough
to describe an object and yet simple and generic enough that these fields
would be found in most museum's collection information systems, even with
minimal cataloging. The order of the fields is determined by convention
in the museum community for museum wall labels. The REACH was designed
for just such a use as this: to create a common element set that would
come from individual museums to be used in a shared system. Thus REACH
did not define a syntax (such as an SGML DTD or MARC-like system) but is
purely a semantic set of fields with common definitions, which can be used
in different syntax's and applications (such as here in a union database
of EAD documents). Most of the 20 REACH elements are carried over here
verbatim; only in a couple of places were two REACH elements collapsed
into one EAD markup field for simplicity and appropriateness. The REACH
element set was designed partially as a distillation of the more complex,
exhaustive CDWA and MESL element sets, and in relation to the Dublin Core
element set.
CDWA (Categories for the Description of Works of Art)
CDWA, a Getty & museum community developed standard, is a full semantic, descriptive standard for describing art objects. "The AITF Categories for the Description of Works of Art are guidelines for formulating the content of art databases. They articulate an intellectual structure for descriptions of objects and images: in this sense they constitute a schematic representation of the requirements and assumptions implicit in the practice of the discipline of art history. The AITF hopes that the Categories will become a standard to which existing art information systems can be mapped and upon which new systems can be developed. By providing a single, encompassing framework for descriptive information about works of art, the Categories are intended to enhance compatibility between diverse systems that wish to share art information. Such a standard will contribute to the integrity and longevity of information transmitted across networks and moved to new systems." (from CDWA website http://www.gii.getty.edu/cdwa/). Here the CDWA is mainly used to add detail and granularity to the markup of item-level information where the EAD DTD is generic and leaves it up to the implementor to define the details. Specifically the EAD uses the <physdesc> and <physfacet> elements to contain information about the physical aspects of the object. These elements are generic and repeatable (to allow cross-domain use), but they include attributes for type and source if one wishes to indicate further detail about exactly what type of physical description is being marked up, and what the authority source for that type is. Since museum objects have detailed physical information, type is used here to define the difference between physfacet/media and physfacet/techniques. The type attribute was added to the EAD version 1.0 along with CDWA as an official source specifically to aid in the markup of museum and object collections.
So, while the REACH element set is used to provide the simplest common ground for the whole record, CDWA is used in the opposite way, to add detail to a specific element when needed. Other authorities used by participating museums, such as ICONCLASS, will be explicit in the markup when appropriate.
AAT (Art & Architecture
Thesaurus)
ULAN (Union List of
Artist Names)
TGN (Thesaurus of
Geographic Names)
LCTGM (Library of
Congress Subject Heading Thesaurus for Graphic Materials)
These terminology standards,
all developed largely by the Getty, museum and library communities, are
staples of the museum community - used in print and database collection
systems to provide consistency of terms for access purposes. These authority
sources will be used whenever possible inside the marked/fielded data elements.
The EAD offers very consistent ways to include both unstructured terms
and structured terms, either exclusively or together in a record, as described
below.
Elements/Fields:
These fields should
be included in the order shown. Elements are named, then defined as they
would be found in the museum collection system, then the EAD markup is
shown. All fields are repeatable. This set (based on REACH) is a target
or template for markup. However if museum lacks data for one or more fields,
it should simply be included empty. If the museum needs to add more than
this number of fields, they should consult with the project team on how
best to include that.
Start and End item-level record:
Start with <c0x level=item> (<c01>, <c02>, etc. depending on how many levels of hierarchy are represented in the finding aid. Whenever you get to the item-level however, the record should begin with <c0x> and contain the elements below)
End with </c0x>
1: Electronic Location & Access
Definition:
The URL linking the
object record to a digital image of the object or the filename for that
digital image. Repeatable for multiple, different images of object.
If you have thumbnails of images, and then larger versions, it's preferrable
to have the thumbnail be the image which displays on the page, and is linked
to the larger image. To enable this, always place the thumbnail image second
in markup order.
<daogrp>
<daoloc entityref="b1968.79-1a"
role="hi-res"></daoloc>
<daoloc entityref="b1968.79-1t"
role="thumbnail"></daoloc>
</daogrp>
Note that image file names are referred to with entityrefs. Entityrefs should be declared for each image earlier in the EAD document (immediately after <!DOCTYPE.... and before the first <ead> tag) with this syntax:
<!ENTITY b1965-6a
SYSTEM "1968.79.a.jpg" NDATA JPEG>
It is also allowable to include a second, HREF type linking mechanism in the markup like this:
<daogrp>
<daoloc entityref="b1968.79-1a"
href="http://www.bampfa.berkeley.edu/images/1968.79.a.jpg" role="hi-res"></daoloc>
<daoloc entityref="b1968.79-1t"
href="http://www.bampfa.berkeley.edu/images/1968.79.a.jpg" role="thumbnail"></daoloc>
</daogrp>
2: Creator/Maker
Definition:
The name of a person
or corporate entity responsible for the design or creation of the object.
Where an individual artist is unknown,
this field should
contain a designation by school and period or the name of the culture group
responsible for the creation of the work. The name should represent
the attribution currently accepted by the holding institution. Birth and
death dates and country of origin, if known, should go in this field, after
the name. The ULAN may be used here in conjunction with the NORM attribute.
<origination><persname> value[Vincent van Gogh, 1853-1890, Holland] </persname></origination>
or
<origination><persname
role="Creator" NORM="Van Gogh, Vincent" source="ULAN"> value[Vincent Van
Gogh, 1853-1890, Holland] </persname></origination>
3: Object Name/Title
Definition:
The name or title
given to the object by the creator/maker, curator, or owner, or the text
of a caption that appears with the image as in prints, cartoons, and photographs.
Preferred Use:
The field for a title
or name of the object. Descriptive titles or names based on classification
terms or object type should be
provided for objects
that do not have formal titles.
<unittitle> value[Portrait
of DeVille] </unittitle>
4: Date of Creation/Date Range
Definition:
The year in which
the object was created; if specific year not known, or if object executed
over several years, give date range.
<unitdate> value[1897] </unitdate>
or
<unitdate norm="1800-1900">
value[19th century] </unitdate>
5: Place of Origin/Discovery
Definition:
The geographical location in which an object was created, or if not known, then place object was found. Especially useful for archaeological artifacts, not so useful for art objects where creator info is known.
<geogname role="Creation-Place" source="CDWA"> value[Iberian Peninsula] </geogname>
or
<geogname role="Context-Excavationplace"
source="CDWA"> value[Iberian Peninsula] </geogname>
6: Medium/Materials
Definition:
The substance(s) of
which the object is made.
<physfacet> value[oil on canvas] </physfacet>
CDWA may be used here for specificity.
<physfacet type="Materials-Description"
source="CDWA"> value[oil on canvas] </physfacet>
7: Techniques/Process
Definition:
A term or phrase describing
how the object was created.
<physfacet> value[lost
wax method] </physfacet>
CDWA may be used here for specificity.
<physfacet type="Materials-Processes"
source="CDWA"> value[lost wax method] </physfacet>
8: Dimensions
Definition:
Measurements associated
with any particular dimension of the object.
<dimensions> value[22
x 37 inches] </dimensions>
9: Current Repository Name
Definition:
The full name of the
current repository of the object; include place if appropriate.
<repository> value[Berkeley
Art Museum] </repository>
10: Current Object ID Number
Definition:
The inventory number
currently assigned to the object by the current repository.
Preferred Use:
This field is for
the object's accession number or ID number or current inventory number
or any unique identifying number as
assigned by the current
repository. Inventory numbers or other identifiers that may have been assigned
to the object by former
owners should be reported
in the Notes field.
<unitid> 1958.5.59
</unitid>
11: Provenance
Type: Repeatable
Required: No
Definition:
The name of a previous
owner of the object.
Preferred Use:
Enter the name of
a person, institution, or organization that formerly owned the object or
include current owner's name, especially if different from current repository.
<custodhist> Gift of Mr. and Mrs. J. Farnsworth </custodhist>
or
<custodhist> On
Extended Loan to the J. Wright Estate </custodhist>
12: Notes
Definition:
Textual description
of object; object history: associated people, organizations, places, and
events in the object's history;
distinguishing features;
inscriptions/marks; condition; edition/state. Any descriptive text, remarks
and comments documenting the object or commenting on it from an interpretive/curatorial
perspective. For this purpose, this text should be only that which is specific
to this item; group or collection level descriptions should be placed above
the group of objects being described, and not repeated in each object's
<odd> (other descriptive data) field.
<odd><p> This
work epitomizes the ephemeral and spare quality of Cornell's late shadow-boxes
and has been linked in influence to minimalism as much as to other art
in this particular medium such as the work of Bruce Conner. This work is
loosely based on a story of a 19th century Russian ballerina..... </p></odd>
Multiple, separate notes should be distinguished by the addition of a <head> sub-element like this:
<odd><head>Description</head><p> This work epitomizes the ephemeral and spare quality of Cornell's late shadow-boxes and has been linked in influence to minimalism as much as to other art in this particular medium such as the work of Bruce Conner. This work is loosely based on a story of a 19th century Russian ballerina..... </p></odd>
<odd><head>Condition</head><p>
This work has loose nail joints and fragile pigmentation...</p></odd>
Types of notes may
include description/content, inscriptions/marks, state/edition, transcription,
history, condition, reference/bibliography, language, exhibition history,
etc.
13: Subject Matter
Definition:
The content or subject
matter of the object. AAT or other authority source for term is encouraged
here. Note required "Subject:" in front of text.
<subject>Subject:
value[madonna and child] </subject>
14: Type of Object
Definition:
The classification
of the object by type.
Preferred Use:
This field is for
the term(s) that indicate the classification of the object. For material
culture collections, this will tend to be
the object name (for
example, chair, canoe, etc.); fine art institutions should use this field
to specify object genre or format
(for example, painting,
engraving, etc.). AAT or other authority source for term is encouraged
here. Note required "Classification/Genre:" in front of text.
<genreform>Classification/Genre: value[engraving] </genreform>
or
<genreform source="AAT"
norm="engravings">Classification/Genre: value[engraving] </genreform>
15: Style/Period/Group/Movement/School
Definition:
A term identifying
a style or period in the history of art.
Preferred Use:
This field is for
the term(s) identifying a style or period whose characteristics are represented
by the object. These terms should
preferably be in the
AAT, except where the AAT is too Western Art centric. Note required "Style/Group:"
in front of text.
<physfacet type="Styles-Description"
source="CDWA">Style/Group: value[Surrealism] </physfacet>
Example of one item-record marked up
Note some fields are not used, such as <geogname>, which is more appropriate for archaeological artifacts. Some extra elements are inserted, which are not semantically defined since they will not be coming from the museum collection system, but rather are needed only for the EAD markup (<did> and <physdesc> elements).
<c01 level="item">
<did>
<daogrp>
<daoloc entityref="b1968.79-1a"
role="hi-res"></daoloc>
<daoloc entityref="b1968.79-1t"
role="thumbnail"></daoloc>
</daogrp>
<origination><persname>Vincent
van Gogh, 1853-1890, Holland </persname></origination>
<unittitle>Portrait
of Gaugin</unittitle>
<unitdate>1887</unitdate>
<physdesc>
<physfacet type="Materials-Description" source="CDWA">
oil on canvas </physfacet>
<physfacet type="Materials-Processes" source="CDWA">
Intaglio with spatula </physfacet>
<dimensions>22 x 37 inches</dimensions>
</physdesc>
<repository>Berkeley
Art Museum</repository>
<unitid>1958.5.59</unitid>
</did>
<admininfo>
<custodhist>
<p>Gift of Mr. and Mrs. J. Farnsworth </p>
</custodhist>
</admininfo>
<odd>
<p>This work was the last painting Van Gogh painted during the short
stay
of fellow painter Paul Gaugin at their small house in Arles, France. It
exemplifies....
</p>
</odd>
<controlaccess>
<subject>Subject: Portrait of Artist Paul Gaugin</subject>
<genreform>Classification/Genre: Painting</genreform>
<corpname>Style/Group: Early Expressionist </corpname>
</controlaccess>
</c01>
2: Images
The object of these specifications is to provide a common format for the images from all of the participating institutions for viewing inside the finding aid alongside the item record. Very high resolution images for archiving and future use are not covered here as they are not used in the project directly; and are up to each institution to create and store themselves (however TIFF file format is recommended for this purpose).
The common format covers only the following: File format, image viewing gamma, and file pixel dimensions. File naming conventions are covered in the Metadata section. If a museum does not have one size of image (for instance the high-res) then they would simply omit that image and the corresponding EAD markup and only the two they do have will be displayed.
The specifications
which follow are based on those used successfully in the California Heritage
Project (UC Berkeley). In addition the AMICO (Art Museum Image Consortium)
Project is recommending a maximum resolution of 1024x786 pixels, and the
MESL (Museum Education Site Licensing) Project allowed each museum to use
a different scheme, but the maximum-sized images typically varied in size
from 1284x1876 to 1472x999 pixels and they used JPEG file format as we
recommend. The MESL project in particular conducted end-user surveys to
determine the acceptable parameters for their images. So, we feel that
the recommended maximum image size falls into the range being used in museum
community project; thus indicating an acceptance of them for the end-user
and the ability of museums to produce them. File formats also use ISO standards,
which are also used in the museum community, thus reducing duplicative
work for museums and making maximum use of their images in multiple venues.
1. Viewing files should
be processed to be viewed in a gamma 2.2 viewing environment.
2. Each viewing file
should be provided in three different resolutions to meet different viewing
purposes: "High
Resolution," for detailed
study, "Medium Resolution," for full-screen viewing, and "Thumbnail," for
embedding in finding aids.
3. Pixel dimensions (maximum) for each resolution class:
Thumbnail 192 x 192
(e.g., 128W x 166H)
Medium Resolution
768 x 768
High Resolution 1536
x 1536
4. File Formats for each resolution class:
Thumbnail GIF
Medium Resolution
JPEG (JIFF)
High Resolution JPEG
(JIFF)
Each file format should
be tested against a variety of imaging programs and browsers to insure
compatibility. Progressive
formats should not
be used.