Dublin Core/FINMARC/GILS Crosswalk

Juha Hakala
Automation Unit of Finnish Research Libraries
Helsinki University Library
juha.hakala@helsinki.fi

Last updated: 01/15/98 (version 1.3)

Changes:

Version 1.2:245 $b specification, 001 generation for ISO2709 from DC Resource Identifier data, specification of 021 $c usage, handling of duplicate Resource Identifier tags with the same Scheme information.

Version 1.3: Specification of URN conversion.


Introduction

This document relies heavily on the Dublin Core/MARC/GILS crosswalk document developed by the Library of Congress (see http://lcweb.loc.gov/marc/dccross.html).

The following is a crosswalk between the fifteen elements in the Dublin Core Element Set on the one hand and both FINMARC bibliographic data elements and GILS attributes on the other. The crosswalk may be used in conversion of metadata from some other syntax into MARC. For conversion of MARC into Dublin Core, many fields would be mapped into a single Dublin Core element; this is not entirely covered in this document.

In the Dublin Core to FINMARC mapping, in some cases there are two mappings provided. The first is a simple mapping and is used if the Dublin Core elements are used without qualifiers. The second is for a more complex description for which the elements have qualifiers. There could be a mixture, but if the particular element is unqualified, then the simple mapping for that element should be used. Certain defaults have been assumed as to definitions and qualifiers; if this changes the list will need to be adjusted. This list has been made consistent with the GILS/MARC mapping where possible. Where applicable, subfields are given.

Earlier metadata workshops supported the notion of defining qualifiers and subelements for elements when more complex descriptions are needed, but the list of qualifiers is not entirely agreed upon. When the list of qualifiers becomes standardized it will be necessary to modify this document and add to it as appropriate. Only the most obvious qualifiers have been included now. The crosswalk will be modified after qualifiers have become more standardized.

FINMARC fields are listed with field number, then in parentheses field name/subfield name (if both are the same, no subfield name is included). If the value of the indicator is not provided, use a blank (H'20'). The label is a shortened form of the element name. GILS attribute names for each Dublin Core element are also given. Definitions are taken from Dublin Core Metadata Element Set: Reference Description (see http://purl.org/metadata/dublin_core_elements).

It is possible that conversion produces several instances of the same field. This is OK for e.g. 700, 710 and 720, but only one 100 tag is allowed.

Dublin Core to FINMARC and GILS Crosswalk

Title

The name given to the resource by the CREATOR or PUBLISHER.

FINMARC:

  • If the element is repeated in the DC record, all titles after the first: 245$r (Parallel Title/Title proper)
  • If there is a string " : " or ": " in the data, it should be generated into " $b ".

    GILS:

    Author or Creator

    The person(s) or organization(s) primarily responsible for the intellectual content of the resource. For example, authors in the case of written documents, artists, photographers, or illustrators in the case of visual resources. Qualifier possible: type.

    FINMARC:

    If the Author element is repeated in the DC record, the data from repeats should go either to 700$a and $h or 710$a (see contributor tag for details) or 720$a and $h, depending on the type values.

    GILS:

    Subject and Keywords

    The topic of the resource, or keywords or phrases that describe the subject or content of the resource. The intent of the specification of this element is to promote the use of controlled vocabularies and keywords. This element might well include scheme-qualified classification data (for example, Library of Congress Classification Numbers or Dewey Decimal numbers) or scheme-qualified controlled vocabularies (such as MEdical Subject Headings or Art and Architecture Thesaurus descriptors) as well. Qualifier possible: scheme.

    FINMARC:

    GILS:

    Description

    A textual description of the content of the resource, including abstracts in the case of document-like objects or content descriptions in the case of visual resources. Future metadata collections might well include computational content description (spectral analysis of a visual resource, for example) that may not be embeddable in current network systems. In such a case this field might contain a link to such a description rather than the description itself.

    FINMARC:

    GILS:

    Publisher

    The entity responsible for making the resource available in its present form, such as a publisher, a university department, or a corporate entity. The intent of specifying this field is to identify the entity that provides access to the resource.

    FINMARC:

    GILS

    Other Contributors

    Person(s) or organization(s) in addition to those specified in the CREATOR element who have made significant intellectual contributions to the resource but whose contribution is secondary to the individuals or entities specifed in the CREATOR element (for example, editors, transcribers, illustrators, and convenors). Qualifier possible: type.

    FINMARC:

    GILS

    Date

    The date the resource was made available in its present form. The recommended best practice is an 8 digit number in the form YYYYMMDD as defined by ANSI X3.30-1985. In this scheme, the date element for the day this is written would be 19961203, or December 3, 1996. Many other schema are possible, but if used, they should be identified in an unambiguous manner. Qualifier possible: type
    FINMARC:

    GILS

    Resource Type

    The category of the resource, such as home page, novel, poem, working paper, technical report, essay, dictionary. It is expected that RESOURCE TYPE will be chosen from an enumerated list of types. A preliminary set of such types can be found at the following URL: http://www.roads.lut.ac.uk/Metadata/DC-ObjectTypes.html.

    FINMARC:

    GILS

    Format

    The data representation of the resource, such as text/html, ASCII, Postscript file, executable application, or JPEG image. The intent of specifying this element is to provide information necessary to allow people or machines to make decisions about the usability of the encoded data (what hardware and software might be required to display or execute it, for example). As with RESOURCE TYPE, FORMAT will be assigned from enumerated lists such as registered Internet Media Types (MIME types). In principal, formats can include physical media such as books, serials, or other non-electronic media.

    FINMARC:

    GILS:

    Resource Identifier

    String or number used to uniquely identify the resource. Examples for networked resources include URLs and URNs (the latter will hopefully replace the former as an identifier in the long run). Other globally-unique identifiers,such as International Standard Book Numbers (ISBN) or other formal names would also be candidates for this element. Qualifier possible: scheme.

    FINMARC:

    When generating exchange records, the ID's based on the following systems should be converted also to the 001 tag: NBN, ISBN, ISRC, ISMN and ISRN.

    If the converter has means of checking validity of ISBN and other established codes, and finds that the ID is not correct, the conversion is done to subfield $c in respective tag.

    If NBN, ISBN, ISRC, ISMN or ISRN data is repeated in the DC record, all occurrences after the first one are ignored.

    URN's are always based on other identifiers like NBN's or ISBN's. It is possible to extract the basic ID from URN string. For URN's based on NBN's, this is done by stripping string "URN:NBN:fi-" from the beginning of the URN; the rest may be placed into 015$a.

    GILS:

    Source

    The work, either print or electronic, from which this resource is derived, if applicable. For example, an html encoding of a Shakespearean sonnet might identify the paper version of the sonnet from which the electronic version was transcribed.

    FINMARC:

    GILS:

    Language

    Language of the intellectual content of the resource. Where practical, the content of this field should coincide with the Z39.53 three character codes for written languages. Qualifier possible: scheme.
    FINMARC:

    GILS:

    A three-character language code standard is currently under development as: ISO 639-2 (not yet available electronically)

    Relation

    Relationship to other resources. The intent of specifying this element is to provide a means to express relationships among resources that have formal relationships to others, but exist as discrete resources themselves. For example, images in a document, chapters in a book, or items in a collection. A formal specification of RELATION is currently under development. Users and developers should understand that use of this element should be currently considered experimental. Possible qualifiers: scheme, type.

    FINMARC:

    GILS:


    Coverage

    The spatial locations and temporal durations characteristic of the resource. Formal specification of COVERAGE is currently under development. Users and developers should understand that use of this element should be currently considered experimental. Possible qualifier: type.

    FINMARC:

    GILS:

    Rights Management

    The content of this element is intended to be a link (a URL or other suitable URI as appropriate) to a copyright notice, a rights-management statement, or perhaps a server that would provide such information in a dynamic way. The intent of specifying this field is to allow providers a means to associate terms and conditions or copyright statements with a resource or collection of resources. No assumptions should be made by users if such a field is empty or not present. Qualifiers possible: URL, URN. Nota bene: not available in the Nordic Metadata template yet (will be added shortly).

    FINMARC:


    GILS:

    Notes

    In addition to the variable length fields listed in the mapping, a FINMARC record will also include a Leader and field 008 (Fixed-Length Data Elements). Certain character positions in each of these fixed length fields of a FINMARC record will need to be coded, although most will generate default values.

    Leader: a fixed field comprising the first 24 character positions (00-23) of each record that provides information for the processing of the record. The following positions should be generated:

    008 Fixed Length Data Elements: Forty character positions (00-39) containing positionally-defined data elements that provide coded information about the record as a whole or about special bibliographic aspects of the item being cataloged. For records originating as Dublin Core, the following character positions are used:

    Uses for mapping Dublin Core to MARC

    A mapping between the elements in the Dublin Core and FINMARC fields is necessary so that conversions between various syntaxes can occur accurately. Once Dublin Core style metadata is widely provided, it might interact with MARC records in various ways such as the following:

    Enhancement of simple resource description record. A cataloging agency may wish to extract the metadata provided in Dublin Core style (presumably in HTML or SGML) and convert the data elements to MARC fields, resulting in a skeletal record. That record might then be enhanced as needed to add additional information generally provided in the particular catalog.

    Searching across syntaxes and databases. Libraries have large systems with valuable information in MARC bibliographic records (which may also be called metadata). Over the past few years with the expansion of electronic resource over the Internet, other syntaxes have also been considered for providing metadata. The Library of Congress has worked with a group of SGML experts to create a Document-Type Definition (DTD) for MARC, so that conversions can be made between SGML and MARC in a standardized way. It will be important for systems to be able to search metadata in different syntaxes and databases and have commonality in the definition and use of elements.