A Metadata Schema Registry as a Tool to Enhance Metadata Interoperability
Mitsuharu Nagamori and Shigeo Sugimoto
Interoperability is one of the most crucial issues for the metadata and digital library communities. Metadata registries are formal systems that can disclose authoritative information about semantics and data elements to realize semantic interoperability of metadata across domains and cultures. Registries typically store the semantics of metadata elements, maintain information about any local extensions, and provide mappings to other metadata schemas. This article describes the basic requirements and functions for a metadata schema registry. The primary function of a metadata schema registry is to provide reference descriptions of metadata terms for both human users and machines. Based upon our experiences in developing software tools with metadata schema registries, e.g., subject gateways and metadata databases, we have learned that a metadata schema registry has the potential to provide a wider range of services based on metadata schemas. This article also describes some functional extensions to our metadata schema registry in Tsukuba, Japan.
Since the emergence of the World Wide Web in the mid-1990s, metadata has been recognized as a key technology for digital libraries. Metadata is typically defined as "data about other data" a simple definition that embraces a broad range of resources from library catalogs and indexes to thesauri, ratings, reviews, and terms and conditions for use. On the Internet, metadata is designed for tasks ranging from resource description and discovery to archiving, trading, content filtering, resource syndication and information management. This diversity of purpose reflects the variety of information resources available on the Internet, ranging from personal web pages to huge portals for government information, digital libraries, and shopping catalogs, as well as the variety of users ranging from young children to businesses and professionals.
There are various communities that use metadata on the Internet such as digital libraries, museums and e-governments. Each community has defined its own metadata standard, e.g., Dublin Core Metadata Element Set (DCMES) , Metadata Encoding and Transmission Standard (METS) , Metadata Object Description Schema (MODS) , and IEEE Learning Object Metadata (LOM)  according to its purpose. A domain-specific metadata standard fills a community's demands. However, the standard may compromise discovery and reuse of metadata across communities. It is necessary to satisfy both interoperability and domain specificity. Thus, interoperability is one of the important issues for the metadata and digital library communities.
A metadata schema registry (or simply, metadata registry) is a formal system that provides services over metadata vocabularies to users and machines. A metadata schema registry is widely recognized as an important tool not only to share information about metadata vocabularies but also to enhance reusability of metadata vocabularies. It is also known that a metadata registry has important roles for semantic metadata interoperability among communities speaking different languages and over time . Achieving metadata interoperability is fundamental to making information resources shareable and discoverable. This article describes the basic requirements and functions of metadata schema registries. The authors have been developing a metadata schema registry since 1998, as well as developing some software tools for using metadata schema registries. From our experiences, we have found that metadata schema registries have the potential to provide various services based on metadata schemas. This article also describes a functional extension of a metadata schema registry for metadata applications.
A metadata schema defines a framework for representing metadata. In general, a metadata schema includes semantic definitions of terms used in the schema, structural constraints and data structure definitions, and bindings to physical description syntax (such as XML).
A metadata schema consists of the following components:
Library catalogs are one type of metadata. Cataloging rules, in general, include guidelines for catalogers to extract values from resources to create catalogs in addition to the semantic and syntactic components listed above. The definition of metadata schema in this article does not include such guidelines.
Metadata schema descriptions are generally given in RDF (Resource Description Framework) Schema language. RDF is a language for representing information about resources, and RDF Schema is a language for defining RDF properties that express metadata terms  . In RDF Schema, every metadata term is given a unique identifier that works as its primary name.
Application Profiles are defined as schemas consisting of data elements drawn from one or more namespaces, combined together by implementers, and optimized for a particular local application . In this article, a set of rules that defines structural constraints and syntactic features of a metadata schema is called an Application Profile. An Application Profile provides a framework to adopt one or more element sets in accordance with application requirements. The Dublin Core Metadata Element Set (DCMES) defines the vocabulary of metadata, i.e., terms and their meanings. But in general, DCMES does not specify encoding nor syntactic characteristics. An exception is the feature included in Simple Dublin Core that states "Any of the 15 elements is optional and repeatable". In addition, local applications may have domain-specific requirements appropriate to a given domain or application, for example:
These requirements can be defined independently of the vocabulary definitions. Any application can have its own application profile, which specifies a set of metadata vocabulary terms used in the application as well as syntactic or structural features of the particular application. The vocabulary terms could be borrowed from one or more source schemas. More importantly, the application profile could be used to define a mapping from the particular local application's scheme to a global metadata scheme, or schemes, which is crucial for metadata interoperability.
In the digital library and Semantic Web communities, achieving metadata interoperability is fundamental for making information resources shareable and discoverable. The following paragraphs describe the requirements for enhancing metadata interoperability.
(1) Interoperability among different metadata standards
(3) Multiple Languages
A goal of metadata schema registries is to make metadata schemas understandable by both humans and machines, and shareable among user communities. Metadata schema registries have captured the interest of broad metadata communities because of the strong requirements of interoperability and longevity of metadata schemas. ISO/IEC 11179 addresses the semantics of the data, the representations of data, and the registration of the descriptions of the data . ISO/IEC 11179 allows the creation of a shared data environment. The Universal Description Discovery and Integration (UDDI) registries act as reference points for Web Services that allow for common descriptions and discovery of those services . UDDI is based on XML standards and is platform-independent. ISO IEC JTC1 SC32 WG2 has been organizing a series of workshops on metadata registries .
The white paper reported by the DELOS Working Group on Registries  describes basic concepts of metadata schemas, i.e., metadata vocabulary, layers for metadata interoperability, data model, and so forth. The layered model discussed in the white paper gives a framework for metadata vocabularies.
Beginning in January 2004, the JISC IE Metadata Schema Registry (IEMSR) project started development of a metadata schema registry as a pilot, shared service within the JISC Information Environment . The IEMSR will act as the primary source for authoritative information about metadata schemas recommended by the JISC IE Standards framework. Metadata within the JISC IE is based on the Dublin Core and IEEE LOM standards.
The SchemaWeb is a repository for RDF Schemas expressed in the RDF Schema, OWL and DAML+OIL schema languages . It provides a simple directory of RDF Schemas for both human users and machines to search and browse metadata schemas.
Functional Requirements for a Metadata Schema Registry
A metadata schema defines a framework for representing metadata. In order to make metadata shareable and discoverable across user communities and languages, it is necessary to improve on interoperability of metadata schemas. A metadata schema registry stores metadata schemas and serves information about them (e.g., definitions of the metadata schemas, and relationships between metadata terms) to users. A metadata schema registry provides its services not only for human users but also for machines. Human users and machines may have the following requirements:
For human users
The extended functional requirements of a metadata registry are summarized below.
(3) Schema Mapping
(4) Version Management
(5) Multilanguage User Interfaces
(6) API for software tools
The ULIS metadata schema registry developed by the authors provides reference descriptions of metadata terms in multiple languages encoded in RDF Schema . The ULIS metadata schema registry stores DCMES in 22 languages, e.g., English, Japanese, Chinese, Korean and others. We have experimentally stored metadata elements of the Internet Public Library Asia (IPL-Asia)  and those of the Nippon Cataloging Rules (NCR) in the ULIS metadata schema registry.
The DCMI Registry Working Group, which was established in December 1999, has been discussing and developing a metadata schema registry . The authors have been involved in the working group since 1998. The DCMI registry, which is in operation, provides authoritative reference descriptions of metadata schema (Figure 1). The reference descriptions are internally encoded in RDF Schema and translated into 24 different languages. The reference descriptions are presented in a user friendly form for human users and in RDF Schema for machines. The application program interface is provided based on Web Services protocols, i.e., both SOAP and REST . Description of each metadata term includes a unique name of the term, language-dependent labels, definition statement of the term, date(s) of issue, type of the term, etc. The DCMI registry is provided as open source software for use by broader communities. As of summer 2005, the DCMI registry has been made available in Germany, China, New Zealand and Tsukuba, Japan in addition to OCLC.
Functional Extension of the Registry
The primary function of a metadata schema registry is to provide reference descriptions of metadata terms both for human users and machines. From our experiences in developing software tools, e.g., subject gateways and metadata databases, we have learned that a metadata schema registry has the potential to provide a wider range of services based on metadata schemas  . We have experimentally developed a few functions to evaluate the feasibility of functional extension of the metadata schema registry. The functions presented below are to be incorporated with the basic functions of the metadata schema registry. The functions are materialized in software tools to support information access across metadata schemas, a software generator based on metadata schemas, and a support tool for developing and maintaining metadata vocabularies.
Experimental Study: A Metadata Schema-Driven Software Tool Generator
From our experiences in developing software tools for metadata applications, we have learned that basic software tools such as a metadata editor and a search tool can be (semi-)automatically derived from metadata schemas. Based on this idea, we have been developing an experimental software tool generator for metadata application systems, which uses schema descriptions of metadata vocabularies and application profiles  . This experimental system has a set of built-in primitive functions, e.g., to load/store texts from/to a database, to search text in a database, and so on.
The system produces a software tool from a set of XML documents that specifies the functions and the user interfaces of the software tool. The XML document set is named Application System Description (ASD). An ASD is composed of the following four elements.
Figure 3 is a diagram that shows how the software tool generates a metadata-driven software, such as a subject gateway. The software tool generates a metadata-driven software based on an application profile. The metadata instances created and used in the application software tools conform to the syntactic definition. A function repository, shown in Figure 3, stores primitive functions that are used in a metadata-driven software, e.g., editing, browsing and searching metadata. These functions in the function repository are commonly used by metadata-driven software tools. Since user interfaces are derived from a metadata schema that includes class definitions of domain and range of a metadata element, we can choose user interface widgets and built-in functions for the element in accordance with the class definitions.
Summary and Future Work
The experimental system shown above is a rather straightforward extension of the metadata schema registry. We have found that the separation of syntactic and semantic features is useful to understand the functionality of the extended functions.
From this study and other related studies, we have learned the following lessons:
1. Baker, T., et al. Principles of Metadata Registries. Available at <http://delos-noe.iei.pi.cnr.it/activities/standardizationforum/Registries.pdf>.
5. Heery, R. and Patel, M. "Application profiles: mixing and matching metadata schemas." Ariadne 25, September, 2000, <http://www.ariadne.ac.uk/issue25/app-profiles/intro.html>.
10. Lee, M. Development of a Software Tool Generator based on Declarative Descriptions of Metadata Schemas and Applications. Master Thesis, Graduate School of Library, Information and Media Studies, University of Tsukuba, Japan, 2005.
11. Lee, W., et al. "A Subject gateway in Multiple Languages: a Prototype Development and Lesson Learned." Proceedings of DC-2003, pp.59-66, Seattle, 2004.
14. Nagamori, M. and Sugimoto, S. "A Metadata Schema Framework for Functional Extension of the Metadata Schema Registry." Proceedings of DC-2004, pp.3-11, Shanghai, 2004.
15. Nagamori, M., et al. "A Multilingual Metadata Schema Registry Based on RDF Schema." Proceedings of DC-2001, pp.209-212, Tokyo, 2001.
20. Sugimoto, S., et al. "Versioning the Dublin Core Across Multiple Languages and Over Time." Proceedings of SAINT 2001 Workshop, pp.151-156, San Diego, 2001.
21. Sugimoto, S. "Metadata Schemas, Models and Tools - Metadata-Oriented Projects at Tsukuba and Lessons Learned for Interoperability." Proceedings of ICDL 2004, pp.690-699, India, 2004.
© Copyright 2006 Mitsuharu Nagamori and Shigeo Sugimoto