ISO 25964: Thesauri and Interoperability with other Vocabularies

Scope

This is an international standard development project of ISO Technical Committee 46 (Information and documentation) Subcommittee 9 (Identification and description). The assigned Working Group (WG8) will revise, merge, and extend two existing international standards:

  • ISO 2788 Guidelines for the establishment and development of monolingual thesauri
  • ISO 5964 Guidelines for the establishment and development of multilingual thesauri

These two intimately linked standards will be updated to reflect the needs of the 21st century, bearing in mind the demand for interoperability in a networked society. All of their existing scope will be retained and refreshed, and the following additional subjects will be added:

  • Guidance on electronic functions and displays
  • Functional specification for software to manage thesauri
  • Interoperability (mapping) between thesauri and other types of vocabulary
  • Data modeling and formats for exchange of thesaurus data

Work

The work will be based on BS 8723 Structured vocabularies for Information Retrieval, a standard in five parts as follows:

  • Part 1: Definitions, symbols and abbreviations;
  • Part 2: Thesauri;
  • Part 3: Vocabularies other than thesauri;
  • Part 4: Interoperability between vocabularies;
  • Part 5: Exchange formats and protocols for interoperability

ISO 25964 will draw on BS 8723, but reorganize the content to fit into two-parts.  Part 1 will cover all aspects of thesauri, monolingual and multilingual, including a data model and formats/protocols for data exchange. Part 2 will cover interoperability between thesauri and other vocabularies such as classification schemes, taxonomies and ontologies. It will provide guidance on mapping practice and architecture.

Working Group

A Working Group has been established, with members from 13 countries: Bulgaria, Canada, Denmark, Finland, France, Germany, New Zealand, South Africa, Spain, Sweden, UK, Ukraine, and USA. The project is led by Stella Dextre Clarke of the UK.

Status

Part 1 (Thesauri for information retrieval) was issued as a Draft International Standard (DIS) ballot beginning on 2009-October-26 and ending on 2010-03-26. A draft schema to accompany the standard and a test XML document using the schema are available for public review and comment.

Part 2 (Interoperability with other vocabularies) is in development. It is expected to be released for a Committee Draft ballot in the first quarter of 2010.

ISO 25964 – the international standard for thesauri and interoperability with other vocabularies

Summary description of ISO 25964

ISO 25964. Thesauri and interoperability with other vocabularies
    Part 1: Thesauri for information retrieval
    Part 2: Interoperability with other vocabularies

Part 1 of the standard, published in 2011, covers all aspects of developing a thesaurus, monolingual or multilingual. It has replaced the previous standards ISO 2788 and ISO 5964. To encourage networking interoperability, it includes a data model and an XML schema for data exchange.

Part 2 of the standard, to be published in 2013, covers new ground not previously addressed in any standard. Its main aim is to encourage high quality information retrieval across networked resources that have been indexed with different vocabularies. It explains how to set up mappings between the concepts in such vocabularies, and other forms of complementary use.

Both standards can be obtained directly from ISO via the ISO Store, or from any of ISO’s member bodies, such as ANSI in the US or BSI in UK. (See list of all such outlets). They are stocked also in a number of public and university libraries. Definitions contained in the standards can be freely viewed at https://www.iso.org/obp/ui/ (search for “25964”).

Development background

ISO 2788 and ISO 5964, the standards for monolingual and multilingual thesauri respectively, were first published in 1974 and 1985 respectively, and only ISO 2788 was subsequently updated. In 2008 it was judged necessary to bring the two together and overhaul them completely, adding substantial new content to cater for the needs of networked information retrieval. A Working Group known as “WG8: Structured Vocabularies” was established under the auspices of ISO Technical Committee 46 (Information and documentation) Subcommittee 9 (Identification and description).

Active members of WG8 between 2008 and 2013 have included:

Victor Beloozerov (RU)Michèle Hudon (CA)Tracy Powell (NZ)

Sylvie Dalbin (FR)Daniel Kless (DE)Esther Scheven (DE)

Johan De Smedt (BE)Traugott Koch (DE)Douglas Tudhope (GB)

Stella Dextre Clarke (GB)Sophie Lessard (CA)Bernard Vatant (FR)

Axel Ermert (DE)Richard Light (GB)Leonard Will (GB)

F. Javier García Marco (ES)Jutta Lindenthal (DE)Marcia Zeng (US)

Alan Gilchrist (GB)Marianne Lykke (DK) 

The Group is chaired by Stella Dextre Clarke, and the Secretariat is provided by NISO.

While developing the new standard, the group has maintained contact with the people and organizations responsible for related standards. For example, The British Standard BS 8723 [26] was used heavily in preparing the first drafts of ISO 25964. Contact has also been maintained with the committee responsible for the American standard ANSI/NISO Z39.19 [25]. Care has similarly been taken to maintain compatibility with the W3C standard SKOS [29]. (See below.)

More background on the development history can be found on the Further reading and related resources page. See especially [2], [9] and [10].

Maintenance responsibility

Like all ISO standards, each Part of ISO 25964 is subject to review five years after publication. It will be the responsibility of ISO and its member bodies to ensure that the documents are updated as necessary. The NISO Secretariat has established a public e-mail list for the 25964info interest group; by subscribing to this you can submit comments, suggestions and queries about the standard. Any corrections that are agreed will be noted as Errata on this site. See the ISO 25964 Interest Group e-mail list for news of continuing work around the implementation of the standards.

Content of Part 1 (ISO 25964-1)

Scope

This part of ISO 25964 gives recommendations for the development and maintenance of thesauri intended for information retrieval applications, whether monolingual or multilingual. It is applicable to vocabularies used for retrieving information from all types of information resources, irrespective of the media used (text, sound, still or moving image, physical object or multimedia) including knowledge bases and portals, bibliographic databases, text, museum or multimedia collections, and the items within them.

This part of ISO 25964 also provides a data model and recommended format for the import and export of thesaurus data.

Requirements of software to manage thesauri are given, but not for the databases or software used directly in search or indexing applications (although the needs of such applications are anticipated among the recommendations for thesaurus management).

Abbreviated Table of contents

Foreword
Introduction
1 Scope
2 Terms and definitions
3 Symbols, abbreviated terms and other conventions
4 Thesaurus overview and objectives
5 Concepts and their scope in a thesaurus
6 Thesaurus terms
7 Complex concepts
8 The equivalence relationship, in a monolingual context  
9 Equivalence across languages  
10 Relationships between concepts  
11 Facet analysis  
12 Presentation and layout  
13 Managing thesaurus construction and maintenance  
14 Guidelines for thesaurus management software  
15 Data model  
16 Integration of thesauri with applications  
17 Exchange formats  
18 Protocols  
Annex A (informative) Examples of displays found in published thesauri
Annex B (informative) XML Schema for data exchange  
Bibliography  
Index 

Provisions for data exchange

The ISO 25964 XML schema for data exchange is based upon the data model in Clause 15; see [1] and [4]. Although the data model and schema provide for some very sophisticated thesauri, using any or all of the features described in the standard, much of their content is optional. Thus in practice they can be stripped down to work easily for very simple vocabularies. See more explanation at the ISO 25964 Schema webpage , where you can access and download the model and version 1.4 of the schema free of charge, together with documentation and a test document illustrating how the schema may be applied.

Content of Part 2 (ISO 25964-2)

Scope

This part of ISO 25964 is applicable to thesauri and other types of vocabulary that are commonly used for information retrieval. It describes, compares and contrasts the elements and features of these vocabularies that are implicated when interoperability is needed. It gives recommendations for the establishment and maintenance of mappings between multiple thesauri, or between thesauri and other types of vocabularies.

Follow the links in the reading list for more detail on mapping practice.

Abbreviated Table of contents

Foreword 
Introduction 
1 Scope 
2 Normative references
3 Terms and definitions
4 Symbols, abbreviations and other conventions
5 Objectives and identification
6 Structural models for mapping across vocabularies
7 Types of mapping
8 Equivalence mappings
9 Hierarchical mappings
10 Associative mappings
11 Exact, inexact and partial equivalence
12 Use of mappings in information retrieval
13 Handling pre-coordination
14 Techniques for identifying candidate mappings
15 Managing the data
16 Display of mapped vocabularies
17 Classification schemes
18 Classification schemes used for records management
19 Taxonomies
20 Subject heading schemes
21 Ontologies
22 Terminologies
23 Name authority lists
24 Synonym rings
Annex A (informative) Management of terminological data in support of interoperability
Bibliography
Index

Interoperability with SKOS and other schemas

To reach the goal of interoperability across today’s expanding networks is an immense and infinitely extensible endeavor. Think of it as a jigsaw of standards and protocols, each shaped to interlock with the neighboring pieces. Think of it also as a community effort, in which the developers of each jigsaw piece must collaborate with others to ensure the smooth flow of error-free data.

Especially close neighbors in the Semantic Web jigsaw are ISO 25964 and SKOS [29]. ISO 25964-1 essentially advises on the selection and fitting together of concepts, terms and relationships to make a good thesaurus. SKOS addresses the next step, with recommendations on porting the resultant thesauri (or other ‘simple Knowledge Organization Systems’) to the Web. ISO 25964-2 recommends the sort of mappings that can be established between one KOS and another; SKOS presents a way of expressing these when published to the Web.

To ensure a good fit between the recommendations of these complementary standards, the teams responsible for them have maintained contact throughout. The respective data models are not identical, because ISO 25964 must provide for the need of all sorts of thesauri (whether for Web use or for other applications) while SKOS [29] must provide for all sorts of KOS (including classification schemes and many others that do not comply with ISO25964). Despite the differences, however, there is good alignment, making it possible to develop a set of correspondences between components of the data models. Where the basic SKOS data model lacks a construct that corresponds to a feature of the ISO 25964 model, the SKOS-XL [30] model has been used, supplemented by additional proposals where necessary. Care has been taken to avoid any incompatibility with another ongoing project to align SKOS with MADS [27].

Continuing work and discussion

Based on the documented correspondence table, an RDF schema that provides a machine-readable version for these mappings as well as for the elements from the ISO 25964 model is available on http://purl.org/iso25964/skos-thes.

The ISO 25964info interest group provides a forum for discussing the development work and any issues that arise. Anyone can subscribe and view archives. See details at www.niso.org/lists/25964info/.

Documents

Committee Roster

Chair

Members

Xiaomi An

No. 9 Madian East Road, Haidian District Beijing, 100088 China
Standardization Administration of China

Daniel Kless

The University of Melbourne Department of Information Systems

Marianne Lykke

e-Learning Lab - Center for User-driven Innovation, Learning and Design - Department of Communication

Sun Wei

Institute of Scientific & Technological Information of China (ISTIC)