Information and documentation -- Thesauri and interoperability with other vocabularies-- Part 1: Thesauri for information retrieval and Part 2: Interoperability with other vocabularies
ontology - a formal, explicit specification of a shared conceptualization. NOTE: This definition is attributable to Studer et al., extending an earlier definition by Gruber, and is adopted in this part of ISO 25964 because it is widely accepted in the ontology development community. An ontology typically includes definitions of concepts and specified relationshps between them, set out in a formal way so that a machine can use them for reasoning. This definition excludes thesauri, classification schemes and other structured vocabularies described in this part of ISO 25964, even though these are sometimes described as "lightweight ontologies".
thesaurus - Controlled and structured vocabulary in which concepts are represented by terms, organized so that relationships between concepts are made explicit, and preferred terms are accompanied by lead-in entries for synonyms or quasi-synonyms. NOTE: The purpose of a thesaurus is to guide both the indexer and the searcher to select the same preferred term or combination of preferred terms to represent a given subject. For this reason a thesaurus is optimized for human navigability and terminological coverage of a domain.
controlled vocabulary - prescribed list of terms, headings or codes, each representing a concept. NOTE: Controlled vocabularies are designed for applications in which it is useful to identify each concept with one consistent label, for example when classifying documents, indexing them and/or searching them. Thesauri, subject heading schemes and name authority lists are examples of controlled vocabularies.
Section 21.1.2 "Whereas the role of most of the vocabularies ... is to guide the selection of search/indexing terms, or the browsing of organized document collections, the purpose of ontologies in the context of retrievel is different. Ontologies are not designed for information retrieval by index terms or class notation, but for making assertions about individuals, e.g. about real persons or abstract things such as a process. ..."
Section 21.3 Structural comparison between thesauri and ontologies "One key difference is that, unlike thesauri, ontologies necessarily distinguish between classes and individuals, in order to enable reasoning and inferencing. ... The concepts of a thesaurus and the classes of an ontology represent meaning in two fundamentally different ways. Thesauri express the meaning of a concept through terms, supported by adjuncts such as a hierarchy, associated concepts, qualifiers, scope notes and/or a precise definition, all directed mainly to human users. Ontologies, in contrast, convey the meaning of classes through machine-readable membership conditions. ... The instance relationship used in some thesauri approximates to the class assertion used in ontologies. Likewise, the generic hierarchical relationship ... corresponds to the subclass axiom in ontologies. However, in practice few thesauri make the distinction between generic, whole-part and instance relationships. The undifferentiated hierarchical relationship most commonly found in thesauri is inadequate for the reasoning functions of ontologies. Similarly the associative relationship is unsuited to an ontology, because it is used in a multitude of different situations and therefore is not semantically precise enough to enable inferencing."
Section 21.4.3 Complementary use of a thesaurus and an ontology "Current interest in 'Linked Data' has led to many projects ... that combine the reasoning power of ontologies with the retrieval capabiliites of knowledge organization systems such as thesauri. ... Examples of the following two scenarios are currently found, and in both of them Semantic Web standards such as SKOS and OWL are very often deployed. ...
- Complementary use of a thesaurus and an ontology may be mediated by a metadata element set (also known as a metadata schema). With this approach, the definitions and attributes of the metadata elements should be studied and a model of the application domain developed. An ontology should be developed, in which selected metadata elements are established as classes and/or properties. Typically, the range of certain metadata elements is constrained to the values in a particular controlled vocabulary, such as a thesaurus. This combination potentially allows logical inferencing at the level of the metadata elements, as well as supporting information retrieval via the terms and concepts of the controlled vocabularies. ..."
Section 4.2 "Establishing an appropriate preferred term to represent a particular concept is not always straightforward because a concept can often be expressed in more than one way. Furthermore, in ordinary discourse one term can have mre than one meaning. Vocabulary control is therefore essential, and thesauri are used to achieve this by the following two principal means.
a) Concepts and terms are deliberately restricted in scope to selected meanings. ... b) When the same concept can be expressed by two or more synonyms or quasi-synonyms in the same language, one of these terms is usually selected as the preferred term ...
One consequence of using the measures in a) and b) for vocabulary control is that the resultant language might not correspond to a user's preferences. The thesaurus has an important role in mediating between terms used in discourse and those that function effectively for information retrieval. To achieve the retrieval benefits, users need to accept a degree of artificiality in the controlled vocabulary ..."