Understanding SKOS and the importance of data quality

SKOS, an acronym for Simple Knowledge Organization System, is precisely what it sounds like— simple, but not only — it’s a fundamental pillar for effective knowledge organization.

Exploring the Simplicity and Essence of SKOS

SKOS has been designed to facilitate the sharing and linking of knowledge across the web. It harmonizes the structure and applications of various systems like thesauri, taxonomies, classification schemes, and subject heading systems. SKOS explicitly captures these similarities, promoting data and technology sharing across diverse applications.

? The fundamental objective of SKOS is to make it easier for people and computers to share and interact with knowledge organization systems via the Semantic Web.

SKOS is based on RDF (Resource Description Framework) to represent data as triples in various RDF syntaxes. SKOS doesn’t function as a formal knowledge representation language. Instead, it focuses on organizing knowledge.

SKOS model

In basic SKOS, the model considers Knowledge Organisation Systems (KOS) as conceptual schemas including concepts as stand-alone entities or grouped within concept schemes. Each resource (concept and concept scheme) is a fundamental part of information with a unique URI (Uniform Resource Identifier), various relationships, properties and documentation that uniquely define it.

Diagram illustrating what defines a 'concept'. With, horizontally, the term "concept" framed by a thick black border on the left, linked to the term "a", and three lines on the right . A blue line linked to the term "relations". A green line linked to the term "properties" and a red line linked to the term "documentation".

The different classes for concept and concept scheme:

Fundamental units of meaning or ideas within a knowledge organization system

Assert that a given resource is a concept

Concept schemes
Aggregation of one or more SKOS concepts

Assert that a given resource is part of a concept scheme

Concepts are labelled by lexical strings in one or more natural languages. The expressions used to refer to them in the natural language are their labels.

The meaning of a concept is defined not only by its labels but also by its links with other concepts, the semantic relations.

Diagram illustrating how a 'concept' can be linked to various terms and labels to define it. With, horizontally, a box on the left, with a thick black border, containing a title "Concept A" and a bulleted list Relationships, Properties, Doc. 3 coloured lines emerge from this box on the right. The first, in blue, splits in two to lead to two framed terms: broader - term B and narrower - term C. The second, in green, splits in two to lead to two boxed labels: prefLabel - Label 1 and altLabel - Label 2. The third, in red, leads to documentation: scopeNote - Notes.

The different classes for label and semantic relations:

Encompasses various terms used to reference a concept
  +-- skos:prefLabel
  +-- skos:altLabel
  +-- skos:hiddenLabel

A resource can have multiple labels, with only one designated as the preferred label. Other labels are considered alternative labels. Hidden labels can be used for internal processes.

Semantic Relations
Refers to the various relationships and links between a concept and other concepts
  +-- skos:broader
  +-- skos:narrower
  +-- skos:related

Broader indicates that one concept encompasses a wider, more general scope than another concept, while narrower asserts the opposite, where one concept has a more specific meaning than another. The related relationship indicates a connection or similarity between two concepts, without strictly implying a hierarchical relationship between them.

Concepts can be grouped into collections, and schemes can be interconnected through various relationships. SKOS provides the flexibility to map concepts across different schemes using hierarchical, associative, or precise equivalence relationships.

The different classes for collection:

Groupings of concepts
  +-- skos:collection
  +-- skos:member
  +-- skos:memberList

Concepts can be grouped together into collections. When concepts within a collection share common characteristics, the collection is usually named using the skos:member property. If the order of the collection’s members is important, these members can be linked to the collection using the skos:memberList property.

SKOS provides various types of documentation properties to add descriptions, notes, and additional information to provide further informal, human-readable information.

The different classes for documentation:

Non-formal information concerning the meaning of relationships, concepts and labels, along with their evolution over time.
  +-- skos:note
  +-- skos:scopeNote
  +-- skos:definition
  +-- skos:editorialNote
  +-- skos:example

skos:definition provides a formal definition. skos:scopeNote offers a brief explanation of use. skos:example provides illustrative examples. skos:historyNote is used to add historical information related to the evolution of a concept. skos:editorialNote is used for Internal management details. skos:changeNote is used to record updates or changes. skos:note can be used for various types of additional notes or comments.

SKOS allows the establishment of semantic relationships between concepts, which can be hierarchical or associative.

Through its mapping properties, SKOS enables the use and linking of concepts from different systems. Different types of mapping are used, depending on the level of accuracy of the correspondence between the concepts.

The different classes of the mapping properties:

Mapping properties
Mapping between concepts included in different conceptual schemes
  +-- skos:exactMatch
  +-- skos:closeMatch
  +-- skos:broadMatch
  +-- skos:narrowerMatch
  +-- skos:relatedMatch

A skos:exactMatch assertion signifies that two concepts share a high degree of similarity and can be used interchangeably. The skos:closeMatch property indicates that two concepts have a similar meaning. Additionally, when dealing with concepts from different conceptual schemes, you can establish matches using properties analogous to the semantic relationships: skos:broadMatch indicates that one concept is more general than another concept; skos:narrowMatch asserts that one concept is more specific than another concept.; and skos:relatedMatch implies that two concepts are related in some way, but the relationship may not be hierarchical.

What does a SKOS model look like?

Let’s consider an example within a concept scheme about animals containing three concepts: cats, wildcats and domestic cats.

image generated by lexicart of two cats, one smaller than the other, facing us but looking ahead.

The nature of the relationship between “cats” and “wildcats” in a knowledge organization system like SKOS can vary based on the specific classification or taxonomy employed.

In the following example, we have defined a specific type of relationship ourselves, and we illustrate it with a simplified textual representation, using Turtle prefixes to represent the model in a compact way. This gives us a visual outline of the following conceptual schema:

  • A concept scheme <ex:Animal_Concept_Scheme> representing the broader context of “animals”.
  • A top-level concept <ex:cats>
    • with <skos:prefLabel> — the preferred label in english “cats”
    • with <skos:altLabel> — an alternative label in english “felines”
  • A related concept <ex:wildcats>
  • A narrower concept <ex:domestic cats>
    • with <skos:prefLabel> — the preferred label in english “house cats”
    • with <skos:scopeNote> — a human-readable definition to explain that this term is “used only for domestic cats”

Examples of resources based on SKOS

Although SKOS is a simple framework, it is ideal for organizing and sharing knowledge where the requirements of a project or organization match its simplicity. SKOS is often used by museums, libraries, government agencies, and other institutions, providing several key benefits:

Simplifying Knowledge Management: SKOS provides a straightforward way to model hierarchical and associative relationships between concepts, making it easier to structure and manage knowledge.

Enhancing Data Interoperability: SKOS is designed with interoperability in mind, enabling seamless integration and sharing of knowledge organization systems across different applications and platforms.

Semantic Web Compatibility: SKOS aligns well with the principles of the Semantic Web, allowing organizations to participate in the growing ecosystem of linked data and semantic technologies.

The Getty Research Institute uses SKOS to manage and share controlled vocabularies for art and architecture. They’ve created a custom SKOS vocabulary called the Getty Vocabularies, which includes over 1.5 million records covering artists, artworks, and other concepts. The Getty Vocabularies are used by museums, libraries, and other cultural institutions around the world.

The Getty Research Institute logo

The European Union Publications Office uses SKOS to manage the EuroVoc thesaurus in 24 languages, covering the activities and policies of the European Union. EuroVoc is used by EU institutions and other organizations to index and retrieve information about EU policies and activities.

The European Union Publication Office logo

AGROVOC is a multilingual controlled vocabulary, endorsed by the Food and Agriculture Organization of the United Nations (FAO), covering the terminology in agriculture, forestry, fisheries, food, and related domains. It employs SKOS to represent concepts and facilitates interoperability with various knowledge systems and integration into the Semantic Web and Linked Data.

AGROVOC is an essential component of the AgroPortal project, dedicated to agricultural ontologies and vocabularies.

ESCO (European Skills, Competences, Qualifications and Occupations) is a multilingual classification system that uses SKOS to represent and manage its extensive dataset of skills, competencies, qualifications, and occupations. It works as a dictionary in 27 languages, describing, identifying and classifying professional occupations and skills relevant to the EU labour market, Education and Training. It contains over 3000 occupations and 13800 skills.

Challenges and Considerations

It’s important to acknowledge that while the SKOS system is straightforward, it does come with its set of intricacies:

  • Semantic Complexity: In its simplicity, SKOS can lack a formal semantic specification, sometimes leading to interpretation ambiguities.
  • Limited Expressivity: SKOS is designed for simple hierarchical and associative relationships. Complex vocabularies may require more expressive ontologies.
  • Inferential Constraints: Unlike advanced ontologies, SKOS does not offer extensive automated inference and reasoning capabilities.
  • Version Control: SKOS doesn’t have built-in versioning mechanisms, requiring external version control practices.
  • Interoperability Challenges: While designed for interoperability, achieving true interoperability often relies on consistent data modelling, URI management, and alignment with Linked Data principles.
  • Scalability Concerns: Managing extensive and complex vocabularies can present scalability challenges in certain SKOS editing environments.
  • Maintenance and Sustainability: Like any knowledge organisation system, SKOS vocabularies require ongoing maintenance, updates and governance to ensure continued relevance and accuracy.

The Role of Data Quality in SKOS Management

Data quality is a crucial aspect when working with SKOS or any other knowledge organization system.

The quality of your SKOS vocabulary directly influences its effectiveness, usability, and even the overall success of a data project.

Data quality is not only about fixing errors; it’s also about proactively preventing. At Hanami, we’ve seamlessly integrated SHACL standards into our user-friendly tool to tackle the nuances of data quality head-on.

Our solution goes beyond identifying issues; it allows you to maintain data integrity and accuracy.

? To ensure your SKOS data remains pristine Hanami can support you with the following:
– Maintaining terminology consistency.
– Verifying the accuracy of hierarchical and associative relationships.
– Implement validation mechanisms to prevent data entry errors.
– Track changes in your SKOS vocabulary
– Encourage collaboration while upholding data quality practices.

Hanami SKOS Edition: Addressing Data Quality Challenges

Data quality is an ongoing effort, that requires regular assessment of the health of your data.

Hanami SKOS Edition offers a holistic solution that combines all the essential data quality features we offer as standard in the Hanami application, while also being free and ready to download.

Here’s how Hanami SKOS Edition can transform your SKOS data quality management, offering you:

Comprehensive Validation Report: Access an in-depth Validation Report to analyze violations against defined constraints, enabling continuous quality assessment.

Data Quality Scoring System: Benefit from a unique scoring system that helps you prioritize improvements.

Intuitive User Interface: Our interface makes it easy to identify and resolve problems, simplifying complex data quality tasks.

Dual Application of SHACL: Hanami leverages SHACL for both data model description and validation, ensuring unwavering data quality.

This comprehensive approach ensures that your data not only complies with the defined constraints but also aligns with the SKOS data model, resulting in more consistent and reliable data quality.

Data quality is a multi-dimensional task. Accuracy, completeness, timeliness, consistency, and relevancy are just a few of the many aspects that need to be considered. Achieving good quality in one dimension doesn’t guarantee good quality in others. Regular monitoring and maintenance are critical to preventing the degradation of data quality over time. And Hanami is here to help you every step of the way.

SKOS Enhanced by Data Quality

In the landscape of knowledge organization and semantic technologies, SKOS simplifies the complex. But to fully leverage its potential, it is essential to address data quality concerns as soon as possible.

Hanami SKOS Edition seamlessly integrates the power of SHACL to empower you with comprehensive data quality management, ensuring that your SKOS-compliant data is not just simple but of the highest quality.

From comprehensive validation reports to user-friendly interfaces, Hanami is your partner in maintaining the integrity and precision of your SKOS data.

Elevate your knowledge organization with Hanami SKOS Edition today, free and ready to download.

Share this post

Ready to get more?

We strive to empower organisations to harness and use their data even in challenging environments.

Read our report with insights about how Hanami can help your business benefit from Enterprise Knowledge Graph edition and how we believe that collaboration drives the future of Open and Linked Data.