Metadata ============================= .. note:: OntoLearner Metadata will be created automatically at Github under `metadata/ `_ directory, and it is available for download after ``ontolearner > 1.4.0`` also at `Releases `_ per release. .. hint:: The metadata release is fully automated through CI/CD, ensuring it is generated automatically with each PyPI release. .. sidebar:: OntoLearner Metadata Exporter Features - Generates `Dublin Core metadata `_ for each ontology in the library - Creates a top-level ``Collection`` resource for OntoLearner - Supports RDF/XML serialization in a clean, human-readable format - Uses a custom ``ontologizer`` namespace for ontology-specific resources The ``OntoLearnerMetadataExporter`` is a utility class for generating **Dublin Core (DCMI) metadata** for all ontologies benchmarked in the OntoLearner library. It collects essential metadata, including ontology title and description, creator/authors, license information, format, version, and last updated date, domain and category, and download URL. Additionally, it generates a **top-level collection resource** that describes the entire OntoLearner benchmarking suite. The output is a **pretty-printed RDF/XML file** compatible with standard semantic web tools and parsers. **Example RDF structure:** .. code-block:: xml OntoLearner Benchmark Ontologies This Dublin Core metadata collection describes ontologies benchmarked in OntoLearner. It includes information such as title, creator, format, license, and version. OntoLearner Team MIT License 1.4.0 NCIt NCI Thesaurus (NCIt) NCI Thesaurus (NCIt) is a reference terminology that includes broad coverage of the cancer domain... OWL 2023-10-19 Creative Commons 4.0 https://terminology.tib.eu/ts/ontologies/NCIT Medicine Cancer, Oncology 24.04e Properties ------------------------------------- The following table summarizes the key **Dublin Core metadata properties** captured for each ontology in OntoLearner. It provides a quick overview of the ontology’s identifier, title, description, authorship, format, license, domain, and version information, helping users understand and reference the ontologies consistently. .. list-table:: **OntoLearner Metadata Properties** :header-rows: 0 :widths: 40 40 40 * - **Property** - **Example** - **Description** * - ``dc:identifier`` - NCIt - Ontology ID * - ``dcterms:title`` - NCI Thesaurus (NCIt) - Ontology full name * - ``dcterms:description`` - NCI Thesaurus (NCIt) is a reference terminology that includes broad coverage of the cancer domain... - Detailed ontology description * - ``dcterms:creator`` - NCI - Creator / author * - ``dcterms:format`` - OWL - Ontology format * - ``dcterms:date`` - 2023-10-19 - Last updated * - ``dcterms:license`` - Creative Commons 4.0 - License information * - ``dcterms:source`` - `https://terminology.tib.eu/ts/ontologies/NCIT `_ - Download or reference URL * - ``dcterms:subject`` - Medicine - Domain or category * - ``dcterms:hasVersion`` - 24.04e - Ontology version The following represents the benchmark collection info. The `dcterms:hasVersion` represents the library version that the metadata was released. .. code-block:: xml OntoLearner Benchmark Ontologies This Dublin Core metadata collection describes ontologies benchmarked in OntoLearner. It includes information such as title, creator, format, license, and version. OntoLearner Team MIT License 1.4.0 Exporter -------------------- ``OntoLearnerMetadataExporter`` is included in the OntoLearner library, which you can store the ontology locally. .. code-block:: python from ontolearner import OntoLearnerMetadataExporter # Initialize exporter exporter = OntoLearnerMetadataExporter() # Export metadata to RDF/XML exporter.export("ontolearner-metadata.rdf") The above code outputs: - **File:** ``ontolearner-metadata.rdf`` - **Format:** Pretty-printed RDF/XML - **Content:** metadata for each ontology The top-level collection describes the entire OntoLearner benchmark, while each ontology entry includes detailed metadata using Dublin Core and DCTERMS properties. .. hint:: **Namespace Bindings:** The exporter uses the following namespaces in the RDF output: - ``dc``: ``http://purl.org/dc/elements/1.1/`` - ``dcterms``: ``http://purl.org/dc/terms/`` - ``ontologizer``: ``https://ontolearner.readthedocs.io/ontologizer/ontology_modularization.html#`` - ``rdf``: ``http://www.w3.org/1999/02/22-rdf-syntax-ns#`` .. note:: - The **Collection resource** always appears first in the RDF/XML output. - Individual ontologies are serialized as ``ontologizer:Ontology`` resources. - The ``export()`` method automatically reads the OntoLearner library version from the ``VERSION`` file. - The RDF/XML output is compatible with standard semantic web tools like **Protégé**, **RDFLib**, and **Apache Jena**.