.. raw:: html

OntoLearner is a modular, open-source Python framework purpose-built for modern ontology learning (OL)—the semi-automatic construction and enrichment of ontologies from unstructured sources—powered by Large Language Models (LLMs). Rooted in decades of research in semantic web and NLP, OntoLearner integrates the breadth of existing ontology repositories with the reasoning and generative capabilities of state-of-the-art foundation models. Traditional OL systems often struggled with inconsistency, poor vocabulary alignment, and domain fragmentation. OntoLearner addresses these challenges by combining structured access to curated ontologies with LLM-driven enrichment workflows—enabling scalable, cross-domain ontology development grounded in best practices and FAIR principles. Unlike general-purpose NLP or embedding libraries, OntoLearner is designed specifically for ontology engineering and OL research. It offers: .. raw:: html

✅ Cross-domain coverage with leading repositories like BioPortal, OBO Foundry, OLS, LOV, and FAIRsharing.
🤖 LLM-assisted modeling for tasks such as term suggestion, concept typing, taxonomy induction, relation extraction, and ontology enrichment.
🧠 Benchmarking tools to evaluate, compare, and validate LLM-based methods for OL using standardized datasets and metrics.
🔁 Machine-readable ontologies hosted on Hugging Face, optimized for integration into generative AI pipelines with full support for versioning, streaming, and metadata inspection.
🔧 Modular and extensible architecture that seamlessly integrate with existing ontology development environments.

A wide selection of over `200 ontologies `_ are available for immediate use on 🤗 Hugging Face. OntoLearner was created by `Scientific Knowledge Organization (SciKnowOrg group) `_ at `Technische Informationsbibliothek (TIB) `_. Don't hesitate to open an issue on the `OntoLearner repository `_ if something is broken or if you have further questions. Usage ===== .. seealso:: See the `Quickstart `_ for more quick information on how to use OntoLearner. Working with OntoLearner is straightforward: .. sidebar:: Installation You can install *ontolearner* using pip: .. code-block:: python pip install -U ontolearner We recommend **Python 3.10+** and **PyTorch 2.4.0+**. See `installation `_ for further installation options. .. tab:: Ontologizer Module .. code-block:: python from ontolearner import AgrO # 1. Initialize an ontologizer from OntoLearner ontology = AgrO() # 2. Load the ontology automatically from Hugging Face ontology.load() # 3. Extract the learning task dataset data = ontology.extract() print(ontology) # outputs: # ontology_id: AgrO # ontology_full_name: Agronomy Ontology (AgrO) # domain: Agriculture # category: Agronomy # version: 1.0 # last_updated: 2022-11-02 # creator: The Crop Ontology Consortium # license: Creative Commons 4.0 # format: RDF # download_url: https://agroportal.lirmm.fr/ontologies/AGRO?p=summary .. tab:: Learner Module .. code-block:: python from ontolearner import LearnerPipeline, AgrO, train_test_split # Load the AgrO ontology ontology = AgrO() ontology.load() # Extract term-typing instances and split into train and test sets train_data, test_data = train_test_split( ontology.extract(), test_size=0.2, random_state=42 ) # RAG can be configured either by passing both IDs (shown here), # or by passing a prebuilt `rag=` learner object. pipeline = LearnerPipeline( retriever_id='sentence-transformers/all-MiniLM-L6-v2', llm_id='Qwen/Qwen2.5-0.5B-Instruct', hf_token='...', batch_size=32, top_k=5 ) # Run the pipeline: training, prediction, and evaluation in one call outputs = pipeline( train_data=train_data, test_data=test_data, evaluate=True, # Compute metrics like precision, recall, and F1 task='term-typing' # Specifies the task ) # Print final evaluation metrics print("Metrics:", outputs['metrics']) # Print the total time taken for the full pipeline execution print("Elapsed time:", outputs['elapsed_time']) # Print all outputs print(outputs) Citing ======= If you find this repository helpful, feel free to cite our publication `LLMs4OL: Large language models for ontology learning `_: .. code-block:: bibtex @inproceedings{babaei2023llms4ol, title={LLMs4OL: Large language models for ontology learning}, author={Babaei Giglou, Hamed and D’Souza, Jennifer and Auer, S{\"o}ren}, booktitle={International Semantic Web Conference}, pages={408--427}, year={2023}, organization={Springer} } or GitHub repository: .. code-block:: bibtex @software{babaei_giglou_2025_15399783, author = {Babaei Giglou, Hamed and D'Souza, Jennifer and Aioanei, Andrei and Mihindukulasooriya, Nandana and Auer, Sören}, title = {OntoLearner: A Modular Python Library for Ontology Learning with LLMs}, month = may, year = 2025, publisher = {Zenodo}, version = {v1.0.1}, doi = {10.5281/zenodo.15399783}, url = {https://doi.org/10.5281/zenodo.15399783}, } .. toctree:: :maxdepth: 1 :caption: Getting Started :hidden: installation quickstart .. toctree:: :maxdepth: 1 :caption: Ontologizer :hidden: ontologizer/ontology_modularization ontologizer/ontology_hosting ontologizer/new_ontologies ontologizer/metadata ontologizer/metrics .. toctree:: :maxdepth: 1 :caption: Learning Tasks :hidden: learning_tasks/learning_tasks learning_tasks/llms4ol learning_tasks/text2onto .. toctree:: :maxdepth: 4 :caption: Learner Models :hidden: learners/llm learners/retrieval learners/rag learners/llms4ol .. toctree:: :maxdepth: 4 :caption: Benchmarking :hidden: benchmarking/benchmark .. toctree:: :maxdepth: 1 :caption: Package Reference :glob: :hidden: package_reference/ontologizer package_reference/text2onto package_reference/learner package_reference/data_structure package_reference/evaluation package_reference/tools package_reference/utils package_reference/pipeline