Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging TripleStore and TripleStroreKnowledgeBase, Refactoring and Documentation update #510

Merged
merged 17 commits into from
Feb 10, 2025

Conversation

alkidbaci
Copy link
Collaborator

@alkidbaci alkidbaci commented Jan 16, 2025

Refactoring KnowledgeBase class:

  • Removed arguments ontologymanager_factory, length_metric_factory, length_metric, individuals_cache_size, ind_set, backend_store. Reasons given below, most of them based on 'TODOs' set in the code.
    • ontologymanager_factory was not really used and ontology manager will be deprecated in owlapy.
    • length measurement should be disentangled from KB, related method removed from KB.
    • Caching should not be done by KB, it can be done optionally by the reasoner.
    • ind_set removed because it hinders us scaling large KGs.
  • KnowledgeBase no longer perform caching (related methods removed).
  • Redundant methods: all_individuals_set, get_leaf_concepts, get_least_general_named_concepts are removed.
  • encode_learning_problem disentangled from KB, implementation moved to encode_kb method of class PosNegLPStandard.
  • evaluate_concept disentangled from KB, this method is now moved to quality_funcs.py module
  • get_most_general_classes renamed to most_general_classes
  • added method are_owl_concept_disjoint
  • Redundant overloadings of init removed

Expanded AbstractKnowledgeBase:

  • Added abstract method that should be implemented by any inheriting class of base class AbstractKnowledgeBase. This are methods implemented in KnowledgeBase that weren't part of AbstractKnowledgeBase.

Refactoring of related modules:

  • Arguments of type KnowledgeBase set to AbstractKnowledgeBase where applicable (mainly on learning models)
  • Updated references of removed/changed methods of KB.

Merged TripleStore with TripleStoreKnowlegeBase #451

  • TripleStore now inherits from AbstractKnowledgeBase.
  • TripleStoreKnowledgeBase removed, TripleStore now represents both of them.
  • TripleStoreReasonerOntology removed. TripleStore now contains a reasoner and an ontology, respectively TripleStoreReasoner and TripleStoreOntology.
  • Between conflicting methods (methods with similar name/function) only one is chosen. Some are marked << TO BE DECIDED >> because implementation is slightly different.
  • Implemented all other abstract methods from AbstractKnowledgeBase. Some are marked with << REVIEW >>, the reason explained via comments.
  • Added new method data_property_ranges for TripleStoreReasoner.
  • Code format refactoring, some code was stretching to much (in multiple lines), now folded to fewer lines.

Validation of changes

  • Github action is passing meaning that KB related tests are passing. However, concept learners using KB may potentially work slower because caching is no longer used.
  • Since we don't have tests set for TripleStore, I manually tested different concept learners. Using the script examples/concept_learning_via_triplestore_example.py anyone can do so. Concept learners work fine with a few exceptions:
    • Drill not working on TripleStore because of the issue mentioned at Bug with confusion matrix construction for Drill #509.
    • EvoLearner not working on TripleStore nor KnowledgeBase but the error is unrelated to the changes of this PR.
    • A timeout issue occurred for CELOE when using TripleStore on Mutagenesis. I believe this was still the case even before the changes when using TripleStoreKnowledgeBase and is probably related to the large amount of calls CELOE algorithm makes because of the one-by-one refinement procedure.

Documentation Update

  • Docs became a bit outdated after the above mentioned changes so, they got updated as well. Beside minor updates on each guide, a restructuring of the content was made.
    • TripleStore documentation moved from Concept Learner guide to Knowledge Bases guide.
    • Evaluating a concept (manually constructed concept) is moved from Knowledge Bases guide to a new guide called Evaluating a Class Expression.

@Demirrr Demirrr self-requested a review January 20, 2025 11:13
@alkidbaci
Copy link
Collaborator Author

alkidbaci commented Jan 20, 2025

I have marked some methods on the triple_store.py module with the labels: <<TO BE DECIDED>> and <<REVIEW>>.
@Demirrr for what we communicated earlier, these methods require most of the attention.

@alkidbaci
Copy link
Collaborator Author

After merge with Jean's changes (PR #511), this PR was tested locally because https://files.dice-research.org/ is currently down.
Every test was passing with with the the following errors which are now fixed (details documented in the end of this comment for future reference):

  • An error with EvoLearner (test_evolearner.py, test_example_concept_learning_evaluation.py)
  • An error with Drill (test_drill.py)

@Demirrr I would like to merge this PR but the merging is blocked because it requires 1 approving review.


Error documentation

  • The issue with EvoLearner was emerging from the deap library. During an process within this library the built-in eval method was used. It was noticed that when the code argument used in that method contained operation such as UNION or INTERSECTION caused the error TypeError: 'NoneType' object is not subscriptable. It was not clear at first what was the real issue but apparently enumerations were used to create the functions used in the globals of eval method and and when an actual code had for example a UNION operation (specified by an enumeration) it was not mapped to its value. So basically there was a mismatch on how the enumeration were handled in the code arg and in the global arg. The fix was to use the value of the enumeration object to create the global functions so that they could be mapped correctly. The root cause of this error was possibly from a recent version of deap or how things are rendered in python 3.11 (no further time was spent to investigate this since the error was already fixed).
  • The issue with Drill was emerging from owlapy. During its execution, Drill asked for disjoint classes of OWLNothing and this resulted in an error because this scenario was not handled by the StructuralReasoner. Now this scenario, together with some other scenarios (e.g. when asking for disjoint properties, etc.) are now supported by StructuralReasoner. Therefore, the fix for this error is provided in owlapy not here.

Copy link
Member

@Demirrr Demirrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alkidbaci thank you for the work!

@Demirrr Demirrr merged commit 524d819 into develop Feb 10, 2025
0 of 2 checks passed
@github-actions github-actions bot temporarily deployed to pull request February 10, 2025 08:25 Inactive
@alkidbaci alkidbaci deleted the refactoring branch February 11, 2025 12:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants