Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New metadata element]: curation_rule #166

Closed
matentzn opened this issue Apr 18, 2022 · 11 comments · Fixed by #258
Closed

[New metadata element]: curation_rule #166

matentzn opened this issue Apr 18, 2022 · 11 comments · Fixed by #258
Assignees
Milestone

Comments

@matentzn
Copy link
Collaborator

Element id (e.g. creator_id, mapping_tool_version):
(Must be lower case and contain only letters and underscores.)

curation_rule

Value data type (e.g. URI, URL, text, xsd:boolean):

EntityReference (URI/CURIE)

Description

See #150 for wider context.

A curation rule is a (potentially) complex condition executed by a human curator that led to the establishment of a mapping. Curation rules often involve complex domain-specific considerations, which are hard to capture in an automated fashion.

Constraints

  • Should only be recorded in conjunction with mapping_justification: HumanCurated.
  • Should be recorded with a term from a controlled vocabulary

Examples

mapping-commons/disease-mappings#16

@saubin78
Copy link

Has this element been implemented? Will it be available in 1.0?

@matentzn
Copy link
Collaborator Author

Unfortunately this was not planned for 1.0, but 1.1. Can you outline why you would need it more quickly? What use case are you thinking of using it for?

@matentzn matentzn added this to the 1.1.0 milestone Jun 16, 2022
@saubin78
Copy link

Hi! No hurry.
We'll discuss this point next week around D2KAB project's use cases and come back to you with further elements. We'll see in particular if values from a controlled vocabulary would fit, not quite sure...

@matentzn
Copy link
Collaborator Author

We have been curating some examples for these, but I don't think the vocabulary should be standardised - there is a wide space of curation rules, and many can get very specific for any given ontology. Here we are discussing some very concrete examples: mapping-commons/disease-mappings#16

@saubin78
Copy link

What is the status of curation_rule in the current SSSOM specification? As far as I can see, it has not been introduced in the model. Is there any plan for this?

@saubin78
Copy link

I am not 100% convinced by the statement "Should only be recorded in conjunction with mapping_justification: HumanCurated."
Wouldn't rules be relevant with e.g. semapv:BackgroundKnowledgeBasedMatching or semapv:LogicalReasoning?

@matentzn
Copy link
Collaborator Author

Can you give 3-4 examples of specific curation rules you can think of?

@saubin78
Copy link

saubin78 commented Feb 2, 2023

Hi,

I feel like "mapping rules" could be expressed in an ontology and used by an automatic mapping tool.
In our use case, we are working on manual mappings (so I don't really answer your question) but we may imagine that we express them as constraints or axioms in a ontology.

Currently, our curation rules are short texts. For now, they look like this :

Rule 3 is about abiotic stress and tolerance of the plant to metal toxicity.
metal + toxicity (WTO) <-> metal + tolerance  (CO321)

Example: 
Aluminium toxicity (WTO:0000450) <-> Aluminium tolerance (CO_321:0000079)

or this one, that has subrules :

Rule 1 is about biotic stresses. WTO defines general trait classes of response to pests and disease for WTO. CO_321 defines traits about the observable degree of affection. The observation may be done on the whole plant, or a subpart of it. 
We consider that the user of the information retrieval function, given a pathogen or a disease, would like to retrieve all data, independently of the way the affection is observed. As a consequence, the retrieval terms score, incidence, severity, response, index, coefficent, progress curve and resistance are considered as similar.

1.1 Score
resistance + ’name of disease’ (WTO) <-> ’name of disease’ + score (CO321)

Example from CO_321 : Septoria score on leaf  (CO_321:1010036)
And from WTO : resistance to Septoria Leaf Blotch (WTO:0000554)

The score is a direct measure on the plant. It is usually a discret number between 0 an 6 or 10.

@saubin78
Copy link

saubin78 commented Feb 2, 2023

In short term, I'd like to use the "curation_rule" element and find the best way to publish the rules. For now I see two possibilities :

  1. Record the rules in a document that would go with the mapping dataset and reference the rules in the SSSOM representation using the codes (eg 1.1 or 2 in my examples above)
  2. Publish each rule as a RDF resource (= nanopublication?) and reference it with its URI in the SSSOM representation of our mappings. But what would be the type for these resources? Maybe SEMAPV could provide a definition of a "MappingRule" class?

@matentzn
Copy link
Collaborator Author

matentzn commented Feb 7, 2023

Thank you for your comments @saubin78 - I am in a deadline mode atm, but will get to this soon!

@matentzn
Copy link
Collaborator Author

This is still top of my list @saubin78 - sorry about the delay.

matentzn added a commit that referenced this issue Mar 16, 2023
Fixes #166

- [X] `docs/` have been added/updated if necessary
- [X] `make test` has been run locally
- [X] tests have been added/updated (if applicable)
- [ ]

The need for representing specific curation rules is everywhere, see
#166. It will be very difficult, if not impossible, to standardise
curation rules, so I would advocate we leave this totally open for now.
Representing the rules as a resource basically makes them an open ended
enum - which gives us more flexibility for adding structure later.
@saubin78 suggested to create a class "MappingRule" and have curation
rules being instances of mapping rules.

We also should decide reasonably soon if we want to rename curation rule
into mapping rule altogether, if we agree with @saubin78 assertion in
#166 that computational rules can also be curation rules (I think that's
fair!).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants