-
-
Notifications
You must be signed in to change notification settings - Fork 729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DFC] match taxons to dfc product types #11817
[DFC] match taxons to dfc product types #11817
Conversation
As explained here datafoodconsortium/connector-ruby#13 , only the narrowest (more specific one) product type is modelled as a
We also need As a starting point this is where the |
287876b
to
792834d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great!
I'm tempted to approve but I think that we should change to using a URI instead of a name. That will avoid guessing and allows for name changes.
I also have a pull request on the connector which will replace the guessing of setter method names. I wonder if we need to tweak the import feature there as well:
@@ -11,9 +11,11 @@ | |||
id: 90_000, | |||
supplier: enterprise, name: "Pesto", description: "Basil Pesto", | |||
variants: [variant], | |||
primary_taxon: taxon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you change the spec data, it's good to run swagger to document that change in the same commit.
dfc_name = supplied_product.productType.prefLabels[:en].downcase | ||
taxon = Spree::Taxon.find_by(dfc_name: ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it would be more reliable to store the URL as dfc_id
instead of the name. Labels can change. And we may want to support additional vocabularies one day.
def self.taxon(supplied_product) | ||
# We use english locale, might need to make this configurable | ||
dfc_name = supplied_product.productType.prefLabels[:en].downcase | ||
taxon = Spree::Taxon.find_by(dfc_name: ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we look for the presence of broader types? If a product comes in as wine and OFN doesn't have wine then it could still be classified as drink, right? That would mean traversing through the broader terms. Potentially, that could be in one SQL query if we can sort by broadness. That may be easier in Ruby. Anyway, this could be an improvement later on.
@@ -0,0 +1,5 @@ | |||
class AddDfcNameToSpreeTaxons < ActiveRecord::Migration[7.0] | |||
def change | |||
add_column :spree_taxons, :dfc_name, :string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imagined a new table that can attach a DFC link to anything. But it may be good to start simple like this. We can always generalise later and move the data into a new table.
I had a quick look, and I think it should work as it is but I'll double check that. |
db/schema.rb
Outdated
@@ -924,6 +925,7 @@ | |||
t.boolean "show_api_key_view", default: false, null: false | |||
t.string "provider" | |||
t.string "uid" | |||
t.datetime "terms_of_service_accepted_at" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t.datetime "terms_of_service_accepted_at" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good refactor.
Am I right to assume that this will become easier when the connector has broader and narrower defined?
type = call_dfc_product_type(current_stack) | ||
|
||
id = type.semanticId | ||
@product_types[id] = current_stack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you store the type directly? Then you don't need to resolve the stack when looking up types by id.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure to be honest, it makes so much sense now that you are pointing it out 😄
With the patched skos parser, skos concept now have broader and narrower populated, but they return an array of URI. So it's just easier to do something like this But that means your suggested improvement about checking broader terms should be easily feasible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice improvement. 👍
).sub( | ||
"dfc-pt:", | ||
"https://github.com/datafoodconsortium/taxonomies/releases/latest/download/productTypes.rdf#" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mkllnk I fixed the importer based on your comment here : datafoodconsortium/connector-codegen#12 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
Now we just have to reconcile this with the code generator... 🙃
ab61505
to
e9399ce
Compare
49f569a
to
9a48e7f
Compare
It relies on having dfc_name populated on the given taxon. Matching is as follow: - parse the DFC product types and store in PRODUCT_TYPES if needed - match the dfc_name against PRODUCT_TYPES - call the method returned on the DFC connector
This is to test the product type matching is working properly
Ie: you can now use "Soft drink" instead of "soft_drink", case is also ignored
Fox export only
For import this time
It keeps SuppliedProductBuilder and move all the matching logic to its own class
Doesn't try to load skos concept if we are not dealing with an object that doesn't refers to a skos concept
The given productType.rdf file doesn't give us any context for `dfc-pt`, so there was no reason to do that. We still need to do some substitution in the importer, as some times we are given `dfc-pt` as input data.
The `dfc-b:hasType` value can only be parsed as object id if the context contains: ``` "dfc-b:hasType":{ "@type":"@id" }, ``` The standard context includes this and it's easier to use. Now that the URIs of product types are correctly resolved, we don't need to substitute the URI manually. Also dropped an old unneeded spec for backwards compatibility.
This was missed in a previous refactor
45268fe
to
96a0100
Compare
Currently it's not possibel to compare two `DataFoodConsortium::Connector::SKOSConcept` or two `VirtualAssembly::Semantizer::SemanticObject with` `==`. Related to : https://github.com/assemblee-virtuelle/semantizer-ruby/pull/2/files
Hi @rioug, Before stagingAfter stagingWe now have the field DFC-URI available - note that it's neither called I have also tried the "bread" example. It worked well when I found out that the request requires the variant id, not the product id. I found this misleading because the URL actually has "supplied_product" in it - not sure if this could benefit from some clarification. In the end it worked well. 👍 |
ℹ️ This is a funded feature. Please use the Clockify project with the following name when working on this, including review and test: #9170 OFN DFC Products
What? Why?
Upstream patch has been merged datafoodconsortium/connector-codegen#10, which fix previously blocking issue: datafoodconsortium/connector-ruby#13. We'll have to remove the parser monkey patching once the changes are releases.
To make matching taxon to DFC product type, we added a "dfc id" field so the user can specify which DFC product type a given taxon refers to. "dfc id" are meant to refers to a DFC product type URI defined in https://github.com/datafoodconsortium/taxonomies/releases/latest/download/productTypes.rdf, you can browse the file here . For instance, if you wanted to link the "Bread" category to the "DFC Bread", you would use the following URI : https://github.com/datafoodconsortium/taxonomies/releases/latest/download/productTypes.rdf#bread
The "dfc id" is then used to pick the correct DFC product type when the API returns a product.
Similarly, when creating/updating a product, the DFC api will try to find the appropriate taxon using the taxon's dfc id.
Technical note:
Product types (and other SKOSConcepts) are modelled as Class method on the connector, which return a
SKOSConcept
object. IE: to access the "soft drink" product type you callconnector.PRODUCT_TYPES.DRINK.SOFT_DRINK
. For ease of retrieving the SKOSConcept we want, we first parse the Product type tree and store it inSuppliedProductBuilder::PRODUCT_TYPES
hash,with the path to access it, for instance :
PRODUCT_TYPES.DRINK.SOFT_DRINK.LEMONADE
is stored as :SuppliedProductBuilder::PRODUCT_TYPES = { lemonade: [:DRINK, :SOFT_DRINK] }
What should we test?
As a super admin
NOTE "dfc name" has been renamed "dfc id"
-> dfc id is saved
-> dfc id is saved
Release notes
Changelog Category (reviewers may add a label for the release notes):
The title of the pull request will be included in the release notes.