Skip to content

Commit

Permalink
refactor: uniformalize all arrows
Browse files Browse the repository at this point in the history
- Bold arrows ==> logic control
- Normal arrows --> data access
- Dashed arrows -.> error/skip conditions
- Added a few connections that were previously missing (minor update)
  • Loading branch information
nickumia-reisys committed Sep 14, 2023
1 parent 8747afe commit ed2479e
Show file tree
Hide file tree
Showing 2 changed files with 42 additions and 40 deletions.
80 changes: 41 additions & 39 deletions docs/dcat.mmd
Original file line number Diff line number Diff line change
Expand Up @@ -55,66 +55,68 @@ flowchart TD
is_deleted{Is Dataset Deleted?}

%% Algorithm
gs --> load_remote_catalog
load_remote_catalog --> validate_conforms_to
validate_conforms_to-. No .-> error
validate_conforms_to-->|Yes|catalog_values
gs ==> load_remote_catalog
load_remote_catalog ==> validate_conforms_to
validate_conforms_to == No ==> error
validate_conforms_to == Yes ==> check_schema_version
load_remote_catalog --> source_data
load_remote_catalog --> catalog_values
catalog_values --> check_schema_version
check_schema_version-->|No|default_schema_version
check_schema_version-->|Yes|schema_version
check_schema_version-- No -->default_schema_version
check_schema_version-- Yes -->schema_version
schema_version --> get_existing_datasets
default_schema_version --> get_existing_datasets
get_existing_datasets --> existing_datasets
get_existing_datasets --> is_parent_
is_parent_-->|Yes|existing_parents
%% existing_parents --> is_parent_demoted
is_parent_-->|No|is_parent_demoted
is_parent_demoted-->|Yes|orphaned_parents
is_parent_demoted-->|No|is_parent_promoted
%% existing_datasets --> is_parent_promoted
is_parent_promoted-->|Yes|new_parents
is_parent_promoted-->|No|load_config
get_existing_datasets ==> is_parent_
is_parent_ == Yes ==> existing_parents
existing_parents --> is_parent_demoted
is_parent_ == No ==> is_parent_demoted
is_parent_demoted -- Yes --> orphaned_parents
is_parent_demoted == No ==> is_parent_promoted
existing_datasets --> is_parent_promoted
is_parent_promoted -- Yes --> new_parents
is_parent_promoted == No ==> load_config
load_config --> hc_filter
load_config --> hc_defaults
load_config --> is_identifier_both
load_config ==> is_identifier_both
is_identifier_both-. Yes .-> error
is_identifier_both-->|No|for_each_dataset
for_each_dataset --> dataset_contains_filter
is_identifier_both == No ==> for_each_dataset
hc_filter --> dataset_contains_filter
for_each_dataset ==> dataset_contains_filter
dataset_contains_filter-. Yes .-> skip
dataset_contains_filter-->|No|has_identifier
dataset_contains_filter == No ==> has_identifier
has_identifier-. No .-> error
has_identifier-->|Yes|multiple_identifier
has_identifier == Yes ==> multiple_identifier
multiple_identifier-. Yes .-> skip
multiple_identifier-->|No|unique_datsets
multiple_identifier == No ==> unique_datsets
unique_datsets --> unique_existing
unique_existing-->|Yes|hash_exists
unique_existing-->|Yes|seen_datasets
unique_existing-->|No|new_pkg_id
hash_exists-->|Yes|make_upstream_content_hash
is_active-->|Yes|make_upstream_content_hash
orphaned_parents-->|Disjunction|make_upstream_content_hash
new_parents-->|Disjunction|make_upstream_content_hash
make_upstream_content_hash --> check_hash
unique_existing == Yes ==> hash_exists
unique_existing == Yes ==> seen_datasets
unique_existing == No ==> new_pkg_id
hash_exists == Yes ==> make_upstream_content_hash
is_active == Yes ==> make_upstream_content_hash
orphaned_parents-- Disjunction -->make_upstream_content_hash
new_parents-- Disjunction -->make_upstream_content_hash
make_upstream_content_hash ==> check_hash
check_hash-. Yes .-> skip
check_hash-->|No|HarvestObjectExtra
check_hash-- No -->HarvestObjectExtra
new_pkg_id --> HarvestObjectExtra
Append__is_collection --> HarvestObjectExtra
Append__schema_version --> HarvestObjectExtra
Append__catalog_values --> HarvestObjectExtra
schema_version --> HarvestObjectExtra
default_schema_version --> HarvestObjectExtra
catalog_values --> HarvestObjectExtra
Append__collection_pkg_id --> HarvestObjectExtra
is_parent_-->|Yes|Harvest_first
is_parent_-->|No|Harvest_second
is_parent_ == Yes ==> Harvest_first
is_parent_ == No ==> Harvest_second
HarvestObjectExtra --> Harvest_first
HarvestObjectExtra --> Harvest_second
Harvest_first --> for_each_existing
Harvest_second --> for_each_existing
for_each_existing --> seen_datasets
for_each_existing --> is_deleted
Harvest_first ==> for_each_existing
Harvest_second ==> for_each_existing
for_each_existing ==> seen_datasets
for_each_existing ==> is_deleted
seen_datasets-. Inverse .-> skip
is_deleted-. Yes .-> skip
seen_datasets --> update
is_deleted-->|No|update
is_deleted-- No -->update
update-. exception .-> error
update --> ge
2 changes: 1 addition & 1 deletion docs/dcat.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

1 comment on commit ed2479e

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coverage

Coverage Report
FileStmtsMissCoverMissing
harvester
   __init__.py30100% 
harvester/db/models
   __init__.py50100% 
   models.py530100% 
harvester/extract
   __init__.py1922 89%
   dcatus.py1122 82%
harvester/utils
   __init__.py00100% 
   json.py2266 73%
   pg.py3544 89%
   s3.py2466 75%
harvester/validate
   __init__.py00100% 
   dcat_us.py240100% 
TOTAL1962090% 

Tests Skipped Failures Errors Time
29 0 💤 0 ❌ 0 🔥 20.766s ⏱️

Please sign in to comment.