Merge pull request #227 from Riminder/feat/add-profile-json-to-parsin…

…g-warehouse feat: parsing warehouse V2
Riminder · Mar 15, 2024 · 8e6c20e · 8e6c20e
2 parents 16f03f1 + cf222c1
commit 8e6c20e
Show file tree

Hide file tree

Showing 2 changed files with 145 additions and 107 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -34,54 +34,54 @@ BREAKING CHANGE: for users that rely on generate_docs helper ([`77b8da9`](https:
 
 * feat: consider info if exists rather than the parsed info (#221)
 
-* feat: consider info if exists rather than the parsed info
-
-* fix: add case urls in info of profile json
-
-* fix: after jamal review
-
-* fix: remplace cv url by cv binary (#220)
-
-* fix:add some error handling for jobology connector
-
-* fix:jobology flake8 connector
-
-* fix:some type
-
-* fix:regarding jamal review
-
-* fix: remplace cv url by cv binary
-
-* docs: update docs
-
-* fix: flake8 outputs
-
-* fix: jobology catch profile
-
-* docs: update docs
-
-* fix: regarding jamal review
-
-* fix: handle possible error binasciii
-
-* fix: flake8 and docs
-
-* fix: some flake8 output
-
-* fix: correct update date for Jobology connector (#222)
-
-Co-authored-by: the-forest-tree &lt;the-forest-tree@hrflow.ai&gt;
-
-* 4.6.1
-
-Automatically generated by python-semantic-release
-
-* fix: regarding jamal review location=value
-
----------
-
-Co-authored-by: the-forest-tree &lt;65894619+the-forest-tree@users.noreply.github.com&gt;
-Co-authored-by: the-forest-tree &lt;the-forest-tree@hrflow.ai&gt;
+* feat: consider info if exists rather than the parsed info
+
+* fix: add case urls in info of profile json
+
+* fix: after jamal review
+
+* fix: remplace cv url by cv binary (#220)
+
+* fix:add some error handling for jobology connector
+
+* fix:jobology flake8 connector
+
+* fix:some type
+
+* fix:regarding jamal review
+
+* fix: remplace cv url by cv binary
+
+* docs: update docs
+
+* fix: flake8 outputs
+
+* fix: jobology catch profile
+
+* docs: update docs
+
+* fix: regarding jamal review
+
+* fix: handle possible error binasciii
+
+* fix: flake8 and docs
+
+* fix: some flake8 output
+
+* fix: correct update date for Jobology connector (#222)
+
+Co-authored-by: the-forest-tree &lt;the-forest-tree@hrflow.ai&gt;
+
+* 4.6.1
+
+Automatically generated by python-semantic-release
+
+* fix: regarding jamal review location=value
+
+---------
+
+Co-authored-by: the-forest-tree &lt;65894619+the-forest-tree@users.noreply.github.com&gt;
+Co-authored-by: the-forest-tree &lt;the-forest-tree@hrflow.ai&gt;
 Co-authored-by: hrflow-semantic-release &lt;hrflow-semantic-release&gt; ([`5b7997b`](https://github.com/Riminder/hrflow-connectors/commit/5b7997b5080b31de96d074b001369150ac95f596))
 
 
@@ -95,30 +95,30 @@ Co-authored-by: the-forest-tree &lt;the-forest-tree@hrflow.ai&gt; ([`852c6aa`](h
 
 * fix: remplace cv url by cv binary (#220)
 
-* fix:add some error handling for jobology connector
-
-* fix:jobology flake8 connector
-
-* fix:some type
-
-* fix:regarding jamal review
-
-* fix: remplace cv url by cv binary
-
-* docs: update docs
-
-* fix: flake8 outputs
-
-* fix: jobology catch profile
-
-* docs: update docs
-
-* fix: regarding jamal review
-
-* fix: handle possible error binasciii
-
-* fix: flake8 and docs
-
+* fix:add some error handling for jobology connector
+
+* fix:jobology flake8 connector
+
+* fix:some type
+
+* fix:regarding jamal review
+
+* fix: remplace cv url by cv binary
+
+* docs: update docs
+
+* fix: flake8 outputs
+
+* fix: jobology catch profile
+
+* docs: update docs
+
+* fix: regarding jamal review
+
+* fix: handle possible error binasciii
+
+* fix: flake8 and docs
+
 * fix: some flake8 output ([`b41676e`](https://github.com/Riminder/hrflow-connectors/commit/b41676ebab149f4dd1170b34f7baf3641792981a))
 
 
@@ -301,20 +301,20 @@ rule of releasing depending on commit messages ([`16844d9`](https://github.com/R
 
 * Adding new actions &#39;pull_application_list&#39; and &#39;push_score_list&#39;.  (#184)
 
-* Adding new actions &#39;pull_application_list&#39; and &#39;push_score_list&#39;. They will be used to sync applications (profiles, jobs, statuses) and synchronize scores from HrFlow.ai to a target warehouse
-
-* style: apply black formatting
-
-* test: add new pull_application_list to coherence tests
-
-* fix: use random key for backend test to avoid failure in ci
-It seems that when running multiple ci run in the same time
-race condition can occur and one test can find the result of another
-running in the same time
-
----------
-
-Co-authored-by: thomas &lt;thomas.zhu@hrflow.ai&gt;
+* Adding new actions &#39;pull_application_list&#39; and &#39;push_score_list&#39;. They will be used to sync applications (profiles, jobs, statuses) and synchronize scores from HrFlow.ai to a target warehouse
+
+* style: apply black formatting
+
+* test: add new pull_application_list to coherence tests
+
+* fix: use random key for backend test to avoid failure in ci
+It seems that when running multiple ci run in the same time
+race condition can occur and one test can find the result of another
+running in the same time
+
+---------
+
+Co-authored-by: thomas &lt;thomas.zhu@hrflow.ai&gt;
 Co-authored-by: the-forest-tree &lt;the-forest-tree@hrflow.ai&gt; ([`df7d387`](https://github.com/Riminder/hrflow-connectors/commit/df7d3874bee3bf9d991f2d45b991330684ff6c0f))
 
 

diff --git a/src/hrflow_connectors/connectors/hrflow/warehouse/profile.py b/src/hrflow_connectors/connectors/hrflow/warehouse/profile.py
@@ -198,6 +198,46 @@ def merge_info(base: dict, info: dict) -> dict:
     return base
 
 
+def merge_item(base: dict, profile: dict, item: str) -> dict:
+    if not profile.get(item):
+        return base
+
+    base[item] = profile[item]
+    return base
+
+
+def hydrate_profile(profile_parsed: dict, profile_json: dict) -> dict:
+    profile_info = profile_json.get("info", {})
+    profile_enriched = merge_info(profile_parsed, profile_info)
+
+    items_to_merge = [
+        "experiences",
+        "educations",
+        "skills",
+        "languages",
+        "certifications",
+        "interests",
+    ]
+    for item in items_to_merge:
+        profile_enriched = merge_item(profile_enriched, profile_json, item)
+
+    profile_enriched["text"] = profile_json.get("text") or profile_enriched.get("text")
+    profile_enriched["text_language"] = profile_json.get(
+        "text_language"
+    ) or profile_enriched.get("text_language")
+    profile_enriched["experiences_duration"] = (
+        profile_json.get("experiences_duration")
+        if profile_json.get("experiences_duration") is not None
+        else profile_enriched.get("experiences_duration")
+    )
+    profile_enriched["educations_duration"] = (
+        profile_json.get("educations_duration")
+        if profile_json.get("educations_duration") is not None
+        else profile_enriched.get("educations_duration")
+    )
+    return profile_enriched
+
+
 def write_parsing(
     adapter: LoggerAdapter,
     parameters: WriteProfileParsingParameters,
@@ -207,9 +247,10 @@ def write_parsing(
     hrflow_client = Hrflow(
         api_secret=parameters.api_secret, api_user=parameters.api_user
     )
-    for profile in profiles:
-        profile_info = profile.get("info", {})
 
+    source_response = hrflow_client.source.get(key=parameters.source_key)
+
+    for profile in profiles:
         if parameters.only_insert and hrflow_client.profile.indexing.get(
             source_key=parameters.source_key, reference=profile["reference"]
         ).get("data"):
@@ -219,6 +260,19 @@ def write_parsing(
             )
             continue
 
+        if profile.get("resume") is None:
+            indexing_response = hrflow_client.profile.indexing.add_json(
+                source_key=parameters.source_key, profile_json=profile
+            )
+            if indexing_response["code"] != 201:
+                adapter.error(
+                    "Failed to index profile with reference={} response={}".format(
+                        profile["reference"], indexing_response
+                    )
+                )
+                failed.append(profile)
+            continue
+
         parsing_response = hrflow_client.profile.parsing.add_file(
             source_key=parameters.source_key,
             profile_file=profile["resume"]["raw"],
@@ -236,9 +290,8 @@ def write_parsing(
             )
             failed.append(profile)
             continue
-        source_response = hrflow_client.source.get(
-            key=parameters.source_key
-        )  # Get source to check if sync_parsing is enabled
+
+        # check if sync_parsing is enabled
         if source_response["code"] != 200:
             adapter.warning(
                 "Failed to get source with key={} response={}, won't be able to update"
@@ -248,22 +301,7 @@ def write_parsing(
             )
         elif source_response["data"]["sync_parsing"] is True:
             current_profile = parsing_response["data"]["profile"]
-            profile_result = merge_info(current_profile, profile_info)
-
-            profile_result["text"] = profile.get("text") or profile_result.get("text")
-            profile_result["text_language"] = profile.get(
-                "text_language"
-            ) or profile_result.get("text_language")
-            profile_result["experiences_duration"] = (
-                profile.get("experiences_duration")
-                if profile.get("experiences_duration") is not None
-                else profile_result.get("experiences_duration")
-            )
-            profile_result["educations_duration"] = (
-                profile.get("educations_duration")
-                if profile.get("educations_duration") is not None
-                else profile_result.get("educations_duration")
-            )
+            profile_result = hydrate_profile(current_profile, profile)
 
             edit_response = hrflow_client.profile.indexing.edit(
                 source_key=parameters.source_key,