Skip to content

Commit

Permalink
fix: remplace cv url by cv binary (#220)
Browse files Browse the repository at this point in the history
* fix:add some error handling for jobology connector

* fix:jobology flake8 connector

* fix:some type

* fix:regarding jamal review

* fix: remplace cv url by cv binary

* docs: update docs

* fix: flake8 outputs

* fix: jobology catch profile

* docs: update docs

* fix: regarding jamal review

* fix: handle possible error binasciii

* fix: flake8 and docs

* fix: some flake8 output
  • Loading branch information
Abdellahitech authored Feb 12, 2024
1 parent 4785086 commit b41676e
Show file tree
Hide file tree
Showing 3 changed files with 33 additions and 18 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ We invite developers to join us in our mission to bring AI and data integration
| **Indeed** | Job Board | 🎯 | | |
| **Inzojob** | Job Board | 🎯 | | |
| **Jobijoba** | Job Board | 🎯 | | |
| [**Jobology**](./src/hrflow_connectors/connectors/jobology/README.md) | Job Board | :white_check_mark: | *21/12/2022* | *08/01/2024* | :x: | :x: | :x: | :x: |
| [**Jobology**](./src/hrflow_connectors/connectors/jobology/README.md) | Job Board | :white_check_mark: | *21/12/2022* | *10/02/2024* | :x: | :x: | :x: | :x: |
| **Jobrapido** | Job Board | 🎯 | | |
| **JobTeaser** | Job Board | 🎯 | | |
| **Jobtransport** | Job Board | 🎯 | | |
Expand Down
8 changes: 4 additions & 4 deletions src/hrflow_connectors/connectors/jobology/test-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ warehouse:
"lastName":"Mezid",
"phone":"0611423374",
"email":"pierre.dupont@mailprovider.com",
"cvUrl":"https://riminder-documents-eu-2019-12.s3-eu-west-1.amazonaws.com/teams/fc9d40fd60e679119130ea74ae1d34a3e22174f2/sources/e6159e65395995ae9f69a90ce916f7b6f90823dd/profiles/ab1c521d49b73d321bd54f6f6d3a4d81a31835eb/parsing/original.pdf",
"cvBase64": b"JVBERi0xLjcNCiW1tb...",
"coverText":"Ceci est le texte de motivation.\nCe champ peut contenir des retours à la ligne.",
"profilecountry": "France",
"profileregions": "Ile-de-france",
Expand All @@ -29,7 +29,7 @@ actions:
"lastName":"Mezid",
"phone":"0611423374",
"email":"pierre.dupont@mailprovider.com",
"cvUrl":"https://riminder-documents-eu-2019-12.s3-eu-west-1.amazonaws.com/teams/fc9d40fd60e679119130ea74ae1d34a3e22174f2/sources/e6159e65395995ae9f69a90ce916f7b6f90823dd/profiles/ab1c521d49b73d321bd54f6f6d3a4d81a31835eb/parsing/original.pdf",
"cvBase64": b"JVBERi0xLjcNCiW1tb...",
"coverText":"Ceci est le texte de motivation.\nCe champ peut contenir des retours à la ligne.",
"profilecountry": "France",
"profileregions": "Ile-de-france",
Expand All @@ -53,7 +53,7 @@ actions:
"lastName":"Mezid",
"phone":"0611423374",
"email":"pierre.dupont@mailprovider.com",
"cvUrl":"https://riminder-documents-eu-2019-12.s3-eu-west-1.amazonaws.com/teams/fc9d40fd60e679119130ea74ae1d34a3e22174f2/sources/e6159e65395995ae9f69a90ce916f7b6f90823dd/profiles/ab1c521d49b73d321bd54f6f6d3a4d81a31835eb/parsing/original.pdf",
"cvBase64": b"JVBERi0xLjcNCiW1tb...",
"coverText":"Ceci est le texte de motivation.\nCe champ peut contenir des retours à la ligne.",
"profilecountry": "France",
"profileregions": "Ile-de-france",
Expand All @@ -78,7 +78,7 @@ actions:
"lastName":"Mezid",
"phone":"0611423374",
"email":"pierre.dupont@mailprovider.com",
"cvUrl":"https://riminder-documents-eu-2019-12.s3-eu-west-1.amazonaws.com/teams/fc9d40fd60e679119130ea74ae1d34a3e22174f2/sources/e6159e65395995ae9f69a90ce916f7b6f90823dd/profiles/ab1c521d49b73d321bd54f6f6d3a4d81a31835eb/parsing/original.pdf",
"cvBase64": b"JVBERi0xLjcNCiW1tb...",
"coverText":"Ceci est le texte de motivation.\nCe champ peut contenir des retours à la ligne.",
"profilecountry": "France",
"profileregions": "Ile-de-france",
Expand Down
41 changes: 28 additions & 13 deletions src/hrflow_connectors/connectors/jobology/warehouse.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
import base64
import typing as t
from logging import LoggerAdapter

import requests
import magic
from pydantic import Field

from hrflow_connectors.core import (
Expand All @@ -22,25 +23,39 @@ class ReadProfilesParameters(ParametersModel):
)


def get_content_type(binary_data: bytes):
mime = magic.Magic(mime=True)
content_type = mime.from_buffer(binary_data)
if not content_type:
return "application/octet-stream"
return content_type


def read(
adapter: LoggerAdapter,
parameters: ReadProfilesParameters,
read_mode: t.Optional[ReadMode] = None,
read_from: t.Optional[str] = None,
) -> t.Iterable[t.Dict]:
result = {**parameters.profile}
cv_url = result["cvUrl"]
response = requests.get(cv_url)
if response.status_code == 200:
result["cv"] = response.content
result["content_type"] = response.headers["Content-Type"]
elif response.status_code == 400:
raise Exception(f"Bad Request {response.text}")
else:
raise Exception(
f"request failed with status code {response.status_code} and message"
f" {response.text}"
)
cv_base64 = result.get("cvBase64")

if cv_base64 is None:
raise ValueError("No base64 string provided for CV.")

try:
binary_data = base64.b64decode(cv_base64)
except base64.binascii.Error:
padding_needed = 4 - (len(cv_base64) % 4)
if padding_needed != 4:
cv_base64 += "=" * padding_needed
binary_data = base64.b64decode(cv_base64)

if not binary_data:
raise Exception("Error decoding base64 string")
content_type = get_content_type(binary_data)
result["cv"] = binary_data
result["content_type"] = content_type

return [result]

Expand Down

0 comments on commit b41676e

Please sign in to comment.