Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: [feature] RPPS anonymization #14

Open
Pierre-Auguste-Beaucote opened this issue Jul 9, 2024 · 2 comments
Open

Feature request: [feature] RPPS anonymization #14

Pierre-Auguste-Beaucote opened this issue Jul 9, 2024 · 2 comments

Comments

@Pierre-Auguste-Beaucote
Copy link

Feature type

Additional entity

Description

👋 Congrats for the repo, it is a crucial topic ! Do you plan on adding RPPS (doctors national identifiers) in the entity list ?

Preventing health professional's identification in health data is increasing patient's privacy, and it is also protecting health professional's privacy !

@percevalw
Copy link
Member

percevalw commented Jul 9, 2024

Hi @Pierre-Auguste-Beaucote !
We have not annotated RRPS ids in our private AP-HP dataset so we have currently not way of evaluating the RPPS matching performance on real data. This said, in our documents, most RPPS seem to follow the following format:

  • RPPS = 10000XXXXX
  • or N° RPPS : 10000YYYYY

so adding a regular expression (e.g. in https://github.com/aphp/eds-pseudo/blob/main/eds_pseudo/pipes/pseudonymisation/patterns.py) should do the trick in most cases.

We can also annotate the fictitious templates and even add hard samples that would be difficult to match with a regular expression:

My doctor national identifier is the following:
10 00000 000

Do you have a use case in mind and/or some public/private documents that could be used to build/evaluate this matcher ?

@Pierre-Auguste-Beaucote
Copy link
Author

Thanks for the quick answer, the RPPS is sometimes present on medical reports, always on prescriptions, referral letters, transportation vouchers, as well as most documents where there is a doctor's stamp.

Unfortunately I don't have such dataset !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants