Files to replicate our main findings for our Special Topics in Natural Language Processing class
Medical Text Simplification
Made by Zachary Schultz and Nick Hankins
Abstract/Motivations:
For our project, we initially attempted to extract keywords from a set of patient transcriptions and extrapolate on that data to form treatment suggestions. However, upon further inspection of the dataset we gathered, it became apparent that the transcriptions included the treatment so it might already be primed. At that point, we thought about how we can use the data that we did have in an intuitive way. To that end, we decided that it could be interesting to simplify the field-specific jargon of the patient notes. In this case, the field-specific vocabulary is Medical terminology that is related to patient information, diagnoses, and full treatment options. What we aim to achieve is a simplification of the paragraphs as a way to understand what is being said in layman’s terms. Our goal is to ultimately see how a model can be fine tuned to an admittedly complex task due to its subjectivity and how said model can reconcile various methods with which it can be trained and tested.
Link to where the rest of the data can be found: https://github.com/pmc-patients/pmc-patients