Possible to remove in-text references? #678
-
Hello, I recently started using Docling but so far have been unable to find a way to remove in-text references. For example, if I have the below section of text, is it possible to remove the numeric references? I'm planning to use this primarily on academic papers, so simply filtering out all numeric values using something like regex isn't an option, as I'd like to keep any relevant statistics and other numeric data intact. Sample text: If the "16" at the end is an in-text reference to the theoretical Smith et al. paper, is it possible to remove the 16 from the text while leaving all the other numbers intact? If this is not currently possible within Docling, would anyone know of a good workaround? Thanks in advance for any help! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
For anyone interested, I was able to use regex after all. I've tested it on a few examples for flexibility and it seems to work well. Code is below.
Testing functionality
|
Beta Was this translation helpful? Give feedback.
For anyone interested, I was able to use regex after all. I've tested it on a few examples for flexibility and it seems to work well. Code is below.