-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Add saturation threshold option for low contrast tables #203
base: master
Are you sure you want to change the base?
[MRG] Add saturation threshold option for low contrast tables #203
Conversation
Codecov Report
@@ Coverage Diff @@
## master #203 +/- ##
==========================================
- Coverage 88.26% 87.65% -0.61%
==========================================
Files 14 14
Lines 1542 1555 +13
Branches 350 351 +1
==========================================
+ Hits 1361 1363 +2
- Misses 127 137 +10
- Partials 54 55 +1
Continue to review full report at Codecov.
|
@NoReflex Thanks for the PR! The results look great! I don't have enough image processing and opencv background so will have to read up on the |
@vinayak-mehta The cv2.COLOR_BGR2HSV is just a colorspace transformation from RGB to HSV (Hue, Saturation, Value). EDIT: By failing, I just mean that the result will be worse than using the plain option, the code is pretty bulletproof, it's just some simple numpy array transformations. |
Hey! As camelot is dead, we try to build a maintained fork at Do you want to open the PR against that branch so that we can merge your improvement? |
I found out this problem while trying to parse a table with low contrast background color. The -back option didn't work for low contrast areas such as the last row. So, I've added a new option -color (--process_color_background) which increases the contrast to guarantee accurate table parsing.
Here's camelot (master) result:

Here's my branch with -color option enabled:

As you can see, we add another step which is basically a binary threshold for low saturation vs no saturation.
Now the borders are way more pronounced and camelot has no issue detecting all the rows.