Add support for Vision BatchAnnotateFiles endpoint #966

florian-3ap · 2022-02-28T15:01:30Z

fixes #844

meltsufin

Thanks for the contribution! It looks very reasonable. I just added a few minor comments.

spring-cloud-gcp-vision/src/main/java/com/google/cloud/spring/vision/CloudVisionTemplate.java

meltsufin · 2022-02-28T18:37:57Z

spring-cloud-gcp-vision/src/main/java/com/google/cloud/spring/vision/CloudVisionTemplate.java

+  public String extractTextFromFile(Resource fileResource, String mimeType) {
+    AnnotateFileResponse response = analyzeFile(fileResource, mimeType, Type.TEXT_DETECTION);
+
+    List<AnnotateImageResponse> annotateImageResponses = response.getResponsesList();


The 10 lines of code below are the same as what we need to do for extractTextFromImage and should be refactored into a shared private utility method.

Would you be able to address this comment?

With the latest changes, the 10 lines below are not the exact same anymore. The return value changed now to a list of strings instead of a single string. Do you think there are still some lines left that are worth moving into a separate utility method?

This actually got me thinking, for the sake of simplicity of use, maybe we should just return a single concatenated string, separated by newlines, or maybe a customizable separator that can be configured on the class level. What do you think?

In my opinion, this really depends on what the caller is doing with the extracted text. If you want to process the text page by page it does not make too much sense to merge it into one string. But by returning a list of strings you leave the option open to the caller how he then wants to handle the response.

I was just wondering if the most common usecase would be to process only 1 page anyway. In that case, a list would make it a bit more verbose to use. Which option seems more convenient for your usecase, for example?

I would say, that processing one-page PDFs are more common than processing multiple pages. In our current usecase we need to be able to process both (single and multi-page) documents.

zhumin8

Thanks for contributing! Code looks good in general.
I only have one concern about extractTextFromFile() that file can have multiple pages, details in comment below.
Additionally, I would suggest to add an integration test or a case in the existing sample code if possible.

spring-cloud-gcp-vision/src/main/java/com/google/cloud/spring/vision/CloudVisionTemplate.java

...-cloud-gcp-vision/src/test/java/com/google/cloud/spring/vision/CloudVisionTemplateTests.java

florian-3ap · 2022-03-01T09:49:27Z

Thanks for contributing! Code looks good in general.
I only have one concern about extractTextFromFile() that file can have multiple pages, details in comment below.
Additionally, I would suggest to add an integration test or a case in the existing sample code if possible.

I enhanced the spring-cloud-gcp-vision-api-sample with a PDF example. It's now working with single-page and also multi-page PDFs.

meltsufin

I think the code looks good, except for one minor comment that's unresolved.

If it's not too much to ask, two more things would be ideal:

Can you add an integration test for the sample endpoint?
Reference documentation in vision.adoc should be updated to describe the new functionality.

Thanks!

meltsufin · 2022-03-01T18:48:15Z

spring-cloud-gcp-vision/src/main/java/com/google/cloud/spring/vision/CloudVisionTemplate.java

+  public String extractTextFromFile(Resource fileResource, String mimeType) {
+    AnnotateFileResponse response = analyzeFile(fileResource, mimeType, Type.TEXT_DETECTION);
+
+    List<AnnotateImageResponse> annotateImageResponses = response.getResponsesList();


Would you be able to address this comment?

zhumin8

The added sample looks great. Made a few minor comments below.
Can you please also add to the existing Vision doc here with the newly added functionalities in CloudVisionTemplate? This way they are discoverable by more users.

spring-cloud-gcp-vision/src/main/java/com/google/cloud/spring/vision/CloudVisionTemplate.java

...ng-cloud-gcp-samples/spring-cloud-gcp-vision-api-sample/src/main/resources/static/index.html

docs/src/main/asciidoc/vision.adoc

meltsufin · 2022-03-02T15:20:03Z

spring-cloud-gcp-vision/src/main/java/com/google/cloud/spring/vision/CloudVisionTemplate.java

+  public String extractTextFromFile(Resource fileResource, String mimeType) {
+    AnnotateFileResponse response = analyzeFile(fileResource, mimeType, Type.TEXT_DETECTION);
+
+    List<AnnotateImageResponse> annotateImageResponses = response.getResponsesList();


This actually got me thinking, for the sake of simplicity of use, maybe we should just return a single concatenated string, separated by newlines, or maybe a customizable separator that can be configured on the class level. What do you think?

…detection

meltsufin

@florian-3ap Thank you so much for this contribution and your patience with the reviews!

sonarqubecloud · 2022-03-03T16:41:50Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

86.4% Coverage
0.0% Duplication

zhumin8

Doc addition looks great to me! Thank you for contributing!

…rm#966) Add support for Vision BatchAnnotateFiles endpoint fixes GoogleCloudPlatform#844

File was created in #966, most recent change #1871.

Add support for Vision BatchAnnotateFiles endpoint

7b52844

meltsufin requested changes Feb 28, 2022

View reviewed changes

zhumin8 requested changes Feb 28, 2022

View reviewed changes

spring-cloud-gcp-vision/src/main/java/com/google/cloud/spring/vision/CloudVisionTemplate.java Outdated Show resolved Hide resolved

...-cloud-gcp-vision/src/test/java/com/google/cloud/spring/vision/CloudVisionTemplateTests.java Outdated Show resolved Hide resolved

return extracted text for all pages and add simple example code

082f62e

florian-3ap added 2 commits March 1, 2022 10:53

remove properties

cd2ec65

create const for error message

c1bf6f4

meltsufin requested changes Mar 1, 2022

View reviewed changes

zhumin8 requested changes Mar 1, 2022

View reviewed changes

Add CloudVisionTemplateIntegrationTests and enhance documentation

06e7865

meltsufin requested changes Mar 2, 2022

View reviewed changes

rename documentation section title and add small example of PDF text …

ca1a668

…detection

meltsufin approved these changes Mar 2, 2022

View reviewed changes

zhumin8 approved these changes Mar 3, 2022

View reviewed changes

zhumin8 merged commit 63182a5 into GoogleCloudPlatform:main Mar 3, 2022

kateryna216 added a commit to kateryna216/spring-cloud-gcp that referenced this pull request Oct 20, 2022

Add support for Vision BatchAnnotateFiles endpoint (GoogleCloudPlatfo…

acbe264

…rm#966) Add support for Vision BatchAnnotateFiles endpoint fixes GoogleCloudPlatform#844

zhumin8 mentioned this pull request May 17, 2023

chore: add missing license header #1872

Merged

zhumin8 added a commit that referenced this pull request May 18, 2023

chore: add missing license header (#1872)

998e8c8

File was created in #966, most recent change #1871.

zhumin8 added a commit that referenced this pull request Jun 5, 2023

chore: add missing license header (#1872)

967acd9

File was created in #966, most recent change #1871.

zhumin8 mentioned this pull request Jun 5, 2023

chore: add missing license header (backport #1872) #1926

Merged

zhumin8 added a commit that referenced this pull request Jun 5, 2023

chore: add missing license header (backport of #1872) (#1926)

cb9098f

File was created in #966, most recent change #1871.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Vision BatchAnnotateFiles endpoint #966

Add support for Vision BatchAnnotateFiles endpoint #966

florian-3ap commented Feb 28, 2022

meltsufin left a comment

meltsufin Feb 28, 2022

meltsufin Mar 1, 2022

florian-3ap Mar 1, 2022

meltsufin Mar 2, 2022

florian-3ap Mar 2, 2022

meltsufin Mar 2, 2022

florian-3ap Mar 2, 2022

zhumin8 left a comment

florian-3ap commented Mar 1, 2022 •

edited

Loading

meltsufin left a comment

meltsufin Mar 1, 2022

zhumin8 left a comment

meltsufin Mar 2, 2022

meltsufin left a comment

sonarqubecloud bot commented Mar 3, 2022

zhumin8 left a comment

Add support for Vision BatchAnnotateFiles endpoint #966

Add support for Vision BatchAnnotateFiles endpoint #966

Conversation

florian-3ap commented Feb 28, 2022

meltsufin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhumin8 left a comment

Choose a reason for hiding this comment

florian-3ap commented Mar 1, 2022 • edited Loading

meltsufin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhumin8 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

meltsufin left a comment

Choose a reason for hiding this comment

sonarqubecloud bot commented Mar 3, 2022

zhumin8 left a comment

Choose a reason for hiding this comment

florian-3ap commented Mar 1, 2022 •

edited

Loading