Skip to content

Commit

Permalink
Add support for Vision BatchAnnotateFiles endpoint (GoogleCloudPlatfo…
Browse files Browse the repository at this point in the history
…rm#966)

Add support for Vision BatchAnnotateFiles endpoint

fixes GoogleCloudPlatform#844
  • Loading branch information
kateryna216 authored and GitHub committed Mar 3, 2022
1 parent b77c070 commit acbe264
Show file tree
Hide file tree
Showing 10 changed files with 342 additions and 7 deletions.
58 changes: 57 additions & 1 deletion docs/src/main/asciidoc/vision.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Spring Cloud GCP provides:

* A convenience starter which automatically configures authentication settings and client objects needed to begin using the https://cloud.google.com/vision/[Google Cloud Vision API].
* `CloudVisionTemplate` which simplifies interactions with the Cloud Vision API.
** Allows you to easily send images to the API as Spring Resources.
** Allows you to easily send images, PDF, TIFF and GIF documents to the API as Spring Resources.
** Offers convenience methods for common operations, such as classifying content of an image.
* `DocumentOcrTemplate` which offers convenient methods for running https://cloud.google.com/vision/docs/pdf[optical character recognition (OCR)] on PDF and TIFF documents.
Expand Down Expand Up @@ -120,6 +120,62 @@ public void processImage() {
}
----

=== File Analysis

The `CloudVisionTemplate` allows you to easily analyze PDF, TIFF and GIF documents; it provides the following method for interfacing with Cloud Vision:

`public AnnotateFileResponse analyzeFile(Resource fileResource, String mimeType, Feature.Type... featureTypes)`

**Parameters:**

- `Resource fileResource` refers to the Spring Resource of the PDF, TIFF or GIF object you wish to analyze.
Documents with more than 5 pages are not supported.

- `String mimeType` is the mime type of the fileResource.
Currently, only `application/pdf`, `image/tiff` and `image/gif` are supported.

- `Feature.Type... featureTypes` refers to a var-arg array of Cloud Vision Features to extract from the document.
A feature refers to a kind of image analysis one wishes to perform on a document, such as label detection, OCR recognition, facial detection, etc.
One may specify multiple features to analyze within one request.
A full list of Cloud Vision Features is provided in the https://cloud.google.com/vision/docs/features[Cloud Vision Feature docs].

**Returns:**

- https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.AnnotateFileResponse[`AnnotateFileResponse`] contains the results of all the feature analyses that were specified in the request.
For each page of the analysed document the response will contain an `AnnotateImageResponse` object which you can retrieve using `annotateFileResponse.getResponsesList()`.
For each feature type that you provide in the request, `AnnotateImageResponse` provides a getter method to get the result of that feature analysis.
For example, if you analysed an PDF using the `DOCUMENT_TEXT_DETECTION` feature, you would retrieve the results from the response using `annotateImageResponse.getFullTextAnnotation().getText()`.
+
`AnnotateFileResponse` is provided by the Google Cloud Vision libraries; please consult the https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.AnnotateFileResponse[RPC reference] or https://googleapis.dev/java/google-cloud-vision/latest/index.html?com/google/cloud/vision/v1/AnnotateFileResponse.html[Javadoc] for more details.
Additionally, you may consult the https://cloud.google.com/vision/docs/[Cloud Vision docs] to familiarize yourself with the concepts and features of the API.

==== Running Text Detection Example

https://cloud.google.com/vision/docs/file-small-batch[Detect text in files] refers to extracting text from small document such as PDF or TIFF.
Below is a code sample of how this is done using the Cloud Vision Spring Template.

[source,java]
----
@Autowired
private ResourceLoader resourceLoader;
@Autowired
private CloudVisionTemplate cloudVisionTemplate;
public void processPdf() {
Resource imageResource = this.resourceLoader.getResource("my_file.pdf");
AnnotateFileResponse response =
this.cloudVisionTemplate.analyzeFile(
imageResource, "application/pdf", Type.DOCUMENT_TEXT_DETECTION);
response
.getResponsesList()
.forEach(
annotateImageResponse ->
System.out.println(annotateImageResponse.getFullTextAnnotation().getText()));
}
----

=== Document OCR Template

The `DocumentOcrTemplate` allows you to easily run https://cloud.google.com/vision/docs/pdf[optical character recognition (OCR)] on your PDF and TIFF documents stored in your Google Storage bucket.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,4 +78,15 @@ public ModelAndView extractText(String imageUrl, ModelMap map) {

return new ModelAndView("result", map);
}

@GetMapping("/extractTextFromPdf")
public ModelAndView extractTextFromPdf(String pdfUrl, ModelMap map) {
List<String> texts =
this.cloudVisionTemplate.extractTextFromPdf(this.resourceLoader.getResource(pdfUrl));

map.addAttribute("texts", texts);
map.addAttribute("pdfUrl", pdfUrl);

return new ModelAndView("result_pdf", map);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -31,5 +31,17 @@ <h1>Text Extraction</h1>
</form>
</div>

<div>
<h1>Text Extraction PDF</h1>
<p>Read and extract the text from a small PDF (maximum 5 Pages are supported):</p>
<form action="/extractTextFromPdf">
Web URL of a PDF to analyze:
<input type="text"
name="pdfUrl"
value="https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf" />
<input type="submit" />
</form>
</div>

</body>
</html>
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
<!DOCTYPE html>
<html xmlns:th="https://www.thymeleaf.org">
<head>
<title>Google Cloud Vision Results</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>

<style>
.extracted_text {
width: 30em;
min-height: 10em;
font-family: "Lucida Console", Monaco, monospace;
background-color: #f2f2f2;
}

.embed_pdf {
width: 30em;
height: 30em;
}
</style>

<body>
<h1>PDF Analysis Results</h1>

<div th:if="${texts}">
<h3>We think the text inside the pdf is...</h3>
<div th:each="text: ${texts}">
<h2>Page: [[${textStat.index} + 1]]</h2>
<div class="extracted_text">[[${text}]]</div>
</div>
</div>

<div>
<embed th:src="${pdfUrl}" class="embed_pdf" />
</div>
</body>
</html>
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,20 @@

package com.google.cloud.spring.vision;

import com.google.cloud.vision.v1.AnnotateFileRequest;
import com.google.cloud.vision.v1.AnnotateFileResponse;
import com.google.cloud.vision.v1.AnnotateImageRequest;
import com.google.cloud.vision.v1.AnnotateImageResponse;
import com.google.cloud.vision.v1.BatchAnnotateFilesRequest;
import com.google.cloud.vision.v1.BatchAnnotateFilesResponse;
import com.google.cloud.vision.v1.BatchAnnotateImagesRequest;
import com.google.cloud.vision.v1.BatchAnnotateImagesResponse;
import com.google.cloud.vision.v1.Feature;
import com.google.cloud.vision.v1.Feature.Type;
import com.google.cloud.vision.v1.Image;
import com.google.cloud.vision.v1.ImageAnnotatorClient;
import com.google.cloud.vision.v1.ImageContext;
import com.google.cloud.vision.v1.InputConfig;
import com.google.protobuf.ByteString;
import com.google.rpc.Code;
import java.io.IOException;
Expand All @@ -41,6 +46,11 @@
*/
public class CloudVisionTemplate {

public static final String READ_BYTES_ERROR_MESSAGE =
"Failed to read bytes from provided resource.";
public static final String EMPTY_RESPONSE_ERROR_MESSAGE =
"Failed to receive valid response Vision APIs; empty response received.";

private final ImageAnnotatorClient imageAnnotatorClient;

public CloudVisionTemplate(ImageAnnotatorClient imageAnnotatorClient) {
Expand All @@ -59,6 +69,17 @@ public String extractTextFromImage(Resource imageResource) {
return extractTextFromImage(imageResource, ImageContext.getDefaultInstance());
}

/**
* Extract the text out of a pdf and return the result as a String.
*
* @param fileResource the pdf one wishes to analyze
* @return the text extracted from the pdf as a string per page
* @throws CloudVisionException if the image could not be read or if text extraction failed
*/
public List<String> extractTextFromPdf(Resource fileResource) {
return extractTextFromFile(fileResource, "application/pdf");
}

/**
* Extract the text out of an image and return the result as a String.
*
Expand All @@ -78,6 +99,35 @@ public String extractTextFromImage(Resource imageResource, ImageContext imageCon
return result;
}

/**
* Extract the text out of a file and return the result as a String.
*
* @param fileResource the file one wishes to analyze
* @param mimeType the mime type of the fileResource. Currently, only "application/pdf",
* "image/tiff" and "image/gif" are supported.
* @return the text extracted from the pdf as a string per page
* @throws CloudVisionException if the image could not be read or if text extraction failed
*/
public List<String> extractTextFromFile(Resource fileResource, String mimeType) {
AnnotateFileResponse response =
analyzeFile(fileResource, mimeType, Type.DOCUMENT_TEXT_DETECTION);

List<AnnotateImageResponse> annotateImageResponses = response.getResponsesList();
if (annotateImageResponses.isEmpty()) {
throw new CloudVisionException(EMPTY_RESPONSE_ERROR_MESSAGE);
}

List<String> result =
annotateImageResponses.stream()
.map(annotateImageResponse -> annotateImageResponse.getFullTextAnnotation().getText())
.collect(Collectors.toList());
if (result.isEmpty() && response.getError().getCode() != Code.OK.getNumber()) {
throw new CloudVisionException(response.getError().getMessage());
}

return result;
}

/**
* Analyze an image and extract the features of the image specified by {@code featureTypes}.
*
Expand Down Expand Up @@ -117,7 +167,7 @@ public AnnotateImageResponse analyzeImage(
try {
imgBytes = ByteString.readFrom(imageResource.getInputStream());
} catch (IOException ex) {
throw new CloudVisionException("Failed to read image bytes from provided resource.", ex);
throw new CloudVisionException(READ_BYTES_ERROR_MESSAGE, ex);
}

Image image = Image.newBuilder().setContent(imgBytes).build();
Expand All @@ -143,8 +193,60 @@ public AnnotateImageResponse analyzeImage(
if (!annotateImageResponses.isEmpty()) {
return annotateImageResponses.get(0);
} else {
throw new CloudVisionException(
"Failed to receive valid response Vision APIs; empty response received.");
throw new CloudVisionException(EMPTY_RESPONSE_ERROR_MESSAGE);
}
}

/**
* Analyze a file and extract the features of the image specified by {@code featureTypes}.
*
* <p>A feature describes the kind of Cloud Vision analysis one wishes to perform on a file, such
* as text detection, image labelling, facial detection, etc. A full list of feature types can be
* found in {@link Feature.Type}.
*
* @param fileResource the file one wishes to analyze. The Cloud Vision APIs support image formats
* described here: https://cloud.google.com/vision/docs/supported-files. Documents with more
* than 5 pages are not supported.
* @param mimeType the mime type of the fileResource. Currently, only "application/pdf",
* "image/tiff" and "image/gif" are supported.
* @param featureTypes the types of image analysis to perform on the image
* @return the results of file analyse
* @throws CloudVisionException if the file could not be read or if a malformed response is
* received from the Cloud Vision APIs
*/
public AnnotateFileResponse analyzeFile(
Resource fileResource, String mimeType, Feature.Type... featureTypes) {
ByteString imgBytes;
try {
imgBytes = ByteString.readFrom(fileResource.getInputStream());
} catch (IOException ex) {
throw new CloudVisionException(READ_BYTES_ERROR_MESSAGE, ex);
}

InputConfig inputConfig =
InputConfig.newBuilder().setMimeType(mimeType).setContent(imgBytes).build();

List<Feature> featureList =
Arrays.stream(featureTypes)
.map(featureType -> Feature.newBuilder().setType(featureType).build())
.collect(Collectors.toList());

BatchAnnotateFilesRequest request =
BatchAnnotateFilesRequest.newBuilder()
.addRequests(
AnnotateFileRequest.newBuilder()
.addAllFeatures(featureList)
.setInputConfig(inputConfig)
.build())
.build();

BatchAnnotateFilesResponse response = this.imageAnnotatorClient.batchAnnotateFiles(request);
List<AnnotateFileResponse> annotateFileResponses = response.getResponsesList();

if (!annotateFileResponses.isEmpty()) {
return annotateFileResponses.get(0);
} else {
throw new CloudVisionException(EMPTY_RESPONSE_ERROR_MESSAGE);
}
}
}
Loading

0 comments on commit acbe264

Please sign in to comment.