Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FormRecognizer] Fix an issue accessing undefined object #8748

Merged
merged 4 commits into from
May 15, 2020
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions sdk/formrecognizer/ai-form-recognizer/.vscode/launch.json
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
"<node_internals>/**"
],
"program": "${file}",
"cwd": "${fileDirname}",
"outFiles": [
"${workspaceFolder}/dist/**/*.js"
]
Expand Down
2 changes: 2 additions & 0 deletions sdk/formrecognizer/ai-form-recognizer/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

## 1.0.0-preview.3 (Unreleased)

- Blank pages in receipt recognition are now handled properly.

## 1.0.0-preview.2 (2020-05-06)

- `FormTrainingClient.listModels()` now works correctly on NodeJs v8.
Expand Down

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -580,7 +580,10 @@ export interface USReceiptItem {
}

// @public (undocumented)
export type USReceiptType = "unrecognized" | "itemized" | "creditCard" | "gas" | "parking";
export type USReceiptType = {
type: "unrecognized" | "itemized" | "creditCard" | "gas" | "parking";
confidence?: number;
};

// @public (undocumented)
export type ValueTypes = "string" | "date" | "time" | "phoneNumber" | "number" | "integer" | "array" | "object";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ async function main() {

const usReceipt = response.receipts[0];
console.log("First receipt:");
console.log(`Receipt type: ${usReceipt.receiptType}`);
console.log(`Receipt type: ${usReceipt.receiptType.type} with confidence ${usReceipt.receiptType.confidence}`);
console.log(
`Merchant Name: ${usReceipt.merchantName.value} (confidence: ${usReceipt.merchantName.confidence})`
);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ async function main() {

const usReceipt = response.receipts[0];
console.log("First receipt:");
console.log(`Receipt type: ${usReceipt.receiptType}`);
console.log(`Receipt type: ${usReceipt.receiptType.type} with confidence ${usReceipt.receiptType.confidence}`);
console.log(
`Merchant Name: ${usReceipt.merchantName.value} (confidence: ${usReceipt.merchantName.confidence})`
);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ export async function main() {

const usReceipt = response.receipts[0];
console.log("First receipt:");
console.log(`Receipt type: ${usReceipt.receiptType}`);
console.log(`Receipt type: ${usReceipt.receiptType.type} with confidence ${usReceipt.receiptType.confidence}`);
console.log(
`Merchant Name: ${usReceipt.merchantName.value} (confidence: ${usReceipt.merchantName.confidence})`
);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ export async function main() {

const usReceipt = response.receipts[0];
console.log("First receipt:");
console.log(`Receipt type: ${usReceipt.receiptType}`);
console.log(`Receipt type: ${usReceipt.receiptType.type} with confidence ${usReceipt.receiptType.confidence}`);
console.log(
`Merchant Name: ${usReceipt.merchantName.value} (confidence: ${usReceipt.merchantName.confidence})`
);
Expand Down
8 changes: 7 additions & 1 deletion sdk/formrecognizer/ai-form-recognizer/src/models.ts
Original file line number Diff line number Diff line change
Expand Up @@ -504,7 +504,13 @@ export interface USReceiptItem {
totalPrice?: FormField;
}

export type USReceiptType = "unrecognized" | "itemized" | "creditCard" | "gas" | "parking";
export type USReceiptType = {
type: "unrecognized" | "itemized" | "creditCard" | "gas" | "parking";
/**
* Confidence value.
*/
confidence?: number;
}

/**
* United States receipt
Expand Down
24 changes: 17 additions & 7 deletions sdk/formrecognizer/ai-form-recognizer/src/transforms.ts
Original file line number Diff line number Diff line change
Expand Up @@ -397,16 +397,20 @@ function toRecognizedReceipt(result: DocumentResultModel, pages: FormPage[]): Re
const form = toRecognizedForm(result, pages);
return {
recognizedForm: form,
locale: undefined
locale: undefined // in the future service would return locale info
};
}

function toReceiptType(type: FormField): USReceiptType {
if (type.valueType === "string" && type.value === "Itemized") {
return "itemized";
} else {
return "unrecognized";
if (type.valueType === "string") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get a little bit spooked out by this block. The type signature suggests that "gas" and "parking" are also valid values of type, and it feels like we're doing too much to transpose FormField into USReceiptType here. Is there any reason we couldn't return type.valueType in all cases where it's defined and then use "unrecognized" otherwise?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed the confidence field on receipt type, it will be included in another PR so we will be returning `{ confidence: 1, type: "itemized" }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh it's in this PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the valueType for a ReceiptType field should be "string" and the value of the field type.value is the receipt category, which is capitalized and I want to transform it into the camelCase that we normally use in TypeScript.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are fine with capitalized string literals for receipt types then yes we can return them directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm returning type.value which has a string type as stronger receipt type might not be valid here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like TypeScript compiler can correctly narrow down the to the valid string literals. I refactored this to use a switch statement. Hopefully it reads better.

if (type.value === "Itemized") {
return { confidence: type.confidence, type: "itemized" };
} else if (type.value === "CreditCard") {
return { confidence: type.confidence, type: "creditCard" };
}
}

return { confidence: type.confidence, type: "unrecognized"};
}

function toUSReceiptItems(items: ReceiptItemArrayField): USReceiptItem[] {
Expand Down Expand Up @@ -470,7 +474,7 @@ function toUSReceipt(receipt: RecognizedReceipt): ReceiptWithLocale {
return {
locale: "US",
recognizedForm: receipt.recognizedForm,
items: toUSReceiptItems((form.fields["Items"] as unknown) as ReceiptItemArrayField),
items: form.fields["Items"] ? toUSReceiptItems((form.fields["Items"] as unknown) as ReceiptItemArrayField) : [],
merchantAddress: form.fields["MerchantAddress"],
merchantName: form.fields["MerchantName"],
merchantPhoneNumber: form.fields["MerchantPhoneNumber"],
Expand Down Expand Up @@ -506,11 +510,17 @@ export function toReceiptResultResponse(
return common;
}

if (!result.analyzeResult) {
throw new Error("Expecting valid analyzeResult from the service response")
}

const pages = result.analyzeResult!.readResults.map(toFormPage);
return {
...common,
version: result.analyzeResult!.version,
receipts: result.analyzeResult!.documentResults!.map((d) => {
receipts: result.analyzeResult!.documentResults!.filter(d => {
return !!d.fields
}).map((d) => {
const receipt = toRecognizedReceipt(d, pages);
return toReceiptWithLocale({ ...receipt, locale: "US" }); // default to US until service returns locale info.
})
Expand Down
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ describe("FormRecognizerClient NodeJS only", () => {
const usReceipt = response!.receipts![0];
assert.equal(usReceipt.recognizedForm.formType, "prebuilt:receipt");
assert.equal(usReceipt.locale, "US"); // default to "US" for now
assert.equal(usReceipt.receiptType, "itemized");
assert.equal(usReceipt.receiptType.type, "itemized");
assert.equal(usReceipt.locale, "US");
assert.ok(usReceipt.tax, "Expecting valid 'tax' field");
assert.equal(usReceipt.tax!.name, "Tax");
Expand Down Expand Up @@ -185,4 +185,27 @@ describe("FormRecognizerClient NodeJS only", () => {
const usReceipt = response!.receipts![0];
assert.equal(usReceipt.recognizedForm.formType, "prebuilt:receipt");
});

it("recognizes multi-page receipt with blank page", async () => {
const filePath = path.join(ASSET_PATH, "receipt", "multipage_invoice1.pdf");
const stream = fs.createReadStream(filePath);

const poller = await client.beginRecognizeReceipts(stream, "application/pdf", {
includeTextDetails: true
});
await poller.pollUntilDone();
const response = poller.getResult();

assert.ok(response, "Expect valid response object");
assert.equal(response!.status, "succeeded");
assert.ok(
response!.receipts && response!.receipts.length > 0,
`Expect no-empty pages but got ${response!.receipts}`
);
const usReceipt = response!.receipts![0];
assert.equal(usReceipt.recognizedForm.formType, "prebuilt:receipt");
assert.equal(usReceipt.locale, "US"); // default to "US" for now
assert.equal(usReceipt.receiptType.type, "itemized");
assert.equal(usReceipt.locale, "US");
});
}).timeout(60000);
22 changes: 19 additions & 3 deletions sdk/formrecognizer/ai-form-recognizer/test/transforms.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,17 @@ import {
toFormTable,
toRecognizeFormResultResponse,
toReceiptResultResponse,
toFormModelResponse
toFormModelResponse,
toRecognizedForm
} from "../src/transforms";
import {
GeneratedClientGetAnalyzeFormResultResponse as GetAnalyzeFormResultResponse,
GeneratedClientGetAnalyzeReceiptResultResponse as GetAnalyzeReceiptResultResponse,
GeneratedClientGetCustomModelResponse as GetCustomModelResponse,
ReadResult as ReadResultModel,
FieldValue as FieldValueModel,
DataTable as DataTableModel
DataTable as DataTableModel,
DocumentResult as DocumentResultModel
} from "../src/generated/models";
import {
StringFieldValue,
Expand Down Expand Up @@ -456,6 +458,19 @@ describe("Transforms", () => {
assert.equal(transformed.rows[2].cells[0].columnSpan, 2);
});

it("toRecognizedForm() should handle empty page", () => {
const original: DocumentResultModel = {
docType: "prebuilt:receipt",
pageRange: [1, 1],
fields: {}
}

const transformed = toRecognizedForm(original, formPages);

assert.ok(transformed, "Expected valid recognized form");
assert.deepStrictEqual(transformed.fields, {}, "expected empty fields in recognzied form");
});

it("toRecognizeFormResultResponse() converts unsupervised response into recognized forms", () => {
const original: GetAnalyzeFormResultResponse = JSON.parse(unsupervisedResponseString);
const transformed = toRecognizeFormResultResponse(original);
Expand Down Expand Up @@ -554,7 +569,8 @@ describe("Transforms", () => {
const receiptResult = toReceiptResultResponse(original);
const usReceipt = receiptResult.receipts![0];

assert.equal(usReceipt.receiptType, "itemized");
assert.equal(usReceipt.receiptType.confidence, 0.692);
assert.equal(usReceipt.receiptType.type, "itemized");
assert.equal(usReceipt.locale, "US");
assert.ok(usReceipt.tax, "Expecting valid 'tax' field");
assert.equal(usReceipt.tax!.name, "Tax");
Expand Down