Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entities Integrity URL #1348

Merged
merged 5 commits into from
Feb 19, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 86 additions & 0 deletions docs/api.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ info:
- [RESTORE](/central-api-entity-management/#restoring-a-deleted-entity) endpoint for Entities.
- Entities that have been soft-deleted for 30 days will automatically be purged.
- [Entities Odata](/central-api-odata-endpoints/#id3) now returns `__system/deletedAt`. It can also be used in $filter, $sort and $select query parameters.
- [Integrity](/central-api-openrosa-endpoints/#openrosa-dataset-integrity-api) endpoint for the Entity list.

## ODK Central v2024.3

Expand Down Expand Up @@ -9800,6 +9801,12 @@ paths:
* The Manifest will only output information for files the server actually has in its possession. Any missing expected files will be omitted, as we cannot provide a `hash` or `downloadUrl` for them.

* For Attachments that are linked to a Dataset, the value of `hash` is calculated using the MD5 of the last updated timestamp of the Dataset, instead of the content of the Dataset.

[Offline Entities support](https://forum.getodk.org/t/openrosa-spec-proposal-support-offline-entities/48052):

* If an attachment is linked to a Dataset, then `type="entityList"` attribute is added to the `mediaFile` element.

* `integrityUrl` is also returned for the attachments that are linked to a Dataset.
operationId: OpenRosa Form Manifest API
parameters:
- name: projectId
Expand Down Expand Up @@ -9844,6 +9851,12 @@ paths:
<hash>md5:a6fdc426037143cf71cced68e2532e3c</hash>
<downloadUrl>https://your.odk.server/v1/projects/7/forms/basic/attachments/question2.jpg</downloadUrl>
</mediaFile>
<mediaFile type="entityList">
<filename>people.csv</filename>
<hash>md5:9fd39ac868eccdc0c134b3b7a6a25eb7</hash>
<downloadUrl>https://your.odk.server/v1/projects/7/forms/basic/attachments/people.csv</downloadUrl>
<integrityUrl>https://your.odk.server/v1/projects/7/datasets/people/integrity</integrityUrl>
</mediaFile>
</manifest>
403:
description: Forbidden
Expand All @@ -9857,6 +9870,79 @@ paths:
<OpenRosaResponse xmlns="http://openrosa.org/http/response" items="0">
<message nature="error">The authenticated actor does not have rights to perform that action.</message>
</OpenRosaResponse>
/v1/projects/{projectId}/datasets/{name}/integrity?id={UUIDs}:
get:
tags:
- OpenRosa Endpoints
summary: OpenRosa Dataset Integrity API
description: |-
_(introduced: version 2025.1)_

This is the fully standards-compliant implementation of the Entities Integrity API as described in [OpenRosa spec proposal: support offline Entities](https://forum.getodk.org/t/openrosa-spec-proposal-support-offline-entities/48052).

This returns the `deleted` flag of the Entities requested through `id` query parameter. If no `id` is provided then all Entities are return.
operationId: OpenRosa Form Manifest API
parameters:
- name: projectId
in: path
description: The numeric ID of the Project
required: true
schema:
type: number
example: "7"
- name: name
in: path
description: The `name` of the dataset being referenced.
required: true
schema:
type: string
example: people
- name: id
in: query
description: The comma separated UUIDs of the Entities
required: true
schema:
type: string
example: 6fdfa3b6-64fb-46cf-b98c-c92b57f914b1,97717278-2bf8-4565-88b2-711c88d66e75
- name: X-OpenRosa-Version
in: header
description: e.g. 1.0
schema:
type: string
example: "1.0"
responses:
200:
description: OK
headers:
X-OpenRosa-Version:
schema:
type: string
content:
text/xml:
example: |
<?xml version="1.0" encoding="UTF-8"?>
<data>
<entities>
<entity id="6fdfa3b6-64fb-46cf-b98c-c92b57f914b1">
<deleted>true</deleted>
</entity>
<entity id="97717278-2bf8-4565-88b2-711c88d66e75">
<deleted>false</deleted>
</entity>
</entities>
</data>
403:
description: Forbidden
headers:
X-OpenRosa-Version:
schema:
type: string
content:
text/xml:
example: |
<OpenRosaResponse xmlns="http://openrosa.org/http/response" items="0">
<message nature="error">The authenticated actor does not have rights to perform that action.</message>
</OpenRosaResponse>
/v1/test/{token}/projects/{projectId}/forms/{xmlFormId}/draft/formList:
get:
tags:
Expand Down
22 changes: 20 additions & 2 deletions lib/formats/openrosa.js
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

const { mergeRight } = require('ramda');
const { parse, render } = require('mustache');
const { attachmentToDatasetName } = require('../util/util');

////////////////////////////////////////////////////////////////////////////////
// SETUP
Expand Down Expand Up @@ -66,6 +67,9 @@ const formManifestTemplate = template(200, `<?xml version="1.0" encoding="UTF-8"
<filename>{{name}}</filename>
<hash>md5:{{openRosaHash}}</hash>
<downloadUrl>{{{domain}}}{{{basePath}}}/attachments/{{urlName}}</downloadUrl>
{{#integrityUrl}}
<integrityUrl>{{{integrityUrl}}}</integrityUrl>
{{/integrityUrl}}
</mediaFile>
{{/hasSource}}
{{/attachments}}
Expand All @@ -77,7 +81,10 @@ const formManifest = (data) => formManifestTemplate(mergeRight(data, {
attachment.with({
hasSource: attachment.blobId || attachment.datasetId,
urlName: encodeURIComponent(attachment.name),
isDataset: attachment.datasetId != null
isDataset: attachment.datasetId != null,
integrityUrl: attachment.datasetId ?
`${data.domain}${data.projectPath}/datasets/${encodeURIComponent(attachmentToDatasetName(attachment.name))}/integrity`
: null
}))
}));

Expand All @@ -87,5 +94,16 @@ const openRosaErrorTemplate = openRosaMessageBase('error');
parse(openRosaErrorTemplate);
const openRosaError = (message) => render(openRosaErrorTemplate, { message });

module.exports = { createdMessage, formList, formManifest, openRosaError };
const entityListTemplate = template(200, `<?xml version="1.0" encoding="UTF-8"?>
<data>
<entities>
{{#entities}}
<entity id="{{uuid}}">
<deleted>{{deleted}}</deleted>
</entity>
{{/entities}}
</entities>
</data>`);
const entityList = (data) => entityListTemplate(data);
module.exports = { createdMessage, formList, formManifest, openRosaError, entityList };

47 changes: 34 additions & 13 deletions lib/model/query/datasets.js
Original file line number Diff line number Diff line change
Expand Up @@ -426,21 +426,21 @@ const getPublishedBySimilarName = (projectId, name) => ({ maybeOne }) => {
////////////////////////////////////////////////////////////////////////////////
// DATASET METADATA GETTERS

const _getLinkedForms = (datasetName, projectId) => sql`
SELECT DISTINCT f."xmlFormId", coalesce(current_def.name, f."xmlFormId") "name" FROM form_attachments fa
JOIN form_defs fd ON fd.id = fa."formDefId" AND fd."publishedAt" IS NOT NULL
JOIN forms f ON f.id = fd."formId" AND f."deletedAt" IS NULL
JOIN form_defs current_def ON f."currentDefId" = current_def.id
JOIN datasets ds ON ds.id = fa."datasetId"
WHERE ds.name = ${datasetName}
AND ds."projectId" = ${projectId}
AND ds."publishedAt" IS NOT NULL
`;

// Gets the dataset information, properties (including which forms each property comes from),
// and which forms consume the dataset via CSV attachment.
const getMetadata = (dataset) => async ({ all, Datasets }) => {

const _getLinkedForms = (datasetName, projectId) => sql`
SELECT DISTINCT f."xmlFormId", coalesce(current_def.name, f."xmlFormId") "name" FROM form_attachments fa
JOIN form_defs fd ON fd.id = fa."formDefId" AND fd."publishedAt" IS NOT NULL
JOIN forms f ON f.id = fd."formId" AND f."deletedAt" IS NULL
JOIN form_defs current_def ON f."currentDefId" = current_def.id
JOIN datasets ds ON ds.id = fa."datasetId"
WHERE ds.name = ${datasetName}
AND ds."projectId" = ${projectId}
AND ds."publishedAt" IS NOT NULL
`;

const _getSourceForms = (datasetName, projectId) => sql`
SELECT DISTINCT f."xmlFormId", coalesce(fd.name, f."xmlFormId") "name" FROM datasets ds
JOIN dataset_form_defs dfd ON ds.id = dfd."datasetId"
Expand Down Expand Up @@ -489,7 +489,6 @@ const getMetadata = (dataset) => async ({ all, Datasets }) => {
};
};


////////////////////////////////////////////////////////////////////////////
// DATASET PROPERTY GETTERS

Expand Down Expand Up @@ -665,6 +664,28 @@ const getLastUpdateTimestamp = (datasetId) => ({ maybeOne }) =>
.then((t) => t.orNull())
.then((t) => (t ? t.loggedAt : null));


const canReadForOpenRosa = (auth, datasetName, projectId) => ({ oneFirst }) => oneFirst(sql`
WITH linked_forms AS (
${_getLinkedForms(datasetName, projectId)}
)
SELECT count(1) FROM linked_forms
INNER JOIN (
SELECT forms."xmlFormId" FROM forms
INNER JOIN projects ON projects.id=forms."projectId"
INNER JOIN (
SELECT "acteeId" FROM assignments
INNER JOIN (
SELECT id FROM roles WHERE verbs ? 'form.read' OR verbs ? 'open_form.read'
) AS role ON role.id=assignments."roleId"
WHERE "actorId"=${auth.actor.map((actor) => actor.id).orElse(-1)}
) AS assignment ON assignment."acteeId" IN ('*', 'form', projects."acteeId", forms."acteeId")
WHERE forms.state != 'closed'
GROUP BY forms."xmlFormId"
) AS users_forms ON users_forms."xmlFormId" = linked_forms."xmlFormId"
`)
.then(count => count > 0);

module.exports = {
createPublishedDataset, createPublishedProperty,
createOrMerge, publishIfExists,
Expand All @@ -674,5 +695,5 @@ module.exports = {
getProperties, getFieldsByFormDefId,
getDiff, update, countUnprocessedSubmissions,
getUnprocessedSubmissions,
getLastUpdateTimestamp
getLastUpdateTimestamp, canReadForOpenRosa
};
30 changes: 29 additions & 1 deletion lib/model/query/entities.js
Original file line number Diff line number Diff line change
Expand Up @@ -964,6 +964,34 @@ const purge = (force = false, projectId = null, datasetName = null, entityUuid =
SELECT COUNT(*) FROM deleted_entities`);
};

////////////////////////////////////////////////////////////////////////////////
// INTEGRITY CHECK

const idFilter = (options) => {
const query = options.ifArg('id', ids => sql`uuid IN (${sql.join(ids.split(',').map(id => sql`${id.trim()}`), sql`, `)})`);
return query.sql ? query : sql`TRUE`;
};

const _getAllEntitiesState = (datasetId, options) => sql`
SELECT uuid, "deletedAt" IS NOT NULL as deleted
FROM entities
WHERE "datasetId" = ${datasetId}
AND ${idFilter(options)}
UNION
SELECT uuid, deleted FROM (
SELECT jsonb_array_elements_text(details -> 'entityUuids') AS uuid, TRUE as deleted
FROM audits
JOIN datasets ON datasets."acteeId" = audits."acteeId"
WHERE action = 'entity.purge'
AND datasets.id = ${datasetId}
) purged
WHERE ${idFilter(options)}
-- union with not approved
`;

const getEntitiesState = (datasetId, options = QueryOptions.none) =>
({ all }) => all(_getAllEntitiesState(datasetId, options));

module.exports = {
createNew, _processSubmissionEvent,
createSource,
Expand All @@ -980,5 +1008,5 @@ module.exports = {
countByDatasetId, getById, getDef,
getAll, getAllDefs, del,
createEntitiesFromPendingSubmissions,
resolveConflict, restore, purge
resolveConflict, restore, purge, getEntitiesState
};
19 changes: 18 additions & 1 deletion lib/resources/datasets.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,15 @@
// except according to the terms contained in the LICENSE file.

const sanitize = require('sanitize-filename');
const { getOrNotFound } = require('../util/promise');
const { getOrNotFound, reject } = require('../util/promise');
const { streamEntityCsv } = require('../data/entity');
const { validateDatasetName, validatePropertyName } = require('../data/dataset');
const { contentDisposition, success, withEtag } = require('../util/http');
const { md5sum } = require('../util/crypto');
const { Dataset } = require('../model/frames');
const Problem = require('../util/problem');
const { QueryOptions } = require('../util/db');
const { entityList } = require('../formats/openrosa');

module.exports = (service, endpoint) => {
service.get('/projects/:id/datasets', endpoint(({ Projects, Datasets }, { auth, params, queryOptions }) =>
Expand Down Expand Up @@ -102,4 +103,20 @@ module.exports = (service, endpoint) => {

return withEtag(serverEtag, csv);
}));

service.get('/projects/:projectId/datasets/:name/integrity', endpoint.openRosa(async ({ Datasets, Entities }, { params, auth, queryOptions }) => {
const dataset = await Datasets.get(params.projectId, params.name, true).then(getOrNotFound);

// Anyone with the verb `entity.list` or anyone with read access on a Form
// that consumes this dataset can call this endpoint.
const canAccessEntityList = await auth.can('entity.list', dataset);
if (!canAccessEntityList) {
await Datasets.canReadForOpenRosa(auth, params.name, params.projectId)
.then(canAccess => canAccess || reject(Problem.user.insufficientRights()));
}

const entities = await Entities.getEntitiesState(dataset.id, queryOptions.allowArgs('id'));

return entityList({ entities });
}));
};
10 changes: 7 additions & 3 deletions lib/resources/forms.js
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ const { sanitizeFieldsForOdata, setVersion } = require('../data/schema');
const { getOrNotFound, reject, resolve, rejectIf } = require('../util/promise');
const { success } = require('../util/http');
const { formList, formManifest } = require('../formats/openrosa');
const { noargs, isPresent, isBlank } = require('../util/util');
const { noargs, isPresent, isBlank, attachmentToDatasetName } = require('../util/util');
const { streamEntityCsvAttachment } = require('../data/entity');
const { md5sum } = require('../util/crypto');

Expand Down Expand Up @@ -226,7 +226,7 @@ module.exports = (service, endpoint) => {
.then(getOrNotFound)
.then((form) => auth.canOrReject('form.update', form))
.then((form) => Promise.all([
Datasets.get(params.projectId, params.name.replace(/\.csv$/i, ''), true)
Datasets.get(params.projectId, attachmentToDatasetName(params.name), true)
.then(getOrNotFound)
.then((dataset) => auth.canOrReject('entity.list', dataset)),
FormAttachments.getByFormDefIdAndName(form.draftDefId, params.name).then(getOrNotFound)
Expand Down Expand Up @@ -293,7 +293,11 @@ module.exports = (service, endpoint) => {
.then((form) => canReadForm(auth, form))
.then((form) => FormAttachments.getAllByFormDefIdForOpenRosa(form.def.id)
.then((attachments) =>
formManifest({ attachments, basePath: path.resolve(originalUrl, '..'), domain: env.domain })))));
formManifest({ attachments,
basePath: path.resolve(originalUrl, '..'),
domain: env.domain,
projectPath: originalUrl.match(/^\/v1\/(.*\/)?projects\/\d+/)[0] }
)))));

////////////////////////////////////////
// READ-ONLY ATTACHMENT ENDPOINTS
Expand Down
3 changes: 2 additions & 1 deletion lib/util/util.js
Original file line number Diff line number Diff line change
Expand Up @@ -80,12 +80,13 @@ function utf8ToBase64(string) {
// so let's just make our own.
const construct = (Type) => (x, y) => new Type(x, y);

const attachmentToDatasetName = (attachmentName) => attachmentName.replace(/\.csv$/i, '');

module.exports = {
noop, noargs,
isBlank, isPresent, blankStringToNull, sanitizeOdataIdentifier,
printPairs, without, pickAll,
base64ToUtf8, utf8ToBase64,
construct
construct, attachmentToDatasetName
};

3 changes: 2 additions & 1 deletion package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,7 @@
"should": "~13",
"streamtest": "~1.2",
"supertest": "^6.3.3",
"tmp": "~0.2"
"tmp": "~0.2",
"xml2js": "^0.5.0"
}
}
Loading
Loading