diff --git a/docs/api.md b/docs/api.md index 4e295c8..59b9607 100644 --- a/docs/api.md +++ b/docs/api.md @@ -1,112 +1,165 @@ # API ## /entitymentions GET +### Parameters +| Parameter | Type | Description | +|-----------|--------|------------------| +| `article` | STRING | The article path | -The `/entitymentions` endpoint is a **GET** endpoint. When doing a **GET** request to the endpoint, a JSON Array is returned containing all the currently known entitymentions, their indexes and the file they originate from. The format of the JSON array is formatted as follows: +The `/entitymentions` endpoint is a **GET** endpoint. When doing a **GET** request to the endpoint, a JSON Array is returned containing all the currently known entitymentions, their indexes, type, label, iri and the file they originate from. The JSON array is formatted as follows: ```JSON -[ - { - "name": "ENTITY MENTION", - "startIndex": INT, - "endIndex":INT, - "fileName":"FILENAME.EXTENSION" - }, - { - "name": "ENTITY MENTION", - "startIndex": INT, - "endIndex":INT, - "fileName":"FILENAME.EXTENSION" - } -] +{ + "fileName": STRING, + "language": STRING, + "sentences": [ + { + "sentence": STRING, + "sentenceStartIndex": INT, + "sentenceEndIndex": INT, + "entityMentions": [ + { + "name": STRING, + "type": STRING, + "label": STRING, + "startIndex": INT, + "endIndex": INT, + "iri": STRING + } + ] + } + ] +} ``` ### Example Output -Here is an example of an output from the endpoint. For simplification, only a single file has been processed by the Entity Recognizer and Linker, and just a few of the found entity mentions is shown below: +Here is an example of an output from the endpoint `/entitymentions?article=test.txt`. For simplification, only a single file has been processed by the Entity Recognizer and Linker: ```JSON -[ - { - "name": "Martin Kjærs", - "startIndex": 28, - "endIndex": 40, - "fileName": "Artikel.txt" - }, - { - "name": "Region Nordjylland", - "startIndex": 100, - "endIndex": 118, - "fileName": "Artikel.txt" - }, - { - "name": "Aalborg", - "startIndex": 285, - "endIndex": 292, - "fileName": "Artikel.txt" - } -] +{ + "fileName": "test.txt", + "language": "en", + "sentences": [ + { + "sentence": "Hi my name is marc", + "sentenceStartIndex": 0, + "sentenceEndIndex": 47, + "entityMentions": [ + { + "name": "marc", + "type": "Entity", + "label": "GPE", + "startIndex": 14, + "endIndex": 18, + "iri": "knox-kb01.srv.aau.dk/marc" + } + ] + } + ] +} ``` -## articlename/entities GET +## /entitymentions/all GET -The `articlename/entities` endpoint is a **GET** endpoint. The articlename in the url has to replaced with a name of an article including .txt. When doing a GET request to the endpoint, a JSON Array is returned containing the currently known entitymentions found in the given article name including their indexes. The format of the JSON array is formatted as follows: +The `/entitymentions/all` endpoint is a **GET** endpoint. When doing a **GET** request to the endpoint, a JSON Array is returned containing the all articles with their currently known entitymentions found. The JSON array is formatted as follows: ```JSON [ { - "name": "ENTITY MENTION", - "startIndex": INT, - "endIndex":INT, - "fileName":"FILENAME.EXTENSION" - }, - { - "name": "ENTITY MENTION", - "startIndex": INT, - "endIndex":INT, - "fileName":"FILENAME.EXTENSION" + "fileName": STRING, + "language": STRING, + "sentences": [ + { + "sentence": STRING, + "sentenceStartIndex": INT, + "sentenceEndIndex": INT, + "entityMentions": [ + { + "name": STRING, + "type": STRING, + "label": STRING, + "startIndex": INT, + "endIndex": INT, + "iri": STRING + } + ] + } + ] } ] ``` ### Example Output -Here is an example of an output from the endpoint when getting for Artikel.txt/entities. For simplification, only a single file has been processed by the Entity Recognizer and Linker, and just a few of the found entity mentions is shown below: +Here is an example of an output from the endpoint when getting all articles. For simplification, only two files has been processed by the Entity Recognizer and Linker: ```JSON [ { - "name": "Martin Kjærs", - "startIndex": 28, - "endIndex": 40, - "fileName": "Artikel.txt" + "fileName": "test.txt", + "language": "en", + "sentences": [ + { + "sentence": "Hi my name is marc", + "sentenceStartIndex": 0, + "sentenceEndIndex": 47, + "entityMentions": [ + { + "name": "marc", + "type": "Entity", + "label": "PERSON", + "startIndex": 14, + "endIndex": 18, + "iri": "knox-kb01.srv.aau.dk/marc" + } + ] + } + ] }, { - "name": "Region Nordjylland", - "startIndex": 100, - "endIndex": 118, - "fileName": "Artikel.txt" - }, - { - "name": "Aalborg", - "startIndex": 285, - "endIndex": 292, - "fileName": "Artikel.txt" + "fileName": "test2.txt", + "language": "en", + "sentences": [ + { + "sentence": "Hi my name is joe", + "sentenceStartIndex": 0, + "sentenceEndIndex": 47, + "entityMentions": [ + { + "name": "Joe", + "type": "Entity", + "label": "PERSON", + "startIndex": 14, + "endIndex": 17, + "iri": "knox-kb01.srv.aau.dk/joe" + } + ] + } + ] } ] ``` -## detectlanguage POST +## /detectlanguage POST +This endpoint expects the given request body to contain some input text and returns its language. It uses the [langdetect](https://pypi.org/project/langdetect/) library. -This endpoint will check the language in the given text. -Send the text in the request body and it will return the language. -The given text has to be longer than 4 characters. -The function will return the lanugage in 2 charaters. +> **_NOTE:_** The function will return the language as a [ISO 639-1](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) code. ### Example -Request body: The man was walking down the street -Response: en +Request body: "The man was walking down the street"\ +Response: en + + +### Constraints +- The given text has to be longer than 4 characters. + +### Supported languages +`af, ar, bg, bn, ca, cs, cy, da, de, el, en, es, et, fa, fi, fr, gu, he, +hi, hr, hu, id, it, ja, kn, ko, lt, lv, mk, ml, mr, ne, nl, no, pa, pl, +pt, ro, ru, sk, sl, so, sq, sv, sw, ta, te, th, tl, tr, uk, ur, vi, zh-cn, zh-tw` +> **_NOTE:_** see [List of ISO 639-1 codes](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) for more information