diff --git a/docs/api.md b/docs/api.md
index 4e295c8..59b9607 100644
--- a/docs/api.md
+++ b/docs/api.md
@@ -1,112 +1,165 @@
# API
## /entitymentions GET
+### Parameters
+| Parameter | Type | Description |
+|-----------|--------|------------------|
+| `article` | STRING | The article path |
-The `/entitymentions` endpoint is a **GET** endpoint. When doing a **GET** request to the endpoint, a JSON Array is returned containing all the currently known entitymentions, their indexes and the file they originate from. The format of the JSON array is formatted as follows:
+The `/entitymentions` endpoint is a **GET** endpoint. When doing a **GET** request to the endpoint, a JSON Array is returned containing all the currently known entitymentions, their indexes, type, label, iri and the file they originate from. The JSON array is formatted as follows:
```JSON
-[
- {
- "name": "ENTITY MENTION",
- "startIndex": INT,
- "endIndex":INT,
- "fileName":"FILENAME.EXTENSION"
- },
- {
- "name": "ENTITY MENTION",
- "startIndex": INT,
- "endIndex":INT,
- "fileName":"FILENAME.EXTENSION"
- }
-]
+{
+ "fileName": STRING,
+ "language": STRING,
+ "sentences": [
+ {
+ "sentence": STRING,
+ "sentenceStartIndex": INT,
+ "sentenceEndIndex": INT,
+ "entityMentions": [
+ {
+ "name": STRING,
+ "type": STRING,
+ "label": STRING,
+ "startIndex": INT,
+ "endIndex": INT,
+ "iri": STRING
+ }
+ ]
+ }
+ ]
+}
```
### Example Output
-Here is an example of an output from the endpoint. For simplification, only a single file has been processed by the Entity Recognizer and Linker, and just a few of the found entity mentions is shown below:
+Here is an example of an output from the endpoint `/entitymentions?article=test.txt`. For simplification, only a single file has been processed by the Entity Recognizer and Linker:
```JSON
-[
- {
- "name": "Martin Kjærs",
- "startIndex": 28,
- "endIndex": 40,
- "fileName": "Artikel.txt"
- },
- {
- "name": "Region Nordjylland",
- "startIndex": 100,
- "endIndex": 118,
- "fileName": "Artikel.txt"
- },
- {
- "name": "Aalborg",
- "startIndex": 285,
- "endIndex": 292,
- "fileName": "Artikel.txt"
- }
-]
+{
+ "fileName": "test.txt",
+ "language": "en",
+ "sentences": [
+ {
+ "sentence": "Hi my name is marc",
+ "sentenceStartIndex": 0,
+ "sentenceEndIndex": 47,
+ "entityMentions": [
+ {
+ "name": "marc",
+ "type": "Entity",
+ "label": "GPE",
+ "startIndex": 14,
+ "endIndex": 18,
+ "iri": "knox-kb01.srv.aau.dk/marc"
+ }
+ ]
+ }
+ ]
+}
```
-## articlename/entities GET
+## /entitymentions/all GET
-The `articlename/entities` endpoint is a **GET** endpoint. The articlename in the url has to replaced with a name of an article including .txt. When doing a GET request to the endpoint, a JSON Array is returned containing the currently known entitymentions found in the given article name including their indexes. The format of the JSON array is formatted as follows:
+The `/entitymentions/all` endpoint is a **GET** endpoint. When doing a **GET** request to the endpoint, a JSON Array is returned containing the all articles with their currently known entitymentions found. The JSON array is formatted as follows:
```JSON
[
{
- "name": "ENTITY MENTION",
- "startIndex": INT,
- "endIndex":INT,
- "fileName":"FILENAME.EXTENSION"
- },
- {
- "name": "ENTITY MENTION",
- "startIndex": INT,
- "endIndex":INT,
- "fileName":"FILENAME.EXTENSION"
+ "fileName": STRING,
+ "language": STRING,
+ "sentences": [
+ {
+ "sentence": STRING,
+ "sentenceStartIndex": INT,
+ "sentenceEndIndex": INT,
+ "entityMentions": [
+ {
+ "name": STRING,
+ "type": STRING,
+ "label": STRING,
+ "startIndex": INT,
+ "endIndex": INT,
+ "iri": STRING
+ }
+ ]
+ }
+ ]
}
]
```
### Example Output
-Here is an example of an output from the endpoint when getting for Artikel.txt/entities. For simplification, only a single file has been processed by the Entity Recognizer and Linker, and just a few of the found entity mentions is shown below:
+Here is an example of an output from the endpoint when getting all articles. For simplification, only two files has been processed by the Entity Recognizer and Linker:
```JSON
[
{
- "name": "Martin Kjærs",
- "startIndex": 28,
- "endIndex": 40,
- "fileName": "Artikel.txt"
+ "fileName": "test.txt",
+ "language": "en",
+ "sentences": [
+ {
+ "sentence": "Hi my name is marc",
+ "sentenceStartIndex": 0,
+ "sentenceEndIndex": 47,
+ "entityMentions": [
+ {
+ "name": "marc",
+ "type": "Entity",
+ "label": "PERSON",
+ "startIndex": 14,
+ "endIndex": 18,
+ "iri": "knox-kb01.srv.aau.dk/marc"
+ }
+ ]
+ }
+ ]
},
{
- "name": "Region Nordjylland",
- "startIndex": 100,
- "endIndex": 118,
- "fileName": "Artikel.txt"
- },
- {
- "name": "Aalborg",
- "startIndex": 285,
- "endIndex": 292,
- "fileName": "Artikel.txt"
+ "fileName": "test2.txt",
+ "language": "en",
+ "sentences": [
+ {
+ "sentence": "Hi my name is joe",
+ "sentenceStartIndex": 0,
+ "sentenceEndIndex": 47,
+ "entityMentions": [
+ {
+ "name": "Joe",
+ "type": "Entity",
+ "label": "PERSON",
+ "startIndex": 14,
+ "endIndex": 17,
+ "iri": "knox-kb01.srv.aau.dk/joe"
+ }
+ ]
+ }
+ ]
}
]
```
-## detectlanguage POST
+## /detectlanguage POST
+This endpoint expects the given request body to contain some input text and returns its language. It uses the [langdetect](https://pypi.org/project/langdetect/) library.
-This endpoint will check the language in the given text.
-Send the text in the request body and it will return the language.
-The given text has to be longer than 4 characters.
-The function will return the lanugage in 2 charaters.
+> **_NOTE:_** The function will return the language as a [ISO 639-1](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) code.
### Example
-Request body: The man was walking down the street
-Response: en
+Request body: "The man was walking down the street"\
+Response: en
+
+
+### Constraints
+- The given text has to be longer than 4 characters.
+
+### Supported languages
+`af, ar, bg, bn, ca, cs, cy, da, de, el, en, es, et, fa, fi, fr, gu, he,
+hi, hr, hu, id, it, ja, kn, ko, lt, lv, mk, ml, mr, ne, nl, no, pa, pl,
+pt, ro, ru, sk, sl, so, sq, sv, sw, ta, te, th, tl, tr, uk, ur, vi, zh-cn, zh-tw`
+> **_NOTE:_** see [List of ISO 639-1 codes](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) for more information
diff --git a/docs/directorywatcher.md b/docs/directorywatcher.md
new file mode 100644
index 0000000..ef1cc1f
--- /dev/null
+++ b/docs/directorywatcher.md
@@ -0,0 +1,60 @@
+# [Directory Watcher](https://github.com/Knox-AAU/PreProcessingLayer_EntityRecognitionAndLinking/blob/main/lib/DirectoryWatcher.py)
+The pipeline starts when a new file is placed in a watched folder by pipeline part A. The Directory Watcher's responsibility is to call a callback function when a new file is created in the watched folder.
+
+## Features
+- [watchdog](https://pypi.org/project/watchdog/) for file events
+- Async callback support
+- [Threading](https://docs.python.org/3/library/threading.html)
+
+## Overview
+
+The `DirectoryWatcher` provides a simple way to monitor a specified directory for file creation events and execute asynchronous callbacks in response. It utilizes the [watchdog](https://pypi.org/project/watchdog/) library for filesystem monitoring and integrates with [asyncio](https://docs.python.org/3/library/asyncio.html) for handling asynchronous tasks. Furthermore the `DirectoryWatcher` uses [threading](https://docs.python.org/3/library/threading.html).
+
+> **_NOTE:_** [Threading](https://docs.python.org/3/library/threading.html) is used to avoid blocking the main thread's code from executing.
+
+
+## Example usage
+```python
+# Importing
+from lib.DirectoryWatcher import DirectoryWatcher
+
+dirPath = "some/path/to/a/directory"
+
+# Setup
+async def newFileCreated(file_path: str):
+ print("New file created in " + file_path)
+
+
+dirWatcher = DirectoryWatcher(
+ directory=dirPath, async_callback=newFileCreated
+)
+
+# A fast API event function running on startup
+@app.on_event("startup")
+async def startEvent():
+ dirWatcher.start_watching()
+
+# A fast API event function running on shutdown
+@app.on_event("shutdown")
+def shutdown_event():
+ dirWatcher.stop_watching()
+```
+
+> **_NOTE:_** The fast API event functions are not needed to use the `Directory Watcher`
+
+
+## Methods
+```python
+def __init__(self, directory, async_callback):
+```
+### Parameters:
+- **directory** (str): A path to the directory you want to watch ie. `some/path/to/a/directory`
+- **async_callback** (function): An async callback function to be called when a new file is created in the **directory**. This function should accept a single parameter, which is the path of the created file.
+
+```python
+def start_watching(self) -> threading.Thread:
+```
+
+```python
+def stop_watching(self):
+```