Skip to content

Commit

Permalink
Add documentation for auto-annnotation in SDK/CLI (#6611)
Browse files Browse the repository at this point in the history
Documentation for #6483.
  • Loading branch information
SpecLad authored Aug 8, 2023
1 parent a8e921b commit c68cb07
Show file tree
Hide file tree
Showing 4 changed files with 269 additions and 2 deletions.
1 change: 1 addition & 0 deletions cvat-sdk/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ The SDK API includes several layers:
- Server API wrappers (`ApiClient`). Located in at `cvat_sdk.api_client`.
- High-level tools (`Core`). Located at `cvat_sdk.core`.
- PyTorch adapter. Located at `cvat_sdk.pytorch`.
* Auto-annotation support. Located at `cvat_sdk.auto_annotation`.

Package documentation is available [here](https://opencv.github.io/cvat/docs/api_sdk/sdk).

Expand Down
44 changes: 42 additions & 2 deletions site/content/en/docs/api_sdk/cli/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,12 @@ You can get help with `cvat-cli --help`.
```
usage: cvat-cli [-h] [--version] [--insecure] [--auth USER:[PASS]] [--server-host SERVER_HOST]
[--server-port SERVER_PORT] [--organization SLUG] [--debug]
{create,delete,ls,frames,dump,upload,export,import} ...
{create,delete,ls,frames,dump,upload,export,import,auto-annotate} ...
Perform common operations related to CVAT tasks.
positional arguments:
{create,delete,ls,frames,dump,upload,export,import}
{create,delete,ls,frames,dump,upload,export,import,auto-annotate}
options:
-h, --help show this help message and exit
Expand Down Expand Up @@ -230,3 +230,43 @@ by using the [label constructor](/docs/manual/basics/creating_an_annotation_task
```bash
cvat-cli import task_backup.zip
```

### Auto-annotate

This command provides a command-line interface
to the [auto-annotation API](/docs/api_sdk/sdk/auto-annotation).
To use it, create a Python module that implements the AA function protocol.

In other words, this module must define the required attributes on the module level.
For example:

```python
import cvat_sdk.auto_annotation as cvataa

spec = cvataa.DetectionFunctionSpec(...)

def detect(context, image):
...
```

- Annotate the task with id 137 with the predefined YOLOv8 function:
```bash
cvat-cli auto-annotate 137 --function-module cvat_sdk.auto_annotation.functions.yolov8n
```

- Annotate the task with id 138 with an AA function defined in `my_func.py`:
```bash
cvat-cli auto-annotate 138 --function-file path/to/my_func.py
```

Note that this command does not modify the Python module search path.
If your function module needs to import other local modules,
you must add your module directory to the search path
if it isn't there already.

- Annotate the task with id 139 with a function defined in the `my_func` module
located in the `my-project` directory,
letting it import other modules from that directory.
```bash
PYTHONPATH=path/to/my-project cvat-cli auto-annotate 139 --function-module my_func
```
6 changes: 6 additions & 0 deletions site/content/en/docs/api_sdk/sdk/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ SDK API includes several layers:
- Low-level API with REST API wrappers. Located at `cvat_sdk.api_client`. [Read more](/docs/api_sdk/sdk/lowlevel-api)
- High-level API. Located at `cvat_sdk.core`. [Read more](/docs/api_sdk/sdk/highlevel-api)
- PyTorch adapter. Located at `cvat_sdk.pytorch`. [Read more](/docs/api_sdk/sdk/pytorch-adapter)
- Auto-annotation API. Located at `cvat_sdk.auto_annotation.` [Read more](/docs/api_sdk/sdk/auto-annotation)

In general, the low-level API provides single-request operations, while the high-level one
implements composite, multi-request operations, and provides local proxies for server objects.
Expand All @@ -25,6 +26,11 @@ The PyTorch adapter is a specialized layer
that represents datasets stored in CVAT as PyTorch `Dataset` objects.
This enables direct use of such datasets in PyTorch-based machine learning pipelines.

The auto-annotation API is a specialized layer
that lets you automatically annotate CVAT datasets
by running a custom function on the local machine.
See also the `auto-annotate` command in the CLI.

## Installation

To install an [official release of CVAT SDK](https://pypi.org/project/cvat-sdk/) use this command:
Expand Down
220 changes: 220 additions & 0 deletions site/content/en/docs/api_sdk/sdk/auto-annotation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,220 @@
---
title: 'Auto-annotation API'
linkTitle: 'Auto-annotation API'
weight: 6
---

## Overview

This layer provides functionality that allows you to automatically annotate a CVAT dataset
by running a custom function on your local machine.
A function, in this context, is a Python object that implements a particular protocol
defined by this layer.
To avoid confusion with Python functions,
auto-annotation functions will be referred to as "AA functions" in the following text.
A typical AA function will be based on a machine learning model
and consist of the following basic elements:

- Code to load the ML model.

- A specification describing the annotations that the AA function can produce.

- Code to convert data from CVAT to a format the ML model can understand.

- Code to run the ML model.

- Code to convert resulting annotations to a format CVAT can understand.

The layer can be divided into several parts:

- The interface, containing the protocol that an AA function must implement.

- The driver, containing functionality to annotate a CVAT dataset using an AA function.

- The predefined AA function based on Ultralytics YOLOv8n.

The `auto-annotate` CLI command provides a way to use an AA function from the command line
rather than from a Python program.
See [the CLI documentation](/docs/api_sdk/cli/) for details.

## Example

```python
from typing import List
import PIL.Image

import torchvision.models

from cvat_sdk import make_client
import cvat_sdk.models as models
import cvat_sdk.auto_annotation as cvataa

class TorchvisionDetectionFunction:
def __init__(self, model_name: str, weights_name: str, **kwargs) -> None:
# load the ML model
weights_enum = torchvision.models.get_model_weights(model_name)
self._weights = weights_enum[weights_name]
self._transforms = self._weights.transforms()
self._model = torchvision.models.get_model(model_name, weights=self._weights, **kwargs)
self._model.eval()

@property
def spec(self) -> cvataa.DetectionFunctionSpec:
# describe the annotations
return cvataa.DetectionFunctionSpec(
labels=[
cvataa.label_spec(cat, i)
for i, cat in enumerate(self._weights.meta['categories'])
]
)

def detect(self, context, image: PIL.Image.Image) -> List[models.LabeledShapeRequest]:
# convert the input into a form the model can understand
transformed_image = [self._transforms(image)]

# run the ML model
results = self._model(transformed_image)

# convert the results into a form CVAT can understand
return [
cvataa.rectangle(label.item(), [x.item() for x in box])
for result in results
for box, label in zip(result['boxes'], result['labels'])
]

# log into the CVAT server
with make_client(host="localhost", credentials=("user", "password")) as client:
# annotate task 12345 using Faster R-CNN
cvataa.annotate_task(client, 41617,
TorchvisionDetectionFunction("fasterrcnn_resnet50_fpn_v2", "DEFAULT", box_score_thresh=0.5),
)
```

## Auto-annotation interface

Currently, the only type of AA function supported by this layer is the detection function.
Therefore, all of the following information will pertain to detection functions.

A detection function accepts an image and returns a list of shapes found in that image.
When it is applied to a dataset, the AA function is run for every image,
and the resulting lists of shapes are combined and uploaded to CVAT.

A detection function must have two attributes, `spec` and `detect`.

`spec` must contain the AA function's specification,
which is an instance of `DetectionFunctionSpec`.

`DetectionFunctionSpec` must be initialized with a sequence of `PatchedLabelRequest` objects
that represent the labels that the AA function knows about.
See the docstring of `DetectionFunctionSpec` for more information on the constraints
that these objects must follow.

`detect` must be a function/method accepting two parameters:

- `context` (`DetectionFunctionContext`).
Contains information about the current image.
Currently `DetectionFunctionContext` only contains a single field, `frame_name`,
which contains the file name of the frame on the CVAT server.

- `image` (`PIL.Image.Image`).
Contains image data.

`detect` must return a list of `LabeledShapeRequest` objects,
representing shapes found in the image.
See the docstring of `DetectionFunctionSpec` for more information on the constraints
that these objects must follow.

The same AA function may be used with any dataset that contain labels with the same name
as the AA function's specification.
The way it works is that the driver matches labels between the spec and the dataset,
and replaces the label IDs in the shape objects with those defined in the dataset.

For example, suppose the AA function's spec defines the following labels:

| Name | ID |
|-------|----|
| `bat` | 0 |
| `rat` | 1 |

And the dataset defines the following labels:

| Name | ID |
|-------|-----|
| `bat` | 100 |
| `cat` | 101 |
| `rat` | 102 |

Then suppose `detect` returns a shape with `label_id` equal to 1.
The driver will see that it refers to the `rat` label, and replace it with 102,
since that's the ID this label has in the dataset.

The same logic is used for sub-label IDs.

### Helper factory functions

The CVAT API model types used in the AA function protocol are somewhat unwieldy to work with,
so it's recommented to use the helper factory functions provided by this layer.
These helpers instantiate an object of their corresponding model type,
passing their arguments to the model constructor
and sometimes setting some attributes to fixed values.

The following helpers are available for building specifications:

| Name | Model type | Fixed attributes |
|-----------------------|-----------------------|-------------------|
| `label_spec` | `PatchedLabelRequest` | - |
| `skeleton_label_spec` | `PatchedLabelRequest` | `type="skeleton"` |
| `keypoint_spec` | `SublabelRequest` | - |

The following helpers are available for use in `detect`:

| Name | Model type | Fixed attributes |
|-------------|--------------------------|-------------------------------|
| `shape` | `LabeledShapeRequest` | `frame=0` |
| `rectangle` | `LabeledShapeRequest` | `frame=0`, `type="rectangle"` |
| `skeleton` | `LabeledShapeRequest` | `frame=0`, `type="skeleton"` |
| `keypoint` | `SubLabeledShapeRequest` | `frame=0`, `type="points"` |

## Auto-annotation driver

The `annotate_task` function uses an AA function to annotate a CVAT task.
It must be called as follows:

```python
annotate_task(<client>, <task ID>, <AA function>, <optional arguments...>)
```

The supplied client will be used to make all API calls.

By default, new annotations will be appended to the old ones.
Use `clear_existing=True` to remove old annotations instead.

If a detection function declares a label that has no matching label in the task,
then by default, `BadFunctionError` is raised, and auto-annotation is aborted.
If you use `allow_unmatched_label=True`, then such labels will be ignored,
and any shapes referring to them will be dropped.
Same logic applies to sub-label IDs.

`annotate_task` will raise a `BadFunctionError` exception
if it detects that the function violated the AA function protocol.

## Predefined AA function

This layer includes a predefined AA function based on the Ultralytics YOLOv8n model.
You can use this AA function as-is, or use it as a base on which to build your own.

To use this function, you have to install CVAT SDK with the `ultralytics` extra:

```console
$ pip install "cvat-sdk[ultralytics]"
```

The AA function is implemented as a module
in order to be compatible with the `cvat-cli auto-annotate` command.
Simply import `cvat_sdk.auto_annotation.functions.yolov8n`
and use the module itself as a function:

```python
import cvat_sdk.auto_annotation.functions.yolov8n as yolov8n
annotate_task(<client>, <task ID>, yolov8n)
```

0 comments on commit c68cb07

Please sign in to comment.