Fix mask annotations in CVAT format (cvat-ai#5905)

### Motivation and context  Fixed cvat-ai#5828 - Fixed mask annotations in CVAT format - Updated documentation ### How has this been tested?  ### Checklist  - [x] I submit my changes into the `develop` branch - [ ] I have added a description of my changes into the [CHANGELOG](https://github.com/opencv/cvat/blob/develop/CHANGELOG.md) file - [ ] I have updated the documentation accordingly - [ ] I have added tests to cover my changes - [ ] I have linked related issues (see [GitHub docs]( https://help.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword)) - [ ] I have increased versions of npm packages if it is necessary ([cvat-canvas](https://github.com/opencv/cvat/tree/develop/cvat-canvas#versioning), [cvat-core](https://github.com/opencv/cvat/tree/develop/cvat-core#versioning), [cvat-data](https://github.com/opencv/cvat/tree/develop/cvat-data#versioning) and [cvat-ui](https://github.com/opencv/cvat/tree/develop/cvat-ui#versioning)) ### License - [ ] I submit _my code changes_ under the same [MIT License]( https://github.com/opencv/cvat/blob/develop/LICENSE) that covers the project. Feel free to contact the maintainers if that's a concern. --------- Co-authored-by: Boris Sekachev <boris.sekachev@yandex.ru>
retailnext · Jul 1, 2023 · bcf5437 · bcf5437
1 parent bba3d8e
commit bcf5437
Show file tree

Hide file tree

Showing 3 changed files with 16 additions and 8 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -29,6 +29,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Invalid mask when running automatic annotation on a task (<https://github.com/opencv/cvat/pull/5883>)
 - Cloud storage content listing when the manifest name contains special characters
   (<https://github.com/opencv/cvat/pull/5873>)
+- Width and height in CVAT dataset format mask annotations (<https://github.com/opencv/cvat/pull/5905>)
 
 ### Security
 - TDB

diff --git a/cvat/apps/dataset_manager/formats/cvat.py b/cvat/apps/dataset_manager/formats/cvat.py
@@ -795,8 +795,8 @@ def dump_labeled_shapes(shapes, is_skeleton=False):
                         ("rle", f"{list(int (v) for v in shape.points[:-4])}"[1:-1]),
                         ("left", f"{int(shape.points[-4])}"),
                         ("top", f"{int(shape.points[-3])}"),
-                        ("width", f"{int(shape.points[-2] - shape.points[-4])}"),
-                        ("height", f"{int(shape.points[-1] - shape.points[-3])}"),
+                        ("width", f"{int(shape.points[-2] - shape.points[-4]) + 1}"),
+                        ("height", f"{int(shape.points[-1] - shape.points[-3]) + 1}"),
                     ]))
                 elif shape.type != 'skeleton':
                     dump_data.update(OrderedDict([
@@ -933,8 +933,8 @@ def dump_shape(shape, element_shapes=None, label=None):
                 ("rle", f"{list(int (v) for v in shape.points[:-4])}"[1:-1]),
                 ("left", f"{int(shape.points[-4])}"),
                 ("top", f"{int(shape.points[-3])}"),
-                ("width", f"{int(shape.points[-2] - shape.points[-4])}"),
-                ("height", f"{int(shape.points[-1] - shape.points[-3])}"),
+                ("width", f"{int(shape.points[-2] - shape.points[-4]) + 1}"),
+                ("height", f"{int(shape.points[-1] - shape.points[-3]) + 1}"),
             ]))
         elif shape.type == "cuboid":
             dump_data.update(OrderedDict([
@@ -1293,8 +1293,8 @@ def load_anno(file_object, annotations):
                     shape['points'] = el.attrib['rle'].split(',')
                     shape['points'].append(el.attrib['left'])
                     shape['points'].append(el.attrib['top'])
-                    shape['points'].append("{}".format(int(el.attrib['left']) + int(el.attrib['width'])))
-                    shape['points'].append("{}".format(int(el.attrib['top']) + int(el.attrib['height'])))
+                    shape['points'].append("{}".format(int(el.attrib['left']) + int(el.attrib['width']) - 1))
+                    shape['points'].append("{}".format(int(el.attrib['top']) + int(el.attrib['height']) - 1))
                 elif el.tag == 'cuboid':
                     shape['points'].append(el.attrib['xtl1'])
                     shape['points'].append(el.attrib['ytl1'])

diff --git a/site/content/en/docs/manual/advanced/xml_format.md b/site/content/en/docs/manual/advanced/xml_format.md
@@ -38,7 +38,7 @@ In annotation mode each image tag has `width` and `height` attributes for the sa
       <labels>
         <label>
           <name>String: name of the label (e.g. car, person)</name>
-          <type>String: any, bbox, cuboid, cuboid_3d, ellipse, polygon, polyline, points, skeleton, tag</type>
+          <type>String: any, bbox, cuboid, cuboid_3d, ellipse, mask, polygon, polyline, points, skeleton, tag</type>
           <attributes>
             <attribute>
               <name>String: attribute name</name>
@@ -84,7 +84,8 @@ On each image it is possible to have many different objects. Each object can hav
 If an annotation task is created with `z_order` flag then each object will have `z_order` attribute which is used
 to draw objects properly when they are intersected (if `z_order` is bigger the object is closer to camera).
 In previous versions of the format only `box` shape was available.
-In later releases `polygon`, `polyline`, `points`, `skeletons` and `tags` were added. Please see below for more details:
+In later releases `mask`, `polygon`, `polyline`, `points`, `skeletons` and `tags` were added.
+Please see below for more details:
 
 ```xml
 <?xml version="1.0" encoding="utf-8"?>
@@ -128,6 +129,8 @@ In later releases `polygon`, `polyline`, `points`, `skeletons` and `tags` were a
       <attribute name="String: an attribute name">String: the attribute value</attribute>
       ...
     </skeleton>
+    <mask label="String: the associated label" source="manual or auto" occluded="Number: 0 - False, 1 - True" rle="RLE mask" left="Number: left coordinate of the image where the mask begins" top="Number: top coordinate of the image where the mask begins" width="Number: width of the mask" height="Number: height of the mask" z_order="Number: z-order of the object">
+    </mask>
     ...
   </image>
   ...
@@ -237,6 +240,8 @@ Example:
       <points label="3" occluded="0" source="manual" outside="0" points="125.87,62.85">
       </points>
     </skeleton>
+    <mask label="car" source="manual" occluded="0" rle="3, 5, 7, 7, 5, 9, 3, 11, 2, 11, 2, 12, 1, 12, 1, 26, 1, 12, 1, 12, 2, 11, 3, 9, 5, 7, 7, 5, 3" left="707" top="888" width="13" height="15" z_order="0">
+    </mask>
   </image>
 </annotations>
 ```
@@ -267,6 +272,8 @@ cloned for each location (a known redundancy).
     <points frame="Number: frame" points="x0,y0;x1,y1;..." outside="Number: 0 - False, 1 - True" occluded="Number: 0 - False, 1 - True" keyframe="Number: 0 - False, 1 - True">
       <attribute name="String: an attribute name">String: the attribute value</attribute>
     </points>
+    <mask frame="Number: frame" outside="Number: 0 - False, 1 - True" occluded="Number: 0 - False, 1 - True" rle="RLE mask" left="Number: left coordinate of the image where the mask begins" top="Number: top coordinate of the image where the mask begins" width="Number: width of the mask" height="Number: height of the mask" z_order="Number: z-order of the object">
+    </mask>
     ...
   </track>
   <track id="Number: id of the track (doesn't have any special meeting)" label="String: the associated label" source="manual or auto">