-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing objects when dumping in COCO format #1941
Comments
I just realized some of the objects were polylines instead of polygons. This is probably the cause of this issue. It would be nice if a warning was issued in such cases. Is there a way to convert polylines to polygons within CVAT? |
I encountered this issue as well, however did not have the polylines issue. I was supposed to raise this issue after discussing it in the gitter, but never got around to it. We discovered the issue when annotators marked a whole set of 1 object as another object so we wrote a quick script to parse the dumped coco json, replace the tag, and then reuploaded it. We noticed when we reuploaded that some objects were missing and some objects were on totally different frames. We then replicated it without modifying anything to make sure we didn't do anything wrong. Dumping coco, reuploading, and then dumping it again provided two different outputs, so something in datumaro is not very keen on coco at the moment. |
@thomasstats, can you help in reproducing this problem? If you could share annotations in |
@thomasstats, is it possible that missing annotations were grouped boxes or polygons? If so, could you describe the expected COCO output for such annotations? |
The initial dumped COCO annotation was this: After uploading and dumping as COCO again I get this: There are other differences obviously too. For instance the initial dump was 6790 KB, after uploading and dumping again the file is now 4371 KB. I've attached the sample below. |
@thomasstats, from what I can see, there are the following differences:
However, there is the same number of annotations and their counts on images, I can't find differences in the annotation. Is there anything I miss? |
@zhiltsov-max i'm not sure, that's just the actual output, but when displaying (as I said before), a significant portion of the annotations don't display and quite a few of them display on the wrong frame. in the gitter discussion we had another person had also mentioned they had a similar problem and got around it by using the bbox instead of the polygon, but in our case we need those polygons. on top of that, is there any reason why there would even be a difference between the files? shouldn't dumping, uploading and then dumping already provide the same result because it's being processed through datumaro? it seems as if there's some kind of conversion going on during the upload that isn't happening during the download, which doesn't sound like it should be the case. |
@thomasstats, checking the annotation files with the following python script gives no difference in annotations and their image appearance, except annotation ids and coordinate precision (which implies some differences in the
I can imagine there could be some other problems on a path from uploading to displaying in UI, but the annotations seem to be equal, though. Could you dump and share However, it's really interesting where the differences came from, and it is a topic for an investigation. Speaking about dumping, uploading, and annotation equality - there are many formats, and each of them has it's own specifics. Then, it should be somehow mapped to the tool's model, and sometimes it is impossible to do unambiguously. For example, there are 2 fields in COCO format: |
Attached. If this isn't good enough, if there's a way to private message (because I can't have my actual images public), I can send actual image example screenshots too.
I'm not entirely aware of whether or not it's possible, but couldn't there be options for prioritizing segments or prioritizing boxes during the dumps and upload? |
@thomasstats, comparing this file with We've thought about providing some context menu for exporting and importing with options, it is reflected in #1804. |
Perhaps the differences are a red herring and it's a client side issue then? I've been trying to reproduce with a toy case sample, but I haven't be able to so far. However, the issue does exist on multiple of our datasets, not just the one. |
@bsekachev , could you please look? |
To sum up, from the conversation, as I understood:
Is everything right? The first question is what CVAT version do you use? Could you provide a commit hash? |
@bsekachev correct. Here are my details: Git hash commit (git log -1): Release 1.0.0 (#1335) Though I was able to replicate it locally (with our dataset) using Windows 10 and commit 7679434 |
Sorry for delay, I was on vacation previous week. |
@bsekachev Thanks for the update, but as stated previously, I was able to replicate this issue with our same dataset that's having the issue on the 1.1 beta commit 7679434 |
Oh, my inattentiveness. And it would be great if you provided server ids of objects where you meet the issue. If you do not know how to get them from a client, let me know |
Closing as no response for a long time. |
Please reopen, this issue is still valid. Polyline annotations are not reported in COCO format, whereas they are present in CVAT format XML files. |
Could you please look? |
COCO format does not know about polylines, the format has official description here: https://cocodataset.org/#format-data. Do you want us to implement our own extension? |
I'll explain what I am doing right now and why. I define a thin polygon around the polyline with a thickness of just a few pixels. I do that also because I cannot found any deep model that predicts polyline, but there are models that predict masks. A bit off-topic: it seems that there is an analogous problem with simple points, i.e. they are not translated into coco's keypoints. |
@giuseta, it looks like you're not bound to the COCO, because no models working with COCO support polylines, and they are not supposed to do this. Maybe you could try to use
It is partially supported, read here: #2910 (comment) |
I will close the issue as outdated. |
My actions before raising this issue
Expected Behaviour
Annotations file should contain all objects that were labeled in the task.
Current Behaviour
Dumping annotations in CVAT XML format works fine but when using COCO format, some objects are missing in several frames.
For example with frame 211:
There are no error in the docker logs:
Possible Solution
Temporary workaround: convert from CVAT XML to COCO JSON
Steps to Reproduce (for bugs)
I didn't find yet how to reproduce the bug, I tried to create a task with just problematic frames but when dumping annotations in COCO format but all objects were presents.
Context
Task of 2400 frames with 2719 objects labeled as polygons distributed in two categories.
Your Environment
git log -1
): commit 07de714 (HEAD -> master, tag: v1.0.0, origin/master, origin/HEAD)docker version
(e.g. Docker 17.0.05): Docker version 19.03.8, build afacb8b7f0Logs from `cvat` container
Next steps
You may join our Gitter channel for community support.
The text was updated successfully, but these errors were encountered: