Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase frame buffer / improve the way buffering is handled #3798

Closed
2 tasks done
mrKallah opened this issue Oct 14, 2021 · 10 comments
Closed
2 tasks done

Increase frame buffer / improve the way buffering is handled #3798

mrKallah opened this issue Oct 14, 2021 · 10 comments
Labels
enhancement New feature or request performance

Comments

@mrKallah
Copy link

My actions before raising this issue

I am trying to increase the amount of buffered frames on a local server run for multiple user annotations, I've tried to find this in the docs and I've asked on glitter, but with no luck. The reason I need to increase the buffer is due to having usually ~1k-10k frames between the end of one annotation to the start of the next. The data I am working with is mostly without the subject I'm trying to detect. It would significantly increase my productivity for the buffer handling to be improved.

Expected Behaviour

The server should have a setting to change the amount of frames buffered so if you are running a dedicated server to annotate, you can use it how you wish.
When the client goes to the next frame, the server should start to send the next buffered frame to the client.
Personally I'd like it if the server would work at 100% CPU when loading frames to minimize the time spent waiting for the server to load frames.

Current Behaviour

For my use-case, the server is idle until the client runs out of frames, then it loads frames using ~25% of CPU or 100% of one core and then it goes idle again until the clients buffer is empty. I can also not find a setting to change the amount of frames buffered.

Possible Solution

Short Term - Add a setting for the buffered frames.
Long Term - Re-write the way the buffering of images is handled and allow for multithreaded frame buffering to utilize parallelism

Steps to Reproduce (for bugs)

  1. Load a large dataset with somewhat high resolution
  2. Hold down f or v to fast forward
  3. After a little while it should start loading for a few seconds

Context

This issue drastically decreases productivity of our team when annotating our dataset. We are trying to annotate rare occurrences in usually 1.5h videos, where there is sometimes less than 5 separate annotation tracks. The video needs to be relatively high resolution due to the nature of what we are trying to detect. Thus we are stuck spending a lot of time waiting for the machine to load the next frames.

Your Environment

  • Git hash commit (git log -1): e8b3284
  • Docker version docker version (e.g. Docker 17.0.05): 20.10.9
  • Are you using Docker Swarm or Kubernetes? unsure, I've installed it as per https://openvinotoolkit.github.io/cvat/docs/administration/basics/installation/ for ubuntu
  • Operating System and version (e.g. Linux, Windows, MacOS): Linux Ubuntu
  • Code example or link to GitHub repo or gist to reproduce problem:
    #docker.compose.overide.yml
# Copyright (C) 2018-2021 Intel Corporation
#
# SPDX-License-Identifier: MIT

#version: '3.3'

version: '3.3'


services:
  cvat:
    environment:
      CVAT_SHARE_URL: "Mounted from /mnt/share host directory"
      CVAT_HOST: "localhost 192.168.10.135"
    volumes:
      - cvat_share:/home/django/share:ro
    labels:
      - traefik.http.routers.cvat.rule=(Host(`localhost`) || Host(`192.168.10.135`)) &&
          PathPrefix(`/api/`, `/git/`, `/opencv/`, `/analytics/`, `/static/`, `/admin`, `/documentation/`, `/django-rq`)

  cvat_ui:
    labels:
      - traefik.http.routers.cvat-ui.rule=Host(`localhost`) || Host(`192.168.10.135`)

  traefik:
    ports:
      - 80:8080
      - 90:9090

volumes:
  cvat_share:
    driver_opts:
      type: none
      device: /mnt/share
      o: bind

docker.compose.yml

# Copyright (C) 2018-2021 Intel Corporation
#
# SPDX-License-Identifier: MIT

version: '3.3'

services:
  cvat_db:
    container_name: cvat_db
    image: postgres:10-alpine
    restart: always
    environment:
      POSTGRES_USER: root
      POSTGRES_DB: cvat
      POSTGRES_HOST_AUTH_METHOD: trust
    volumes:
      - cvat_db:/var/lib/postgresql/data
    networks:
      - cvat

  cvat_redis:
    container_name: cvat_redis
    image: redis:4.0-alpine
    restart: always
    networks:
      - cvat

  cvat:
    container_name: cvat
    image: openvino/cvat_server
    restart: always
    depends_on:
      - cvat_redis
      - cvat_db
    environment:
      DJANGO_MODWSGI_EXTRA_ARGS: ''
      ALLOWED_HOSTS: '*'
      CVAT_REDIS_HOST: 'cvat_redis'
      CVAT_POSTGRES_HOST: 'cvat_db'
      ADAPTIVE_AUTO_ANNOTATION: 'false'
    labels:
      - traefik.enable=true
      - traefik.http.services.cvat.loadbalancer.server.port=8080
      - traefik.http.routers.cvat.rule=Host(`${CVAT_HOST:-localhost}`) &&
          PathPrefix(`/api/`, `/git/`, `/opencv/`, `/analytics/`, `/static/`, `/admin`, `/documentation/`, `/django-rq`)
      - traefik.http.routers.cvat.entrypoints=web
    volumes:
      - cvat_data:/home/django/data
      - cvat_keys:/home/django/keys
      - cvat_logs:/home/django/logs
    networks:
      - cvat

  cvat_ui:
    container_name: cvat_ui
    image: openvino/cvat_ui
    restart: always
    depends_on:
      - cvat
    labels:
      - traefik.enable=true
      - traefik.http.services.cvat-ui.loadbalancer.server.port=80
      - traefik.http.routers.cvat-ui.rule=Host(`${CVAT_HOST:-localhost}`)
      - traefik.http.routers.cvat-ui.entrypoints=web
    networks:
      - cvat

  traefik:
    image: traefik:v2.4
    container_name: traefik
    restart: always
    command:
      - "--providers.docker.exposedByDefault=false"
      - "--providers.docker.network=cvat"
      - "--entryPoints.web.address=:8080"
    # Uncomment to get Traefik dashboard
    #   - "--entryPoints.dashboard.address=:8090"
    #   - "--api.dashboard=true"
    # labels:
    #   - traefik.enable=true
    #   - traefik.http.routers.dashboard.entrypoints=dashboard
    #   - traefik.http.routers.dashboard.service=api@internal
    #   - traefik.http.routers.dashboard.rule=Host(`${CVAT_HOST:-localhost}`)
    ports:
      - 8080:8080
      - 8090:8090
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    networks:
      - cvat

volumes:
  cvat_db:
  cvat_data:
  cvat_keys:
  cvat_logs:

networks:
  cvat:


  • Other diagnostic information / logs:
    Logs from `cvat` container

Next steps

You may join our Gitter channel for community support.

@azhavoro
Copy link
Contributor

Hi,

  1. Try to create a task without the use cache option, which is currently enabled by default - this can increase the creation time, but the server will just send the prepared file and will not waste time to prepare it in case of cache miss.
  2. The raw frame buffer size is currently 2 GB (https://github.com/openvinotoolkit/cvat/blob/develop/cvat-core/src/frames.js#L656) and cannot be significantly increased due to the browser sandbox memory size limitation. You can try to increase this value and rebuild images with docker-compose -f docker-compose.yml -f docker-compose.dev.yml up -d --build, but I'm not sure about the stability of the tool in this case.

@rsnk96
Copy link
Contributor

rsnk96 commented Oct 19, 2021

Hi @azhavoro , can you help clarify the following? And apologies if these have been clarified elsewhere

Try to create a task without the use cache option, which is currently enabled by default - this can increase the creation time, but the server will just send the prepared file and will not waste time to prepare it in case of cache miss.

Does the use cache option affect the loading process for tasks that were creating with just images, no video?


The raw frame buffer size is currently 2 GB (https://github.com/openvinotoolkit/cvat/blob/develop/cvat-core/src/frames.js#L656) and cannot be significantly increased due to the browser sandbox memory size limitation. You can try to increase this value and rebuild images with docker-compose -f docker-compose.yml -f docker-compose.dev.yml up -d --build, but I'm not sure about the stability of the tool in this case.

Just for internal understanding, is this 2GB limit the limit at the client end, or the limit at the server end..?
Asking because if it is server end, then that means the 2GB would be shared between all the annotators ==> It would also interest my use case to increase the limit.

@mrKallah
Copy link
Author

Hi, thank you for getting back to me on this. I have done some testing now, and I am still experiencing some issues, although your advice helped ease my problem.

Hi,

  1. Try to create a task without the use cache option, which is currently enabled by default - this can increase the creation time, but the server will just send the prepared file and will not waste time to prepare it in case of cache miss.

This helps a lot thanks, it still pauses every 72 frames but it is loads faster between time it loads.

  1. The raw frame buffer size is currently 2 GB (https://github.com/openvinotoolkit/cvat/blob/develop/cvat-core/src/frames.js#L656) and cannot be significantly increased due to the browser sandbox memory size limitation. You can try to increase this value and rebuild images with docker-compose -f docker-compose.yml -f docker-compose.dev.yml up -d --build, but I'm not sure about the stability of the tool in this case.

I have tried to change these variables but this still loads every 72 frames. I don't think the frame buffer size is the issue here though as each image we are using are at max 1080p and even uncompressed rgb images at that size will not reach 2gb of memory in 72 frames. I am assuming that seeing this is js, this is on the client end, is there anywhere in the server that I can look to see if there I can increase my performance and get a more stable throughput?

I have looked in cvat/settings/base.py and changed the
'maxBytes': 1024102450, # 50 MB
DATA_UPLOAD_MAX_MEMORY_SIZE = 100 * 1024 * 1024 # 100 MB
LOCAL_LOAD_MAX_FILES_COUNT = 500
LOCAL_LOAD_MAX_FILES_SIZE = 512 * 1024 * 1024 # 512 MB

to
```py
'maxBytes': 1024*1024*1024*50, # 50 GB
DATA_UPLOAD_MAX_MEMORY_SIZE = 100 * 1024 * 1024 * 1024  # 100 GB
LOCAL_LOAD_MAX_FILES_COUNT = 5000
LOCAL_LOAD_MAX_FILES_SIZE = 512 * 1024 * 1024 * 1024  # 512 GB

And that made no difference either, still loading every 72 frames.

I am however starting to think that it might be related to a hard limit on frames rather than a file size limitation. What makes me say this, is that regardless if the images are uploaded using the 70% compression or not, and whether its 1080p, 720p or even smaller image sizes, its still loading every 72 frames.

Thank you for taking time to look at this.

@azhavoro
Copy link
Contributor

@mrKallah Hi, you can set the chunk size in the task constructor https://openvinotoolkit.github.io/cvat/docs/manual/basics/creating_an_annotation_task/#chunk-size. Default value for 1080p and lower resolution is 36 frames and seems that in your case the browser has time to download and decode only 2 chunks during playback and must wait for new frames to be ready.

@mrKallah
Copy link
Author

@azhavoro

@mrKallah Hi, you can set the chunk size in the task constructor https://openvinotoolkit.github.io/cvat/docs/manual/basics/creating_an_annotation_task/#chunk-size. Default value for 1080p and lower resolution is 36 frames and seems that in your case the browser has time to download and decode only 2 chunks during playback and must wait for new frames to be ready.

Thank you, increasing this value has helped significantly I put the value at 128 and now the loading only happens every 256 frames!

@mrKallah
Copy link
Author

mrKallah commented Oct 22, 2021

@azhavoro, just one more question, I have been testing some more, and overall its far better with the newer settings, however I am still experiencing some stops and buffering, and also some frame drops when fast forwarding right after a buffer. Is there any way to give more system resources to cvat? I max out on 50% cpu usage when fast forwarding and about 50% of ram too. The server is hosted on the same machine as the client, and even if it was over network I'm on a 1GB up and down connection and everything is installed on M.2 drives, including the storage of the files and the server software. I can't see what the bottleneck here is and so I thought maybe there is something I can do to give cvat more resources?

Thanks again for your help so far!

@mrKallah
Copy link
Author

@azhavoro, just one more question, I have been testing some more, and overall its far better with the newer settings, however I am still experiencing some stops and buffering, and also some frame drops when fast forwarding right after a buffer. Is there any way to give more system resources to cvat? I max out on 50% cpu usage when fast forwarding and about 50% of ram too. The server is hosted on the same machine as the client, and even if it was over network I'm on a 1GB up and down connection and everything is installed on M.2 drives, including the storage of the files and the server software. I can't see what the bottleneck here is and so I thought maybe there is something I can do to give cvat more resources?

Thanks again for your help so far!
Screenshot from 2021-10-25 09-16-12

Here you can see the docker stats command shows that the image draws 100% but that the overall usage for python3 is only around 6% which is one core. Is there any way to get multi-threading to work?

@nmanovic
Copy link
Contributor

Probably need to have an easy way to configure these parameters. Need to look at streaming pipeline one more time. Probably there is a room for performance optimization and improving the UX.

@nmanovic nmanovic added enhancement New feature or request performance labels Nov 19, 2021
@bsekachev
Copy link
Member

  • Now client prefetches frames during user navigates forward (see Added simple prefetch analyzer #6695)
  • Now client has advanced caching mechanism, but it is enterprise feature, available on app.cvat.ai
  • Server also has its own cache based on KeyDB, where size can be configured via .yaml file (see cvat/docker-compose.yml).

So, I believe we can close the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance
Projects
None yet
Development

No branches or pull requests

6 participants
@rsnk96 @mrKallah @bsekachev @nmanovic @azhavoro and others