Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CVAT] Adapt exchange/recording oracles for honeypots #2720

Merged
merged 83 commits into from
Nov 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
eee13bc
Update exchange oracle
Marishka17 Oct 29, 2024
4fa8d23
Add wait timeout when importing GT annotations
Marishka17 Oct 29, 2024
3ce5d69
Extract a base class for task creation
zhiltsov-max Oct 30, 2024
6795109
Move gt setup into the base class
zhiltsov-max Oct 30, 2024
621af5b
Add draft implementation for points task creation
zhiltsov-max Oct 30, 2024
e8727db
[draft] Update recording oracle
Marishka17 Oct 30, 2024
cb28f92
Upgrade cvat-sdk dep
zhiltsov-max Oct 31, 2024
f6daa82
[Exchnage oracle] apply some comments && small fixes
Marishka17 Oct 31, 2024
8f00430
Refactor some code, fix errors
zhiltsov-max Oct 31, 2024
619ae54
Add quality settings setup
zhiltsov-max Oct 31, 2024
c1e5e93
Fix update quality settings
zhiltsov-max Oct 31, 2024
05b5442
Use inbound bbox circle radius for point validation
zhiltsov-max Oct 31, 2024
d2f3b0b
Merge remote-tracking branch 'upstream/mk/update_cvat_oracles' into z…
zhiltsov-max Oct 31, 2024
0428f03
Fix linter errors
zhiltsov-max Oct 31, 2024
8ef59b4
Merge pull request #2730 from humanprotocol/zm/change_point_validation
zhiltsov-max Nov 1, 2024
e145b7d
Fix quality settings update call
zhiltsov-max Nov 1, 2024
aa3d642
Move common function to the base class
zhiltsov-max Nov 1, 2024
d42f0b6
Expect point group annotations in points annotation task
zhiltsov-max Nov 1, 2024
a4b6505
Fix response check
zhiltsov-max Nov 1, 2024
f0258c4
Improve formatting in the log message
zhiltsov-max Nov 1, 2024
a3fbb84
Use single shape mode for image_points task
zhiltsov-max Nov 1, 2024
19202c7
Refactor recording oracle updates
Marishka17 Nov 1, 2024
4ae5928
[Recording oracle] update deps
Marishka17 Nov 1, 2024
df99be5
Update assignment urls for skeleton tasks
zhiltsov-max Nov 1, 2024
2b9b5fa
Merge pull request #2743 from humanprotocol/zm/change_point_task_stat…
zhiltsov-max Nov 1, 2024
3b70e40
Rename quality parameter
zhiltsov-max Nov 1, 2024
2f62b70
Fix linter error
zhiltsov-max Nov 1, 2024
ebdb341
[Ex oracle] Move gt dataset preparation into separate method for skel…
Marishka17 Nov 3, 2024
79a7c0a
[Ex oracle] update deps
Marishka17 Nov 3, 2024
93284c9
Resolve conflicts
Marishka17 Nov 3, 2024
6e286f5
[Ex oracle] Improve handling oracle mode(dev/prod/test)
Marishka17 Nov 4, 2024
30f84dc
[Ex oracle] Fix test
Marishka17 Nov 4, 2024
24bbc5a
[Recording oracle] Apply comments && small fixes && remove unused code
Marishka17 Nov 4, 2024
95a80c1
[Exchnage oracle] Pass job start/stop frame from Ex oracle to Rec oracle
Marishka17 Nov 6, 2024
916cb0d
Update recording oracle
Marishka17 Nov 6, 2024
c00825c
t
Marishka17 Nov 6, 2024
5bc041f
[Exchange oracle] Fix tests
Marishka17 Nov 7, 2024
6bca715
[Exchange oracle] Add migration
Marishka17 Nov 7, 2024
75db488
Fix test
Marishka17 Nov 7, 2024
41bbcf5
[Exchange oracle] mark job start/stop frame as not nullable
Marishka17 Nov 7, 2024
d70a63b
Fix tests
Marishka17 Nov 7, 2024
a96d1e7
[Recording oracle] Clean up the code
Marishka17 Nov 7, 2024
8853208
Update packages/examples/cvat/exchange-oracle/src/handlers/job_creati…
Marishka17 Nov 7, 2024
1963855
Merge develop
Marishka17 Nov 7, 2024
a933ea6
[Recording oracle] Apply comments
Marishka17 Nov 7, 2024
7189d81
[Exchnage oracle] use_bbox_size_for_points -> point_size_base
Marishka17 Nov 7, 2024
34bbdf3
[Rec oracle] Use MT19937 generator
Marishka17 Nov 8, 2024
3f536a5
[Exchange oracle] Move BoxesFromPointsTaskBuilder::_prepare_gt_roi_da…
Marishka17 Nov 8, 2024
1b6e13a
[Exchange oracle] Fix checking which files should be uploaded to the …
Marishka17 Nov 8, 2024
a5399bd
[Exchange oracle] Update down_revision
Marishka17 Nov 8, 2024
f4c5900
fix typo
Marishka17 Nov 8, 2024
2e71542
[Exchange oracle] Include val_size into chunk_size
Marishka17 Nov 8, 2024
69fc2e3
Fix some errors
zhiltsov-max Nov 8, 2024
2d1af93
Enable empty frame matching
zhiltsov-max Nov 8, 2024
78daf11
Merge remote-tracking branch 'upstream/mk/update_cvat_oracles' into m…
zhiltsov-max Nov 8, 2024
b0603dc
Use the added parameter
zhiltsov-max Nov 8, 2024
8d56fc8
[Recording orcale] Fix get_task_quality_report
Marishka17 Nov 8, 2024
e4ce52d
[Ex oracle] Fix missing segment_size
Marishka17 Nov 11, 2024
fa5cf7a
[Rec oracle] Small fixes
Marishka17 Nov 11, 2024
0fec721
Update packages/examples/cvat/exchange-oracle/src/cvat/api_calls.py
Marishka17 Nov 11, 2024
907235c
[Exchange oracle] Bump cvat-sdk version
Marishka17 Nov 11, 2024
3847298
Fix roi GT dataset in boxes_from_points tasks
zhiltsov-max Nov 11, 2024
eb83293
Add more clever default for sort_images
zhiltsov-max Nov 11, 2024
6a94f45
Fix linter error, remove gt image data callback
zhiltsov-max Nov 11, 2024
a543c24
Fix type annotation
zhiltsov-max Nov 11, 2024
1f43589
Fix quality settings for skeletons_from_boxes
zhiltsov-max Nov 11, 2024
06166f9
Fix GT datasets for points in skeletons_from_boxes
zhiltsov-max Nov 11, 2024
e67625a
Simplify roi info id
zhiltsov-max Nov 11, 2024
481f251
Remove unused field from skeleton roi info
zhiltsov-max Nov 12, 2024
5a08e4d
Allow optional joints in skeleton task manifest
zhiltsov-max Nov 12, 2024
ba9228f
Add points task meta
zhiltsov-max Nov 12, 2024
16c67b0
Basic fix for merged dataset annotations
zhiltsov-max Nov 12, 2024
6e07890
Basic fix for premature escrow validation requests
zhiltsov-max Nov 12, 2024
203ddee
Fix incorrect GT preparation in boxes_from_points
zhiltsov-max Nov 12, 2024
dd43bfa
Refactor some code
zhiltsov-max Nov 12, 2024
a097c3e
Fix linter problem
zhiltsov-max Nov 12, 2024
e2cd19e
Use the original GT for final annotation merging
zhiltsov-max Nov 13, 2024
5f4a616
Resolve conflicts
Marishka17 Nov 13, 2024
2d70afb
Fix dataset merging for the points task
zhiltsov-max Nov 13, 2024
4d7683d
Update comment
zhiltsov-max Nov 13, 2024
7122809
[Exchange Oracle] Move cvat timeout settings to cvat config, update .…
zhiltsov-max Nov 13, 2024
3746126
[Recording Oracle] Update .env template, add some variables
zhiltsov-max Nov 13, 2024
5865287
Update packages/examples/cvat/exchange-oracle/src/core/config.py
Marishka17 Nov 13, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
"""job_start_stop_frame

Revision ID: 0a91b6a5f7b6
Revises: 4fc740e8c6ff
Create Date: 2024-11-07 08:15:00.780982

"""

import sqlalchemy as sa

from alembic import op

# revision identifiers, used by Alembic.
revision = "0a91b6a5f7b6"
down_revision = "4fc740e8c6ff"
branch_labels = None
depends_on = None


def upgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.add_column("jobs", sa.Column("start_frame", sa.Integer(), nullable=False))
op.add_column("jobs", sa.Column("stop_frame", sa.Integer(), nullable=False))
# ### end Alembic commands ###


def downgrade() -> None:
# ### commands auto generated by Alembic - please adjust! ###
op.drop_column("jobs", "stop_frame")
op.drop_column("jobs", "start_frame")
# ### end Alembic commands ###
187 changes: 111 additions & 76 deletions packages/examples/cvat/exchange-oracle/poetry.lock

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion packages/examples/cvat/exchange-oracle/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ sqlalchemy-utils = "^0.41.1"
alembic = "^1.11.1"
httpx = "^0.24.1"
pytest = "^7.2.2"
cvat-sdk = "2.6.0"
cvat-sdk = "^2.22.0"
sqlalchemy = "^2.0.16"
apscheduler = "^3.10.1"
xmltodict = "^0.13.0"
Expand Down
2 changes: 1 addition & 1 deletion packages/examples/cvat/exchange-oracle/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from src.core.config import Config

if __name__ == "__main__":
is_dev = Config.environment == "development"
is_dev = Config.is_development_mode()
Config.validate()
register_in_kvstore()

Expand Down
10 changes: 10 additions & 0 deletions packages/examples/cvat/exchange-oracle/src/.env.template
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,14 @@ CVAT_ADMIN_USER_ID=
CVAT_INCOMING_WEBHOOKS_URL=
CVAT_WEBHOOK_SECRET=
CVAT_ORG_SLUG=
CVAT_TASK_SEGMENT_SIZE=
CVAT_MAX_JOBS_PER_TASK=
CVAT_TASK_CREATION_CHECK_INTERVAL=
CVAT_MAX_VALIDATION_CHECKS=
CVAT_IOU_THRESHOLD=
CVAT_OKS_SIGMA=
CVAT_EXPORT_TIMEOUT=
CVAT_IMPORT_TIMEOUT=

# Storage Config (S3/GCS)

Expand All @@ -84,6 +92,7 @@ STORAGE_USE_SSL=
# Features

ENABLE_CUSTOM_CLOUD_HOST=
REQUEST_LOGGING_ENABLED=

# Core

Expand All @@ -105,6 +114,7 @@ DEFAULT_API_PAGE_SIZE=
LOCALHOST_RECORDING_ORACLE_ADDRESS=
LOCALHOST_RECORDING_ORACLE_URL=
LOCALHOST_JOB_LAUNCHER_URL=
LOCALHOST_REPUTATION_ORACLE_ADDRESS=
LOCALHOST_REPUTATION_ORACLE_URL=

# Encryption
Expand Down
3 changes: 1 addition & 2 deletions packages/examples/cvat/exchange-oracle/src/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,5 @@ async def startup_event():
logger.info("Exchange Oracle is up and running!")


is_test = Config.environment == "test"
if not is_test:
if not Config.is_test_mode():
setup_cron_jobs(app)
2 changes: 1 addition & 1 deletion packages/examples/cvat/exchange-oracle/src/chain/escrow.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,6 @@ def get_available_webhook_types(
Config.localhost.recording_oracle_address or escrow.recording_oracle
).lower(): OracleWebhookTypes.recording_oracle,
(
Config.localhost.reputation_oracle_url or escrow.reputation_oracle
Config.localhost.reputation_oracle_address or escrow.reputation_oracle
).lower(): OracleWebhookTypes.reputation_oracle,
}
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,12 @@

class JobMeta(BaseModel):
job_id: int
task_id: int
annotation_filename: Path
annotator_wallet_address: str
assignment_id: str
start_frame: int
stop_frame: int


class AnnotationMeta(BaseModel):
Expand Down
58 changes: 52 additions & 6 deletions packages/examples/cvat/exchange-oracle/src/core/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@
import inspect
import os
from collections.abc import Iterable
from typing import ClassVar
from enum import Enum
from typing import ClassVar, Optional

from attrs.converters import to_bool
from dotenv import load_dotenv
Expand Down Expand Up @@ -104,6 +105,8 @@ class LocalhostConfig(_NetworkConfig):

recording_oracle_address = os.environ.get("LOCALHOST_RECORDING_ORACLE_ADDRESS")
recording_oracle_url = os.environ.get("LOCALHOST_RECORDING_ORACLE_URL")

reputation_oracle_address = os.environ.get("LOCALHOST_REPUTATION_ORACLE_ADDRESS")
reputation_oracle_url = os.environ.get("LOCALHOST_REPUTATION_ORACLE_URL")


Expand Down Expand Up @@ -167,14 +170,30 @@ class CronConfig:


class CvatConfig:
# TODO: remove cvat_ prefix
cvat_url = os.environ.get("CVAT_URL", "http://localhost:8080")
cvat_admin = os.environ.get("CVAT_ADMIN", "admin")
cvat_admin_pass = os.environ.get("CVAT_ADMIN_PASS", "admin")
cvat_org_slug = os.environ.get("CVAT_ORG_SLUG", "")

cvat_job_overlap = int(os.environ.get("CVAT_JOB_OVERLAP", 0))
cvat_job_segment_size = int(os.environ.get("CVAT_JOB_SEGMENT_SIZE", 150))
cvat_task_segment_size = int(os.environ.get("CVAT_TASK_SEGMENT_SIZE", 150))
cvat_default_image_quality = int(os.environ.get("CVAT_DEFAULT_IMAGE_QUALITY", 70))
cvat_max_jobs_per_task = int(os.environ.get("CVAT_MAX_JOBS_PER_TASK", 1000))
cvat_task_creation_check_interval = int(os.environ.get("CVAT_TASK_CREATION_CHECK_INTERVAL", 5))

cvat_export_timeout = int(os.environ.get("CVAT_EXPORT_TIMEOUT", 5 * 60))
"Timeout, in seconds, for annotations or dataset export waiting"

cvat_import_timeout = int(os.environ.get("CVAT_IMPORT_TIMEOUT", 60 * 60))
"Timeout, in seconds, for waiting on GT annotations import"

# quality control settings
cvat_max_validation_checks = int(os.environ.get("CVAT_MAX_VALIDATION_CHECKS", 3))
"Maximum number of attempts to run a validation check on a job after completing annotation"

cvat_iou_threshold = float(os.environ.get("CVAT_IOU_THRESHOLD", 0.8))
cvat_oks_sigma = float(os.environ.get("CVAT_OKS_SIGMA", 0.1))

cvat_incoming_webhooks_url = os.environ.get("CVAT_INCOMING_WEBHOOKS_URL")
cvat_webhook_secret = os.environ.get("CVAT_WEBHOOK_SECRET", "thisisasamplesecret")
Expand Down Expand Up @@ -220,9 +239,6 @@ class FeaturesConfig:
enable_custom_cloud_host = to_bool(os.environ.get("ENABLE_CUSTOM_CLOUD_HOST", "no"))
"Allows using a custom host in manifest bucket urls"

default_export_timeout = int(os.environ.get("DEFAULT_EXPORT_TIMEOUT", 60))
"Timeout, in seconds, for annotations or dataset export waiting"

request_logging_enabled = to_bool(os.getenv("REQUEST_LOGGING_ENABLED", "0"))
"Allow to log request details for each request"

Expand Down Expand Up @@ -282,10 +298,25 @@ def validate(cls) -> None:
raise Exception(" ".join([ex_prefix, str(ex)]))


class Environment(str, Enum):
PRODUCTION = "production"
DEVELOPMENT = "development"
TEST = "test"

@classmethod
def _missing_(cls, value: str) -> Optional["Environment"]:
value = value.lower()
for member in cls:
if member.value == value:
return member

return None


class Config:
debug = to_bool(os.environ.get("DEBUG", "false"))
port = int(os.environ.get("PORT", 8000))
environment = os.environ.get("ENVIRONMENT", "development")
environment = Environment(os.environ.get("ENVIRONMENT", Environment.DEVELOPMENT.value))
workers_amount = int(os.environ.get("WORKERS_AMOUNT", 1))
webhook_max_retries = int(os.environ.get("WEBHOOK_MAX_RETRIES", 5))
webhook_delay_if_failed = int(os.environ.get("WEBHOOK_DELAY_IF_FAILED", 60))
Expand All @@ -307,6 +338,21 @@ class Config:
core_config = CoreConfig
encryption_config = EncryptionConfig

@classmethod
def is_development_mode(cls) -> bool:
"""Returns whether the oracle is running in development mode or not"""
return cls.environment == Environment.DEVELOPMENT

@classmethod
def is_test_mode(cls) -> bool:
"""Returns whether the oracle is running in testing mode or not"""
return cls.environment == Environment.TEST

@classmethod
def is_production_mode(cls) -> bool:
"""Returns whether the oracle is running in production mode or not"""
return cls.environment == Environment.PRODUCTION

@classmethod
def validate(cls) -> None:
for attr_or_method in cls.__dict__:
Expand Down
17 changes: 9 additions & 8 deletions packages/examples/cvat/exchange-oracle/src/core/manifest.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,14 +105,15 @@ def validate_type(cls, values: dict[str, Any]) -> dict[str, Any]:
existing_names.add(node_name.lower())

nodes_count = len(values["nodes"])
joints = values["joints"]
for joint_idx, joint in enumerate(joints):
for v in joint:
if not (0 <= v < nodes_count):
raise ValueError(
f"Skeleton '{skeleton_name}' joint #{joint_idx}: invalid value. "
f"Expected a number in the range [0; {nodes_count - 1}]"
)
joints = values.get("joints")
if joints is not None:
for joint_idx, joint in enumerate(joints):
for v in joint:
if not (0 <= v < nodes_count):
raise ValueError(
f"Skeleton '{skeleton_name}' joint #{joint_idx}: invalid value. "
f"Expected a number in the range [0; {nodes_count - 1}]"
)

return values

Expand Down
31 changes: 31 additions & 0 deletions packages/examples/cvat/exchange-oracle/src/core/tasks/points.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
import os
from pathlib import Path
from tempfile import TemporaryDirectory

import datumaro as dm

# These details are relevant for image_points tasks


class TaskMetaLayout:
GT_FILENAME = "gt.json"


class TaskMetaSerializer:
GT_DATASET_FORMAT = "datumaro"

def serialize_gt_annotations(self, gt_dataset: dm.Dataset) -> bytes:
with TemporaryDirectory() as temp_dir:
gt_dataset_dir = os.path.join(temp_dir, "gt_dataset")
gt_dataset.export(gt_dataset_dir, self.GT_DATASET_FORMAT)
return (Path(gt_dataset_dir) / "annotations" / "default.json").read_bytes()

def parse_gt_annotations(self, gt_dataset_data: bytes) -> dm.Dataset:
with TemporaryDirectory() as temp_dir:
annotations_filename = os.path.join(temp_dir, "default.json")
with open(annotations_filename, "wb") as f:
f.write(gt_dataset_data)

dataset = dm.Dataset.import_from(temp_dir, format=self.GT_DATASET_FORMAT)
dataset.init_cache()
return dataset
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

import datumaro as dm

# These details are relevant for image_points and image_boxes tasks
# These details are relevant for image_boxes and image_polygons tasks


class TaskMetaLayout:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,8 @@ def track_task_creation(logger: logging.Logger, session: Session) -> None:
upload.task_id,
upload.task.cvat_project_id,
status=JobStatuses(cvat_job.state),
start_frame=cvat_job.start_frame,
stop_frame=cvat_job.stop_frame,
)

completed.append(upload)
Expand Down
Loading