Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix explore command without project #1271

Merged
merged 2 commits into from
Feb 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
(<https://github.com/openvinotoolkit/datumaro/pull/1253>)
- Remove deprecated MediaManager
(<https://github.com/openvinotoolkit/datumaro/pull/1262>)
- Fix explore command without project
(<https://github.com/openvinotoolkit/datumaro/pull/1271>)

## Jan. 2024 Release 1.5.2
### Enhancements
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Usage:
```console
datum explore [target] [--query-img-path <path/to/image>]
[--query-item-id </id/of/image/datasetitem> --query-item-subset <subset/of/image>]
[--query-str <text_query>] [-topk TOPK] [-p PROJECT_DIR] [-s SAVE] [--stage STAGE]
[--query-str <text_query>] [-topk TOPK] [-p PROJECT_DIR] [-s SAVE] [-o DST_DIR] [--stage STAGE]
```

Parameters:
Expand All @@ -27,6 +27,7 @@ Parameters:
- `-topk` (int) - Number how much you want to find similar data.
- `-p, --project` (string) - Directory of the project to operate on (default: current directory).
- `-s, --save` (bool) - Save explorer result files on explore_result folder.
- `-o, --output-dir` (string) - Output directory. By default, a new directory is created in the current directory.
- `--stage` (bool) - Include this action as a project build step.
If true, this operation will be saved in the project
build tree, allowing to reproduce the resulting dataset later.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,12 @@ The Python example for the usage of explorer is described in :doc:`here <../../j

``QUERY_STR`` could be text description or list of them

.. code-block:: bash

datum explore <target> --query-str QUERY_STR -topk TOPK_NUM -s -o DST_DIR

To save the result, specify the output directory as ``DST_DIR``

.. tab-item:: ProjectCLI

With the project-based CLI, we first require to ``create`` a project by
Expand Down
33 changes: 19 additions & 14 deletions src/datumaro/cli/commands/explore.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
import argparse
import logging as log
import os
import os.path as osp
import shutil
import uuid

Expand Down Expand Up @@ -88,6 +87,13 @@
default=False,
help="Save explorer result files on explore_result folder",
)
parser.add_argument(

Check warning on line 90 in src/datumaro/cli/commands/explore.py

View check run for this annotation

Codecov / codecov/patch

src/datumaro/cli/commands/explore.py#L90

Added line #L90 was not covered by tests
"-o",
"--output-dir",
dest="dst_dir",
default=None,
help="Directory to save explore results " "(default: generate automatically)",
)
parser.add_argument(
"--stage",
type=str_to_bool,
Expand All @@ -112,6 +118,7 @@
"target",
"topk",
"project_dir",
"dst_dir",
]
}

Expand All @@ -130,22 +137,20 @@
else:
targets = list(project.working_tree.sources)

source_datasets = []
for target in targets:
target_dataset, _ = parse_full_revpath(target, project)
source_datasets.append(target_dataset)
source_datasets = [parse_full_revpath(target, project)[0] for target in targets]

explorer_args = {"save_hashkey": True}
build_tree = project.working_tree.clone()
for target in targets:
build_tree.build_targets.add_explore_stage(target, params=explorer_args)
if project:
build_tree = project.working_tree.clone()

Check warning on line 144 in src/datumaro/cli/commands/explore.py

View check run for this annotation

Codecov / codecov/patch

src/datumaro/cli/commands/explore.py#L144

Added line #L144 was not covered by tests
for target in targets:
build_tree.build_targets.add_explore_stage(target, params=explorer_args)

Check warning on line 146 in src/datumaro/cli/commands/explore.py

View check run for this annotation

Codecov / codecov/patch

src/datumaro/cli/commands/explore.py#L146

Added line #L146 was not covered by tests

explorer = Explorer(*source_datasets)
for dataset in source_datasets:
dst_dir = dataset.data_path
dataset.save(dst_dir, save_media=True, save_hashkey_meta=True)

if args.stage:
if args.stage and project:
project.working_tree.config.update(build_tree.config)
project.working_tree.save()

Expand Down Expand Up @@ -179,14 +184,14 @@
log.info(f"id: {result.id} | subset: {result.subset} | path : {path}")

if args.save:
saved_result_path = osp.join(args.project_dir, "explore_result")
if osp.exists(saved_result_path):
saved_result_path = args.dst_dir or os.path.join(args.project_dir, "explore_result")

Check warning on line 187 in src/datumaro/cli/commands/explore.py

View check run for this annotation

Codecov / codecov/patch

src/datumaro/cli/commands/explore.py#L187

Added line #L187 was not covered by tests
if os.path.exists(saved_result_path):
shutil.rmtree(saved_result_path)
os.makedirs(saved_result_path)
for result in results:
saved_subset_path = osp.join(saved_result_path, result.subset)
if not osp.exists(saved_subset_path):
saved_subset_path = os.path.join(saved_result_path, result.subset)

Check warning on line 192 in src/datumaro/cli/commands/explore.py

View check run for this annotation

Codecov / codecov/patch

src/datumaro/cli/commands/explore.py#L192

Added line #L192 was not covered by tests
if not os.path.exists(saved_subset_path):
os.makedirs(saved_subset_path)
shutil.copyfile(path, osp.join(saved_subset_path, result.id + ".jpg"))
shutil.copyfile(path, os.path.join(saved_subset_path, result.id + ".jpg"))

Check warning on line 195 in src/datumaro/cli/commands/explore.py

View check run for this annotation

Codecov / codecov/patch

src/datumaro/cli/commands/explore.py#L195

Added line #L195 was not covered by tests

return 0
28 changes: 28 additions & 0 deletions tests/integration/cli/test_explore.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,34 @@ def test_can_explore_dataset_w_query_str(self):

self.assertIn(osp.join(saved_result_path, "train", "1.jpg"), results)

@mark_requirement(Requirements.DATUM_GENERAL_REQ)
@scoped
def test_can_explore_dataset_w_target(self):
test_dir = scope_add(TestDir())
proj_dir = osp.join(test_dir, "proj")
dataset_url = osp.join(test_dir, "dataset")
train_image_path = osp.join(dataset_url, "images", "train", "1.jpg")
saved_result_path = osp.join(proj_dir, "explore_result")

self.test_dataset.export(dataset_url, "datumaro", save_media=True)

run(
self,
"explore",
dataset_url,
"--query-img-path",
train_image_path,
"-topk",
"2",
"-s",
"-o",
saved_result_path,
)

results = glob(osp.join(saved_result_path, "**", "*"), recursive=True)

self.assertIn(osp.join(saved_result_path, "train", "1.jpg"), results)

@mark_requirement(Requirements.DATUM_GENERAL_REQ)
@scoped
def test_can_explore_dataset_wo_target(self):
Expand Down
Loading