[BUG] Unable to run opensearch-benchmark in test mode #245

kotwanikunal · 2023-03-27T23:31:22Z

Describe the bug

opensearch-benchmark fails without running the benchmark on a local OpenSearch server
Logs below

To Reproduce

Install latest version from pypi (pip3 install opensearch-benchmark)
Execute the test command

opensearch-benchmark execute_test --target-host=localhost:9200 --workload=nyc_taxis --pipeline=benchmark-only --test-mode --kill-running-processes

Expected behavior

Benchmarks to run in test mode

Logs

023-03-27 23:25:05,311 ActorAddr-(T|:49871)/PID:73989 osbenchmark.actor ERROR Error in test execution orchestrator
Traceback (most recent call last):

  File "/Users/kkotwani/.pyenv/versions/3.8.16/lib/python3.8/site-packages/osbenchmark/actor.py", line 92, in guard
    return f(self, msg, sender)

  File "/Users/kkotwani/.pyenv/versions/3.8.16/lib/python3.8/site-packages/osbenchmark/test_execution_orchestrator.py", line 108, in receiveMsg_Setup
    self.coordinator.setup(sources=msg.sources)

  File "/Users/kkotwani/.pyenv/versions/3.8.16/lib/python3.8/site-packages/osbenchmark/test_execution_orchestrator.py", line 195, in setup
    self.current_workload = workload.load_workload(self.cfg)

  File "/Users/kkotwani/.pyenv/versions/3.8.16/lib/python3.8/site-packages/osbenchmark/workload/loader.py", line 192, in load_workload
    repo = workload_repo(cfg)

  File "/Users/kkotwani/.pyenv/versions/3.8.16/lib/python3.8/site-packages/osbenchmark/workload/loader.py", line 290, in workload_repo
    return GitWorkloadRepository(cfg, fetch, update)

  File "/Users/kkotwani/.pyenv/versions/3.8.16/lib/python3.8/site-packages/osbenchmark/workload/loader.py", line 331, in __init__
    self.repo.update(distribution_version)

  File "/Users/kkotwani/.pyenv/versions/3.8.16/lib/python3.8/site-packages/osbenchmark/utils/repo.py", line 68, in update
    branch = versions.best_match(git.branches(self.repo_dir, remote=self.remote), distribution_version)

  File "/Users/kkotwani/.pyenv/versions/3.8.16/lib/python3.8/site-packages/osbenchmark/utils/git.py", line 44, in probe
    return f(src, *args, **kwargs)

  File "/Users/kkotwani/.pyenv/versions/3.8.16/lib/python3.8/site-packages/osbenchmark/utils/git.py", line 123, in branches
    return _cleanup_remote_branch_names(process.run_subprocess_with_output(

  File "/Users/kkotwani/.pyenv/versions/3.8.16/lib/python3.8/site-packages/osbenchmark/utils/git.py", line 137, in _cleanup_remote_branch_names
    return [(b[b.index("/") + 1:]).strip() for b in branch_names if not b.endswith("/HEAD")]

  File "/Users/kkotwani/.pyenv/versions/3.8.16/lib/python3.8/site-packages/osbenchmark/utils/git.py", line 137, in <listcomp>
    return [(b[b.index("/") + 1:]).strip() for b in branch_names if not b.endswith("/HEAD")]

ValueError: substring not found

More Context (please complete the following information):

Workload(Share link for custom workloads): nyc_taxis
Service(E.g OpenSearch): OpenSearch
Version (E.g. 1.0): 3.0.0/Latest main

Additional context

Running on an arm64 M1 Macbook

The text was updated successfully, but these errors were encountered:

IanHoang · 2023-03-28T19:06:55Z

Thanks for bringing this to our attention Kunal. I've also been experiencing this issue recently in the integration tests and am currently in the process of identifying a fix. Looks like the issue arises for users who are starting from scratch and is not detected for users who have workloads already preloaded and unzipped.

IanHoang · 2023-03-28T19:50:41Z

Curious to see if this issue only occurs when we provision an opensearch cluster via OSB. I've been using an external host and found that it works with that. This is good to note since we can further be sure to test changes with external and internal hosts.

kartg · 2023-03-28T20:23:01Z

Curious to see if this issue only occurs when we provision an opensearch cluster via OSB

i'm currently hitting this issue when running OSB against a remote endpoint:

opensearch-benchmark execute_test --workload geonames --workload-params "bulk_indexing_clients:1" --pipeline benchmark-only --target-hosts [endpoint]:[port]

Note that I get the same stacktrace and error on running opensearch-benchmark list workloads

IanHoang · 2023-03-28T20:35:14Z

Thanks for the heads up @kartg. I've been able to get it to work with an external host. Will dive further into this.

$ ~ % opensearch-benchmark execute_test --target-host=<endpoint> --client-options="basic_auth_user:'<username>',basic_auth_password:'<password>'" --workload=nyc_taxis --pipeline=benchmark-only --test-mode --kill-running-processes


   ____                  _____                      __       ____                  __                         __
  / __ \____  ___  ____ / ___/___  ____ ___________/ /_     / __ )___  ____  _____/ /_  ____ ___  ____ ______/ /__
 / / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \   / __  / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
/ /_/ / /_/ /  __/ / / /__/ /  __/ /_/ / /  / /__/ / / /  / /_/ /  __/ / / / /__/ / / / / / / / / /_/ / /  / ,<
\____/ .___/\___/_/ /_/____/\___/\__,_/_/   \___/_/ /_/  /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/  /_/|_|
    /_/

[INFO] You did not provide an explicit timeout in the client options. Assuming default of 10 seconds.
[INFO] Executing test with workload [nyc_taxis], test_procedure [append-no-conflicts] and provision_config_instance ['external'] with version [1.1.0].

[WARNING] merges_total_time is 94 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] indexing_total_time is 253 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] refresh_total_time is 289 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] flush_total_time is 86 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
Running delete-index                                                           [100% done]
Running create-index                                                           [100% done]
Running check-cluster-health                                                   [100% done]
Running index                                                                  [100% done]
Running refresh-after-index                                                    [100% done]
Running force-merge                                                            [100% done]
Running refresh-after-force-merge                                              [100% done]
Running wait-until-merges-finish                                               [100% done]
Running default                                                                [100% done]
Running range                                                                  [100% done]
Running distance_amount_agg                                                    [100% done]
Running autohisto_agg                                                          [100% done]
Running date_histogram_agg                                                     [100% done]

------------------------------------------------------
    _______             __   _____
   / ____(_)___  ____ _/ /  / ___/_________  ________
  / /_  / / __ \/ __ `/ /   \__ \/ ___/ __ \/ ___/ _ \
 / __/ / / / / / /_/ / /   ___/ / /__/ /_/ / /  /  __/
/_/   /_/_/ /_/\__,_/_/   /____/\___/\____/_/   \___/
------------------------------------------------------

|                                                         Metric |                     Task |       Value |   Unit |
|---------------------------------------------------------------:|-------------------------:|------------:|-------:|
|                     Cumulative indexing time of primary shards |                          |  0.00363333 |    min |
|             Min cumulative indexing time across primary shards |                          |           0 |    min |
|          Median cumulative indexing time across primary shards |                          | 0.000366667 |    min |
|             Max cumulative indexing time across primary shards |                          |       0.001 |    min |
|            Cumulative indexing throttle time of primary shards |                          |           0 |    min |
|    Min cumulative indexing throttle time across primary shards |                          |           0 |    min |
| Median cumulative indexing throttle time across primary shards |                          |           0 |    min |
|    Max cumulative indexing throttle time across primary shards |                          |           0 |    min |
|                        Cumulative merge time of primary shards |                          |  0.00156667 |    min |
|                       Cumulative merge count of primary shards |                          |           1 |        |
|                Min cumulative merge time across primary shards |                          |           0 |    min |
|             Median cumulative merge time across primary shards |                          |           0 |    min |
|                Max cumulative merge time across primary shards |                          |  0.00156667 |    min |
|               Cumulative merge throttle time of primary shards |                          |           0 |    min |
|       Min cumulative merge throttle time across primary shards |                          |           0 |    min |
|    Median cumulative merge throttle time across primary shards |                          |           0 |    min |
|       Max cumulative merge throttle time across primary shards |                          |           0 |    min |
|                      Cumulative refresh time of primary shards |                          |       0.005 |    min |
|                     Cumulative refresh count of primary shards |                          |         228 |        |
|              Min cumulative refresh time across primary shards |                          |           0 |    min |
|           Median cumulative refresh time across primary shards |                          | 0.000216667 |    min |
|              Max cumulative refresh time across primary shards |                          |  0.00348333 |    min |
|                        Cumulative flush time of primary shards |                          |  0.00143333 |    min |
|                       Cumulative flush count of primary shards |                          |           8 |        |
|                Min cumulative flush time across primary shards |                          |           0 |    min |
|             Median cumulative flush time across primary shards |                          |    0.000225 |    min |
|                Max cumulative flush time across primary shards |                          | 0.000266667 |    min |
|                                        Total Young Gen GC time |                          |           0 |      s |
|                                       Total Young Gen GC count |                          |           0 |        |
|                                          Total Old Gen GC time |                          |           0 |      s |
|                                         Total Old Gen GC count |                          |           0 |        |
|                                                     Store size |                          | 0.000688318 |     GB |
|                                                  Translog size |                          |  5.6345e-07 |     GB |
|                                         Heap used for segments |                          |   0.0620499 |     MB |
|                                       Heap used for doc values |                          |    0.021553 |     MB |
|                                            Heap used for terms |                          |   0.0317993 |     MB |
|                                            Heap used for norms |                          |  0.00402832 |     MB |
|                                           Heap used for points |                          |           0 |     MB |
|                                    Heap used for stored fields |                          |  0.00466919 |     MB |
|                                                  Segment count |                          |          10 |        |
|                                                 Min Throughput |                    index |     2793.59 | docs/s |
|                                                Mean Throughput |                    index |     2793.59 | docs/s |
|                                              Median Throughput |                    index |     2793.59 | docs/s |
|                                                 Max Throughput |                    index |     2793.59 | docs/s |
|                                        50th percentile latency |                    index |     290.706 |     ms |
|                                       100th percentile latency |                    index |     310.121 |     ms |
|                                   50th percentile service time |                    index |     290.706 |     ms |
|                                  100th percentile service time |                    index |     310.121 |     ms |
|                                                     error rate |                    index |           0 |      % |
|                                                 Min Throughput | wait-until-merges-finish |        4.33 |  ops/s |
|                                                Mean Throughput | wait-until-merges-finish |        4.33 |  ops/s |
|                                              Median Throughput | wait-until-merges-finish |        4.33 |  ops/s |
|                                                 Max Throughput | wait-until-merges-finish |        4.33 |  ops/s |
|                                       100th percentile latency | wait-until-merges-finish |     198.273 |     ms |
|                                  100th percentile service time | wait-until-merges-finish |     198.273 |     ms |
|                                                     error rate | wait-until-merges-finish |           0 |      % |
|                                                 Min Throughput |                  default |        4.53 |  ops/s |
|                                                Mean Throughput |                  default |        4.53 |  ops/s |
|                                              Median Throughput |                  default |        4.53 |  ops/s |
|                                                 Max Throughput |                  default |        4.53 |  ops/s |
|                                       100th percentile latency |                  default |     398.297 |     ms |
|                                  100th percentile service time |                  default |     177.316 |     ms |
|                                                     error rate |                  default |           0 |      % |
|                                                 Min Throughput |                    range |        4.96 |  ops/s |
|                                                Mean Throughput |                    range |        4.96 |  ops/s |
|                                              Median Throughput |                    range |        4.96 |  ops/s |
|                                                 Max Throughput |                    range |        4.96 |  ops/s |
|                                       100th percentile latency |                    range |      380.03 |     ms |
|                                  100th percentile service time |                    range |     178.118 |     ms |
|                                                     error rate |                    range |           0 |      % |
|                                                 Min Throughput |      distance_amount_agg |        4.57 |  ops/s |
|                                                Mean Throughput |      distance_amount_agg |        4.57 |  ops/s |
|                                              Median Throughput |      distance_amount_agg |        4.57 |  ops/s |
|                                                 Max Throughput |      distance_amount_agg |        4.57 |  ops/s |
|                                       100th percentile latency |      distance_amount_agg |     390.726 |     ms |
|                                  100th percentile service time |      distance_amount_agg |     171.877 |     ms |
|                                                     error rate |      distance_amount_agg |           0 |      % |
|                                                 Min Throughput |            autohisto_agg |        4.96 |  ops/s |
|                                                Mean Throughput |            autohisto_agg |        4.96 |  ops/s |
|                                              Median Throughput |            autohisto_agg |        4.96 |  ops/s |
|                                                 Max Throughput |            autohisto_agg |        4.96 |  ops/s |
|                                       100th percentile latency |            autohisto_agg |     378.538 |     ms |
|                                  100th percentile service time |            autohisto_agg |     176.765 |     ms |
|                                                     error rate |            autohisto_agg |           0 |      % |
|                                                 Min Throughput |       date_histogram_agg |        4.57 |  ops/s |
|                                                Mean Throughput |       date_histogram_agg |        4.57 |  ops/s |
|                                              Median Throughput |       date_histogram_agg |        4.57 |  ops/s |
|                                                 Max Throughput |       date_histogram_agg |        4.57 |  ops/s |
|                                       100th percentile latency |       date_histogram_agg |     398.352 |     ms |
|                                  100th percentile service time |       date_histogram_agg |     179.242 |     ms |
|                                                     error rate |       date_histogram_agg |           0 |      % |


--------------------------------
[INFO] SUCCESS (took 16 seconds)

IanHoang · 2023-03-28T21:40:02Z

@kartg @kotwanikunal Could we have more context on your setups:

@kotwanikunal Since you are not specifying --distribution-version in your OSB command, have you already set up an OpenSearch cluster locally?
@kartg Which Operating System are you running on? Also, what OpenSearch version are you running with?
For both of @kotwanikunal @kartg: Could you visit ~/.benchmark/benchmarks/workloads/default/ and run git status and provide the output? Curious to see what branch it's defaulted to.

kotwanikunal · 2023-03-28T21:49:20Z

@kotwanikunal Since you are not specifying --distribution-version in your OSB command, have you already set up an OpenSearch cluster locally?

Yes, I have setup a cluster locally.

For both of @kotwanikunal @kartg: Could you visit ~/.benchmark/benchmarks/workloads/default/ and run git status and provide the output? Curious to see what branch it's defaulted to.

 ~ % cd ~/.benchmark/benchmarks/workloads/default/
default % git status
On branch main
Your branch is behind 'origin/main' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)

Tried pulling the latest and running again, still the same issue.

IanHoang · 2023-03-28T21:58:22Z

@kotwanikunal Could you try the following:

Try testing with an earlier version for OpenSearch (such as 2.6.0) to see if the error still exists.
Pull/Run the docker image for OpenSearch 2.6.0 and retry.
Let us know if you still encounter the issue for both of these attempts

kartg · 2023-03-28T21:59:38Z

@kartg Which Operating System are you running on? Also, what OpenSearch version are you running with?

I'm running OSB on macOS 12.6.3 on an Intel-powered Macbook. My target cluster is on OpenSearch 1.3

For both of @kotwanikunal @kartg: Could you visit ~/.benchmark/benchmarks/workloads/default/ and run git status and provide the output? Curious to see what branch it's defaulted to.

$ cd ~/.benchmark/benchmarks/workloads/default/
$ git status
On branch main
Your branch is up to date with 'origin/main'.

nothing to commit, working tree clean

kartg · 2023-03-28T22:31:56Z

@IanHoang the error seems to originate from executing this git command:

opensearch-benchmark/osbenchmark/utils/git.py

Line 124 in a1f4550

    
           "git -C {src} for-each-ref refs/remotes/ --format='%(refname:short)'".format(src=clean_src)))

and then trying to parse its output:

opensearch-benchmark/osbenchmark/utils/git.py

Line 137 in a1f4550

    
           return [(b[b.index("/") + 1:]).strip() for b in branch_names if not b.endswith("/HEAD")]

Here's the output of that git command against the directory where the workloads repo is cloned:

$ git -C ~/.benchmark/benchmarks/workloads/default for-each-ref refs/remotes/ --format='%(refname:short)'
origin/1
origin/2
origin/3
origin/6
origin/7
origin
origin/main

I'm guessing that the "origin" entry in this list is causing the parsing to fail (since it has no /)

EDIT: Can confirm that I can repro this error signature with a simple Python script:

$ cat test.py 
import os
import sys

dir = sys.argv[1]
stream = os.popen("git -C {src} for-each-ref refs/remotes/ --format='%(refname:short)'".format(src=dir))
branch_names = stream.readlines()
for b in branch_names:
  print((b[b.index("/") + 1:]).strip())

$ python test.py ~/.benchmark/benchmarks/workloads/default/
1
2
3
6
7
Traceback (most recent call last):
  File "/Users/gkart/test.py", line 8, in <module>
    print((b[b.index("/") + 1:]).strip())
             ^^^^^^^^^^^^
ValueError: substring not found

Seems related to the new main branch (cc @tlfeng) that was added to the workloads repo recently - https://github.com/opensearch-project/opensearch-benchmark-workloads/branches

kartg · 2023-03-28T23:15:56Z

It looks like fixing this will need either:

An update to the logic in _cleanup_remote_branch_names, or
A change in the branch names of the workloads repo to only use numerical branch names

In the meantime, here's a rather ugly workaround:

First, run:

git -C ~/.benchmark/benchmarks/workloads/default update-ref -d refs/remotes/origin/main

Then, run the following command to open up the git config in your editor:

git -C ~/.benchmark/benchmarks/workloads/default config -e

and the change the following line:

fetch = +refs/heads/*:refs/remotes/origin/*

to something like:

fetch = +refs/heads/*:refs/real-remotes/origin/*

(or any string other than refs/remotes/). This should now allow OSB to execute normally

kotwanikunal · 2023-03-28T23:35:50Z

It looks like fixing this will need either:

An update to the logic in _cleanup_remote_branch_names, or

A change in the branch names of the workloads repo to only use numerical branch names

In the meantime, here's a rather ugly workaround:

First, run:
git -C ~/.benchmark/benchmarks/workloads/default update-ref -d refs/remotes/origin/main
Then, run the following command to open up the git config in your editor:
git -C ~/.benchmark/benchmarks/workloads/default config -e
and the change the following line:
fetch = +refs/heads/*:refs/remotes/origin/*
to something like:
fetch = +refs/heads/*:refs/real-remotes/origin/*
(or any string other than refs/remotes/). This should now allow OSB to execute normally

Thanks @kartg! @IanHoang this worked for me.

gkamat · 2023-03-28T23:43:32Z

While trying to reproduce this scenario by cloning the workloads repository and checking the refs:

$ git clone https://github.com/opensearch-project/opensearch-benchmark-workloads
$ git -C opensearch-benchmark-workloads for-each-ref refs/remotes/ --format='%(refname:short)'
origin/1
origin/2
origin/3
origin/6
origin/7
origin/HEAD
origin/main

There are no entries without a /. Where is the offending ref coming from?

kartg · 2023-03-28T23:52:09Z

@gkamat interesting, maybe it's a change in the git command? The version on my machine simply drops the /HEAD suffix

$ git clone https://github.com/opensearch-project/opensearch-benchmark-workloads

$ git for-each-ref refs/remotes/ --format='%(refname:short)'
origin/1
origin/2
origin/3
origin/6
origin/7
origin
origin/main

$ git --version
git version 2.40.0

kotwanikunal · 2023-03-29T00:00:51Z

I have the same output as @kartg

workspace % git -C opensearch-benchmark-workloads for-each-ref refs/remotes/ --format='%(refname:short)'
origin/1
origin/2
origin/3
origin/6
origin/7
origin
origin/main
workspace % git --version
git version 2.40.0

gkamat · 2023-03-29T00:26:41Z

I'm using the version that gets installed by yum on AL2:

$ git --version
git version 2.39.2

Which platform are you using?

kotwanikunal · 2023-03-29T00:31:01Z

I'm using the version that gets installed by yum on AL2:
$ git --version
git version 2.39.2
Which platform are you using?

M1 Macbook Pro on Ventura 13.2.1

gkamat · 2023-03-29T00:57:22Z

That is a rather new version of git. It will need to be built from source -- even the Linux tarballs at https://git-scm.com/download/linux end at 2.39.2.

kartg · 2023-03-29T01:06:05Z

Nah, FTP just sorts 2.40 after 2.4.* 😄 https://mirrors.edge.kernel.org/pub/software/scm/git/git-2.40.0.tar.sign

gkamat · 2023-03-29T01:06:28Z

Just rebuilt git from source. Yes, the behaviour of the new version is different, as suspected. There will need to be a change made, to fix this.

gkamat · 2023-03-29T02:34:26Z

@IanHoang, changing the git command to use the full refname and indexing from the right should likely fix this issue:

        return _cleanup_remote_branch_names(process.run_subprocess_with_output(
                "git -C {src} for-each-ref refs/remotes/ --format='%(refname)'".format(src=clean_src)))

def _cleanup_remote_branch_names(branch_names):
    return [(b[b.rindex("/") + 1:]).strip() for b in branch_names if not b.endswith("/HEAD")]

IanHoang · 2023-03-29T16:13:19Z

@kartg Thanks for diving into this! Just caught up on the thread.

TL;DR:

Reproduced the issue in a clean Ubuntu environment and can confirm that the issue lies in Git versions 2.40.0+. Propose we use @gkamat recommended fix after testing them out.

I'll open a PR to address this fix.

I'm in the same boat as Govind and am using an earlier version of Git, which doesn't exclude the HEAD from origin/HEAD:

hoangia@3c22fbd0d988 default % git for-each-ref refs/remotes/ --format='%(refname:short)'
origin/1
origin/2
origin/3
origin/6
origin/7
origin/HEAD
origin/main
hoangia@3c22fbd0d988 default % git --version
git version 2.33.0

Reproduced issue in Ubuntu:

Confirmed that issue resides in Git versioning.

Started with git version 2.33.0

ubuntu@ip-172-31-80-80:~$ opensearch-benchmark list workloads

   ____                  _____                      __       ____                  __                         __
  / __ \____  ___  ____ / ___/___  ____ ___________/ /_     / __ )___  ____  _____/ /_  ____ ___  ____ ______/ /__
 / / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \   / __  / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
/ /_/ / /_/ /  __/ / / /__/ /  __/ /_/ / /  / /__/ / / /  / /_/ /  __/ / / / /__/ / / / / / / / / /_/ / /  / ,<
\____/ .___/\___/_/ /_/____/\___/\__,_/_/   \___/_/ /_/  /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/  /_/|_|
    /_/

Available workloads:

Name           Description                                                                                                        Documents    Compressed Size    Uncompressed Size    Default TestProcedure         All TestProcedures
-------------  -----------------------------------------------------------------------------------------------------------------  -----------  -----------------  -------------------  ----------------------------  ------------------------------------------------------------------------------------------------------------------------------------------------------------------
pmc            Full text benchmark with academic papers from PMC                                                                  574,199      5.5 GB             21.7 GB              append-no-conflicts           append-no-conflicts,append-no-conflicts-index-only,append-sorted-no-conflicts,append-fast-with-conflicts
nested         StackOverflow Q&A stored as nested docs                                                                            11,203,029   663.3 MB           3.4 GB               nested-search-test-procedure  nested-search-test-procedure,index-only
geoshape       Shapes from PlanetOSM                                                                                              60,523,283   13.4 GB            45.4 GB              append-no-conflicts           append-no-conflicts
percolator     Percolator benchmark based on AOL queries                                                                          2,000,000    121.1 kB           104.9 MB             append-no-conflicts           append-no-conflicts
so             Indexing benchmark using up to questions and answers from StackOverflow                                            36,062,278   8.9 GB             33.1 GB              append-no-conflicts           append-no-conflicts
noaa           Global daily weather measurements from NOAA                                                                        33,659,481   949.4 MB           9.0 GB               append-no-conflicts           append-no-conflicts,append-no-conflicts-index-only,top_metrics,aggs
http_logs      HTTP server log data                                                                                               247,249,096  1.2 GB             31.1 GB              append-no-conflicts           append-no-conflicts,append-no-conflicts-index-only,append-sorted-no-conflicts,append-index-only-with-ingest-pipeline,update,append-no-conflicts-index-reindex-only
geopointshape  Point coordinates from PlanetOSM indexed as geoshapes                                                              60,844,404   470.8 MB           2.6 GB               append-no-conflicts           append-no-conflicts,append-no-conflicts-index-only,append-fast-with-conflicts
geopoint       Point coordinates from PlanetOSM                                                                                   60,844,404   482.1 MB           2.3 GB               append-no-conflicts           append-no-conflicts,append-no-conflicts-index-only,append-fast-with-conflicts
geonames       POIs from Geonames                                                                                                 11,396,503   252.9 MB           3.3 GB               append-no-conflicts           append-no-conflicts,append-no-conflicts-index-only,append-sorted-no-conflicts,append-fast-with-conflicts,significant-text
nyc_taxis      Taxi rides in New York in 2015                                                                                     165,346,692  4.5 GB             74.3 GB              append-no-conflicts           append-no-conflicts,append-no-conflicts-index-only,append-sorted-no-conflicts-index-only,update,searchable-snapshot
eventdata      This benchmark indexes HTTP access logs generated based sample logs from the elastic.co website using a generator  20,000,000   756.0 MB           15.3 GB              append-no-conflicts           append-no-conflicts,transform

-------------------------------
[INFO] SUCCESS (took 1 seconds)
-------------------------------
ubuntu@ip-172-31-80-80:~$ git version
git version 2.34.1

Updated git to 2.40.0 and observed error when listing workloads

ubuntu@ip-172-31-80-80:~/.benchmark/benchmarks/workloads/default$ git --version
git version 2.40.0
ubuntu@ip-172-31-80-80:~/.benchmark/benchmarks/workloads/default$ opensearch-benchmark list workloads

   ____                  _____                      __       ____                  __                         __
  / __ \____  ___  ____ / ___/___  ____ ___________/ /_     / __ )___  ____  _____/ /_  ____ ___  ____ ______/ /__
 / / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \   / __  / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
/ /_/ / /_/ /  __/ / / /__/ /  __/ /_/ / /  / /__/ / / /  / /_/ /  __/ / / / /__/ / / / / / / / / /_/ / /  / ,<
\____/ .___/\___/_/ /_/____/\___/\__,_/_/   \___/_/ /_/  /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/  /_/|_|
    /_/

[ERROR] Cannot list. substring not found.

Logs shows same error:

  File "/home/ubuntu/opensearch-benchmark/osbenchmark/utils/git.py", line 123, in branches
    return _cleanup_remote_branch_names(process.run_subprocess_with_output(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/opensearch-benchmark/osbenchmark/utils/git.py", line 137, in _cleanup_remote_branch_names
    return [(b[b.index("/") + 1:]).strip() for b in branch_names if not b.endswith("/HEAD")]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/opensearch-benchmark/osbenchmark/utils/git.py", line 137, in <listcomp>
    return [(b[b.index("/") + 1:]).strip() for b in branch_names if not b.endswith("/HEAD")]
               ^^^^^^^^^^^^
ValueError: substring not found

Applying the recommended fix by @gkamat

Added recommended fix by @gkamat to opensearch-benchmark/osbenchmark/utils/git.py and ran list subcommand for environments with Git versions with 2.40.0 and 2.33.0.

Remove :short from refname:short in this line:
https://github.com/IanHoang/opensearch-benchmark/blob/a1f45502b5b9e69bad1c8d10e7a6c30bd0ed8469/osbenchmark/utils/git.py#L123-L124
Update index to rindex in this line:
https://github.com/IanHoang/opensearch-benchmark/blob/a1f45502b5b9e69bad1c8d10e7a6c30bd0ed8469/osbenchmark/utils/git.py#L136-L137
Reran python3 -m pip install -e . to reinstall OSB in development mode.
Reran opensearch-benchmark list workloads and received the successful output:

ubuntu@ip-172-31-80-80:~/opensearch-benchmark$ opensearch-benchmark list workloads

   ____                  _____                      __       ____                  __                         __
  / __ \____  ___  ____ / ___/___  ____ ___________/ /_     / __ )___  ____  _____/ /_  ____ ___  ____ ______/ /__
 / / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \   / __  / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
/ /_/ / /_/ /  __/ / / /__/ /  __/ /_/ / /  / /__/ / / /  / /_/ /  __/ / / / /__/ / / / / / / / / /_/ / /  / ,<
\____/ .___/\___/_/ /_/____/\___/\__,_/_/   \___/_/ /_/  /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/  /_/|_|
    /_/

Available workloads:

Name           Description                                                                                                        Documents    Compressed Size    Uncompressed Size    Default TestProcedure         All TestProcedures
-------------  -----------------------------------------------------------------------------------------------------------------  -----------  -----------------  -------------------  ----------------------------  ------------------------------------------------------------------------------------------------------------------------------------------------------------------
pmc            Full text benchmark with academic papers from PMC                                                                  574,199      5.5 GB             21.7 GB              append-no-conflicts           append-no-conflicts,append-no-conflicts-index-only,append-sorted-no-conflicts,append-fast-with-conflicts
nested         StackOverflow Q&A stored as nested docs                                                                            11,203,029   663.3 MB           3.4 GB               nested-search-test-procedure  nested-search-test-procedure,index-only
geoshape       Shapes from PlanetOSM                                                                                              60,523,283   13.4 GB            45.4 GB              append-no-conflicts           append-no-conflicts
percolator     Percolator benchmark based on AOL queries                                                                          2,000,000    121.1 kB           104.9 MB             append-no-conflicts           append-no-conflicts
so             Indexing benchmark using up to questions and answers from StackOverflow                                            36,062,278   8.9 GB             33.1 GB              append-no-conflicts           append-no-conflicts
noaa           Global daily weather measurements from NOAA                                                                        33,659,481   949.4 MB           9.0 GB               append-no-conflicts           append-no-conflicts,append-no-conflicts-index-only,top_metrics,aggs
http_logs      HTTP server log data                                                                                               247,249,096  1.2 GB             31.1 GB              append-no-conflicts           append-no-conflicts,append-no-conflicts-index-only,append-sorted-no-conflicts,append-index-only-with-ingest-pipeline,update,append-no-conflicts-index-reindex-only
geopointshape  Point coordinates from PlanetOSM indexed as geoshapes                                                              60,844,404   470.8 MB           2.6 GB               append-no-conflicts           append-no-conflicts,append-no-conflicts-index-only,append-fast-with-conflicts
geopoint       Point coordinates from PlanetOSM                                                                                   60,844,404   482.1 MB           2.3 GB               append-no-conflicts           append-no-conflicts,append-no-conflicts-index-only,append-fast-with-conflicts
geonames       POIs from Geonames                                                                                                 11,396,503   252.9 MB           3.3 GB               append-no-conflicts           append-no-conflicts,append-no-conflicts-index-only,append-sorted-no-conflicts,append-fast-with-conflicts,significant-text
nyc_taxis      Taxi rides in New York in 2015                                                                                     165,346,692  4.5 GB             74.3 GB              append-no-conflicts           append-no-conflicts,append-no-conflicts-index-only,append-sorted-no-conflicts-index-only,update,searchable-snapshot
eventdata      This benchmark indexes HTTP access logs generated based sample logs from the elastic.co website using a generator  20,000,000   756.0 MB           15.3 GB              append-no-conflicts           append-no-conflicts,transform

-------------------------------
[INFO] SUCCESS (took 0 seconds)
-------------------------------

…nsearch-project#245) Signed-off-by: Ian Hoang <hoangia@amazon.com>

… (#246) Signed-off-by: Ian Hoang <hoangia@amazon.com> Co-authored-by: Ian Hoang <hoangia@amazon.com>

IanHoang · 2023-03-30T14:53:42Z

Closing issue as this has been resolved in PR #246

kotwanikunal added bug Something isn't working untriaged labels Mar 27, 2023

IanHoang removed the untriaged label Mar 28, 2023

IanHoang pushed a commit to IanHoang/opensearch-benchmark that referenced this issue Mar 29, 2023

Add fix for git.py to index from the right and reformat branches (ope…

f1a8806

…nsearch-project#245) Signed-off-by: Ian Hoang <hoangia@amazon.com>

IanHoang mentioned this issue Mar 29, 2023

Add fix for git.py to index from the right and reformat branches (#245) #246

Merged

1 task

IanHoang added a commit that referenced this issue Mar 29, 2023

Add fix for git.py to index from the right and reformat branches (#245)…

d11247a

… (#246) Signed-off-by: Ian Hoang <hoangia@amazon.com> Co-authored-by: Ian Hoang <hoangia@amazon.com>

IanHoang closed this as completed Mar 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Unable to run opensearch-benchmark in test mode #245

[BUG] Unable to run opensearch-benchmark in test mode #245

kotwanikunal commented Mar 27, 2023

IanHoang commented Mar 28, 2023 •

edited

Loading

IanHoang commented Mar 28, 2023

kartg commented Mar 28, 2023 •

edited

Loading

IanHoang commented Mar 28, 2023

IanHoang commented Mar 28, 2023 •

edited

Loading

kotwanikunal commented Mar 28, 2023

IanHoang commented Mar 28, 2023

kartg commented Mar 28, 2023

kartg commented Mar 28, 2023 •

edited

Loading

kartg commented Mar 28, 2023

kotwanikunal commented Mar 28, 2023

gkamat commented Mar 28, 2023

kartg commented Mar 28, 2023

kotwanikunal commented Mar 29, 2023

gkamat commented Mar 29, 2023

kotwanikunal commented Mar 29, 2023

gkamat commented Mar 29, 2023

kartg commented Mar 29, 2023

gkamat commented Mar 29, 2023

gkamat commented Mar 29, 2023

IanHoang commented Mar 29, 2023 •

edited

Loading

IanHoang commented Mar 30, 2023 •

edited

Loading

[BUG] Unable to run opensearch-benchmark in test mode #245

[BUG] Unable to run opensearch-benchmark in test mode #245

Comments

kotwanikunal commented Mar 27, 2023

IanHoang commented Mar 28, 2023 • edited Loading

IanHoang commented Mar 28, 2023

kartg commented Mar 28, 2023 • edited Loading

IanHoang commented Mar 28, 2023

IanHoang commented Mar 28, 2023 • edited Loading

kotwanikunal commented Mar 28, 2023

IanHoang commented Mar 28, 2023

kartg commented Mar 28, 2023

kartg commented Mar 28, 2023 • edited Loading

kartg commented Mar 28, 2023

kotwanikunal commented Mar 28, 2023

gkamat commented Mar 28, 2023

kartg commented Mar 28, 2023

kotwanikunal commented Mar 29, 2023

gkamat commented Mar 29, 2023

kotwanikunal commented Mar 29, 2023

gkamat commented Mar 29, 2023

kartg commented Mar 29, 2023

gkamat commented Mar 29, 2023

gkamat commented Mar 29, 2023

IanHoang commented Mar 29, 2023 • edited Loading

TL;DR:

Reproduced issue in Ubuntu:

Applying the recommended fix by @gkamat

IanHoang commented Mar 30, 2023 • edited Loading

IanHoang commented Mar 28, 2023 •

edited

Loading

kartg commented Mar 28, 2023 •

edited

Loading

IanHoang commented Mar 28, 2023 •

edited

Loading

kartg commented Mar 28, 2023 •

edited

Loading

IanHoang commented Mar 29, 2023 •

edited

Loading

IanHoang commented Mar 30, 2023 •

edited

Loading