Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a new metric: Client Processing Time #450

Merged
merged 7 commits into from
Feb 5, 2024

Conversation

saimedhi
Copy link
Contributor

Description

Introduced a new metric: Client Processing Time.

Client Processing Time: The delta between total request time and service time.

Total Request Time: Defined as the duration between the runner sending a request to the client OpenSearch-py and receiving the response.

Service Time: Represents the interval from the server receiving the request to the server sending the response. Note: There was a discrepancy in the documentation regarding Service Time, and I have clarified my observations in the associated PR.

Issues Resolved

#432

Signed-off-by: saimedhi <saimedhi@amazon.com>
Signed-off-by: saimedhi <saimedhi@amazon.com>
Signed-off-by: saimedhi <saimedhi@amazon.com>
@saimedhi
Copy link
Contributor Author

I've noticed inaccuracies in the calculation of 'Service Time' in Opensearch Benchmarks. I'll be raising an issue to address this concern. The accuracy of the Client Processing Time metric relies on the accuracy of 'Service Time' used in its computation.

def time_func(func):
async def advised(*args, **kwargs):
request_context_holder.on_client_request_start()
rsl = await func(*args, **kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does rsl represent? Is there a more descriptive name we could use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IanHoang, changed it to response

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IanHoang, kindly inform me of any additional corrections needed. I'm ready to address them promptly. Let's move forward with merging. Thank you.

saimedhi and others added 4 commits January 31, 2024 09:23
Signed-off-by: saimedhi <saimedhi@amazon.com>
Signed-off-by: saimedhi <saimedhi@amazon.com>
Signed-off-by: saimedhi <saimedhi@amazon.com>
@saimedhi
Copy link
Contributor Author

saimedhi commented Feb 1, 2024

@IanHoang, @gkamat Please take a look

@IanHoang
Copy link
Collaborator

IanHoang commented Feb 2, 2024

@saimedhi Overall, looks good. Did you run any tests or tests in --test-mode? If so, could you supply an example metric document from benchmark-metrics-* of external datastore?

@saimedhi
Copy link
Contributor Author

saimedhi commented Feb 5, 2024

@saimedhi Overall, looks good. Did you run any tests or tests in --test-mode? If so, could you supply an example metric document from benchmark-metrics-* of external datastore?

https://search-benchmarks-test-6ykfxpbqxyaslahdtm2ofk7ige.us-west-2.es.amazonaws.com/benchmark-metrics-2024-02/_search

{
    "query": {
        "match_all": {}
    }
}
{
    "took": 6,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 324,
            "relation": "eq"
        },
        "max_score": 1.0,
        "hits": [
            {
                "_index": "benchmark-metrics-2024-02",
                "_id": "-NLreo0BAWfgjZcLzLhf",
                "_score": 1.0,
                "_source": {
                    "@timestamp": 1707164289535,
                    "relative-time-ms": 9.440874999999238,
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "environment": "local",
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "name": "processing_time",
                    "value": 8.88333299999644,
                    "unit": "ms",
                    "sample-type": "warmup",
                    "meta": {
                        "plugins": [
                            "opensearch-alerting",
                            "opensearch-anomaly-detection",
                            "opensearch-asynchronous-search",
                            "opensearch-cross-cluster-replication",
                            "opensearch-custom-codecs",
                            "opensearch-geospatial",
                            "opensearch-index-management",
                            "opensearch-job-scheduler",
                            "opensearch-knn",
                            "opensearch-ml",
                            "opensearch-neural-search",
                            "opensearch-notifications",
                            "opensearch-notifications-core",
                            "opensearch-observability",
                            "opensearch-performance-analyzer",
                            "opensearch-reports-scheduler",
                            "opensearch-security",
                            "opensearch-security-analytics",
                            "opensearch-sql"
                        ],
                        "attribute_shard_indexing_pressure_enabled": "true",
                        "source_revision": "6b1986e964d440be9137eba1413015c31c5a7752",
                        "distribution_version": "2.11.1",
                        "distribution_flavor": "oss",
                        "success": true
                    },
                    "task": "desc_sort_geonameid",
                    "operation": "desc_sort_geonameid",
                    "operation-type": "search"
                }
            },
            {
                "_index": "benchmark-metrics-2024-02",
                "_id": "-dLreo0BAWfgjZcLzLhf",
                "_score": 1.0,
                "_source": {
                    "@timestamp": 1707164289544,
                    "relative-time-ms": 18.039542000003905,
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "environment": "local",
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "name": "latency",
                    "value": 12.431333000002098,
                    "unit": "ms",
                    "sample-type": "normal",
                    "meta": {
                        "plugins": [
                            "opensearch-alerting",
                            "opensearch-anomaly-detection",
                            "opensearch-asynchronous-search",
                            "opensearch-cross-cluster-replication",
                            "opensearch-custom-codecs",
                            "opensearch-geospatial",
                            "opensearch-index-management",
                            "opensearch-job-scheduler",
                            "opensearch-knn",
                            "opensearch-ml",
                            "opensearch-neural-search",
                            "opensearch-notifications",
                            "opensearch-notifications-core",
                            "opensearch-observability",
                            "opensearch-performance-analyzer",
                            "opensearch-reports-scheduler",
                            "opensearch-security",
                            "opensearch-security-analytics",
                            "opensearch-sql"
                        ],
                        "attribute_shard_indexing_pressure_enabled": "true",
                        "source_revision": "6b1986e964d440be9137eba1413015c31c5a7752",
                        "distribution_version": "2.11.1",
                        "distribution_flavor": "oss",
                        "success": true
                    },
                    "task": "desc_sort_geonameid",
                    "operation": "desc_sort_geonameid",
                    "operation-type": "search"
                }
            },
            {
                "_index": "benchmark-metrics-2024-02",
                "_id": "-9Lreo0BAWfgjZcLzLhf",
                "_score": 1.0,
                "_source": {
                    "@timestamp": 1707164289544,
                    "relative-time-ms": 18.039542000003905,
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "environment": "local",
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "name": "client_processing_time",
                    "value": 0.3031250000020691,
                    "unit": "ms",
                    "sample-type": "normal",
                    "meta": {
                        "plugins": [
                            "opensearch-alerting",
                            "opensearch-anomaly-detection",
                            "opensearch-asynchronous-search",
                            "opensearch-cross-cluster-replication",
                            "opensearch-custom-codecs",
                            "opensearch-geospatial",
                            "opensearch-index-management",
                            "opensearch-job-scheduler",
                            "opensearch-knn",
                            "opensearch-ml",
                            "opensearch-neural-search",
                            "opensearch-notifications",
                            "opensearch-notifications-core",
                            "opensearch-observability",
                            "opensearch-performance-analyzer",
                            "opensearch-reports-scheduler",
                            "opensearch-security",
                            "opensearch-security-analytics",
                            "opensearch-sql"
                        ],
                        "attribute_shard_indexing_pressure_enabled": "true",
                        "source_revision": "6b1986e964d440be9137eba1413015c31c5a7752",
                        "distribution_version": "2.11.1",
                        "distribution_flavor": "oss",
                        "success": true
                    },
                    "task": "desc_sort_geonameid",
                    "operation": "desc_sort_geonameid",
                    "operation-type": "search"
                }
            },
            {
                "_index": "benchmark-metrics-2024-02",
                "_id": "_NLreo0BAWfgjZcLzLhf",
                "_score": 1.0,
                "_source": {
                    "@timestamp": 1707164289544,
                    "relative-time-ms": 18.039542000003905,
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "environment": "local",
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "name": "processing_time",
                    "value": 3.5443749999970464,
                    "unit": "ms",
                    "sample-type": "normal",
                    "meta": {
                        "plugins": [
                            "opensearch-alerting",
                            "opensearch-anomaly-detection",
                            "opensearch-asynchronous-search",
                            "opensearch-cross-cluster-replication",
                            "opensearch-custom-codecs",
                            "opensearch-geospatial",
                            "opensearch-index-management",
                            "opensearch-job-scheduler",
                            "opensearch-knn",
                            "opensearch-ml",
                            "opensearch-neural-search",
                            "opensearch-notifications",
                            "opensearch-notifications-core",
                            "opensearch-observability",
                            "opensearch-performance-analyzer",
                            "opensearch-reports-scheduler",
                            "opensearch-security",
                            "opensearch-security-analytics",
                            "opensearch-sql"
                        ],
                        "attribute_shard_indexing_pressure_enabled": "true",
                        "source_revision": "6b1986e964d440be9137eba1413015c31c5a7752",
                        "distribution_version": "2.11.1",
                        "distribution_flavor": "oss",
                        "success": true
                    },
                    "task": "desc_sort_geonameid",
                    "operation": "desc_sort_geonameid",
                    "operation-type": "search"
                }
            },
            {
                "_index": "benchmark-metrics-2024-02",
                "_id": "-v_reo0BBuuZzv8czmG6",
                "_score": 1.0,
                "_source": {
                    "@timestamp": 1707164290143,
                    "relative-time-ms": 13.536166999998045,
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "environment": "local",
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "name": "latency",
                    "value": 11.079457999997544,
                    "unit": "ms",
                    "sample-type": "normal",
                    "meta": {
                        "plugins": [
                            "opensearch-alerting",
                            "opensearch-anomaly-detection",
                            "opensearch-asynchronous-search",
                            "opensearch-cross-cluster-replication",
                            "opensearch-custom-codecs",
                            "opensearch-geospatial",
                            "opensearch-index-management",
                            "opensearch-job-scheduler",
                            "opensearch-knn",
                            "opensearch-ml",
                            "opensearch-neural-search",
                            "opensearch-notifications",
                            "opensearch-notifications-core",
                            "opensearch-observability",
                            "opensearch-performance-analyzer",
                            "opensearch-reports-scheduler",
                            "opensearch-security",
                            "opensearch-security-analytics",
                            "opensearch-sql"
                        ],
                        "attribute_shard_indexing_pressure_enabled": "true",
                        "source_revision": "6b1986e964d440be9137eba1413015c31c5a7752",
                        "distribution_version": "2.11.1",
                        "distribution_flavor": "oss",
                        "success": true
                    },
                    "task": "desc_sort_with_after_geonameid",
                    "operation": "desc_sort_with_after_geonameid",
                    "operation-type": "search"
                }
            },
            {
                "_index": "benchmark-metrics-2024-02",
                "_id": "_f_reo0BBuuZzv8czmG6",
                "_score": 1.0,
                "_source": {
                    "@timestamp": 1707164290143,
                    "relative-time-ms": 13.536166999998045,
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "environment": "local",
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "name": "processing_time",
                    "value": 4.338791999998648,
                    "unit": "ms",
                    "sample-type": "normal",
                    "meta": {
                        "plugins": [
                            "opensearch-alerting",
                            "opensearch-anomaly-detection",
                            "opensearch-asynchronous-search",
                            "opensearch-cross-cluster-replication",
                            "opensearch-custom-codecs",
                            "opensearch-geospatial",
                            "opensearch-index-management",
                            "opensearch-job-scheduler",
                            "opensearch-knn",
                            "opensearch-ml",
                            "opensearch-neural-search",
                            "opensearch-notifications",
                            "opensearch-notifications-core",
                            "opensearch-observability",
                            "opensearch-performance-analyzer",
                            "opensearch-reports-scheduler",
                            "opensearch-security",
                            "opensearch-security-analytics",
                            "opensearch-sql"
                        ],
                        "attribute_shard_indexing_pressure_enabled": "true",
                        "source_revision": "6b1986e964d440be9137eba1413015c31c5a7752",
                        "distribution_version": "2.11.1",
                        "distribution_flavor": "oss",
                        "success": true
                    },
                    "task": "desc_sort_with_after_geonameid",
                    "operation": "desc_sort_with_after_geonameid",
                    "operation-type": "search"
                }
            },
            {
                "_index": "benchmark-metrics-2024-02",
                "_id": "___reo0BBuuZzv8c0WES",
                "_score": 1.0,
                "_source": {
                    "@timestamp": 1707164290724,
                    "relative-time-ms": 7.0460829999987595,
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "environment": "local",
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "name": "latency",
                    "value": 7.034042000000795,
                    "unit": "ms",
                    "sample-type": "warmup",
                    "meta": {
                        "plugins": [
                            "opensearch-alerting",
                            "opensearch-anomaly-detection",
                            "opensearch-asynchronous-search",
                            "opensearch-cross-cluster-replication",
                            "opensearch-custom-codecs",
                            "opensearch-geospatial",
                            "opensearch-index-management",
                            "opensearch-job-scheduler",
                            "opensearch-knn",
                            "opensearch-ml",
                            "opensearch-neural-search",
                            "opensearch-notifications",
                            "opensearch-notifications-core",
                            "opensearch-observability",
                            "opensearch-performance-analyzer",
                            "opensearch-reports-scheduler",
                            "opensearch-security",
                            "opensearch-security-analytics",
                            "opensearch-sql"
                        ],
                        "attribute_shard_indexing_pressure_enabled": "true",
                        "source_revision": "6b1986e964d440be9137eba1413015c31c5a7752",
                        "distribution_version": "2.11.1",
                        "distribution_flavor": "oss",
                        "success": true
                    },
                    "task": "asc_sort_geonameid",
                    "operation": "asc_sort_geonameid",
                    "operation-type": "search"
                }
            },
            {
                "_index": "benchmark-metrics-2024-02",
                "_id": "Af_reo0BBuuZzv8c0WIS",
                "_score": 1.0,
                "_source": {
                    "@timestamp": 1707164290724,
                    "relative-time-ms": 7.0460829999987595,
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "environment": "local",
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "name": "client_processing_time",
                    "value": 0.6644999999991796,
                    "unit": "ms",
                    "sample-type": "warmup",
                    "meta": {
                        "plugins": [
                            "opensearch-alerting",
                            "opensearch-anomaly-detection",
                            "opensearch-asynchronous-search",
                            "opensearch-cross-cluster-replication",
                            "opensearch-custom-codecs",
                            "opensearch-geospatial",
                            "opensearch-index-management",
                            "opensearch-job-scheduler",
                            "opensearch-knn",
                            "opensearch-ml",
                            "opensearch-neural-search",
                            "opensearch-notifications",
                            "opensearch-notifications-core",
                            "opensearch-observability",
                            "opensearch-performance-analyzer",
                            "opensearch-reports-scheduler",
                            "opensearch-security",
                            "opensearch-security-analytics",
                            "opensearch-sql"
                        ],
                        "attribute_shard_indexing_pressure_enabled": "true",
                        "source_revision": "6b1986e964d440be9137eba1413015c31c5a7752",
                        "distribution_version": "2.11.1",
                        "distribution_flavor": "oss",
                        "success": true
                    },
                    "task": "asc_sort_geonameid",
                    "operation": "asc_sort_geonameid",
                    "operation-type": "search"
                }
            },
            {
                "_index": "benchmark-metrics-2024-02",
                "_id": "BP_reo0BBuuZzv8c0WIS",
                "_score": 1.0,
                "_source": {
                    "@timestamp": 1707164290732,
                    "relative-time-ms": 14.581999999997208,
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "environment": "local",
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "name": "service_time",
                    "value": 1.9608329999982743,
                    "unit": "ms",
                    "sample-type": "normal",
                    "meta": {
                        "plugins": [
                            "opensearch-alerting",
                            "opensearch-anomaly-detection",
                            "opensearch-asynchronous-search",
                            "opensearch-cross-cluster-replication",
                            "opensearch-custom-codecs",
                            "opensearch-geospatial",
                            "opensearch-index-management",
                            "opensearch-job-scheduler",
                            "opensearch-knn",
                            "opensearch-ml",
                            "opensearch-neural-search",
                            "opensearch-notifications",
                            "opensearch-notifications-core",
                            "opensearch-observability",
                            "opensearch-performance-analyzer",
                            "opensearch-reports-scheduler",
                            "opensearch-security",
                            "opensearch-security-analytics",
                            "opensearch-sql"
                        ],
                        "attribute_shard_indexing_pressure_enabled": "true",
                        "source_revision": "6b1986e964d440be9137eba1413015c31c5a7752",
                        "distribution_version": "2.11.1",
                        "distribution_flavor": "oss",
                        "success": true
                    },
                    "task": "asc_sort_geonameid",
                    "operation": "asc_sort_geonameid",
                    "operation-type": "search"
                }
            },
            {
                "_index": "benchmark-metrics-2024-02",
                "_id": "_tLreo0BAWfgjZcL07ht",
                "_score": 1.0,
                "_source": {
                    "@timestamp": 1707164291333,
                    "relative-time-ms": 6.70079200000373,
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "environment": "local",
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "name": "latency",
                    "value": 6.702540999995676,
                    "unit": "ms",
                    "sample-type": "warmup",
                    "meta": {
                        "plugins": [
                            "opensearch-alerting",
                            "opensearch-anomaly-detection",
                            "opensearch-asynchronous-search",
                            "opensearch-cross-cluster-replication",
                            "opensearch-custom-codecs",
                            "opensearch-geospatial",
                            "opensearch-index-management",
                            "opensearch-job-scheduler",
                            "opensearch-knn",
                            "opensearch-ml",
                            "opensearch-neural-search",
                            "opensearch-notifications",
                            "opensearch-notifications-core",
                            "opensearch-observability",
                            "opensearch-performance-analyzer",
                            "opensearch-reports-scheduler",
                            "opensearch-security",
                            "opensearch-security-analytics",
                            "opensearch-sql"
                        ],
                        "attribute_shard_indexing_pressure_enabled": "true",
                        "source_revision": "6b1986e964d440be9137eba1413015c31c5a7752",
                        "distribution_version": "2.11.1",
                        "distribution_flavor": "oss",
                        "success": true
                    },
                    "task": "asc_sort_with_after_geonameid",
                    "operation": "asc_sort_with_after_geonameid",
                    "operation-type": "search"
                }
            }
        ]
    }
}

@saimedhi
Copy link
Contributor Author

saimedhi commented Feb 5, 2024

benchmark-results-2024-02

https://search-benchmarks-test-6ykfxpbqxyaslahdtm2ofk7ige.us-west-2.es.amazonaws.com/benchmark-results-2024-02/_search

{
    "query": {
        "match_all": {}
    }
}
{
    "took": 6,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 220,
            "relation": "eq"
        },
        "max_score": 1.0,
        "hits": [
            {
                "_index": "benchmark-results-2024-02",
                "_id": "D9Lseo0BAWfgjZcLSbkR",
                "_score": 1.0,
                "_source": {
                    "benchmark-version": "1.3.0 (git revision: 9c716186)",
                    "benchmark-revision": "9c716186",
                    "environment": "local",
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "distribution-version": "2.11.1",
                    "distribution-flavor": "oss",
                    "user-tags": {},
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "active": true,
                    "distribution-major-version": 2,
                    "workload-revision": "6879b3d",
                    "task": "country_agg_cached",
                    "operation": "country_agg_cached",
                    "name": "client_processing_time",
                    "value": {
                        "100_0": 0.2687920033931732,
                        "mean": 0.2687920033931732,
                        "unit": "ms"
                    }
                }
            },
            {
                "_index": "benchmark-results-2024-02",
                "_id": "EtLseo0BAWfgjZcLSbkR",
                "_score": 1.0,
                "_source": {
                    "benchmark-version": "1.3.0 (git revision: 9c716186)",
                    "benchmark-revision": "9c716186",
                    "environment": "local",
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "distribution-version": "2.11.1",
                    "distribution-flavor": "oss",
                    "user-tags": {},
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "active": true,
                    "distribution-major-version": 2,
                    "workload-revision": "6879b3d",
                    "task": "painless_static",
                    "operation": "painless_static",
                    "name": "client_processing_time",
                    "value": {
                        "100_0": 0.27687498927116394,
                        "mean": 0.27687498927116394,
                        "unit": "ms"
                    }
                }
            },
            {
                "_index": "benchmark-results-2024-02",
                "_id": "HtLseo0BAWfgjZcLSbkR",
                "_score": 1.0,
                "_source": {
                    "benchmark-version": "1.3.0 (git revision: 9c716186)",
                    "benchmark-revision": "9c716186",
                    "environment": "local",
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "distribution-version": "2.11.1",
                    "distribution-flavor": "oss",
                    "user-tags": {},
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "active": true,
                    "distribution-major-version": 2,
                    "workload-revision": "6879b3d",
                    "task": "desc_sort_geonameid",
                    "operation": "desc_sort_geonameid",
                    "name": "client_processing_time",
                    "value": {
                        "100_0": 0.3031249940395355,
                        "mean": 0.3031249940395355,
                        "unit": "ms"
                    }
                }
            },
            {
                "_index": "benchmark-results-2024-02",
                "_id": "H9Lseo0BAWfgjZcLSbkR",
                "_score": 1.0,
                "_source": {
                    "benchmark-version": "1.3.0 (git revision: 9c716186)",
                    "benchmark-revision": "9c716186",
                    "environment": "local",
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "distribution-version": "2.11.1",
                    "distribution-flavor": "oss",
                    "user-tags": {},
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "active": true,
                    "distribution-major-version": 2,
                    "workload-revision": "6879b3d",
                    "task": "desc_sort_with_after_geonameid",
                    "operation": "desc_sort_with_after_geonameid",
                    "name": "client_processing_time",
                    "value": {
                        "100_0": 0.24741700291633606,
                        "mean": 0.24741700291633606,
                        "unit": "ms"
                    }
                }
            },
            {
                "_index": "benchmark-results-2024-02",
                "_id": "JNLseo0BAWfgjZcLSbkR",
                "_score": 1.0,
                "_source": {
                    "benchmark-version": "1.3.0 (git revision: 9c716186)",
                    "benchmark-revision": "9c716186",
                    "environment": "local",
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "distribution-version": "2.11.1",
                    "distribution-flavor": "oss",
                    "user-tags": {},
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "active": true,
                    "distribution-major-version": 2,
                    "workload-revision": "6879b3d",
                    "task": "index-stats",
                    "operation": "index-stats",
                    "name": "duration",
                    "value": {
                        "single": 17.28866699999898
                    }
                }
            },
            {
                "_index": "benchmark-results-2024-02",
                "_id": "JtLseo0BAWfgjZcLSbkR",
                "_score": 1.0,
                "_source": {
                    "benchmark-version": "1.3.0 (git revision: 9c716186)",
                    "benchmark-revision": "9c716186",
                    "environment": "local",
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "distribution-version": "2.11.1",
                    "distribution-flavor": "oss",
                    "user-tags": {},
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "active": true,
                    "distribution-major-version": 2,
                    "workload-revision": "6879b3d",
                    "task": "default",
                    "operation": "default",
                    "name": "duration",
                    "value": {
                        "single": 9.393625000001293
                    }
                }
            },
            {
                "_index": "benchmark-results-2024-02",
                "_id": "K9Lseo0BAWfgjZcLSbkR",
                "_score": 1.0,
                "_source": {
                    "benchmark-version": "1.3.0 (git revision: 9c716186)",
                    "benchmark-revision": "9c716186",
                    "environment": "local",
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "distribution-version": "2.11.1",
                    "distribution-flavor": "oss",
                    "user-tags": {},
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "active": true,
                    "distribution-major-version": 2,
                    "workload-revision": "6879b3d",
                    "task": "scroll",
                    "operation": "scroll",
                    "name": "duration",
                    "value": {
                        "single": 21.090707999999125
                    }
                }
            },
            {
                "_index": "benchmark-results-2024-02",
                "_id": "N9Lseo0BAWfgjZcLSbkR",
                "_score": 1.0,
                "_source": {
                    "benchmark-version": "1.3.0 (git revision: 9c716186)",
                    "benchmark-revision": "9c716186",
                    "environment": "local",
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "distribution-version": "2.11.1",
                    "distribution-flavor": "oss",
                    "user-tags": {},
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "active": true,
                    "distribution-major-version": 2,
                    "workload-revision": "6879b3d",
                    "task": "asc_sort_population",
                    "operation": "asc_sort_population",
                    "name": "duration",
                    "value": {
                        "single": 11.587667000000579
                    }
                }
            },
            {
                "_index": "benchmark-results-2024-02",
                "_id": "ONLseo0BAWfgjZcLSbkR",
                "_score": 1.0,
                "_source": {
                    "benchmark-version": "1.3.0 (git revision: 9c716186)",
                    "benchmark-revision": "9c716186",
                    "environment": "local",
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "distribution-version": "2.11.1",
                    "distribution-flavor": "oss",
                    "user-tags": {},
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "active": true,
                    "distribution-major-version": 2,
                    "workload-revision": "6879b3d",
                    "task": "asc_sort_with_after_population",
                    "operation": "asc_sort_with_after_population",
                    "name": "duration",
                    "value": {
                        "single": 9.533209000000653
                    }
                }
            },
            {
                "_index": "benchmark-results-2024-02",
                "_id": "RNLseo0BAWfgjZcLSbkR",
                "_score": 1.0,
                "_source": {
                    "benchmark-version": "1.3.0 (git revision: 9c716186)",
                    "benchmark-revision": "9c716186",
                    "environment": "local",
                    "test-execution-id": "6eec75c5-02ec-4bee-bad9-a0ab198d5c73",
                    "test-execution-timestamp": "20240205T201738Z",
                    "distribution-version": "2.11.1",
                    "distribution-flavor": "oss",
                    "user-tags": {},
                    "workload": "geonames",
                    "test_procedure": "append-no-conflicts",
                    "provision-config-instance": "defaults",
                    "active": true,
                    "distribution-major-version": 2,
                    "workload-revision": "6879b3d",
                    "task": "country_agg_uncached",
                    "operation": "country_agg_uncached",
                    "name": "error_rate",
                    "value": {
                        "single": 0.0
                    }
                }
            }
        ]
    }
}

@@ -1212,7 +1238,9 @@ def status(v):
# we're good with any count of relocating shards.
expected_relocating_shards = sys.maxsize

request_context_holder.on_client_request_start()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not completely sure, but I tested running benchmarks against the 'nyc_taxis' workload and everything seems to be working fine.

@gkamat
Copy link
Collaborator

gkamat commented Feb 10, 2024

@saimedhi, @IanHoang there is an issue with this change. There is no error checking during the calls to the client, for instance, in the case below:

+            request_context_holder.on_client_request_start()
             await opensearch.indices.forcemerge(**merge_params)
+            request_context_holder.on_client_request_end()

So, if there is an exception thrown in that call, the client_request_end parameter will not be updated and there is an error caught upstream:

[ERROR] Cannot execute-test. Error in load generator [4]
        Cannot run task [index-append]: 'client_request_end'

This condition was suspected in a related comment. It might be best to revert this change until the fix goes in.

@saimedhi
Copy link
Contributor Author

@saimedhi, @IanHoang there is an issue with this change. There is no error checking during the calls to the client, for instance, in the case below:

+            request_context_holder.on_client_request_start()
             await opensearch.indices.forcemerge(**merge_params)
+            request_context_holder.on_client_request_end()

So, if there is an exception thrown in that call, the client_request_end parameter will not be updated and there is an error caught upstream:

[ERROR] Cannot execute-test. Error in load generator [4]
        Cannot run task [index-append]: 'client_request_end'

This condition was suspected in a related comment. It might be best to revert this change until the fix goes in.

@govind, Instead of reverting we can use wrapper for all the runners. Wrapper is already in the code. Just we need to add @time_func to runners removing time_measurement within the runners.

def time_func(func):
async def advised(*args, **kwargs):
request_context_holder.on_client_request_start()
response = await func(*args, **kwargs)
request_context_holder.on_client_request_end()
return response
return advised

@saimedhi
Copy link
Contributor Author

@govind, Instead of reverting we can use wrapper for all the runners. Wrapper is already in the code. Just we need to add @time_func to runners removing time_measurement within the runners.

def time_func(func): async def advised(*args, **kwargs): request_context_holder.on_client_request_start() response = await func(*args, **kwargs) request_context_holder.on_client_request_end() return response return advised

@IanHoang, @gkamat Please try to make this change. If not I will work on Tuesday. I am on-call this weekend.

@gkamat
Copy link
Collaborator

gkamat commented Feb 10, 2024

@saimedhi essentially that wrapper will need to go around the _call() method for all operations, which has the downside of reducing the accuracy of the time measurement. If you are implying the opensearch.indices.forcemerge function should be wrapped, that wrapping would need to be done either in the opensearchpy package or by redefining the function name cell after it is imported in the runner.

Furthermore, even if this were done, it does not take into account the exception handling. The call may fail withibn the client, leading to the client_request_end() call getting skipped.

All-in-all, the best and cleanest option may be to update each of the calls with the appropriate exception handling, either with a try-except or a with (context manager).

Since you are busy right now, the most expeditious option for now is probably reverting. The change can be checked-in later with additional testing. Thanks.

cc: @IanHoang

@saimedhi
Copy link
Contributor Author

@gkamat, I will again try to raise PR soon. But just one confirmation, when there is error during the calls to the client both client request start and client request end should be calculated right.

@gkamat
Copy link
Collaborator

gkamat commented Feb 11, 2024

Yes, client_request_start will already have been computed before the call. All possible exceptions should be caught and then client_request_end should be set subsequently. Else, there will be an error upstream due to the unset variable.

@saimedhi
Copy link
Contributor Author

@saimedhi, @IanHoang there is an issue with this change. There is no error checking during the calls to the client, for instance, in the case below:

+            request_context_holder.on_client_request_start()
             await opensearch.indices.forcemerge(**merge_params)
+            request_context_holder.on_client_request_end()

So, if there is an exception thrown in that call, the client_request_end parameter will not be updated and there is an error caught upstream:

[ERROR] Cannot execute-test. Error in load generator [4]
        Cannot run task [index-append]: 'client_request_end'

This condition was suspected in a related comment. It might be best to revert this change until the fix goes in.

@gkamat please take a look at PR462. I hope it addresses the above issue. Thank you.

@VijayanB
Copy link
Member

VijayanB commented Feb 24, 2024

@saimedhi, @IanHoang there is an issue with this change. There is no error checking during the calls to the client, for instance, in the case below:

+            request_context_holder.on_client_request_start()
             await opensearch.indices.forcemerge(**merge_params)
+            request_context_holder.on_client_request_end()

So, if there is an exception thrown in that call, the client_request_end parameter will not be updated and there is an error caught upstream:

[ERROR] Cannot execute-test. Error in load generator [4]
        Cannot run task [index-append]: 'client_request_end'

This condition was suspected in a related comment. It might be best to revert this change until the fix goes in.

@gkamat please take a look at PR462. I hope it addresses the above issue. Thank you.

@saimedhi If i understood correctly your PR will add client request end if any runner raises exception. However, for force merge, there won't be any exception if polling is used , see here . Did you test whether your fix will work for case where force merge time out at first call and complete is true on subsequent call ?

@saimedhi
Copy link
Contributor Author

request_context_holder.on_client_request_end()

Hi @VijayanB, thank you for checking out the PR! I'm not quite sure how to replicate that scenario, but your point makes sense. Maybe we should calculate request_context_holder.on_client_request_end() before setting complete = True. What do you think?

             try:
                  request_context_holder.on_client_request_start()
                  await opensearch.indices.forcemerge(**merge_params)
                  request_context_holder.on_client_request_end()
                  complete = True
              except opensearchpy.ConnectionTimeout:
                  pass
              while not complete:
                  await asyncio.sleep(params.get("poll-period"))
                  tasks = await opensearch.tasks.list(params={"actions": "indices:admin/forcemerge"})
                  if len(tasks["nodes"]) == 0:
                      request_context_holder.on_client_request_end()             #Added here
                      # empty nodes response indicates no tasks
                      complete = True


@VijayanB
Copy link
Member

Looks good to me. Thanks. Vector search using high dimensions with large number of vectors will take longer time to merge segments. I was able to reproduce for 768D, 10 M vectors. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants