Teraslice Elasticsearch Reader ES6 Worker Error #3962

godber · 2025-02-10T22:56:59Z

We have been doing a large re-index and have occasionally (like 1 in 200k slices) seen slice failures with the following error attached to the slice. This job is using the ES asset version 4.0.5 and the following api config (redacted) and reading from an Elasticsearch 6.5.4 cluster. It writes to Kafka but that doesn't seem relevant.

    "apis": [
        {
            "_name": "elasticsearch_reader_api:id",
            "connection": "es_data",
            "index": "docs-2024.11.17.3",
            "type": "doc",
            "key_type": "base64url",
            "starting_key_depth": 6,
            "id_field_name": "_key",
            "size": 300000
        }
    ],

I'm not sure ... is this a worker processing a slice experiencing a Transport fault talking to ES6?

TSError: aborted
    at Module.pRetry (file:///app/source/packages/utils/dist/src/promises.js:103:21)
    at async SliceExecution.run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/slice.js:38:22)
    at async Worker.runOnce (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:169:13)
    at async _run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:124:17)
    at pRetry (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:87682:17)
    at async ElasticsearchReaderAPI.fetch (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:89520:20)
    at async ElasticsearchIDFetcher.handle (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:80183:33)
    at async file:///app/source/packages/job-components/dist/src/execution-context/worker.js:75:33
    at async file:///app/source/packages/utils/dist/src/promises.js:215:19
    at async file:///app/source/packages/utils/dist/src/promises.js:215:19
    at async WorkerExecutionContext._runSliceOnce (file:///app/source/packages/job-components/dist/src/execution-context/worker.js:295:29)
    at async Module.pRetry (file:///app/source/packages/utils/dist/src/promises.js:88:16)
    at async SliceExecution.run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/slice.js:38:22)
    at async Worker.runOnce (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:169:13)
    at async _run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:124:17)
    at _errorHandlerFn (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:107674:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
Caused by: TSError: aborted
    at pRetry (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:87682:17)
    at async ElasticsearchReaderAPI.fetch (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:89520:20)
    at async ElasticsearchIDFetcher.handle (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:80183:33)
    at async file:///app/source/packages/job-components/dist/src/execution-context/worker.js:75:33
    at async file:///app/source/packages/utils/dist/src/promises.js:215:19
    at async file:///app/source/packages/utils/dist/src/promises.js:215:19
    at async WorkerExecutionContext._runSliceOnce (file:///app/source/packages/job-components/dist/src/execution-context/worker.js:295:29)
    at async Module.pRetry (file:///app/source/packages/utils/dist/src/promises.js:88:16)
    at async SliceExecution.run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/slice.js:38:22)
    at async Worker.runOnce (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:169:13)
    at async _run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:124:17)
    at _errorHandlerFn (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:107674:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
Caused by: TSError: aborted
    at _errorHandlerFn (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:107674:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
Caused by: ConnectionError: aborted
    at IncomingMessage.<anonymous> (/app/source/node_modules/elasticsearch6/lib/Transport.js:260:23)
    at IncomingMessage.emit (node:events:524:28)
    at emitErrorNT (node:internal/streams/destroy:170:8)
    at emitErrorCloseNT (node:internal/streams/destroy:129:3)
    at process.processTicksAndRejections (node:internal/process/task_queues:90:21)

The Teraslice cluster version is as follows:

{
    "arch": "x64",
    "clustering_type": "kubernetesV2",
    "name": "teraslice-cluster1",
    "node_version": "v22.13.0",
    "platform": "linux",
    "teraslice_version": "v2.12.3"
}

Edit: Changed ES cluster version from 6.5.2 to 6.5.4.

The text was updated successfully, but these errors were encountered:

godber · 2025-02-10T23:00:52Z

I guess its possible this should be filed on the Elasticsearch asset instead.

godber · 2025-02-10T23:24:55Z

There don't appear to be any errors in the ES data node logs that correlate with these slice errors and the clusters were all in an OK state, green, no GCs bigger/longer than usual.

godber · 2025-02-11T00:13:58Z

I've tracked down a worker that experienced one of these slice errors and it didn't have anything else to add really. Just the procedural stuff (redacted a bit and some of these lines are clipped):

[2025-02-10T22:52:53.715Z]  INFO: teraslice/7 on ts-wkr-es-store-kafka-4411806d-19a5-a9f6896654xqg4: slice d6218c85-d7b0-461b-a9a4-897930bcd4a4 completed (assignment=worker, module=worker, worker_id=hvf_iu9m, ex_id=9268a52>
[2025-02-10T22:52:53.829Z] ERROR: teraslice/7 on ts-wkr-es-store-kafka-4411806d-19a5-a9f6896654xqg4: (assignment=worker, module=worker_context, worker_id=hvf_iu9m, ex_id=9268a523-8401-4bc2-b09c-b1742b321d24, job_id=54a1806>
    A slice error occurred {
      slice: {
        slice_id: '83d83549-a4d3-4a32-bc10-17231c152f49',
        slicer_id: 2,
        slicer_order: 68,
        request: { keys: [Array], count: 4850 },
        _created: '2025-02-10T22:51:22.799Z'
      }
    }
    --
    TSError: aborted
        at pRetry (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:87682:17)
        at async ElasticsearchReaderAPI.fetch (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:89520:20)
        at async ElasticsearchIDFetcher.handle (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:80183:33)
        at async file:///app/source/packages/job-components/dist/src/execution-context/worker.js:75:33
        at async file:///app/source/packages/utils/dist/src/promises.js:215:19
        at async file:///app/source/packages/utils/dist/src/promises.js:215:19
        at async WorkerExecutionContext._runSliceOnce (file:///app/source/packages/job-components/dist/src/execution-context/worker.js:295:29)
        at async Module.pRetry (file:///app/source/packages/utils/dist/src/promises.js:88:16)
        at async SliceExecution.run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/slice.js:38:22)
        at async Worker.runOnce (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:169:13)
        at async _run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:124:17)
        at _errorHandlerFn (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:107674:11)
        at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
[2025-02-10T22:52:53.840Z] ERROR: teraslice/7 on ts-wkr-es-store-kafka-4411806d-19a5-a9f6896654xqg4: slice state for 9268a523-8401-4bc2-b09c-b1742b321d24 has been marked as error (assignment=worker, module=slice, worker_id>
    TSError: aborted
        at Module.pRetry (file:///app/source/packages/utils/dist/src/promises.js:103:21)
        at async SliceExecution.run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/slice.js:38:22)
        at async Worker.runOnce (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:169:13)
        at async _run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:124:17)
        at pRetry (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:87682:17)
        at async ElasticsearchReaderAPI.fetch (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:89520:20)
        at async ElasticsearchIDFetcher.handle (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:80183:33)
        at async file:///app/source/packages/job-components/dist/src/execution-context/worker.js:75:33
        at async file:///app/source/packages/utils/dist/src/promises.js:215:19
        at async file:///app/source/packages/utils/dist/src/promises.js:215:19
        at async WorkerExecutionContext._runSliceOnce (file:///app/source/packages/job-components/dist/src/execution-context/worker.js:295:29)
        at async Module.pRetry (file:///app/source/packages/utils/dist/src/promises.js:88:16)
        at async SliceExecution.run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/slice.js:38:22)
        at async Worker.runOnce (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:169:13)
        at async _run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:124:17)
        at _errorHandlerFn (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:107674:11)
        at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
[2025-02-10T22:52:53.841Z] ERROR: teraslice/7 on ts-wkr-es-store-kafka-4411806d-19a5-a9f6896654xqg4: slice 83d83549-a4d3-4a32-bc10-17231c152f49 run error (assignment=worker, module=worker, worker_id=hvf_iu9m, ex_id=9268a52>
    TSError: Slice failed processing, caused by TSError: aborted
        at SliceExecution._markFailed (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/slice.js:117:15)
        at async SliceExecution.run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/slice.js:48:17)
        at async Worker.runOnce (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:169:13)
        at async _run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:124:17)
[2025-02-10T22:52:54.313Z]  INFO: teraslice/7 on ts-wkr-es-store-kafka-4411806d-19a5-a9f6896654xqg4: analytics for slice: slice_id: "62e5d797-3ec9-4bbe-9338-dfdcd781e67b", slicer_id: 2, slicer_order: 69, _created: "2025-02>
[2025-02-10T22:52:54.313Z]  INFO: teraslice/7 on ts-wkr-es-store-kafka-4411806d-19a5-a9f6896654xqg4: slice 62e5d797-3ec9-4bbe-9338-dfdcd781e67b completed (assignment=worker, module=worker, worker_id=hvf_iu9m, ex_id=9268a52>
[2025-02-10T22:52:55.500Z]  INFO: teraslice/7 on ts-wkr-es-store-kafka-4411806d-19a5-a9f6896654xqg4: analytics for slice: slice_id: "8b6369c5-0274-4947-b92c-5dfe9d555068", slicer_id: 2, slicer_order: 75, _created: "2025-02>

godber · 2025-02-11T00:24:59Z

It's worth pointing out that the jobs that had slice errors were reading from es 6.5.4, but we have one other job reading from es 6.8.6 that has NOT had a slice failure.

sotojn · 2025-02-11T22:02:15Z

I started tracing at what lines of code were being ran through the stack trace and will list it below to give a better idea one whats going on:

Caused by: ConnectionError: aborted
    at IncomingMessage.<anonymous> (/app/source/node_modules/elasticsearch6/lib/Transport.js:260:23)

Error above happens here:
https://github.com/elastic/elasticsearch-js/blob/098aef0a5826ee1124ac9618293612b5b2b84da4/lib/Transport.js#L251-L255

Caused by: TSError: aborted
    at _errorHandlerFn (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:107674:11)

TSError above occures here on line 839:

teraslice/packages/elasticsearch-api/index.js

Lines 829 to 848 in d442bfa

    
           function _errorHandler(fn, data, reject, fnName = '->unknown()') { 
        
               const retry = _retryFn(fn, data, reject); 
        
               return function _errorHandlerFn(err) { 
        
                   const retryable = isErrorRetryable(err); 
        
                   if (retryable) { 
        
                       retry(); 
        
                   } else { 
        
                       reject( 
        
                           new TSError(err, { 
        
                               context: { 
        
                                   fnName, 
        
                                   connection, 
        
                               }, 
        
                           }) 
        
                       ); 
        
                   } 
        
               }; 
        
           }

    at async ElasticsearchIDFetcher.handle (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:80183:33)

Occures in job-components here:

teraslice/packages/job-components/src/operations/fetcher.ts

Lines 16 to 19 in 7656856

    
               async handle(sliceRequest?: unknown): Promise<DataEntity[]> { 
        
                   return DataEntity.makeArray(await this.fetch(sliceRequest)); 
        
               } 
        
           }

    at async ElasticsearchReaderAPI.fetch (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:89520:20)

Ocurres in elasticsearch-asset-apis here:
https://github.com/terascope/elasticsearch-assets/blob/4bbd428e07382199a2cddaea2e3788ea0e8716ed/packages/elasticsearch-asset-apis/src/elasticsearch-reader-api/ElasticsearchReaderAPI.ts#L158-L170

sotojn · 2025-02-13T00:09:44Z

I've validated that this fix terascope/elasticsearch-assets#1365 resolves the issue above. I used chaos mesh to fail incoming http requests to elasticsearch 6.5.4 using elasticsearch-assets:v4.0.5 and was able to produce a similar error.

Elasticsearch used:

{
  "name" : "gliqkEy",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "m-M337BXRoSiWdNKqjTZIQ",
  "version" : {
    "number" : "6.5.4",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "d2ef93d",
    "build_date" : "2018-12-17T21:17:40.758843Z",
    "build_snapshot" : false,
    "lucene_version" : "7.5.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

Job file used:

{
    "name": "es-to-noop",
    "lifecycle": "once",
    "workers": 1,
    "log_level": "info",
    "assets": [
        "elasticsearch:4.0.5"
    ],
    "operations": [
        {
            "_op": "elasticsearch_reader",
            "connection": "es6",
            "index": "random-data-1",
            "size": 2500,
            "date_field_name": "created"
        },
        {
            "_op": "noop"
        }
    ]
}

Worker logs using elasticsearch-assets:v4.0.5:

[2025-02-13T00:02:00.085Z]  INFO: teraslice/10 on ts-wkr-es-to-noop-713f41de-6cc7-5d88db94df-fpgpc: analytics for slice: slice_id: "e6c537d1-44f4-4734-9c9c-917f2f96b589", slicer_id: 0, slicer_order: 170, _created: "2025-02-13T00:01:27.563Z", time: [157, 0], memory: [29164088, 1568], size: [20000, 20000] (assignment=worker, module=slice, worker_id=NcrF4JHE, ex_id=18a5d575-3bf8-4378-9661-9fed1daacae1, job_id=713f41de-6cc7-43f9-a982-de3f69cc2899, slice_id=e6c537d1-44f4-4734-9c9c-917f2f96b589)
[2025-02-13T00:02:00.085Z]  INFO: teraslice/10 on ts-wkr-es-to-noop-713f41de-6cc7-5d88db94df-fpgpc: slice e6c537d1-44f4-4734-9c9c-917f2f96b589 completed (assignment=worker, module=worker, worker_id=NcrF4JHE, ex_id=18a5d575-3bf8-4378-9661-9fed1daacae1, job_id=713f41de-6cc7-43f9-a982-de3f69cc2899)
[2025-02-13T00:02:00.190Z] ERROR: teraslice/10 on ts-wkr-es-to-noop-713f41de-6cc7-5d88db94df-fpgpc: (assignment=worker, module=worker_context, worker_id=NcrF4JHE, ex_id=18a5d575-3bf8-4378-9661-9fed1daacae1, job_id=713f41de-6cc7-43f9-a982-de3f69cc2899, err.code=INTERNAL_SERVER_ERROR)
    A slice error occurred {
      slice: {
        slice_id: 'e693a559-e098-4ca3-a7bd-e1546714e371',
        slicer_id: 0,
        slicer_order: 171,
        request: {
          start: '2025-02-12T21:14:02.000Z',
          end: '2025-02-12T21:14:03.000Z',
          limit: '2025-02-12T21:15:57.000Z',
          holes: [],
          count: 20000
        },
        _created: '2025-02-13T00:01:27.563Z'
      }
    }
    --
    TSError: aborted
        at pRetry (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:87682:17)
        at async ElasticsearchReaderAPI.fetch (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:89520:20)
        at async ElasticsearchDateFetcher.handle (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:80183:33)
        at async file:///app/source/packages/job-components/dist/src/execution-context/worker.js:75:33
        at async file:///app/source/packages/utils/dist/src/promises.js:215:19
        at async file:///app/source/packages/utils/dist/src/promises.js:215:19
        at async WorkerExecutionContext._runSliceOnce (file:///app/source/packages/job-components/dist/src/execution-context/worker.js:295:29)
        at async Module.pRetry (file:///app/source/packages/utils/dist/src/promises.js:88:16)
        at async SliceExecution.run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/slice.js:38:22)
        at async Worker.runOnce (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:169:13)
        at async _run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:124:17)
        at _errorHandlerFn (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:107674:11)
        at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
[2025-02-13T00:02:00.195Z] ERROR: teraslice/10 on ts-wkr-es-to-noop-713f41de-6cc7-5d88db94df-fpgpc: slice state for 18a5d575-3bf8-4378-9661-9fed1daacae1 has been marked as error (assignment=worker, module=slice, worker_id=NcrF4JHE, ex_id=18a5d575-3bf8-4378-9661-9fed1daacae1, job_id=713f41de-6cc7-43f9-a982-de3f69cc2899, slice_id=e693a559-e098-4ca3-a7bd-e1546714e371, err.code=INTERNAL_SERVER_ERROR)
    TSError: aborted
        at Module.pRetry (file:///app/source/packages/utils/dist/src/promises.js:103:21)
        at async SliceExecution.run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/slice.js:38:22)
        at async Worker.runOnce (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:169:13)
        at async _run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:124:17)
        at pRetry (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:87682:17)
        at async ElasticsearchReaderAPI.fetch (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:89520:20)
        at async ElasticsearchDateFetcher.handle (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:80183:33)
        at async file:///app/source/packages/job-components/dist/src/execution-context/worker.js:75:33
        at async file:///app/source/packages/utils/dist/src/promises.js:215:19
        at async file:///app/source/packages/utils/dist/src/promises.js:215:19
        at async WorkerExecutionContext._runSliceOnce (file:///app/source/packages/job-components/dist/src/execution-context/worker.js:295:29)
        at async Module.pRetry (file:///app/source/packages/utils/dist/src/promises.js:88:16)
        at async SliceExecution.run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/slice.js:38:22)
        at async Worker.runOnce (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:169:13)
        at async _run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:124:17)
        at _errorHandlerFn (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:107674:11)
        at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
[2025-02-13T00:02:00.195Z] ERROR: teraslice/10 on ts-wkr-es-to-noop-713f41de-6cc7-5d88db94df-fpgpc: slice e693a559-e098-4ca3-a7bd-e1546714e371 run error (assignment=worker, module=worker, worker_id=NcrF4JHE, ex_id=18a5d575-3bf8-4378-9661-9fed1daacae1, job_id=713f41de-6cc7-43f9-a982-de3f69cc2899, err.code=INTERNAL_SERVER_ERROR)
    TSError: Slice failed processing, caused by TSError: aborted
        at SliceExecution._markFailed (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/slice.js:117:15)
        at async SliceExecution.run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/slice.js:48:17)
        at async Worker.runOnce (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:169:13)
        at async _run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:124:17)
[2025-02-13T00:02:00.493Z] ERROR: teraslice/10 on ts-wkr-es-to-noop-713f41de-6cc7-5d88db94df-fpgpc: (assignment=worker, module=worker_context, worker_id=NcrF4JHE, ex_id=18a5d575-3bf8-4378-9661-9fed1daacae1, job_id=713f41de-6cc7-43f9-a982-de3f69cc2899, err.code=INTERNAL_SERVER_ERROR)
    A slice error occurred {
      slice: {
        slice_id: 'cdc2e891-e326-42ac-b1f2-56109354cd8a',
        slicer_id: 0,
        slicer_order: 172,
        request: {
          start: '2025-02-12T21:14:03.000Z',
          end: '2025-02-12T21:14:04.000Z',
          limit: '2025-02-12T21:15:57.000Z',
          holes: [],
          count: 16137
        },
        _created: '2025-02-13T00:01:27.569Z'
      }
    }
    --
    TSError: socket hang up
        at pRetry (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:87682:17)
        at async ElasticsearchReaderAPI.fetch (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:89520:20)
        at async ElasticsearchDateFetcher.handle (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:80183:33)
        at async file:///app/source/packages/job-components/dist/src/execution-context/worker.js:75:33
        at async file:///app/source/packages/utils/dist/src/promises.js:215:19
        at async file:///app/source/packages/utils/dist/src/promises.js:215:19
        at async WorkerExecutionContext._runSliceOnce (file:///app/source/packages/job-components/dist/src/execution-context/worker.js:295:29)
        at async Module.pRetry (file:///app/source/packages/utils/dist/src/promises.js:88:16)
        at async SliceExecution.run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/slice.js:38:22)
        at async Worker.runOnce (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:169:13)
        at async _run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:124:17)
        at _errorHandlerFn (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:107674:11)
        at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
[2025-02-13T00:02:00.496Z] ERROR: teraslice/10 on ts-wkr-es-to-noop-713f41de-6cc7-5d88db94df-fpgpc: slice state for 18a5d575-3bf8-4378-9661-9fed1daacae1 has been marked as error (assignment=worker, module=slice, worker_id=NcrF4JHE, ex_id=18a5d575-3bf8-4378-9661-9fed1daacae1, job_id=713f41de-6cc7-43f9-a982-de3f69cc2899, slice_id=cdc2e891-e326-42ac-b1f2-56109354cd8a, err.code=INTERNAL_SERVER_ERROR)
    TSError: socket hang up
        at Module.pRetry (file:///app/source/packages/utils/dist/src/promises.js:103:21)
        at async SliceExecution.run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/slice.js:38:22)
        at async Worker.runOnce (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:169:13)
        at async _run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:124:17)
        at pRetry (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:87682:17)
        at async ElasticsearchReaderAPI.fetch (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:89520:20)
        at async ElasticsearchDateFetcher.handle (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:80183:33)
        at async file:///app/source/packages/job-components/dist/src/execution-context/worker.js:75:33
        at async file:///app/source/packages/utils/dist/src/promises.js:215:19
        at async file:///app/source/packages/utils/dist/src/promises.js:215:19
        at async WorkerExecutionContext._runSliceOnce (file:///app/source/packages/job-components/dist/src/execution-context/worker.js:295:29)
        at async Module.pRetry (file:///app/source/packages/utils/dist/src/promises.js:88:16)
        at async SliceExecution.run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/slice.js:38:22)
        at async Worker.runOnce (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:169:13)
        at async _run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:124:17)
        at _errorHandlerFn (file:///app/assets/bd33953c2886d977354da2d9a90a4fa11015fed5/index.js:107674:11)
        at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
[2025-02-13T00:02:00.496Z] ERROR: teraslice/10 on ts-wkr-es-to-noop-713f41de-6cc7-5d88db94df-fpgpc: slice cdc2e891-e326-42ac-b1f2-56109354cd8a run error (assignment=worker, module=worker, worker_id=NcrF4JHE, ex_id=18a5d575-3bf8-4378-9661-9fed1daacae1, job_id=713f41de-6cc7-43f9-a982-de3f69cc2899, err.code=INTERNAL_SERVER_ERROR)
    TSError: Slice failed processing, caused by TSError: socket hang up
        at SliceExecution._markFailed (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/slice.js:117:15)
        at async SliceExecution.run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/slice.js:48:17)
        at async Worker.runOnce (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:169:13)
        at async _run (file:///app/source/packages/teraslice/dist/src/lib/workers/worker/index.js:124:17)

Could not reproduce it on elasticsearch-assets:v4.2.1.

sotojn · 2025-02-13T00:13:53Z

For whats its worth here is my scheduled mesh-chaos "experiment" that I ran against it if we ever need to reproduce this:

kind: Schedule
apiVersion: chaos-mesh.org/v1alpha1
metadata:
  namespace: services-dev1
  name: abort-3-2
  annotations:
    experiment.chaos-mesh.org/pause: 'false'
spec:
  schedule: '*/1 * * * *'.      # this is just cron syntax for "every minute do this"
  startingDeadlineSeconds: null
  concurrencyPolicy: Forbid
  historyLimit: 1
  type: HTTPChaos
  httpChaos:
    selector:
      namespaces:
        - services-dev1
      labelSelectors:
        app: elasticsearch
    mode: all
    target: Response
    abort: true
    port: 9200
    path: '*'
    duration: 500ms

godber · 2025-02-13T17:51:05Z

Resolved in https://github.com/terascope/elasticsearch-assets/releases/tag/v4.2.1

godber added bug pkg/teraslice labels Feb 10, 2025

godber assigned jsnoble Feb 10, 2025

sotojn self-assigned this Feb 11, 2025

godber changed the title ~~Teraslice Elasticsearch Reader ES6 Slicer(??) Error~~ Teraslice Elasticsearch Reader ES6 Worker Error Feb 11, 2025

godber mentioned this issue Feb 12, 2025

Es fetch fix terascope/elasticsearch-assets#1365

Merged

godber closed this as completed Feb 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Teraslice Elasticsearch Reader ES6 Worker Error #3962

Teraslice Elasticsearch Reader ES6 Worker Error #3962

godber commented Feb 10, 2025 •

edited

Loading

godber commented Feb 10, 2025

godber commented Feb 10, 2025

godber commented Feb 11, 2025

godber commented Feb 11, 2025

sotojn commented Feb 11, 2025

sotojn commented Feb 13, 2025 •

edited

Loading

sotojn commented Feb 13, 2025

godber commented Feb 13, 2025

Teraslice Elasticsearch Reader ES6 Worker Error #3962

Teraslice Elasticsearch Reader ES6 Worker Error #3962

Comments

godber commented Feb 10, 2025 • edited Loading

godber commented Feb 10, 2025

godber commented Feb 10, 2025

godber commented Feb 11, 2025

godber commented Feb 11, 2025

sotojn commented Feb 11, 2025

sotojn commented Feb 13, 2025 • edited Loading

sotojn commented Feb 13, 2025

godber commented Feb 13, 2025

godber commented Feb 10, 2025 •

edited

Loading

sotojn commented Feb 13, 2025 •

edited

Loading