Parallelize sensor search #962

Flix6x · 2024-01-15T16:17:46Z

Tech spike to increase chart loading speed through parallelization. I'm using n=4 parallel processes here, but that could become a config setting.

@Ahmad-Wahid could you do a speed benchmark on this branch (compared to main) for me? Some manual tests of mine (locally) suggested an asset chart with 4 sensors and 1 week of data loads about twice as fast. I'm interested to see whether you can reproduce that result on a cloud server, and how the speed-up factor would change with a larger time frame and with more sensors.

Signed-off-by: F.N. Claessen <felix@seita.nl>

This reverts commit 5ee2396.

…an asset Signed-off-by: F.N. Claessen <felix@seita.nl>

Ahmad-Wahid · 2024-01-15T17:16:51Z

sure, I will do.

Ahmad-Wahid · 2024-01-16T15:15:41Z

Docker image cannot be build, as failing in the test. Locally, I could build it but making it live causing issue.

Ahmad-Wahid · 2024-01-16T15:18:27Z

[FLEXMEASURES][2024-01-16 14:59:19,356] ERROR: ConnectTimeout:"HTTPConnectionPool(host='*******', port=80): Max retries exceeded with url: //api/v3_0/assets?account_id=4 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fc41d9d2200>, 'Connection to **** timed out. (connect timeout=None)'))" [occurred at /usr/local/lib/python3.10/dist-packages/requests/adapters.py(send):507,URL was: /assets/]
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/urllib3/connection.py", line 203, in _new_conn
sock = connection.create_connection(
File "/usr/local/lib/python3.10/dist-packages/urllib3/util/connection.py", line 85, in create_connection
raise err
File "/usr/local/lib/python3.10/dist-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
TimeoutError: [Errno 110] Connection timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py", line 790, in urlopen
response = self._make_request(
File "/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py", line 496, in _make_request
conn.request(
File "/usr/local/lib/python3.10/dist-packages/urllib3/connection.py", line 395, in request
self.endheaders()
File "/usr/lib/python3.10/http/client.py", line 1278, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.10/http/client.py", line 1038, in _send_output
self.send(msg)
File "/usr/lib/python3.10/http/client.py", line 976, in send
self.connect()
File "/usr/local/lib/python3.10/dist-packages/urllib3/connection.py", line 243, in connect
self.sock = self._new_conn()
File "/usr/local/lib/python3.10/dist-packages/urllib3/connection.py", line 212, in _new_conn
raise ConnectTimeoutError(
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPConnection object at 0x7fc41d9d2200>, 'Connection to **** timed out. (connect timeout=None)')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/requests/adapters.py", line 486, in send
resp = conn.urlopen(
File "/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py", line 844, in urlopen
retries = retries.increment(
File "/usr/local/lib/python3.10/dist-packages/urllib3/util/retry.py", line 515, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='****', port=80): Max retries exceeded with url: //api/v3_0/assets?account_id=4 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fc41d9d2200>, 'Connection to **** timed out. (connect timeout=None)'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/flask/app.py", line 1823, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.10/dist-packages/flask/app.py", line 1799, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File "/usr/local/lib/python3.10/dist-packages/flask_classful.py", line 303, in proxy
response = view(**request.view_args)
File "/usr/local/lib/python3.10/dist-packages/flask_classful.py", line 271, in inner
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/flask_login/utils.py", line 290, in decorated_view
return current_app.ensure_sync(func)(*args, **kwargs)
File "/app/flexmeasures/ui/crud/assets.py", line 213, in index
assets += get_assets_by_account(account.id)
File "/app/flexmeasures/ui/crud/assets.py", line 182, in get_assets_by_account
get_assets_response = InternalApi().get(
File "/app/flexmeasures/ui/crud/api_wrapper.py", line 53, in get
response = requests.get(
File "/usr/local/lib/python3.10/dist-packages/requests/api.py", line 73, in get
return request("get", url, params=params, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.10/dist-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/requests/adapters.py", line 507, in send
raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPConnectionPool(host='****l', port=80): Max retries exceeded with url: //api/v3_0/assets?account_id=4 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fc41d9d2200>, 'Connection to **** timed out. (connect timeout=None)'))

victorgarcia98 · 2024-07-04T13:36:51Z

This looks very promising. We should continue this work. I volunteer testing this locally in the following two weeks.

nhoening · 2024-09-02T15:47:06Z

@Flix6x Wouldn't this PR also improve our simulation work (i.e. are we often querying multiple sensors at once there?).

Of course, we browse data there, as well, so the tooling would benefit. I'm guessing the analysis part we're doing there (using reporters to compute KPIs across sensors) would improve speed-wise with this.

I'm trying to say that this is a neglected high-value PR, which we should put in the planning for the next weeks.

nhoening · 2024-09-02T15:47:53Z

Docker image cannot be build, as failing in the test. Locally, I could build it but making it live causing issue.

Maybe this is the blocking issue?

Signed-off-by: Felix Claessen <30658763+Flix6x@users.noreply.github.com>

Signed-off-by: F.N. Claessen <felix@seita.nl>

…to 4 Signed-off-by: F.N. Claessen <felix@seita.nl>

Flix6x · 2024-09-18T09:42:01Z

The test steps and the build step failed after 6 hours. I tried rerunning the falling steps in the pipeline on the same day, and am today retrying just one of the four test steps (for different Python versions) to see if the issue still persists.

nhoening · 2025-01-02T21:17:46Z

Is this issue still alive?

…llel-sensor-search # Conflicts: # flexmeasures/data/models/generic_assets.py # flexmeasures/utils/config_defaults.py

Signed-off-by: F.N. Claessen <felix@seita.nl>

…ch sensor` happens in tests when the fixtures use one session that hasn't been committed yet, and the parallel search method opens a new session in each parallel process Signed-off-by: F.N. Claessen <felix@seita.nl>

Signed-off-by: F.N. Claessen <felix@seita.nl>

Flix6x · 2025-01-04T13:22:01Z

Update

The build seems to work fine now.
Tests were failing, though.
I started to fix them, but some remain broken.
One of the downsides of having subprocesses start their own sessions (which is the recommended practice) is that data set up by fixtures is not present in those new sessions unless the "test" session is committed before the parallel search method is used. I started dealing with that by having fixtures commit their session, which is a direction I'm unsure about.
Related thought: in case only 1 sensor needs to be searched, starting a new session just for that search is probably a bad practice.
Most importantly: I tested a bit in the UI again for 4 sensors, and found the speed benefit seemed not to be present anymore. We need more thorough testing, though.

Flix6x added 4 commits January 15, 2024 16:36

feature: use multiprocessing for sensor search

4944881

Signed-off-by: F.N. Claessen <felix@seita.nl>

feature: use multiprocessing for asset search

5ee2396

Signed-off-by: F.N. Claessen <felix@seita.nl>

Revert "feature: use multiprocessing for asset search"

9fd3eaa

This reverts commit 5ee2396.

feature: use multiprocessing for searching beliefs on the sensors of …

ac24423

…an asset Signed-off-by: F.N. Claessen <felix@seita.nl>

Flix6x mentioned this pull request Jan 18, 2024

Speed up search by avoiding the creation of intermediary objects SeitaBV/timely-beliefs#162

Merged

Flix6x and others added 3 commits September 16, 2024 10:35

Merge branch 'main' into dev/parallel-sensor-search

4bb4bce

Signed-off-by: Felix Claessen <30658763+Flix6x@users.noreply.github.com>

style: black

46bc4ef

Signed-off-by: F.N. Claessen <felix@seita.nl>

feature: config setting for number of parallel processes, defaulting …

23c0367

…to 4 Signed-off-by: F.N. Claessen <felix@seita.nl>

Flix6x added 5 commits January 4, 2025 12:20

Merge remote-tracking branch 'refs/remotes/origin/main' into dev/para…

73d63c7

…llel-sensor-search # Conflicts: # flexmeasures/data/models/generic_assets.py # flexmeasures/utils/config_defaults.py

fix: missing search arguments

939c406

Signed-off-by: F.N. Claessen <felix@seita.nl>

fix: resolve merge conflict

e0ea192

Signed-off-by: F.N. Claessen <felix@seita.nl>

fix: bdf.sensor had wrong type

e2929a1

Signed-off-by: F.N. Claessen <felix@seita.nl>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelize sensor search #962

Parallelize sensor search #962

Flix6x commented Jan 15, 2024

Ahmad-Wahid commented Jan 15, 2024

Ahmad-Wahid commented Jan 16, 2024

Ahmad-Wahid commented Jan 16, 2024

victorgarcia98 commented Jul 4, 2024

nhoening commented Sep 2, 2024

nhoening commented Sep 2, 2024

Flix6x commented Sep 18, 2024

nhoening commented Jan 2, 2025

Flix6x commented Jan 4, 2025

Parallelize sensor search #962

Are you sure you want to change the base?

Parallelize sensor search #962

Conversation

Flix6x commented Jan 15, 2024

Ahmad-Wahid commented Jan 15, 2024

Ahmad-Wahid commented Jan 16, 2024

Ahmad-Wahid commented Jan 16, 2024

victorgarcia98 commented Jul 4, 2024

nhoening commented Sep 2, 2024

nhoening commented Sep 2, 2024

Flix6x commented Sep 18, 2024

nhoening commented Jan 2, 2025

Flix6x commented Jan 4, 2025

Update