Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize sensor search #962

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from
Draft

Parallelize sensor search #962

wants to merge 12 commits into from

Conversation

Flix6x
Copy link
Contributor

@Flix6x Flix6x commented Jan 15, 2024

Tech spike to increase chart loading speed through parallelization. I'm using n=4 parallel processes here, but that could become a config setting.

@Ahmad-Wahid could you do a speed benchmark on this branch (compared to main) for me? Some manual tests of mine (locally) suggested an asset chart with 4 sensors and 1 week of data loads about twice as fast. I'm interested to see whether you can reproduce that result on a cloud server, and how the speed-up factor would change with a larger time frame and with more sensors.

Signed-off-by: F.N. Claessen <felix@seita.nl>
Signed-off-by: F.N. Claessen <felix@seita.nl>
…an asset

Signed-off-by: F.N. Claessen <felix@seita.nl>
@Ahmad-Wahid
Copy link
Contributor

sure, I will do.

@Ahmad-Wahid
Copy link
Contributor

Docker image cannot be build, as failing in the test. Locally, I could build it but making it live causing issue.

@Ahmad-Wahid
Copy link
Contributor

[FLEXMEASURES][2024-01-16 14:59:19,356] ERROR: ConnectTimeout:"HTTPConnectionPool(host='*******', port=80): Max retries exceeded with url: //api/v3_0/assets?account_id=4 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fc41d9d2200>, 'Connection to **** timed out. (connect timeout=None)'))" [occurred at /usr/local/lib/python3.10/dist-packages/requests/adapters.py(send):507,URL was: /assets/]
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/urllib3/connection.py", line 203, in _new_conn
sock = connection.create_connection(
File "/usr/local/lib/python3.10/dist-packages/urllib3/util/connection.py", line 85, in create_connection
raise err
File "/usr/local/lib/python3.10/dist-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
TimeoutError: [Errno 110] Connection timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py", line 790, in urlopen
response = self._make_request(
File "/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py", line 496, in _make_request
conn.request(
File "/usr/local/lib/python3.10/dist-packages/urllib3/connection.py", line 395, in request
self.endheaders()
File "/usr/lib/python3.10/http/client.py", line 1278, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.10/http/client.py", line 1038, in _send_output
self.send(msg)
File "/usr/lib/python3.10/http/client.py", line 976, in send
self.connect()
File "/usr/local/lib/python3.10/dist-packages/urllib3/connection.py", line 243, in connect
self.sock = self._new_conn()
File "/usr/local/lib/python3.10/dist-packages/urllib3/connection.py", line 212, in _new_conn
raise ConnectTimeoutError(
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPConnection object at 0x7fc41d9d2200>, 'Connection to **** timed out. (connect timeout=None)')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/requests/adapters.py", line 486, in send
resp = conn.urlopen(
File "/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py", line 844, in urlopen
retries = retries.increment(
File "/usr/local/lib/python3.10/dist-packages/urllib3/util/retry.py", line 515, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='****', port=80): Max retries exceeded with url: //api/v3_0/assets?account_id=4 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fc41d9d2200>, 'Connection to **** timed out. (connect timeout=None)'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/flask/app.py", line 1823, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.10/dist-packages/flask/app.py", line 1799, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File "/usr/local/lib/python3.10/dist-packages/flask_classful.py", line 303, in proxy
response = view(**request.view_args)
File "/usr/local/lib/python3.10/dist-packages/flask_classful.py", line 271, in inner
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/flask_login/utils.py", line 290, in decorated_view
return current_app.ensure_sync(func)(*args, **kwargs)
File "/app/flexmeasures/ui/crud/assets.py", line 213, in index
assets += get_assets_by_account(account.id)
File "/app/flexmeasures/ui/crud/assets.py", line 182, in get_assets_by_account
get_assets_response = InternalApi().get(
File "/app/flexmeasures/ui/crud/api_wrapper.py", line 53, in get
response = requests.get(
File "/usr/local/lib/python3.10/dist-packages/requests/api.py", line 73, in get
return request("get", url, params=params, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.10/dist-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/requests/adapters.py", line 507, in send
raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPConnectionPool(host='****l', port=80): Max retries exceeded with url: //api/v3_0/assets?account_id=4 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7fc41d9d2200>, 'Connection to **** timed out. (connect timeout=None)'))

@victorgarcia98
Copy link
Contributor

This looks very promising. We should continue this work. I volunteer testing this locally in the following two weeks.

@nhoening
Copy link
Contributor

nhoening commented Sep 2, 2024

@Flix6x Wouldn't this PR also improve our simulation work (i.e. are we often querying multiple sensors at once there?).

Of course, we browse data there, as well, so the tooling would benefit. I'm guessing the analysis part we're doing there (using reporters to compute KPIs across sensors) would improve speed-wise with this.

I'm trying to say that this is a neglected high-value PR, which we should put in the planning for the next weeks.

@nhoening
Copy link
Contributor

nhoening commented Sep 2, 2024

Docker image cannot be build, as failing in the test. Locally, I could build it but making it live causing issue.

Maybe this is the blocking issue?

Flix6x and others added 3 commits September 16, 2024 10:35
Signed-off-by: Felix Claessen <30658763+Flix6x@users.noreply.github.com>
Signed-off-by: F.N. Claessen <felix@seita.nl>
…to 4

Signed-off-by: F.N. Claessen <felix@seita.nl>
@Flix6x
Copy link
Contributor Author

Flix6x commented Sep 18, 2024

The test steps and the build step failed after 6 hours. I tried rerunning the falling steps in the pipeline on the same day, and am today retrying just one of the four test steps (for different Python versions) to see if the issue still persists.

@nhoening
Copy link
Contributor

nhoening commented Jan 2, 2025

Is this issue still alive?

Flix6x added 5 commits January 4, 2025 12:20
…llel-sensor-search

# Conflicts:
#	flexmeasures/data/models/generic_assets.py
#	flexmeasures/utils/config_defaults.py
Signed-off-by: F.N. Claessen <felix@seita.nl>
Signed-off-by: F.N. Claessen <felix@seita.nl>
…ch sensor` happens in tests when the fixtures use one session that hasn't been committed yet, and the parallel search method opens a new session in each parallel process

Signed-off-by: F.N. Claessen <felix@seita.nl>
Signed-off-by: F.N. Claessen <felix@seita.nl>
@Flix6x
Copy link
Contributor Author

Flix6x commented Jan 4, 2025

Update

  • The build seems to work fine now.
  • Tests were failing, though.
  • I started to fix them, but some remain broken.
  • One of the downsides of having subprocesses start their own sessions (which is the recommended practice) is that data set up by fixtures is not present in those new sessions unless the "test" session is committed before the parallel search method is used. I started dealing with that by having fixtures commit their session, which is a direction I'm unsure about.
  • Related thought: in case only 1 sensor needs to be searched, starting a new session just for that search is probably a bad practice.
  • Most importantly: I tested a bit in the UI again for 4 sensors, and found the speed benefit seemed not to be present anymore. We need more thorough testing, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants