Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeing FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas on session load #674

Closed
borolepratik opened this issue Jan 17, 2025 · 8 comments
Labels
bug Something isn't working

Comments

@borolepratik
Copy link

Describe the issue:

Seeing FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas on session load. Happens for atleast these sessions:

year gp session
2022 6 3
2022 5 1
2022 10 1

Reproduce the code example:

import fastf1

# Parameters
year = 2022
grand_prix = 10
session = 1

# Load session data
session = fastf1.get_session(year, grand_prix, session)
session.load()

Error message:

req         WARNING 	DEFAULT CACHE ENABLED! (24.0 KB) /home/datalore/.cache/fastf1
core           INFO 	Loading data for British Grand Prix - Practice 1 [v3.4.4]
req            INFO 	No cached data found for session_info. Loading data...
_api           INFO 	Fetching session info data...
req            INFO 	Data has been written to cache!
req            INFO 	No cached data found for driver_info. Loading data...
_api           INFO 	Fetching driver list...
req            INFO 	Data has been written to cache!
req            INFO 	No cached data found for session_status_data. Loading data...
_api           INFO 	Fetching session status data...
req            INFO 	Data has been written to cache!
req            INFO 	No cached data found for track_status_data. Loading data...
_api           INFO 	Fetching track status data...
req            INFO 	Data has been written to cache!
req            INFO 	No cached data found for _extended_timing_data. Loading data...
_api           INFO 	Fetching timing data...
_api           INFO 	Parsing timing data...
req            INFO 	Data has been written to cache!
req            INFO 	No cached data found for timing_app_data. Loading data...
_api           INFO 	Fetching timing app data...
req            INFO 	Data has been written to cache!
core           INFO 	Processing timing data...
/opt/python/envs/customPyEnv311/lib/python3.11/site-packages/fastf1/core.py:1579: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '<TimedeltaArray>
['0 days 00:22:12.333000']
Length: 1, dtype: timedelta64[ns]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.
  result.loc[mask, 'LapStartTime'] = result.loc[mask, 'PitOutTime']
/opt/python/envs/customPyEnv311/lib/python3.11/site-packages/fastf1/core.py:1579: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '<TimedeltaArray>
['0 days 00:23:38.025000']
Length: 1, dtype: timedelta64[ns]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.
  result.loc[mask, 'LapStartTime'] = result.loc[mask, 'PitOutTime']
/opt/python/envs/customPyEnv311/lib/python3.11/site-packages/fastf1/core.py:1579: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '<TimedeltaArray>
['0 days 00:21:12.607000']
Length: 1, dtype: timedelta64[ns]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.
  result.loc[mask, 'LapStartTime'] = result.loc[mask, 'PitOutTime']
/opt/python/envs/customPyEnv311/lib/python3.11/site-packages/fastf1/core.py:1579: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '<TimedeltaArray>
['0 days 00:30:23.588000']
Length: 1, dtype: timedelta64[ns]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.
  result.loc[mask, 'LapStartTime'] = result.loc[mask, 'PitOutTime']
req            INFO 	No cached data found for car_data. Loading data...
_api           INFO 	Fetching car data...
_api           INFO 	Parsing car data...
req            INFO 	Data has been written to cache!
req            INFO 	No cached data found for position_data. Loading data...
_api           INFO 	Fetching position data...
_api           INFO 	Parsing position data...
req            INFO 	Data has been written to cache!
req            INFO 	No cached data found for weather_data. Loading data...
_api           INFO 	Fetching weather data...
req            INFO 	Data has been written to cache!
req            INFO 	No cached data found for race_control_messages. Loading data...
_api           INFO 	Fetching race control messages...
req            INFO 	Data has been written to cache!
core           INFO 	Finished loading data for 20 drivers: ['1', '10', '11', '14', '16', '18', '20', '22', '23', '24', '3', '31', '4', '44', '47', '5', '55', '6', '63', '77']
@theOehrly
Copy link
Owner

Can you give me the output of pip list so that I can see what versions of the dependencies you have installed? Pandas hasn't had a new release in a few months, and this hasn't come up in testing. So it's most likely an issue caused by (partially) outdated dependencies.

@borolepratik
Copy link
Author

pip list

aiohappyeyeballs   2.4.4
aiohttp            3.11.11
aiosignal          1.3.2
alembic            1.14.0
annotated-types    0.7.0
anyio              4.8.0
argcomplete        3.5.3
attrs              24.3.0
bidict             0.23.1
build              1.2.2.post1
cattrs             24.1.2
certifi            2024.12.14
cfgv               3.4.0
charset-normalizer 3.4.1
click              8.1.8
colorama           0.4.6
commitizen         4.1.0
contourpy          1.3.1
cycler             0.12.1
decli              0.6.2
deprecation        2.1.0
distlib            0.3.9
docutils           0.21.2
fastapi            0.115.6
fastf1             3.4.4
filelock           3.16.1
fonttools          4.55.3
frozenlist         1.5.0
gotrue             2.11.1
gunicorn           23.0.0
h11                0.14.0
h2                 4.1.0
hpack              4.0.0
httpcore           1.0.7
httpx              0.27.2
hyperframe         6.0.1
identify           2.6.5
idna               3.10
jaraco.classes     3.4.0
jaraco.context     6.0.1
jaraco.functools   4.1.0
Jinja2             3.1.5
keyring            25.6.0
kiwisolver         1.4.8
lazy_loader        0.4
Mako               1.3.8
markdown-it-py     3.0.0
MarkupSafe         3.0.2
matplotlib         3.10.0
mdurl              0.1.2
more-itertools     10.5.0
multidict          6.1.0
mypy               1.14.1
mypy-extensions    1.0.0
nh3                0.2.20
nodeenv            1.9.1
numpy              2.2.1
packaging          24.2
pandas             2.2.3
pastel             0.2.1
pillow             11.1.0
pip                24.3.1
pipdeptree         2.16.2
pkginfo            1.12.0
platformdirs       4.3.6
poethepoet         0.32.0
postgrest          0.19.1
pre_commit         4.0.1
prompt_toolkit     3.0.48
propcache          0.2.1
psutil             6.1.1
psycopg            3.2.4
psycopg-binary     3.2.4
pydantic           2.10.4
pydantic_core      2.27.2
Pygments           2.19.0
pyparsing          3.2.1
pyproject_hooks    1.2.0
python-dateutil    2.9.0.post0
python-engineio    4.11.2
python-multipart   0.0.20
python-socketio    5.12.1
pytz               2024.2
pywin32-ctypes     0.2.3
PyYAML             6.0.2
questionary        2.1.0
RapidFuzz          3.11.0
readme_renderer    44.0
realtime           2.1.0
redis              5.2.1
reflex             0.6.8
reflex-chakra      0.6.2
reflex-hosting-cli 0.1.32
requests           2.32.3
requests-cache     1.2.1
requests-toolbelt  1.0.0
rfc3986            2.0.0
rich               13.9.4
ruff               0.8.6
scipy              1.15.0
setuptools         75.7.0
shellingham        1.5.4
simple-websocket   1.1.0
six                1.17.0
sniffio            1.3.1
SQLAlchemy         2.0.36
sqlmodel           0.0.22
starlette          0.41.3
starlette-admin    0.14.1
storage3           0.11.0
StrEnum            0.4.15
supabase           2.11.0
supafunc           0.9.0
tabulate           0.9.0
termcolor          2.5.0
timple             0.1.8
tomlkit            0.13.2
twine              6.0.1
typer              0.15.1
types-requests     2.32.0.20241016
typing_extensions  4.12.2
tzdata             2024.2
url-normalize      1.4.3
urllib3            2.3.0
uvicorn            0.34.0
virtualenv         20.28.1
wcwidth            0.2.13
websockets         13.1
wheel              0.45.1
wrapt              1.17.0
wsproto            1.2.0
yarl               1.18.3

@Casper-Guo
Copy link
Contributor

I reinstalled the master branch in editable mode in a virtual environment and also see the same behavior.

Package            Version      Editable project location
------------------ ------------ -------------------------
attrs              24.3.0
cattrs             24.1.2
certifi            2024.12.14
charset-normalizer 3.4.1
contourpy          1.3.1
cycler             0.12.1
fastf1             0.1.0.dev794 /home/robery/OSS/Fast-F1
fonttools          4.55.3
idna               3.10
kiwisolver         1.4.8
matplotlib         3.10.0
numpy              2.2.1
packaging          24.2
pandas             2.2.3
pillow             11.1.0
pip                24.0
platformdirs       4.3.6
pyparsing          3.2.1
python-dateutil    2.9.0.post0
pytz               2024.2
RapidFuzz          3.11.0
requests           2.32.3
requests-cache     1.2.1
scipy              1.15.1
six                1.17.0
timple             0.1.8
tzdata             2024.2
url-normalize      1.4.3
urllib3            2.3.0
websockets         13.1

FWIW, I can also reproduce this with Fastf1 v3.4.4 and Pandas v2.2.2

I am trying to downgrade Pandas and isolate when this warning was introduced. It seems like this depracation is relevant but it doesn't say anything about the datetime types specifically

@theOehrly
Copy link
Owner

theOehrly commented Jan 17, 2025

Did these warnings only start just now? The last Pandas release was in September. The last few FastF1 release most likely didn't change anything relevant. So there is some potential that this in fact caused by a different dependency being updated.

Edit: also interesting is the fact that these warnings don't seem to appear in the CI runs.

Edit 2: @borolepratik are you installing from master too? If yes, I might have messed up in 976efb3

@Casper-Guo
Copy link
Contributor

My guess is this is not a bug in general but rather something to do with the specific sessions in question. I have a job that loads session data after every GP and I just searched through the logs. There is no warning of this kind either.

I see the same behavior in 3.4.4 so it is not due solely to the commit you are referencing

@Casper-Guo
Copy link
Contributor

I think I have this figured out. The following type coercion does not have the intended effect when the passed argument is another pd.Series:

result.loc[:, 'LapStartTime'] = pd.Series(
    laps_start_time, dtype='timedelta64[ns]'
)

The following minimal example replicates the condition:

import pandas as pd

df = pd.DataFrame()
df.loc[:, "time1"] = pd.Series(
    pd.Series([pd.NaT], dtype="datetime64[ns]"), dtype="timedelta64[ns]"
)
print(df.dtypes)

outputs time1 datetime64[ns]

@Casper-Guo
Copy link
Contributor

laps_start_time = list(result['Time'])[:-1]
if self.name in self._RACE_LIKE_SESSIONS:
    # assumption that the first lap started when the session was
    # started can only be made for the race
    laps_start_time.insert(0, self.session_start_time)
else:
    laps_start_time.insert(0, pd.NaT)
laps_start_time = pd.Series(laps_start_time)

We lose any type guarantee on laps_start_time when we convert it to a list. Subsequently, if we are not in a race like session, we insert pd.NaT at the start of said list. If that pd.NaT is the only element in the list, Pandas infer the datetime dtype when we convert it back to a series. See the following minimal example:

df = pd.DataFrame(
    data={"time1": [pd.Timedelta(seconds=100)]}, dtype="timedelta64[ns]"
)
print(df.dtypes)
last_time = pd.Series([pd.NaT] + list(df["time1"])[:-1])
print(last_time.dtype)

outputs

time1    timedelta64[ns]
dtype: object
datetime64[ns]

This can explain why we have not seen this warning in CI or my race workflow before. It only manifests when a driver does exactly one lap in a practice session. For example, for 2022 GP 10 session 1 the warning is raised for driver numbers 14, 23, 31, 6. And if we print(session.laps.groupby("DriverNumber").size().sort_values()), we get:

DriverNumber
14    1
23    1
31    1
6     1
63    2
3     3
20    3
1     3
47    3
11    3
10    3
24    4
4     4
22    5
5     5
18    5
16    7
55    8
44    9
77    9

as expected

Casper-Guo added a commit to Casper-Guo/Fast-F1 that referenced this issue Jan 17, 2025
@theOehrly theOehrly added the bug Something isn't working label Jan 18, 2025
theOehrly added a commit that referenced this issue Jan 18, 2025
See #674 for discussion

---------

Co-authored-by: theOehrly <23384863+theOehrly@users.noreply.github.com>
@theOehrly
Copy link
Owner

@Casper-Guo thank you for figuring this out and @borolepratik thanks for reporting! I just merged #676 to fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants