Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] - fix guest accelerators on GCP on develop #2425

Closed
Adam-D-Lewis opened this issue Apr 25, 2024 · 1 comment · Fixed by #2426
Closed

[BUG] - fix guest accelerators on GCP on develop #2425

Adam-D-Lewis opened this issue Apr 25, 2024 · 1 comment · Fixed by #2426
Assignees
Labels
block-release ⛔️ Must be completed for release provider: GCP type: bug 🐛 Something isn't working
Milestone

Comments

@Adam-D-Lewis
Copy link
Member

Adam-D-Lewis commented Apr 25, 2024

Describe the bug

I deployed Nebari on commit 2f85ece2b00686de99d94695b55e1c7bb9dde642 (latest commit at time of writing on develop branch, shorly after 2024.3.3 release)

I got an error similar to this being thrown on

[
      GCPNodeGroupInputVars(
          name=name,
          labels=node_group.labels,
          instance_type=node_group.instance,
          min_size=node_group.min_nodes,
          max_size=node_group.max_nodes,
          preemptible=node_group.preemptible,
          guest_accelerators=node_group.guest_accelerators,
      )
      for name, node_group in self.config.google_cloud_platform.node_groups.items()
  ]

error is something like

    GCPNodeGroupInputVars(
  File "/home/balast/miniconda3/envs/possee-neb/lib/python3.11/site-packages/pydantic/main.py", line 164, in __init__
    __pydantic_self__.__pydantic_validator__.validate_python(data, self_instance=__pydantic_self__)
pydantic_core._pydantic_core.ValidationError: 1 validation error for GCPNodeGroupInputVars
guest_accelerators.0
  Input should be a valid dictionary or instance of GCPGuestAccelerators [type=model_type, input_value=GCPGuestAccelerator(name=...idia-tesla-t4', count=1), input_type=GCPGuestAccelerator]
    For further information visit https://errors.pydantic.dev/2.4/v/model_type

I noticed src/_nebari/stages/infrastructure/__init__.py has both a GCPGuestAccelerator class and a GCPGuestAccelerators (plural) class that are nearly identical, and we were passing the wrong one in the code above. It seems this wasn't a problem in pydantic version 1, but it is a problem in pydantic version 2.

To verify this is a quirk of pydantic v1 vs v2 behavior I ran the following code with both pydantic 1.10.12 and 2.4.2. It runs without complain with v1 and throws the error above with v2.

from typing import Annotated, List
from pydantic import BaseModel, Field


class Base(BaseModel):
    pass

class GCPGuestAccelerators(Base):
    name: str
    count: int


class GCPNodeGroupInputVars(Base):
    guest_accelerators: List[GCPGuestAccelerators]


class GCPGuestAccelerator(Base):
    name: str
    count: Annotated[int, Field(ge=1)] = 1

GCPNodeGroupInputVars(
    guest_accelerators=[
        GCPGuestAccelerator(name="nvidia-tesla-t4", count=1)  # wrong instance type passed in
    ]
)
print('done!')

Expected behavior

No error

OS and architecture in which you are running Nebari

Linux x86-64

How to Reproduce the problem?

See above

Command output

No response

Versions and dependencies used.

No response

Compute environment

None

Integrations

No response

Anything else?

No response

@Adam-D-Lewis Adam-D-Lewis added type: bug 🐛 Something isn't working needs: triage 🚦 Someone needs to have a look at this issue and triage labels Apr 25, 2024
@Adam-D-Lewis Adam-D-Lewis added this to the 2024.5.1 milestone Apr 25, 2024
@viniciusdc
Copy link
Contributor

nice catch!

@dcmcand dcmcand added provider: GCP block-release ⛔️ Must be completed for release and removed needs: triage 🚦 Someone needs to have a look at this issue and triage labels Apr 29, 2024
@github-project-automation github-project-automation bot moved this from New 🚦 to Done 💪🏾 in 🪴 Nebari Project Management Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
block-release ⛔️ Must be completed for release provider: GCP type: bug 🐛 Something isn't working
Projects
Development

Successfully merging a pull request may close this issue.

3 participants