Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic local LLM support #216

Merged
merged 8 commits into from
Jan 22, 2025
Merged

Basic local LLM support #216

merged 8 commits into from
Jan 22, 2025

Conversation

sidnarayanan
Copy link
Collaborator

  • LMConfig captures LLM configuration options
  • TransformerHandler adds support for buffered LLM execution & multi-device/node distribution

@sidnarayanan sidnarayanan requested review from kwanUm and a team January 22, 2025 01:31
@sidnarayanan sidnarayanan requested review from jamesbraza and a team January 22, 2025 17:54
except ImportError:
LocalCUDACluster = None

config.set({
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I follow and yeah we can punt. I guess I was thinking this may represent a bug for ML Ops teams using LDP, because they'll likely want to configure this themselves

Comment on lines +103 to +104
backward_prefetch: BackwardPrefetch | None = Field(
BackwardPrefetch.BACKWARD_PRE,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to allow | None here if we have a default? And if we want | None, can you document what that means in the description

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah None is a valid option to pass to FSDP (i.e. no prefetch). The link I put in the description describes this.

)
self._initialize_workers(config, parallel_mode_config)

def _initialize_workers(self, config, parallel_mode_config):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _initialize_workers(self, config, parallel_mode_config):
def _initialize_workers(self, dask_client_config, parallel_mode_config):

Having this take a config and a parallel_mode_config makes one wonder, what is this first config? Let's be more verbose on the name for the first config

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a dask_client_config though. I have added typing to avoid ambiguity.

bf16 = "bfloat16"
fp16 = "float16"
fp32 = "float32"
auto = "auto"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto = "auto"
unspecified = "unspecified"

It seems like auto is not really the right word here, it's more like unspecified.

And if I am wrong, can you add an inline code comment stating what auto means here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already documented here: https://huggingface.co/docs/transformers/v4.48.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained.torch_dtype . Per above, I don't want to re-document parameters we are forwarding

@sidnarayanan sidnarayanan merged commit af94ca6 into main Jan 22, 2025
6 checks passed
@sidnarayanan sidnarayanan deleted the transformer-handler branch January 22, 2025 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants