model.py support in trtllm flow #1041

pankajroark · 2024-07-18T23:47:18Z

🚀 What

Make model.py effective for the trt_llm flow.

💻 How

A concept of truss extension is created and trt_llm is modeled as an extension. Extensions will be bundled with the truss under server/extensions/. Things are pretty hard coded right now. An extension at runtime is modeled as a directory:

It should have a file called extension.py
Which should have a class named Extension
With methods documented in relevant code
Extension provides arguments to pass to the model's init
Extension can also provide a full replacement for the model object

ModelWrapper loads all extensions first (right there is only one), and collects arguments to pass to model class' init method. model class' init method is passed an argument by the name of the extension. e.g. for trt_llm extension a parameter named trt_llm is passed as a dictionary { "engine": engine_object }. The model can thus make use of the engine to make predictions.

If model class is missing in user provided code then the idea is to load an extension provided model object replacement. Right now this is hardcoded to check trt_llm extension only (we can change it to some other strategy if/when there are more than one extensions).

A check is also added where if trt_llm section is provided in the config then either the model class' init method should ask for trt_llm arg in the signature, or not have the model class at all. There are already a few cases where trt_llm config is used and where a default model.py may be there in these existing Trusses. After this change these Trusses will error out, prompting users to either add that parameter or remove the model.py. We expect most of those users to remove model.py to keep the previous functionality (where model.py was ignored). This is to avoid a situation where their previously defunct model.py suddenly starts being used after this change, and most likely fail.

🔬 Testing

A few unit and integration tests have been added here. I've also done some local testing using GPU. I plan to test on the cluster next.

bdubayah · 2024-07-22T21:53:21Z

truss/templates/trtllm-briton/src/extension.py

+# That base class would look like:
+# class TrussExtension(ABC):
+#     @abstracemethod
+#     def model_override(self):


For the ABC, I feel like model override could be optional. I.e. you could have an extension that passes some args to a model without supporting overriding.

bdubayah · 2024-07-22T21:57:03Z

truss/templates/trtllm-briton/src/extension.py

+
+        This is used if model.py is omitted, which is allowed when using trt_llm.
+        """
+        return self._engine


Do we have a base ABC for the truss Model class? Thinking about ways we could show that the Engine class is a Model on its own.

We don't have a base ABC for model class right now. It would be really useful to have one, once we create the smaller base truss library for use in runtime.

bdubayah · 2024-07-22T22:00:53Z

truss/templates/server/model_wrapper.py

-            model_init_params["secrets"] = SecretsResolver.get_secrets(self._config)
-        if _signature_accepts_keyword_arg(model_class_signature, "lazy_data_resolver"):
-            model_init_params["lazy_data_resolver"] = LazyDataResolver(data_dir).fetch()
+        secrets_resolver = SecretsResolver.get_secrets(self._config)


Suggested change

secrets_resolver = SecretsResolver.get_secrets(self._config)

secrets = SecretsResolver.get_secrets(self._config)

bdubayah · 2024-07-22T22:25:41Z

truss/templates/server/model_wrapper.py

-        if _signature_accepts_keyword_arg(model_class_signature, "lazy_data_resolver"):
-            model_init_params["lazy_data_resolver"] = LazyDataResolver(data_dir).fetch()
+        secrets_resolver = SecretsResolver.get_secrets(self._config)
+        lazy_data_resolver = LazyDataResolver(data_dir).fetch()


fetch() is what actually performs the resolution (and returns None). So the difference between this and the existing code is that this will always try to resolve bptrs, regardless of if the model class signature accepts it - which should be fine, since LazyDataResolver can handle there being no bptr manifest.

Looking at the code I also noticed we also have always been assigning model_init_params["lazy_data_resolver"] = None (when model class signature accepts the resolver). I assume we just use it for a check like "lazy_data_resolver" in {"lazy_data_resolver": None} then.

bdubayah · 2024-07-22T22:27:14Z

truss/templates/server/model_wrapper.py

+                if _signature_accepts_keyword_arg(signature, ext_name):
+                    model_init_params[ext_name] = ext.model_args()
+            self._model = model_class(**model_init_params)
+        elif "trt_llm" in extensions:


Could add a constant for "trt_llm" extension name

pankajroark added 13 commits July 17, 2024 18:13

Remove triton support in trtllm flow

4fe21e9

checkpoint

a672f61

checkpoint

5942c51

checkpoint

b0349a8

checkpoint

1d8e55f

checkpoint

fd3915a

checkpoint

c3ee15b

checkpoint

4cc4134

chkpt

1bc2509

chkpt

77e980c

checkpoint

8437cfd

chkpt

36d857e

chkpt

e46625f

pankajroark marked this pull request as ready for review July 19, 2024 00:41

pankajroark requested review from aspctu, joostinyi and bdubayah July 19, 2024 00:41

pankajroark added 3 commits July 19, 2024 20:47

Merge branch 'main' into pg/trtllm-model-py

e9a898d

inc version and makefile

adb5d77

Fix tests

2cd92ef

bdubayah approved these changes Jul 22, 2024

View reviewed changes

joostinyi approved these changes Jul 22, 2024

View reviewed changes

pankajroark added 2 commits July 23, 2024 16:39

Take care of review comments

e173645

merge main

9d594e6

bdubayah approved these changes Jul 23, 2024

View reviewed changes

pankajroark merged commit 9ef4de2 into main Jul 23, 2024
4 checks passed

pankajroark deleted the pg/trtllm-model-py branch July 23, 2024 17:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model.py support in trtllm flow #1041

model.py support in trtllm flow #1041

pankajroark commented Jul 18, 2024 •

edited

Loading

bdubayah Jul 22, 2024

bdubayah Jul 22, 2024

pankajroark Jul 23, 2024

bdubayah Jul 22, 2024

bdubayah Jul 22, 2024

bdubayah Jul 22, 2024

	secrets_resolver = SecretsResolver.get_secrets(self._config)
	secrets = SecretsResolver.get_secrets(self._config)

model.py support in trtllm flow #1041

model.py support in trtllm flow #1041

Conversation

pankajroark commented Jul 18, 2024 • edited Loading

🚀 What

💻 How

🔬 Testing

bdubayah Jul 22, 2024

Choose a reason for hiding this comment

bdubayah Jul 22, 2024

Choose a reason for hiding this comment

pankajroark Jul 23, 2024

Choose a reason for hiding this comment

bdubayah Jul 22, 2024

Choose a reason for hiding this comment

bdubayah Jul 22, 2024

Choose a reason for hiding this comment

bdubayah Jul 22, 2024

Choose a reason for hiding this comment

pankajroark commented Jul 18, 2024 •

edited

Loading