You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
According to vLLM docs, you can specify tools and custom tool parsers (example below).
Why is this useful:
Tool calling is useful in general because it augments the model with additional data
We can train models to run function calling dynamically as the model is generating
This needs a custom tool parser, e.g. you could teach the model to call an API to retrieve additional data (see the ExampleToolParser below)
Problems with implementing this in veRL:
veRL uses inference_engine.generate() which does not support tool calling (only supported in chat())
This potentially needs support in vLLM to make it happen.
The main challenge is that generate currently processes raw text prompts without interpreting structured data (e.g., function calls), whereas chat converts structured messages into prompts and integrates tools.
@ToolParserManager.register_module(["example"])classExampleToolParser(ToolParser):
def__init__(self, tokenizer: AnyTokenizer):
super().__init__(tokenizer)
# adjust request. e.g.: set skip special tokens# to False for tool call output.defadjust_request(
self, request: ChatCompletionRequest) ->ChatCompletionRequest:
returnrequest# implement the tool call parse for stream calldefextract_tool_calls_streaming(
self,
previous_text: str,
current_text: str,
delta_text: str,
previous_token_ids: Sequence[int],
current_token_ids: Sequence[int],
delta_token_ids: Sequence[int],
request: ChatCompletionRequest,
) ->Union[DeltaMessage, None]:
returndelta# implement the tool parse for non-stream calldefextract_tool_calls(
self,
model_output: str,
request: ChatCompletionRequest,
) ->ExtractedToolCallInformation:
returnExtractedToolCallInformation(tools_called=False,
tool_calls=[],
content=text)
The text was updated successfully, but these errors were encountered:
According to vLLM docs, you can specify tools and custom tool parsers (example below).
Why is this useful:
ExampleToolParser
below)Problems with implementing this in veRL:
inference_engine.generate()
which does not support tool calling (only supported inchat()
)generate
currently processes raw text prompts without interpreting structured data (e.g., function calls), whereaschat
converts structured messages into prompts and integrates tools.The text was updated successfully, but these errors were encountered: