Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Handle HF Remote API Call Format #751

Merged
merged 1 commit into from
Dec 4, 2024

Conversation

ishaansehgal99
Copy link
Collaborator

Reason for Change:
Remote HuggingFace API uses a particular format:

curl 'https://api-inference.huggingface.co/models/HuggingFaceH4/zephyr-7b-beta/v1/chat/completions' \
-H 'Authorization: Bearer hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' \
-H 'Content-Type: application/json' \
--data '{
    "model": "HuggingFaceH4/zephyr-7b-beta",
    "messages": [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	],
    "max_tokens": 500,
    "stream": false
}'

This needs to be adhered to in the code.

Reference: https://huggingface.co/HuggingFaceH4/zephyr-7b-beta?inference_api=true

Adding this code ensures we continue to support OAI, HF Remote and Custom URL as our LLM backend

@Fei-Guo Fei-Guo merged commit e0f28f0 into main Dec 4, 2024
3 of 6 checks passed
@Fei-Guo Fei-Guo deleted the Ishaan/huggingface-remote-api branch December 4, 2024 07:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants