Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: Handle HF Remote API Call Format (#751)
**Reason for Change**: Remote HuggingFace API uses a particular format: ``` curl 'https://api-inference.huggingface.co/models/HuggingFaceH4/zephyr-7b-beta/v1/chat/completions' \ -H 'Authorization: Bearer hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' \ -H 'Content-Type: application/json' \ --data '{ "model": "HuggingFaceH4/zephyr-7b-beta", "messages": [ { "role": "user", "content": "What is the capital of France?" } ], "max_tokens": 500, "stream": false }' ``` This needs to be adhered to in the code. Reference: https://huggingface.co/HuggingFaceH4/zephyr-7b-beta?inference_api=true Adding this code ensures we continue to support OAI, HF Remote and Custom URL as our LLM backend
- Loading branch information