Add support for OSS models via HuggingFace endpoints #22

jgpruitt · 2024-06-10T14:00:29Z

No description provided.

Tostino · 2024-09-11T02:42:14Z

Yup, this is simply missing passing in the base url into the openai client. If it did that this extension would be compatible with open source endpoints like vLLM.

Tostino · 2024-10-06T01:25:41Z

@jgpruitt I whipped out an implementation tonight that gets the OpenAI endpoints all working with a base_url variable and setting (similar to api_key).

Tested and working with vLLM's endpoint.

One thing I think I need to do is to support sending in extra parameters to the client through a json payload (or something). This would allow users to pass in any custom settings the open source endpoints may support (e.g. custom sampler settings like Beam Search, etc). After I get that going, and some tests added, i'll open a PR.

Tostino · 2024-10-12T18:59:32Z

@jgpruitt I'd like to pick your brain about the parameters and return types for pgai if that's alright. I am trying to get the postgres wrapper to fully comply with the OpenAI python client as it is specified (other than things like streaming which can't be supported), and see some differences I wanted to get some clarification on.

e.g. why are the embedding functions returning text/table vs the consistent json based on the input parameters? Why did you not follow the json return type for the embed endpoint, but you did for the chat completions endpoint?

I'd really like to get some consistency here, and would rather upstream the changes rather than creating a fork/new project to get it...but I also understand not breaking compatibility if that is a requirement at this point.
Another note, is that for the embed endpoint, there is an encoding_format parameter which can be either base64 or float (default), and that breaks when returning a vector type from the function.
IMO, the best way to deal with this is to have two versions of the functions. One that returns raw json(b) from the call, another that calls the json(b) function and parses out the data to return something a little bit more friendly to use from sql.

Tostino · 2024-10-12T19:28:05Z

@jgpruitt Oh...just dug into some of the other branches before submitting a PR. Looks like you already handled this, as well as a few of the other things I was working on. If I wanted to migrate the non-conflicting changes to a new branch that will be used for the 0.4.0 release, which should I use?

nicoscordialo · 2024-12-17T15:09:29Z

hey @Tostino, we'd also very much appreciated using HuggingFace models here. do you have any updates on where this is at? cheers!

BW-Projects · 2024-12-17T15:59:19Z

@Tostino Same here as well - would be nice to have OpenAI compatible APIs work such as a https://github.com/michaelfeil/infinity

jgpruitt added the enhancement New feature or request label Jun 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for OSS models via HuggingFace endpoints #22

Add support for OSS models via HuggingFace endpoints #22

jgpruitt commented Jun 10, 2024

Tostino commented Sep 11, 2024

Tostino commented Oct 6, 2024 •

edited

Loading

Tostino commented Oct 12, 2024

Tostino commented Oct 12, 2024

nicoscordialo commented Dec 17, 2024

BW-Projects commented Dec 17, 2024

Add support for OSS models via HuggingFace endpoints #22

Add support for OSS models via HuggingFace endpoints #22

Comments

jgpruitt commented Jun 10, 2024

Tostino commented Sep 11, 2024

Tostino commented Oct 6, 2024 • edited Loading

Tostino commented Oct 12, 2024

Tostino commented Oct 12, 2024

nicoscordialo commented Dec 17, 2024

BW-Projects commented Dec 17, 2024

Tostino commented Oct 6, 2024 •

edited

Loading