-
Notifications
You must be signed in to change notification settings - Fork 438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLM integration #2073
Comments
@josevalim I tested all the suggested solutions: Codeium, Copilot, OpenAI, Claude. |
@vitalis for completeness, you tested for which particular categories above? How were your tests done and how did you assess quality? And, of course, even if we decide to train our own model, there is no guarantee it will perform better. For code completion there are smaller models available but how would we proceed to train llama2 or similar and which data + resource do we need for such? Thanks! |
Code completion: None of the solutions is helpful with Elixir, I tested Copilot before you spoke about this idea on twitch stream and tested all other solutions in the past days... |
From my understanding it will be really difficult to optimize llama2 model or similar, llama2 is a huge model and it will require a lot of GPU resources and time and it will be very expensive... |
At this moment I don't have a solution.. but I have an idea I want to research next week and I'll update you |
If you did chat-based with GPT you wouldn't be using the subscription, you'd have users plug in a pay-as-you-go API key, right? Lots and lots of people are already using GPT for way more than Livebook, so I don't think a GPT token requirement would be the limiter in traction. For the record, the chat-based workflow has been a tremendous accelerant to my Elixir app development, even without a built-in integration with Livebook. |
@michaelterryio |
@josevalim Didn't test replit.. to complex and no free plan.. I'll continue the research next week |
StarCoder, WizardCoder, and the replit one would have to be fine-tuned first. I am not expecting them to give good results out of the box. |
@josevalim |
I think 1 and 2 are related. You could create embeds of all the cells in a notebook and store them in some vector table (sqlite?). |
Hey folks, just joining the discussion from #2222 A few thoughts from my side
As it happens, I'm in SF this week chatting to folks at both OpenAI and Anthropic about a number of things, and I'm already planning to ask them how we could fine-tune an elixir-only model. Let me know if there's anything specifically you'd like me to ask those folks Apologies for the long ramblings, I'm awake in the middle of the night and severely jetlagged 🤣 |
Hi @jonastemplestein, that was a great report. :)
That has also been my experience, even before we reach the LLM layer. Indexing all of the Elixir API docs and then performing an embedding search did not provide good results. Asking what is a charlist did not going bring the relevant section in a moduledoc but more often it would bring a specific function, such as
This is an amazingly cool idea, being able to "@-mention" the documentation. And given how Elixir stores docs, it should be trivial to include the documentation of something. The only problem I can think here is about guides: we can programmatically fetch API docs but not its guides. Although I would assume for something as large as Ecto or Phoenix, a lot of knowledge already exists in the neural network, because you can't pass all of its docs or guides in the context. Perhaps initially we should focus on supporting some sort of @-mention and passing the notebook itself as context. That said, I will strike out the RAG idea from the initial issue description, I believe we should focus on two separate problems: chat-based interfaces and code completion. For the chat-based interface (and everything that falls out of it, such as explaining code, explaining exceptions, refactoring code, automatically fixing errors, etc), we will likely need to use an external LLM, due to the required size. But I still hope we can run our own model for code completion, fine-tuned to Elixir.
We can fine tune any of them to Elixir, it would be amazing, and it would definitely help us gear towards a decision. Let me know if there is anything I can do to help. We can also help with collecting the data. Both companies probably already have crawlers which they could point to hex.pm but we can also try to build a large archive with all API references from Hex.pm. The only challenge is that this would need to be refreshed/updated somewhat periodically. |
Can I bother you for the exact details on how you implemented the RAG here? Edit: I just want to point out, I actually don't disagree, just wondering what went wrong. |
Hi folks, we have a rough plan here. Code completion (#1)We are still exploring options for code completion. I would really love if this is something we can run in Elixir itself and people could run it on their machines or their own servers trivially (assuming at most 3b params). Our preferences (in order) are:
Chat, RAG and functions (#2, #3, #4)With OpenAI latest announcements, they have an offering (Assistants) that address problems 2, 3, and 4 for us, which makes it very compelling. Therefore, it is likely we will move forward by integrating with anything that exposes OpenAI assistant API. This means:
|
We want to integrate LLMs as part of Livebook itself. There are at least four distinct levels this can happen:
Code completion (may or may not need a LLM) (options: Codeium, Copilot, fine-tuned Repl.it model)
Chat-based (which also includes selecting code and asking to document it as well as explaining exceptions) (options: Codeium, Copilot, OpenAI, Claude)
Semantic-search over all installed packages in the runtime (may or may not need a LLM) (options: Codeium, Copilot, OpenAI, Claude)Function completion (we can allow kino/smart-cells to register functions which we hook into prompts, similar to HF Agents) (options: OpenAI)This is a meta-issue and we are currently doing proof of concepts on different areas. There is no clear decision yet. We will most likely allow users to "bring their own LLM" for most of these categories, especially 2-3-4 (either commercial or self-hosted).
The text was updated successfully, but these errors were encountered: