Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dependencies #14

Closed
wants to merge 7 commits into from
Closed

Conversation

MostlyKIGuess
Copy link
Member

  • security fixes

chimosky and others added 2 commits November 7, 2024 14:08
@chimosky
Copy link
Member

Did you test your changes to be sure nothing breaks with these versions?

@MostlyKIGuess
Copy link
Member Author

Did you test your changes to be sure nothing breaks with these versions?

while checking I realized , most of them are not needed, I only kept the necessary ones , will push the commit for that. I saw the mail and updated dependencies, want to fix this before this tuesday.

- add command-line interface;
-include unit test for functionality
@MostlyKIGuess
Copy link
Member Author

Welp the master branch wasn't working, so I made a lot of changes, almost revamping. But shouldn't we just merge the documents of this along with https://sugar-docs-ai.streamlit.app/. But I am willling to update this as well. I just saw the mail regarding dep issues and thought would fix it.

@MostlyKIGuess
Copy link
Member Author

issue with current master is that ollama and httpx can't be run simultaneity they break each other's dependencies.

@chimosky
Copy link
Member

issue with current master is that ollama and httpx can't be run simultaneity they break each other's dependencies.

Are you referring to aiohttp and ollama or langchain-ollama?

Also @kshitijdshah99 why do we have two ollama dependencies?

@MostlyKIGuess
Copy link
Member Author

issue with current master is that ollama and httpx can't be run simultaneity they break each other's dependencies.

Are you referring to aiohttp and ollama or langchain-ollama?

Also @kshitijdshah99 why do we have two ollama dependencies?

ollama and httpx, check ollama/ollama-python#356

@kshitijdshah99
Copy link
Contributor

issue with current master is that ollama and httpx can't be run simultaneity they break each other's dependencies.

Are you referring to aiohttp and ollama or langchain-ollama?

Also @kshitijdshah99 why do we have two ollama dependencies?

Both these dependencies serve different purpose. langchain-ollama is a package which provides interaction between Langchain and Ollama to build Langchain workflows. On other hand ollama is for extracting and managing different models like llama3.1, Mistral or Codellama we downloaded locally on our system.

@MostlyKIGuess
Copy link
Member Author

issue with current master is that ollama and httpx can't be run simultaneity they break each other's dependencies.

Are you referring to aiohttp and ollama or langchain-ollama?

Also @kshitijdshah99 why do we have two ollama dependencies?

Both these dependencies serve different purpose. langchain-ollama is a package which provides interaction between Langchain and Ollama to build Langchain workflows. On other hand ollama is for extracting and managing different models like llama3.1, Mistral or Codellama we downloaded locally on our system.

We should just pull from HF instead this way we can remove ollama. What say?

@kshitijdshah99
Copy link
Contributor

I'll suggest if @chimosky agrees we can try solving this issue first instead of directly skipping to HuggingFace because then we would have to adjust many code files.

@chimosky
Copy link
Member

I'll suggest if @chimosky agrees we can try solving this issue first instead of directly skipping to HuggingFace because then we would have to adjust many code files.

If we can do without a dependency, then we should.

requirements.txt Outdated
langchain
transformers==4.45.2
langchain-ollama

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you remove langchain-ollama because we're not using ollama in his pr

@harshkasat
Copy link

Are there any criteria for open-source models? I'm curious, especially since new models like DeepPeek and Qwen perform well in coding and mathematics with their instruction models.

rag_agent.py Outdated
print(f"An error occurred: {e}")

if __name__ == "__main__":
main()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should remove the No EOF at end of file.

rag_agent.py Outdated
Comment on lines 122 to 148
def main():
parser = argparse.ArgumentParser(description="Pippy's AI-Coding Assistant")
parser.add_argument('--model', type=str, choices=[
'bigscience/bloom-1b1',
'facebook/opt-350m',
'EleutherAI/gpt-neo-1.3B'
], default='bigscience/bloom-1b1', help='Model name to use for text generation')
parser.add_argument('--docs', nargs='+', default=[
'./docs/Pygame Documentation.pdf',
'./docs/Python GTK+3 Documentation.pdf',
'./docs/Sugar Toolkit Documentation.pdf'
], help='List of document paths to load into the vector store')
args = parser.parse_args()

try:
agent = RAG_Agent(model=args.model)
agent.retriever = agent.setup_vectorstore(args.docs)

while True:
question = input("Enter your question: ").strip()
if not question:
print("Please enter a valid question.")
continue
response = agent.run(question)
print("Response:", response)
except Exception as e:
print(f"An error occurred: {e}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering this is an API, we definitely wouldn't use command line args for it.

@chimosky
Copy link
Member

chimosky commented Feb 4, 2025

Are there any criteria for open-source models? I'm curious, especially since new models like DeepPeek and Qwen perform well in coding and mathematics with their instruction models.

The only thing we cared about was how the model responded based on the prompt, which was adjusted of course.

@kshitijdshah99
Copy link
Contributor

@MostlyKIGuess I saw you are using very smaller models from HF which is a little less preferable....and of course needless to say that 8B models are taking lot of time for inference so it becomes necessary to use smaller ones. Have you find out some middle way to solve this problem? I tried running Llama-8B model from HF but it's consuming lot's of RAM and space too while running.

@kshitijdshah99
Copy link
Contributor

That's the reason why I preferred Ollama over HF suggest something if we can prevent this. AFAIK the model was working great when integrated with the Pippy-Activity locally.

@kshitijdshah99
Copy link
Contributor

@chimosky it seems these guys have fixed this interdependency issue. See the latest messages. Probably we can work with the latest httpx version if we want to.

@MostlyKIGuess
Copy link
Member Author

@MostlyKIGuess I saw you are using very smaller models from HF which is a little less preferable....and of course needless to say that 8B models are taking lot of time for inference so it becomes necessary to use smaller ones. Have you find out some middle way to solve this problem? I tried running Llama-8B model from HF but it's consuming lot's of RAM and space too while running.

I did that just to test the pipeline, by no means I promote using that and your point of using ollama isn't in good faith, as ollama using quantization on the model to run it, the models in themselves remain the same, we can do the same with huggingface, I am by no means in support of huggingface, it's just convenient because we get access to more models, with ollama there's a dep hole. I will be working on the reviews today and try to remove more bloat.

@MostlyKIGuess
Copy link
Member Author

I also noticed;

  from langchain.embeddings import HuggingFaceEmbeddings
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.

@MostlyKIGuess
Copy link
Member Author

I will be fixing that shortly today

@kshitijdshah99
Copy link
Contributor

@MostlyKIGuess I saw you are using very smaller models from HF which is a little less preferable....and of course needless to say that 8B models are taking lot of time for inference so it becomes necessary to use smaller ones. Have you find out some middle way to solve this problem? I tried running Llama-8B model from HF but it's consuming lot's of RAM and space too while running.

I did that just to test the pipeline, by no means I promote using that and your point of using ollama isn't in good faith, as ollama using quantization on the model to run it, the models in themselves remain the same, we can do the same with huggingface, I am by no means in support of huggingface, it's just convenient because we get access to more models, with ollama there's a dep hole. I will be working on the reviews today and try to remove more bloat.

Bro ik and am convinced with your purpose of opening this PR. The prime reason for supporting Ollama is bcz I have integrated it with fastAPI to the UI and also it's faster for the inference that's why I am not preferring HF. The primary reason of dependency conflict is not there anymore hence I don't feel the need of migrating the code. That's my point. Ollama like HF has many open source models (including deepseek) so that's not going to be an issue in future.

@kshitijdshah99
Copy link
Contributor

kshitijdshah99 commented Feb 6, 2025

I also noticed;

  from langchain.embeddings import HuggingFaceEmbeddings
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.

I opted using CPU bcz I hadn't one in my system yes we can improve it.

@harshkasat
Copy link

I will suggest using collab if you want to leverage GPU or make ssh or tunnel to use GPU in vs code

@harshkasat
Copy link

are there any TODO fixes ? I also want to work on this.

@MostlyKIGuess
Copy link
Member Author

I will suggest using collab if you want to leverage GPU or make ssh or tunnel to use GPU in vs code

I have resources actually personally to test it out.

@MostlyKIGuess
Copy link
Member Author

are there any TODO fixes ? I also want to work on this.

maybe review the changes? will push in like 5-10 mins? would be a great help

@MostlyKIGuess
Copy link
Member Author

Getting a 9.96 gb Image now with total GPU support!

@harshkasat
Copy link

with model?

@MostlyKIGuess
Copy link
Member Author

with model?

no no, this is just the image generated container, model would add space depending upon which we use, I have commented 2 and kept the lowest parameter one because I wanted to just test the pipeline.

@MostlyKIGuess
Copy link
Member Author

@harshkasat
Copy link

I also find some low-level design issues:

  1. Single Responsibility Principle: since RagAgent does a lot of task embed, storing, generate, I think we can create a base class and follow abstraction,
  2. Dependency Injection Violation: agent.retriever = agent.setup_vectorstore(args.docs) modifidying retriever object after creating.
    there might be other issues but I find this ones.

@MostlyKIGuess
Copy link
Member Author

I also find some low-level design issues:

1. Single Responsibility Principle: since RagAgent does a lot of task embed, storing, generate, I think we can create a base class and follow abstraction,

2. Dependency Injection Violation: `agent.retriever = agent.setup_vectorstore(args.docs)` modifidying retriever object after creating.
   there might be other issues but I find this ones.

For this, I haven't actually worked mainly on this project , I did make another similar project it is called sugar docs ai in my repositories, well at least as of now, the immediate action was to lower the size and fix the dep loophole, perhaps we can create a ticket and start working on these too?

@chimosky
Copy link
Member

chimosky commented Feb 6, 2025

I haven't reviewed your changes yet, please write better commit messages, the ones I'm seeing above aren't helpful.

This PR was originally for updating dependencies and it's become more than that, how about you open a new PR with the changes that have nothing to do with updating dependencies and write better commit messages.

@MostlyKIGuess
Copy link
Member Author

I haven't reviewed your changes yet, please write better commit messages, the ones I'm seeing above aren't helpful.

This PR was originally for updating dependencies and it's become more than that, how about you open a new PR with the changes that have nothing to do with updating dependencies and write better commit messages.

alright, will close this then!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants