Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifying the AutoModelForCausalLM #48

Open
RDCordova opened this issue Apr 16, 2024 · 4 comments
Open

Modifying the AutoModelForCausalLM #48

RDCordova opened this issue Apr 16, 2024 · 4 comments
Labels
good first issue Good for newcomers

Comments

@RDCordova
Copy link

I just want to start by saying I love the work that has been done on this project, Here is the issue I'm having

when the model is loaded from HuggingFace using it would be great to be able to select the paramaters of the AutoModelForCausalLM.
self.model = AutoModelForCausalLM.from_pretrained(self.llm)

It works great with small models likee GPT2 but when we advance to larger models (ex mistralai/Mistral-7B-Instruct-v0.1) the GPU quickly runs out of memory . I can generally get around this by using BitsAndBytesConfig to minmize the memory requiered for the LLM but that requires passing addtinal agrumetns to AutoModelForCausalLM ex

model = AutoModelForCausalLM.from_pretrained(
"mistralai/Mistral-7B-Instruct-v0.1",
quantization_config=bnb_config,
device_map="auto",
trust_remote_code=True,
)

@RDCordova RDCordova changed the title Modifying the Modifying the AutoModelForCausalLM Apr 16, 2024
@unnir unnir added the good first issue Good for newcomers label Apr 16, 2024
@unnir
Copy link
Collaborator

unnir commented Apr 16, 2024

Thank you, we will add it to the next update.

Would it be possible to share your bnb_config for testing?

@RDCordova
Copy link
Author

Thank you for the quick respons. I would be happy to assist with the testing.

@RDCordova
Copy link
Author

bnb_config = BitsAndBytesConfig(load_in_4bit= True,bnb_4bit_quant_type= "nf4",bnb_4bit_compute_dtype= torch.float16,bnb_4bit_use_double_quant= True)

@hiberfil
Copy link

@unnir I am also trying to solve this issue to be able to run Mistral but even with @RDCordova example I cant get it to run properly. Do you have a timeline when the next version of the GReaT might come out? Happy to help with testing.

Also @RDCordova did you modify and added the bnb config to the great.py or do you have a training script with bnb as arguments to run the script? Do you have a modified script snippet that you can share with us?

Again thank you so much for the awesome work on both ends.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants