Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A ImportError when I run the program "FinGPT_Training_LoRA_with_ChatGLM2_6B_for_Beginners.ipynb" #165

Open
YRookieBoy opened this issue Mar 14, 2024 · 6 comments

Comments

@YRookieBoy
Copy link

YRookieBoy commented Mar 14, 2024

Hi,
When I try to run "FinGPT_Training_LoRA_with_ChatGLM2_6B_for_Beginners.ipynb" in google colab, I came aross a problem.
The code is

model_name = "THUDM/chatglm2-6b"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModel.from_pretrained(
model_name,
quantization_config=q_config,
trust_remote_code=True,
device='cuda'
)

and, the error is

ImportError: Using load_in_8bit=True requires Accelerate: pip install accelerate and the latest version of bitsandbytes pip install -i https://test.pypi.org/simple/ bitsandbytes or pip install bitsandbytes`
model = prepare_model_for_int8_training(model, use_gradient_checkpointing=True)

Last, I program the code in Gcolab pro and I am sure both packages is installed.
Please help me solve the problem, thank you so much!

@YRookieBoy YRookieBoy changed the title Hi, A ImportError when I run the program "FinGPT_Training_LoRA_with_ChatGLM2_6B_for_Beginners.ipynb" Mar 14, 2024
@YRookieBoy YRookieBoy reopened this Mar 14, 2024
@llk010502
Copy link
Collaborator

Hi, based on my experience, you can try to reinstall these two packages when this error shows, then restart your kernel to run your code. Hope this works.

@YRookieBoy
Copy link
Author

Thank you very much! I have already run the code successfully.

@Siddharth-Latthe-07
Copy link

The error indicates that the necessary packages for 8-bit training, specifically accelerate and bitsandbytes, are either not installed correctly or not recognized by the environment. Here's how you can troubleshoot and resolve the issue:

  1. Ensure correct installation
  2. Restart the runtime: Sometimes, after installing new packages, you need to restart the runtime for the changes to take effect.
  3. Check for correct version and Load the packages before model definition
    sample snippet
# Install the necessary packages
!pip install accelerate
!pip install -i https://test.pypi.org/simple/ bitsandbytes

# Restart runtime after installing the packages (manual step in the Colab interface)

# Import the required libraries
from transformers import AutoTokenizer, AutoModel
from accelerate import Accelerator

# Ensure the runtime is using GPU
import torch
device = 'cuda' if torch.cuda.is_available() else 'cpu'

# Load the model with the necessary configuration
model_name = "THUDM/chatglm2-6b"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModel.from_pretrained(
    model_name,
    quantization_config=q_config,
    trust_remote_code=True,
    device=device
)

# Prepare model for 8-bit training
from transformers import prepare_model_for_int8_training
model = prepare_model_for_int8_training(model, use_gradient_checkpointing=True)

and also check the gpu settings

hope this will help, let me know the further updates
Thanks

@tducharme-brex
Copy link

Hello @Siddharth-Latthe-07 , I tried the above code and got the same error

`low_cpu_mem_usage` was None, now set to True since model is quantized.
Loading checkpoint shards: 100%
 7/7 [00:08<00:00,  1.16s/it]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-21-3503f94a9a20>](https://localhost:8080/#) in <cell line: 0>()
      3 model_name = "THUDM/chatglm2-6b"
      4 tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
----> 5 model = AutoModel.from_pretrained(
      6         model_name,
      7         quantization_config=q_config,

3 frames
[/usr/local/lib/python3.11/dist-packages/transformers/modeling_utils.py](https://localhost:8080/#) in to(self, *args, **kwargs)
   2772         # Checks if the model has been loaded in 8-bit
   2773         if getattr(self, "quantization_method", None) == QuantizationMethod.BITS_AND_BYTES:
-> 2774             raise ValueError(
   2775                 "`.to` is not supported for `4-bit` or `8-bit` bitsandbytes models. Please use the model as it is, since the"
   2776                 " model has already been set to the correct devices and casted to the correct `dtype`."

ValueError: `.to` is not supported for `4-bit` or `8-bit` bitsandbytes models. Please use the model as it is, since the model has already been set to the correct devices and casted to the correct `dtype`.

any idea how to help?
I restarted my runtime and its still not working

Thanks!

@Siddharth-Latthe-07
Copy link

@tducharme-brex Ok,
I guess the issue is that .to() is not supported for 4-bit or 8-bit bitsandbytes models. This happens when you explicitly try to move the model to a device (device='cuda') after loading it with quantization.

Try to remove the explicit device=device argument in AutoModel.from_pretrained(), beacuse BitsAndBytesConfig automatically assigns the correct device.

try running this script and let me know:-

# Install necessary packages
!pip install --upgrade bitsandbytes accelerate transformers

# Restart runtime after installation

# Import required libraries
from transformers import AutoTokenizer, AutoModel, BitsAndBytesConfig
import torch

# Ensure the runtime is using GPU
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Using device: {device}")

# Define quantization config for int8
q_config = BitsAndBytesConfig(load_in_8bit=True)

# Load tokenizer
model_name = "THUDM/chatglm2-6b"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

# Load model (DO NOT specify `device='cuda'` explicitly)
model = AutoModel.from_pretrained(
    model_name,
    quantization_config=q_config,  # This will automatically handle device placement
    trust_remote_code=True,
    device_map="auto"  # Use "auto" to let Hugging Face handle device allocation
)

# DO NOT manually move model to CUDA
# model.to(device)  <-- REMOVE THIS

# If you need to prepare for int8 training, use this
from transformers import prepare_model_for_int8_training
model = prepare_model_for_int8_training(model, use_gradient_checkpointing=True)

print("Model loaded successfully!")

@tducharme-brex
Copy link

@Siddharth-Latthe-07 I had to change the import, but I got this block of code to work

from transformers import prepare_model_for_int8_training was raising an import error
changing it to from peft import prepare_model_for_int8_training still threw an error.
finally this got it to work:

from peft import prepare_model_for_kbit_training
model = prepare_model_for_kbit_training(model, use_gradient_checkpointing=True)

ultimately I could only get it to work when I changed the model version from THUDM/chatglm2-6b to the newer THUDM/chatglm3-6b. Then everything ran smoothly

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants