Batch Inference #17

Dhan800 · 2023-10-11T05:46:22Z

Thanks for your hard work. I tried to conduct batch inference but encountered some errors. My code looks like:
prompts = tokenizer(test_dataset, return_tensors='pt', padding=True, truncation=True)
gen_tokens = model.generate(
**prompts,
do_sample=False,
max_new_tokens=30,
)
gen_text = tokenizer.batch_decode(gen_tokens, skip_special_tokens=True)
The error message is about "reporting a bug to pytorch". I think the problems roots in "hidden_states.to(torch.float32)". I say in your evaluation code, there is only "inference_on_one". Can you provide more guidance?

Thank you for your time and consideration.

Dhan800 · 2023-10-11T21:56:42Z

Sorry for missing some critical information. I am using QLORA. Here is my configuration.
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
I have 1460 test samples. Without batch inference, it can take up to 46 minutes on 3090.

SuperBruceJia · 2024-08-24T16:56:28Z

Please check our codes for your reference:
https://github.com/vkola-lab/medpodgpt/blob/main/utils/eval_small_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch Inference #17

Batch Inference #17

Dhan800 commented Oct 11, 2023

Dhan800 commented Oct 11, 2023

SuperBruceJia commented Aug 24, 2024

Batch Inference #17

Batch Inference #17

Comments

Dhan800 commented Oct 11, 2023

Dhan800 commented Oct 11, 2023

SuperBruceJia commented Aug 24, 2024