assert t2i_input_embedding.shape[1] == self.img_token_num #56

gsfsdv · 2024-12-12T02:54:25Z

I hope you're doing well! I’ve been working with your code, and I’ve encountered an issue when executing the following assertion:
assert t2i_input_embedding.shape[1] == self.img_token_num
The error occurs because t2i_input_embedding.shape[1] is 1, but self.img_token_num is set to 8, causing the assertion to fail. I’m not sure why the second dimension of t2i_input_embedding would be 1 when I expect it to match self.img_token_num.

Could you help clarify the intended shape of t2i_input_embedding here? Is there any specific preprocessing step or reshaping operation I might have missed that would result in this discrepancy?

KzZheng · 2024-12-12T23:04:02Z

The t2i_input_embedding is the output hidden states of LLM to use for the later diffusion process. I just ran all the code on a new machine, and the demo works well. I'm not sure the reason why you encounter its shape as 1, since it should be dealt inside the prepare_inputs_for_generation function inside MiniGPT5 class:(

MiniGPT-5/minigpt4/models/mini_gpt5.py

Line 327 in 2121c74

if new_token_ids == self.output_img_id:

)
If len(special_token_index) is not zero, it means the first output image token is generated. In other words, during the LLM generation, new_token_ids == self.output_img_id should be True, and all_img_tokens should be append to this first image token, which will make t2i_input_embedding to be the same length as img_token_num.

gsfsdv · 2024-12-16T01:59:32Z

Thank you for your reply. I would like to ask what the IMG_TOKEN_NUM in Constants.py represents. I noticed it is set to 8.

gsfsdv · 2024-12-16T02:15:45Z

I would also like to inquire, as I want to use miniGPT-5 to batch generate data, which parameters in the model do I need to reset when starting to process a new sample? I look forward to your reply.

hlchen23 · 2025-01-17T07:35:48Z

I have met the same problem. Have you addressed it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assert t2i_input_embedding.shape[1] == self.img_token_num #56

assert t2i_input_embedding.shape[1] == self.img_token_num #56

gsfsdv commented Dec 12, 2024

KzZheng commented Dec 12, 2024 •

edited

Loading

gsfsdv commented Dec 16, 2024

gsfsdv commented Dec 16, 2024

hlchen23 commented Jan 17, 2025

assert t2i_input_embedding.shape[1] == self.img_token_num #56

assert t2i_input_embedding.shape[1] == self.img_token_num #56

Comments

gsfsdv commented Dec 12, 2024

KzZheng commented Dec 12, 2024 • edited Loading

gsfsdv commented Dec 16, 2024

gsfsdv commented Dec 16, 2024

hlchen23 commented Jan 17, 2025

KzZheng commented Dec 12, 2024 •

edited

Loading