Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]Still not working, there is an error (TypeError: unsupported opera type (s) for//: 'NoneType' and 'int') when running Python playground. py #35

Open
lckj2009 opened this issue Nov 30, 2023 · 33 comments

Comments

@lckj2009
Copy link

lckj2009 commented Nov 30, 2023

Hello, there is an error (TypeError: unsupported operator type (s) for//: 'NoneType' and 'int') running Python playground.py

Operating system: ubuntu 20.04

Python 3.9.18

Other parameters: Same as MiniGPT-5/requirements. txt

All three ckpt files are located in MiniGPT-5/config. The configuration files have all been changed. The weight used is Vicuna-7b-v1.1. However, the following error still occurred.

Run "Python playground. py --stage1_weight /root/MiniGPT-5/config/stage1_cc3m.ckpt --test_weight /root/MiniGPT-5/config/stage2_vist.ckpt" The following error occurred during command execution:

Seed set to 42
Loading VIT
Traceback (most recent call last):
File "/root/MiniGPT-5/examples/playground.py", line 40, in
minigpt5 = MiniGPT5_Model.load_from_checkpoint(stage1_ckpt, strict=False, map_location="cpu", encoder_model_config=model_args, **vars(training_args))
File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/lightning/pytorch/core/module.py", line 1552, in load_from_checkpoint
loaded = _load_from_checkpoint(
File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/lightning/pytorch/core/saving.py", line 89, in _load_from_checkpoint
model = _load_state(cls, checkpoint, strict=strict, kwargs)
File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/lightning/pytorch/core/saving.py", line 156, in _load_state
obj = cls(
_cls_kwargs)
File "/root/MiniGPT-5/model.py", line 68, in init
self.model = MiniGPT5.from_config(minigpt4_config.model_cfg)
File "/root/MiniGPT-5/minigpt4/models/mini_gpt4.py", line 247, in from_config
model = cls(
File "/root/MiniGPT-5/minigpt4/models/mini_gpt5.py", line 46, in init
super().init(*args, **kwargs)
File "/root/MiniGPT-5/minigpt4/models/mini_gpt4.py", line 53, in init
self.visual_encoder, self.ln_vision = self.init_vision_encoder(
File "/root/MiniGPT-5/minigpt4/models/blip2.py", line 65, in init_vision_encoder
visual_encoder = create_eva_vit_g(
File "/root/MiniGPT-5/minigpt4/models/eva_vit.py", line 416, in create_eva_vit_g
model = VisionTransformer(
File "/root/MiniGPT-5/minigpt4/models/eva_vit.py", line 259, in init
self.patch_embed = PatchEmbed(
File "/root/MiniGPT-5/minigpt4/models/eva_vit.py", line 190, in init
num_patches = (img_size[1] // patch_size[1]) * (img_size[0] // patch_size[0])
TypeError: unsupported operand type(s) for //: 'NoneType' and 'int'

@lckj2009
Copy link
Author

this is my pip list:

accelerate 0.24.1
aiofiles 23.2.1
aiohttp 3.8.4
aiosignal 1.3.1
altair 5.2.0
annotated-types 0.6.0
antlr4-python3-runtime 4.9.3
anyio 3.7.1
appdirs 1.4.4
argon2-cffi 23.1.0
argon2-cffi-bindings 21.2.0
arrow 1.3.0
asttokens 2.4.1
async-lru 2.0.4
async-timeout 4.0.2
attrs 22.2.0
Babel 2.13.1
backoff 2.2.1
beautifulsoup4 4.12.2
bleach 6.1.0
blessed 1.20.0
blis 0.7.11
boto3 1.33.2
botocore 1.33.2
braceexpand 0.1.7
catalogue 2.0.10
cchardet 2.1.7
certifi 2023.11.17
cffi 1.16.0
chardet 5.1.0
charset-normalizer 3.3.2
click 8.1.7
cmake 3.27.7
comm 0.2.0
confection 0.1.4
contourpy 1.0.7
croniter 1.4.1
cycler 0.11.0
cymem 2.0.8
dateutils 0.6.12
debugpy 1.8.0
decorator 5.1.1
decord 0.6.0
deepdiff 6.7.1
defusedxml 0.7.1
diffusers 0.23.1
docker-pycreds 0.4.0
exceptiongroup 1.2.0
executing 2.0.1
fastapi 0.104.1
fastjsonschema 2.19.0
ffmpy 0.3.1
filelock 3.9.0
fonttools 4.38.0
fqdn 1.5.1
frozenlist 1.3.3
fsspec 2023.10.0
ftfy 6.1.3
gitdb 4.0.11
GitPython 3.1.40
gradio 3.50.0
gradio_client 0.6.1
h11 0.14.0
httpcore 1.0.2
httpx 0.25.2
huggingface-hub 0.19.4
idna 3.6
importlib-metadata 6.8.0
importlib-resources 5.12.0
inquirer 3.1.4
iopath 0.1.10
ipykernel 6.27.1
ipython 8.18.1
isoduration 20.11.0
itsdangerous 2.1.2
jedi 0.19.1
Jinja2 3.1.2
jmespath 1.0.1
joblib 1.3.2
json5 0.9.14
jsonpointer 2.4
jsonschema 4.20.0
jsonschema-specifications 2023.11.1
jupyter_client 8.6.0
jupyter_core 5.5.0
jupyter-events 0.9.0
jupyter-lsp 2.2.1
jupyter_server 2.11.1
jupyter_server_terminals 0.4.4
jupyterlab 4.0.9
jupyterlab_pygments 0.3.0
jupyterlab_server 2.25.2
kiwisolver 1.4.4
langcodes 3.3.0
lightning 2.0.9.post0
lightning-cloud 0.5.55
lightning-utilities 0.10.0
linkify-it-py 2.0.2
lit 17.0.6
llvmlite 0.41.1
markdown-it-py 2.2.0
MarkupSafe 2.1.3
matplotlib 3.7.0
matplotlib-inline 0.1.6
mdit-py-plugins 0.3.3
mdurl 0.1.2
mistune 3.0.2
mpmath 1.3.0
multidict 6.0.4
murmurhash 1.0.10
nbclient 0.9.0
nbconvert 7.11.0
nbformat 5.9.2
nest-asyncio 1.5.8
networkx 3.2.1
nltk 3.8.1
notebook 7.0.6
notebook_shim 0.2.3
numba 0.58.1
numpy 1.26.2
nvidia-cublas-cu11 11.10.3.66
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu11 11.7.101
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu11 8.5.0.96
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu11 10.9.0.58
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu11 10.2.10.91
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu11 11.4.0.1
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu11 11.7.4.91
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu11 2.14.3
nvidia-nccl-cu12 2.18.1
nvidia-nvjitlink-cu12 12.3.101
nvidia-nvtx-cu11 11.7.91
nvidia-nvtx-cu12 12.1.105
omegaconf 2.3.0
open-clip-torch 2.23.0
opencv-python 4.8.1.78
ordered-set 4.1.0
orjson 3.9.10
overrides 7.4.0
packaging 23.0
pandas 2.1.3
pandocfilters 1.5.0
parso 0.8.3
pathy 0.10.3
peft 0.6.2
pexpect 4.9.0
Pillow 10.1.0
pip 23.3.1
platformdirs 4.0.0
portalocker 2.8.2
preshed 3.0.9
prometheus-client 0.19.0
prompt-toolkit 3.0.41
protobuf 4.25.1
psutil 5.9.4
ptyprocess 0.7.0
pure-eval 0.2.2
pycocoevalcap 1.2
pycocotools 2.0.6
pycparser 2.21
pydantic 1.10.13
pydantic_core 2.4.0
pydub 0.25.1
Pygments 2.17.2
PyJWT 2.8.0
pynndescent 0.5.11
pyparsing 3.0.9
python-dateutil 2.8.2
python-editor 1.0.4
python-json-logger 2.0.7
python-multipart 0.0.6
pytorch-fid 0.3.0
pytorch-lightning 2.1.2
pytz 2023.3.post1
PyYAML 6.0
pyzmq 25.1.1
readchar 4.0.5
referencing 0.31.0
regex 2022.10.31
requests 2.31.0
rfc3339-validator 0.1.4
rfc3986-validator 0.1.1
rich 13.7.0
rouge 1.0.1
rpds-py 0.13.1
s3transfer 0.8.1
safetensors 0.4.1
scikit-learn 1.3.2
scipy 1.11.4
semantic-version 2.10.0
Send2Trash 1.8.2
sentence-transformers 2.2.2
sentencepiece 0.1.99
sentry-sdk 1.37.1
setproctitle 1.3.3
setuptools 68.0.0
six 1.16.0
smart-open 6.4.0
smmap 5.0.1
sniffio 1.3.0
soupsieve 2.5
spacy 3.5.1
spacy-legacy 3.0.12
spacy-loggers 1.0.5
srsly 2.4.8
stack-data 0.6.3
starlette 0.27.0
starsessions 1.3.0
sympy 1.12
tenacity 8.2.2
terminado 0.18.0
thinc 8.1.12
threadpoolctl 3.2.0
timm 0.6.13
tinycss2 1.2.1
tokenizers 0.13.3
tomli 2.0.1
toolz 0.12.0
torch 2.0.1
torch-fidelity 0.3.0
torchmetrics 1.2.0
torchvision 0.15.2
tornado 6.3.3
tqdm 4.64.1
traitlets 5.14.0
transformers 4.31.0
triton 2.0.0
typer 0.7.0
types-python-dateutil 2.8.19.14
typing_extensions 4.8.0
tzdata 2023.3
uc-micro-py 1.0.2
umap-learn 0.5.5
uri-template 1.3.0
urllib3 1.26.18
uvicorn 0.24.0.post1
wandb 0.16.0
wasabi 1.1.2
wcwidth 0.2.12
webcolors 1.13
webdataset 0.2.48
webencodings 0.5.1
websocket-client 1.6.4
websockets 11.0.3
wheel 0.41.2
xformers 0.0.22
yarl 1.8.2
zipp 3.14.0

@lckj2009
Copy link
Author

lckj2009 commented Nov 30, 2023

this is my Weight files, The following figure:

1

@lckj2009
Copy link
Author

lckj2009 commented Nov 30, 2023

this is my /root/MiniGPT-5/config/minigpt4.yaml:
model:
arch: minigpt5
model_type: pretrain_vicuna
freeze_vit: True
freeze_qformer: True
max_txt_len: 160
end_sym: "###"
prompt_path: ""
prompt_template: '###Human: {} ###Assistant: '
ckpt: '/root/MiniGPT-5/config/prerained_minigpt4_7b.pth'
using_lora: True

datasets:
cc_sbu_align:
vis_processor:
train:
name: "blip2_image_eval"
image_size: 224
text_processor:
train:
name: "blip_caption"

run:
task: image_text_pretrain

optimizer

lr_sched: "linear_warmup_cosine_lr"
init_lr: 3e-5
min_lr: 1e-5
warmup_lr: 1e-6

weight_decay: 0.05
max_epoch: 5
iters_per_epoch: 200
batch_size_train: 12
batch_size_eval: 12
num_workers: 4
warmup_steps: 200

seed: 42
output_dir: "output/minigpt4_stage2_finetune"

amp: True
resume_ckpt_path: null

evaluate: False
train_splits: ["train"]

device: "cuda"
world_size: 1
dist_url: "env://"
distributed: True

@lckj2009
Copy link
Author

lckj2009 commented Nov 30, 2023

this is my /root/MiniGPT-5/minigpt4/configs/models/minigpt4.yaml:

model:
arch: mini_gpt4

vit encoder

image_size: 224
drop_path_rate: 0
use_grad_checkpoint: False
vit_precision: "fp16"
freeze_vit: True
freeze_qformer: True

Q-Former

num_query_token: 32

Vicuna

llama_model: "/root/vicuna-7b-v1.1"

generation configs

prompt: ""

preprocess:
vis_processor:
train:
name: "blip2_image_train"
image_size: 224
eval:
name: "blip2_image_eval"
image_size: 224
text_processor:
train:
name: "blip_caption"
eval:
name: "blip_caption"

@lckj2009
Copy link
Author

this is my /root/vicuna-7b-v1.1, The following figure:
2

@lckj2009
Copy link
Author

lckj2009 commented Nov 30, 2023

Including "Vicuna-7b-v1.1" is all good. The path and configuration file are fine, but why is there still such an error. My pip "torch=2.0.1, lighting=2.0.9.post0"

@KzZheng
Copy link
Collaborator

KzZheng commented Nov 30, 2023

I'm trying to help. According to your error record:

File "/root/MiniGPT-5/minigpt4/models/eva_vit.py", line 190, in init
num_patches = (img_size[1] // patch_size[1]) * (img_size[0] // patch_size[0])
TypeError: unsupported operand type(s) for //: 'NoneType' and 'int'

where Img_size is None when the model is loaded.

But according to your config file, you have set img_size: 224 in /root/MiniGPT-5/minigpt4/configs/models/minigpt4.yaml. Then, I'm confused. Can you check the minigpt4_config.model_cfg.image_size after line 67?

@lckj2009
Copy link
Author

May I ask which file has line 67?/root/MiniGPT-5/model.py 67 line?
3

@KzZheng
Copy link
Collaborator

KzZheng commented Nov 30, 2023

Yes

@lckj2009
Copy link
Author

Yes

Please wait a moment, I need to set up a debugging environment

@lckj2009
Copy link
Author

lckj2009 commented Nov 30, 2023

File "/root/MiniGPT-5/model.py", line 68, in init
print('minigpt4_config.model_cfg.image_size' + str(minigpt4_config.model_cfg.image_size))
File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/omegaconf/dictconfig.py", line 355, in getattr
self._format_and_raise(
File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/omegaconf/base.py", line 231, in _format_and_raise
format_and_raise(
File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/omegaconf/_utils.py", line 899, in format_and_raise
_raise(ex, cause)
File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/omegaconf/_utils.py", line 797, in _raise
raise ex.with_traceback(sys.exc_info()[2]) # set env var OC_CAUSE=1 for full trace
File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/omegaconf/dictconfig.py", line 351, in getattr
return self._get_impl(
File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/omegaconf/dictconfig.py", line 442, in _get_impl
node = self._get_child(
File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/omegaconf/basecontainer.py", line 73, in _get_child
child = self._get_node(
File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/omegaconf/dictconfig.py", line 480, in _get_node
raise ConfigKeyError(f"Missing key {key!s}")
omegaconf.errors.ConfigAttributeError: Missing key image_size
full_key: model.image_size
object_type=dict

Seems unable to: print('minigpt4_config.model_cfg.image_size' + str(minigpt4_config.model_cfg.image_size))

4

@KzZheng
Copy link
Collaborator

KzZheng commented Nov 30, 2023

Then, can you check the self.args.cfg_path in line 27 of minigpt4/common/config.py and model_config_path in line 71 of minigpt4/common/config.py

@lckj2009
Copy link
Author

File "/root/MiniGPT-5/minigpt4/models/eva_vit.py", line 190, in init
num_patches = (img_size[1] // patch_size[1]) * (img_size[0] // patch_size[0])
TypeError: unsupported operand type(s) for //: 'NoneType' and 'int'

Because these errors were executed first, an error was reported before reaching "minipt4/common/config. py".

Here are the modifications I made, but the program has not yet been executed here。
5

@lckj2009
Copy link
Author

lckj2009 commented Nov 30, 2023

We are currently creating the environment and will use 'pycharm' to debug once it is ready。Then I will take a screenshot to show you the situation

@KzZheng
Copy link
Collaborator

KzZheng commented Nov 30, 2023

This is weird. According to your error:

Traceback (most recent call last):
File "/root/MiniGPT-5/examples/playground.py", line 40, in
minigpt5 = MiniGPT5_Model.load_from_checkpoint(stage1_ckpt, strict=False, map_location="cpu", encoder_model_config=model_args, **vars(training_args))
File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/lightning/pytorch/core/module.py", line 1552, in load_from_checkpoint
loaded = _load_from_checkpoint(
File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/lightning/pytorch/core/saving.py", line 89, in _load_from_checkpoint
model = _load_state(cls, checkpoint, strict=strict, kwargs)
File "/root/anaconda3/envs/minigpt5/lib/python3.9/site-packages/lightning/pytorch/core/saving.py", line 156, in _load_state
obj = cls(_cls_kwargs)
File "/root/MiniGPT-5/model.py", line 68, in init
self.model = MiniGPT5.from_config(minigpt4_config.model_cfg)
File "/root/MiniGPT-5/minigpt4/models/mini_gpt4.py", line 247, in from_config
model = cls(
File "/root/MiniGPT-5/minigpt4/models/mini_gpt5.py", line 46, in init
super().init(*args, **kwargs)
File "/root/MiniGPT-5/minigpt4/models/mini_gpt4.py", line 53, in init
self.visual_encoder, self.ln_vision = self.init_vision_encoder(
File "/root/MiniGPT-5/minigpt4/models/blip2.py", line 65, in init_vision_encoder
visual_encoder = create_eva_vit_g(
File "/root/MiniGPT-5/minigpt4/models/eva_vit.py", line 416, in create_eva_vit_g
model = VisionTransformer(
File "/root/MiniGPT-5/minigpt4/models/eva_vit.py", line 259, in init
self.patch_embed = PatchEmbed(
File "/root/MiniGPT-5/minigpt4/models/eva_vit.py", line 190, in init
num_patches = (img_size[1] // patch_size[1]) * (img_size[0] // patch_size[0])
TypeError: unsupported operand type(s) for //: 'NoneType' and 'int'

Your error starts from File "/root/MiniGPT-5/model.py", line 68, but we are testing line 67. The error should be after config. Also, you already tried to print minigpt4_config.model_cfg.image_size before and you did not receive any error in line 67.

@lckj2009
Copy link
Author

lckj2009 commented Nov 30, 2023

This is weird. According to your error:

Traceback (most recent call last):

Your error starts from File "/root/MiniGPT-5/model.py", line 68, but we are testing line 67. The error should be after config. Also, you already tried to print minigpt4_config.model_cfg.image_size before and you did not receive any error in line 67.

yes,Start the next call with "self.model = MiniGPT5.from_config(minigpt4_config.model_cfg)". Before this, there were no errors reported。

About errors:“(img_size[1] // patch_size[1]) * (img_size[0] // patch_size[0])”
I once discovered that ‘img_size[1]’ and ‘img_size[0]’ The value of these two objects is none

@lckj2009
Copy link
Author

lckj2009 commented Nov 30, 2023

I found that the values of ‘img_size[1]’ and ‘img_size[0]’ are none, and I think this is the reason for the error

@lckj2009
Copy link
Author

Because img_size[1]=None and img_size[0]=None, an error of "TypeError: unsupported operand type(s) for //: 'NoneType' and 'int'" occurred

@KzZheng
Copy link
Collaborator

KzZheng commented Nov 30, 2023

I'm trying to help. According to your error record:

File "/root/MiniGPT-5/minigpt4/models/eva_vit.py", line 190, in init
num_patches = (img_size[1] // patch_size[1]) * (img_size[0] // patch_size[0])
TypeError: unsupported operand type(s) for //: 'NoneType' and 'int'

where Img_size is None when the model is loaded.

But according to your config file, you have set img_size: 224 in /root/MiniGPT-5/minigpt4/configs/models/minigpt4.yaml. Then, I'm confused. Can you check the minigpt4_config.model_cfg.image_size after line 67?

Same like I said here. img_size should not be none and the all print I asked above is to check the reason about why it is None here.

@lckj2009
Copy link
Author

lckj2009 commented Nov 30, 2023

Now it can be DEBUG

img_size in minigpt4_config.datasets_cfg.cc_sbu_align.vis_processor.train.img_size。 is not minigpt4_config.model_cfg

img_size==224. is right. NO ERROR

6

7

@KzZheng
Copy link
Collaborator

KzZheng commented Nov 30, 2023

But it should be also inside both model_cfg and dataset_cfg.

image

@lckj2009
Copy link
Author

But it should be also inside both model_cfg and dataset_cfg.

image

Is there a problem with my 'minigpt4.yaml' file or did I read the wrong file? Could you please take a look at the address and content of the 'minigpt4.yaml' file in my reply above

@KzZheng
Copy link
Collaborator

KzZheng commented Nov 30, 2023

this is my /root/MiniGPT-5/minigpt4/configs/models/minigpt4.yaml:

model: arch: mini_gpt4

vit encoder

image_size: 224 drop_path_rate: 0 use_grad_checkpoint: False vit_precision: "fp16" freeze_vit: True freeze_qformer: True

Q-Former

num_query_token: 32

Vicuna

llama_model: "/root/vicuna-7b-v1.1"

generation configs

prompt: ""

preprocess: vis_processor: train: name: "blip2_image_train" image_size: 224 eval: name: "blip2_image_eval" image_size: 224 text_processor: train: name: "blip_caption" eval: name: "blip_caption"

I didn't see the wrong config in your file. To check whether you read the correct file. You should check the model_config_path at line 71 of minigpt4/common/config.py file first.

@lckj2009
Copy link
Author

An error occurred before running to 'self.tokenizer = self.model.llama_tokenizer'.
The following error is in 'eva_vit.py', please analyze it. Please take a look at the breakpoint debugging results

8
9

@lckj2009
Copy link
Author

I made a breakpoint here, but I didn't come in。

10

@KzZheng
Copy link
Collaborator

KzZheng commented Nov 30, 2023

I think that's maybe the reason. You have multiple minigpt4 folder under your python path. Therefore, the python loads minigpt4 from the wrong path/folder. You can step in the line 67 to see where Config(MiniGPT4Args) leads to.

@lckj2009
Copy link
Author

lckj2009 commented Nov 30, 2023

restart。 /root/minigpt555/minigpt4/common/config.py 27 line. minigpt4.yaml is OK.
11

/root/minigpt555/minigpt4/common/config.py 71 line. minigpt4.yaml is OK.
12

/root/minigpt555/model.py 67 line. minigpt4.yaml is OK.
13

They all point to the path: /root/MiniGPT-5/config/minigpt4.yaml

/root/minigpt555 and /root/MiniGPT-5 is same files

@KzZheng
Copy link
Collaborator

KzZheng commented Nov 30, 2023

model_config_path in /root/minigpt555/minigpt4/common/config.py 71 line. should be /root/MiniGPT-5/minigpt4/configs/models/minigpt4.ymal instead of /root/MiniGPT-5/config/minigpt4.yaml

Please check the function default_config_path in Line 79 of minigpt4/models/base_model.py to see why you obtain the wrong path.

@lckj2009
Copy link
Author

lckj2009 commented Dec 1, 2023

Please check the function default_config_path in Line 79 of minigpt4/models/base_model.py to see why you obtain the wrong path.

**Thank you, the problem has been resolved.

But a new error has occurred:**

Traceback (most recent call last):
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn
conn = connection.create_connection(
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection
raise err
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection
sock.connect(sa)
OSError: [Errno 101] Network is unreachable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/connectionpool.py", line 715, in urlopen
httplib_response = self._make_request(
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/connectionpool.py", line 404, in _make_request
self._validate_conn(conn)
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1058, in _validate_conn
conn.connect()
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/connection.py", line 363, in connect
self.sock = conn = self._new_conn()
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f2252a4d8b0>: Failed to establish a new connection: [Errno 101] Network is unreachable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/requests/adapters.py", line 486, in send
resp = conn.urlopen(
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/connectionpool.py", line 799, in urlopen
retries = retries.increment(
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /stabilityai/stable-diffusion-2-1-base/resolve/main/text_encoder/config.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f2252a4d8b0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1247, in hf_hub_download
metadata = get_hf_file_metadata(
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1624, in get_hf_file_metadata
r = _request_wrapper(
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 402, in _request_wrapper
response = _request_wrapper(
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 425, in _request_wrapper
response = get_session().request(method=method, url=url, **params)
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/huggingface_hub/utils/_http.py", line 63, in send
return super().send(request, *args, **kwargs)
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/requests/adapters.py", line 519, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /stabilityai/stable-diffusion-2-1-base/resolve/main/text_encoder/config.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f2252a4d8b0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))"), '(Request ID: 2236cb4e-1384-4f70-882e-68340d88ead0)')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/transformers/utils/hub.py", line 417, in cached_file
resolved_file = hf_hub_download(
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1377, in hf_hub_download
raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/transformers/configuration_utils.py", line 672, in _get_config_dict
resolved_config_file = cached_file(
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/transformers/utils/hub.py", line 452, in cached_file
raise EnvironmentError(
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like stabilityai/stable-diffusion-2-1-base is not the path to a directory containing a file named text_encoder/config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.
python-BaseException

**I wonder if I want to download 'https://huggingface.co/julien-c/EsperBERTo-small/resolve/main/pytorch_model.bin'. The network environment here is not good, so I want to download 'pytorch_model.bin' first and put it in a local folder. But I don't know which folder is better to put it in, please let me know, thank you.

If it's not this file, please tell me the other file names so that I can download it and place it locally.**

@lckj2009
Copy link
Author

lckj2009 commented Dec 1, 2023

I have now cloned 'stablityai/table diffusion 2-1-base', should I also put it in the/root/MiniGPT-5 directory? Where exactly is it placed?

@KzZheng
Copy link
Collaborator

KzZheng commented Dec 1, 2023

I have now cloned 'stablityai/table diffusion 2-1-base', should I also put it in the/root/MiniGPT-5 directory? Where exactly is it placed?

You can place it anywhere you want. Just change sd_model_name at line 73 of model.py to your local path.

@lckj2009
Copy link
Author

lckj2009 commented Dec 4, 2023

I have now cloned 'stablityai/table diffusion 2-1-base', should I also put it in the/root/MiniGPT-5 directory? Where exactly is it placed?

You can place it anywhere you want. Just change sd_model_name at line 73 of model.py to your local path.

Now the models are all installed. Result error: CUDA out of memory.

File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/minigpt555/minigpt4/models/modeling_llama.py", line 140, in forward
return self.down_proj(self.act_fn(self.gate_proj(x)) * self.up_proj(x))
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/anaconda3/envs/minigpt555/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 86.00 MiB (GPU 0; 21.99 GiB total capacity; 21.38 GiB already allocated; 17.00 MiB free; 21.66 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
python-BaseException

I remember the 'python3 playground.py --stage1_weight WEIGHT_FOLDER/stage1_cc3m.ckpt ' command, it doesn't take up much memory. My server has a single card with 24GB of graphics memory. 2 graphics cards

1

@lckj2009
Copy link
Author

lckj2009 commented Dec 6, 2023

Regarding the error:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 128.00 MiB (GPU 0; 21.99 GiB total capacity; 21.42 GiB already allocated; 107.00 MiB free; 21.57 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I once tried to reduce the video memory fragment to 32MB, but it sfailed, and it reported the same error. I estimate we need you to find another way to help solve it:

export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants