Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reranker微调好后,做推理时加载原始minicpm_layerwise报错,是否可以调用本地已下好的模型? #1329

Open
zjx623 opened this issue Jan 13, 2025 · 6 comments

Comments

@zjx623
Copy link

zjx623 commented Jan 13, 2025

#微调模型加载
model_path = '/home/fintuned_model/remini_hardmined_fintuned10_rawmodel/checkpoint-1083'
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True, torch_dtype=torch.bfloat16)  
# 将模型转移到 GPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
model.eval()
file_path = '/home/test_model/result/全部底库_500测试集召回结果.xlsx'
output_file = '/home/test_model/result/全部底库_500测试集召回结果_remini_rawmodel_e1.xlsx'
rerank_topk_res(tokenizer,model,file_path,output_file)  

我使用reranker_minicpm 微调好了lora模型以后,但是在推理时预加载原始模型权重时出现huggingface连接不上的问题,请问是否有办法将预训练模型改为本地路径参数?
报错如下:

Could not locate the configuration_minicpm_reranker.py inside BAAI/bge-reranker-v2-minicpm-layerwise.
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 715, in urlopen
httplib_response = self._make_request(
File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 404, in _make_request
self._validate_conn(conn)
File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1058, in validate_conn
conn.connect()
File "/opt/conda/lib/python3.10/site-packages/urllib3/connection.py", line 419, in connect
self.sock = ssl_wrap_socket(
File "/opt/conda/lib/python3.10/site-packages/urllib3/util/ssl
.py", line 449, in ssl_wrap_socket
ssl_sock = ssl_wrap_socket_impl(
File "/opt/conda/lib/python3.10/site-packages/urllib3/util/ssl
.py", line 493, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
File "/opt/conda/lib/python3.10/ssl.py", line 513, in wrap_socket
return self.sslsocket_class._create(
File "/opt/conda/lib/python3.10/ssl.py", line 1104, in _create
self.do_handshake()
File "/opt/conda/lib/python3.10/ssl.py", line 1375, in do_handshake
self._sslobj.do_handshake()
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/opt/conda/lib/python3.10/site-packages/requests/adapters.py", line 589, in send
resp = conn.urlopen(
File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 799, in urlopen
retries = retries.increment(
File "/opt/conda/lib/python3.10/site-packages/urllib3/util/retry.py", line 550, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/opt/conda/lib/python3.10/site-packages/urllib3/packages/six.py", line 769, in reraise
raise value.with_traceback(tb)
File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 715, in urlopen
httplib_response = self._make_request(
File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 404, in _make_request
self._validate_conn(conn)
File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1058, in validate_conn
conn.connect()
File "/opt/conda/lib/python3.10/site-packages/urllib3/connection.py", line 419, in connect
self.sock = ssl_wrap_socket(
File "/opt/conda/lib/python3.10/site-packages/urllib3/util/ssl
.py", line 449, in ssl_wrap_socket
ssl_sock = ssl_wrap_socket_impl(
File "/opt/conda/lib/python3.10/site-packages/urllib3/util/ssl
.py", line 493, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
File "/opt/conda/lib/python3.10/ssl.py", line 513, in wrap_socket
return self.sslsocket_class._create(
File "/opt/conda/lib/python3.10/ssl.py", line 1104, in _create
self.do_handshake()
File "/opt/conda/lib/python3.10/ssl.py", line 1375, in do_handshake
self._sslobj.do_handshake()
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1374, in _get_metadata_or_catch_error
metadata = get_hf_file_metadata(
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1294, in get_hf_file_metadata
r = _request_wrapper(
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 278, in _request_wrapper
response = _request_wrapper(
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 301, in _request_wrapper
response = get_session().request(method=method, url=url, **params)
File "/opt/conda/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/opt/conda/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 93, in send
return super().send(request, *args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/requests/adapters.py", line 604, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: (ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')), '(Request ID: 5b14b1d4-ff41-4c98-a037-c5f5028f7cde)')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/opt/conda/lib/python3.10/site-packages/transformers/utils/hub.py", line 402, in cached_file
resolved_file = hf_hub_download(
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 860, in hf_hub_download
return _hf_hub_download_to_cache_dir(
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 967, in _hf_hub_download_to_cache_dir
_raise_on_head_call_error(head_call_error, force_download, local_files_only)
File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1485, in _raise_on_head_call_error
raise LocalEntryNotFoundError(
huggingface_hub.errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/jovyan/Umetrip/test_model/decoder_only/test_minicpm.py", line 87, in
model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True, torch_dtype=torch.bfloat16)
File "/opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 524, in from_pretrained
config, kwargs = AutoConfig.from_pretrained(
File "/opt/conda/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 985, in from_pretrained
config_class = get_class_from_dynamic_module(
File "/opt/conda/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 502, in get_class_from_dynamic_module
final_module = get_cached_module_file(
File "/opt/conda/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 306, in get_cached_module_file
resolved_module_file = cached_file(
File "/opt/conda/lib/python3.10/site-packages/transformers/utils/hub.py", line 445, in cached_file
raise EnvironmentError(
OSError: We couldn't connect to 'https://huggingface.co/' to load this file, couldn't find it in the cached files and it looks like BAAI/bge-reranker-v2-minicpm-layerwise is not the path to a directory containing a file named configuration_minicpm_reranker.py.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

@ZHAOFEGNSHUN
Copy link

rerank_topk_res

你用上一层model_path试试呢?model_path = '/home/fintuned_model/remini_hardmined_fintuned10_rawmodel‘

@zjx623
Copy link
Author

zjx623 commented Jan 14, 2025

rerank_topk_res

你用上一层model_path试试呢?model_path = '/home/fintuned_model/remini_hardmined_fintuned10_rawmodel‘

不行的,他这个是个目录啊,具体训练好的lora参数在chepoint文件夹下,transformer自动找基座模型的时候我这因为用的环境有限制还是怎么,无法连到huggingface,不过本地存的有。不知道有人遇到相同情况可以跨过连HF这一步选择本地下好的基座模型

@zjx623
Copy link
Author

zjx623 commented Jan 14, 2025

image
我训练好的文件夹下,checkpoint-1083/adapter_config.json 修改这里的文件,发现不管用本地路径还是HF的”BAAI/bge-....“都不太行,都会报无法连接HF的错误

@ZHAOFEGNSHUN
Copy link

ZHAOFEGNSHUN commented Jan 14, 2025

image 我训练好的文件夹下,checkpoint-1083/adapter_config.json 修改这里的文件,发现不管用本地路径还是HF的”BAAI/bge-....“都不太行,都会报无法连接HF的错误

你用Flagembedding库那种方法试一下?我这边用Flagembedding这种方法没问题

@zjx623
Copy link
Author

zjx623 commented Jan 14, 2025

image 我训练好的文件夹下,checkpoint-1083/adapter_config.json 修改这里的文件,发现不管用本地路径还是HF的”BAAI/bge-....“都不太行,都会报无法连接HF的错误

你用Flagembedding库那种方法试一下?我这边用Flagembedding这种方法没问题

好嘞,我试一下,感谢

@zjx623
Copy link
Author

zjx623 commented Jan 20, 2025

是环境配置问题,已解决,Conda注意使用forge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants