-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible to load your own trained models with internet disabled? #52
Comments
Hello! I'm struggling to reproduce the issue - I seem to be able to load a local model without internet. I can't reproduce it with your notebook either, as the model is private. Could you provide a larger stack trace perhaps? It seems like it wants to load the config for
|
Thank you for the quick response! I've made the dataset public, so you can use it now when trying to reproduce the error. Also, here is the link to the training notebook as well: https://www.kaggle.com/code/jdonnelly0804/pii-train - Training notebook its ok for the internet to be on, and everything goes smoothly, I simply save a checkpoint from the trained model, download the files, upload it as a dataset, and use it in the inference notebook that you we're trying to recreate (that was the hidden dataset) Finally, here is the full traceback: model_checkpoint = "/kaggle/input/pii-train-1-cp3000/Kaggle Checkpoints/checkpoint 3000" gaierror Traceback (most recent call last) File /opt/conda/lib/python3.10/site-packages/urllib3/util/connection.py:72, in create_connection(address, timeout, source_address, socket_options) File /opt/conda/lib/python3.10/socket.py:955, in getaddrinfo(host, port, family, type, proto, flags) gaierror: [Errno -3] Temporary failure in name resolution During handling of the above exception, another exception occurred: NewConnectionError Traceback (most recent call last) File /opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py:386, in HTTPConnectionPool._make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw) File /opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py:1042, in HTTPSConnectionPool._validate_conn(self, conn) File /opt/conda/lib/python3.10/site-packages/urllib3/connection.py:363, in HTTPSConnection.connect(self) File /opt/conda/lib/python3.10/site-packages/urllib3/connection.py:186, in HTTPConnection._new_conn(self) NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x783edf828070>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution During handling of the above exception, another exception occurred: MaxRetryError Traceback (most recent call last) File /opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py:787, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw) File /opt/conda/lib/python3.10/site-packages/urllib3/util/retry.py:592, in Retry.increment(self, method, url, response, error, _pool, _stacktrace) MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /bert-base-uncased/resolve/main/config.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x783edf828070>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')) During handling of the above exception, another exception occurred: ConnectionError Traceback (most recent call last) File /opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py:118, in validate_hf_hub_args.._inner_fn(*args, **kwargs) File /opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py:1631, in get_hf_file_metadata(url, token, proxies, timeout, library_name, library_version, user_agent) File /opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py:385, in _request_wrapper(method, url, follow_relative_redirects, **params) File /opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py:408, in _request_wrapper(method, url, follow_relative_redirects, **params) File /opt/conda/lib/python3.10/site-packages/requests/sessions.py:589, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json) File /opt/conda/lib/python3.10/site-packages/requests/sessions.py:703, in Session.send(self, request, **kwargs) File /opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_http.py:67, in UniqueRequestIdAdapter.send(self, request, *args, **kwargs) File /opt/conda/lib/python3.10/site-packages/requests/adapters.py:519, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies) ConnectionError: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /bert-base-uncased/resolve/main/config.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x783edf828070>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))"), '(Request ID: 36d0930c-5de6-410e-962b-fee0e9868975)') The above exception was the direct cause of the following exception: LocalEntryNotFoundError Traceback (most recent call last) File /opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py:118, in validate_hf_hub_args.._inner_fn(*args, **kwargs) File /opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py:1371, in hf_hub_download(repo_id, filename, subfolder, repo_type, revision, library_name, library_version, cache_dir, local_dir, local_dir_use_symlinks, user_agent, force_download, force_filename, proxies, etag_timeout, resume_download, token, local_files_only, legacy_cache_layout, endpoint) LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on. The above exception was the direct cause of the following exception: OSError Traceback (most recent call last) File /opt/conda/lib/python3.10/site-packages/span_marker/modeling.py:269, in SpanMarkerModel.from_pretrained(cls, pretrained_model_name_or_path, labels, config, model_card_data, *model_args, **kwargs) File /opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py:3462, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs) File /opt/conda/lib/python3.10/site-packages/span_marker/modeling.py:80, in SpanMarkerModel.init(self, config, encoder, model_card_data, **kwargs) File /opt/conda/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py:1082, in AutoConfig.from_pretrained(cls, pretrained_model_name_or_path, **kwargs) File /opt/conda/lib/python3.10/site-packages/transformers/configuration_utils.py:644, in PretrainedConfig.get_config_dict(cls, pretrained_model_name_or_path, **kwargs) File /opt/conda/lib/python3.10/site-packages/transformers/configuration_utils.py:699, in PretrainedConfig._get_config_dict(cls, pretrained_model_name_or_path, **kwargs) File /opt/conda/lib/python3.10/site-packages/transformers/utils/hub.py:429, in cached_file(path_or_repo_id, filename, cache_dir, force_download, resume_download, proxies, token, revision, local_files_only, subfolder, repo_type, user_agent, _raise_exceptions_for_missing_entries, _raise_exceptions_for_connection_errors, _commit_hash, **deprecated_kwargs) OSError: We couldn't connect to 'https://huggingface.co/' to load this file, couldn't find it in the cached files and it looks like bert-base-uncased is not the path to a directory containing a file named config.json. |
That's very useful information right there! I see the issue now. This should fix it for you: import json
from typing import Optional
from transformers import PreTrainedModel, BertModel, BertConfig
from span_marker import SpanMarkerModel
from span_marker.configuration import SpanMarkerConfig
from span_marker.model_card import SpanMarkerModelCardData
class BertSpanMarkerModel(SpanMarkerModel):
def __init__(
self,
config: SpanMarkerConfig,
encoder: Optional[PreTrainedModel] = None,
model_card_data: Optional[SpanMarkerModelCardData] = None,
**kwargs,
) -> None:
if encoder is None:
encoder = BertModel(BertConfig(**config.encoder))
super().__init__(config, encoder, model_card_data, **kwargs)
with open(r"/kaggle/input/pii-train-1-cp3000/Kaggle Checkpoints/checkpoint 3000/config.json") as f:
data = json.load(f)
try:
del data["encoder"]["_name_or_path"]
with open(r"/kaggle/input/pii-train-1-cp3000/Kaggle Checkpoints/checkpoint 3000/config.json", "w") as f:
json.dump(data, f)
except Exception:
pass
model = BertSpanMarkerModel.from_pretrained("/kaggle/input/pii-train-1-cp3000/Kaggle Checkpoints/checkpoint 3000", local_files_only=True, labels = [
'1-EMAIL', '1-ID_NUM', '1-NAME_STUDENT', '1-PHONE_NUM', '1-STREET_ADDRESS',
'1-URL_PERSONAL', '1-USERNAME', '2-ID_NUM', '2-NAME_STUDENT', '2-PHONE_NUM',
'2-STREET_ADDRESS', '2-URL_PERSONAL', 'O'
]) If you're interested in the mistakes, here they are: The model makes two errors when trying to load without internet:
|
Hi Tom, unfortunately after copy pasting the code, that issue still persists: https://www.kaggle.com/jdonnelly0804/pii-infer Were you able to make it work on your end? import json class BertSpanMarkerModel(SpanMarkerModel): with open(r"/kaggle/input/pii-train-1-cp3000/Kaggle Checkpoints/checkpoint 3000/config.json") as f: model = BertSpanMarkerModel.from_pretrained("/kaggle/input/pii-train-1-cp3000/Kaggle Checkpoints/checkpoint 3000", local_files_only=True, labels = [ LocalEntryNotFoundError Traceback (most recent call last) File /opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py:118, in validate_hf_hub_args.._inner_fn(*args, **kwargs) File /opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py:1362, in hf_hub_download(repo_id, filename, subfolder, repo_type, revision, library_name, library_version, cache_dir, local_dir, local_dir_use_symlinks, user_agent, force_download, force_filename, proxies, etag_timeout, resume_download, token, local_files_only, legacy_cache_layout, endpoint) LocalEntryNotFoundError: Cannot find the requested files in the disk cache and outgoing traffic has been disabled. To enable hf.co look-ups and downloads online, set 'local_files_only' to False. The above exception was the direct cause of the following exception: OSError Traceback (most recent call last) File /opt/conda/lib/python3.10/site-packages/span_marker/modeling.py:302, in SpanMarkerModel.from_pretrained(cls, pretrained_model_name_or_path, labels, config, model_card_data, *model_args, **kwargs) File /opt/conda/lib/python3.10/site-packages/span_marker/tokenizer.py:277, in SpanMarkerTokenizer.from_pretrained(cls, pretrained_model_name_or_path, config, *inputs, **kwargs) File /opt/conda/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py:752, in AutoTokenizer.from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs) File /opt/conda/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py:1082, in AutoConfig.from_pretrained(cls, pretrained_model_name_or_path, **kwargs) File /opt/conda/lib/python3.10/site-packages/transformers/configuration_utils.py:644, in PretrainedConfig.get_config_dict(cls, pretrained_model_name_or_path, **kwargs) File /opt/conda/lib/python3.10/site-packages/transformers/configuration_utils.py:699, in PretrainedConfig._get_config_dict(cls, pretrained_model_name_or_path, **kwargs) File /opt/conda/lib/python3.10/site-packages/transformers/utils/hub.py:429, in cached_file(path_or_repo_id, filename, cache_dir, force_download, resume_download, proxies, token, revision, local_files_only, subfolder, repo_type, user_agent, _raise_exceptions_for_missing_entries, _raise_exceptions_for_connection_errors, _commit_hash, **deprecated_kwargs) OSError: We couldn't connect to 'https://huggingface.co/' to load this file, couldn't find it in the cached files and it looks like bert-base-uncased is not the path to a directory containing a file named config.json. |
Was wondering if there is a way to load a model in a kaggle notebook that I trained myself. There's currently a NER competition going on, and I wanted to try using the SpanMarker library to compete. Training went fine, but now to submit, I need to have the kaggle notebook have internet disabled. When trying to load my checkpoint, I get this error:
model_checkpoint = "/kaggle/input/pii-train-1-cp3000/Kaggle Checkpoints/checkpoint 3000"
model = SpanMarkerModel.from_pretrained(model_checkpoint,local_files_only = True,
labels = [
'1-EMAIL', '1-ID_NUM', '1-NAME_STUDENT', '1-PHONE_NUM', '1-STREET_ADDRESS',
'1-URL_PERSONAL', '1-USERNAME', '2-ID_NUM', '2-NAME_STUDENT', '2-PHONE_NUM',
'2-STREET_ADDRESS', '2-URL_PERSONAL', 'O'
])
OSError: We couldn't connect to 'https://huggingface.co/' to load this file, couldn't find it in the cached files and it looks like bert-base-uncased is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.
Kaggle notebook here: https://www.kaggle.com/jdonnelly0804/pii-infer
The text was updated successfully, but these errors were encountered: