Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GGUF support for BERT architecture #34238

Open
Dimmension opened this issue Oct 18, 2024 · 1 comment
Open

GGUF support for BERT architecture #34238

Dimmension opened this issue Oct 18, 2024 · 1 comment
Labels
Feature request Request for a new feature

Comments

@Dimmension
Copy link

Dimmension commented Oct 18, 2024

Feature request

I want to add the ability to use GGUF BERT models in transformers.
Currently the library does not support this architecture. When I try to load it, I get an error TypeError: Architecture 'bert' is not supported.
I have done most of the mapping, with some fields I am having difficulty.
Can anybody help me and provide comments on this feature?

Motivation

I ran into a problem that I can't use gguf models in RASA(rasa uses standard from_pretrained). So I decided to make BERT support

Your contribution

That's my extended ggml.py file

GGUF_TENSOR_MAPPING = {
    "bert": {
        "context_length": "max_position_embeddings",
        "block_count": "num_hidden_layers",
        "feed_forward_length": "intermediate_size",
        "embedding_length": "hidden_size",
        "attention.head_cgguf>=0.10.0ount": "num_attention_heads",
        "attention.layer_norm_rms_epsilon": "rms_norm_eps",
        # "attention.causal": "",
        # "pooling_type": "",
        "vocab_size": "vocab_size",
    }
}
 
GGUF_CONFIG_MAPPING = {
    "bert": {
        "context_length": "max_position_embeddings",
        "block_count": "num_hidden_layers",
        "feed_forward_length": "intermediate_size",
        "embedding_length": "hidden_size",
        "attention.head_cgguf>=0.10.0ount": "num_attention_heads",
        "attention.layer_norm_rms_epsilon": "rms_norm_eps",
        # "attention.causal": "",
        # "pooling_type": "",
        "vocab_size": "vocab_size",
    }
}
 
GGUF_TOKENIZER_MAPPING = {
    "tokenizer": {
        # "ggml.token_type_count": "",
        # "ggml.pre": "",
        "ggml.model": "tokenizer_type",
        "ggml.tokens": "all_special_tokens",
        "ggml.token_type": "all_special_ids",
        "ggml.unknown_token_id": "unk_token_id",
        "ggml.seperator_token_id": "sep_token_id",
        "ggml.padding_token_id": "pad_token_id",
        "ggml.cls_token_id": "cls_token_id",
        "ggml.mask_token_id": "mask_token_id",
    },
    "tokenizer_config": {       
        "ggml.unknown_token_id": "unk_token_id",
        "ggml.seperator_token_id": "sep_token_id",
        "ggml.padding_token_id": "pad_token_id",
        "ggml.cls_token_id": "cls_token_id",
        "ggml.mask_token_id": "mask_token_id",
    },
}
@Dimmension Dimmension added the Feature request Request for a new feature label Oct 18, 2024
@VladOS95-cyber
Copy link
Contributor

VladOS95-cyber commented Oct 18, 2024

Hi @Dimmension, there is dedicated open issue #33260. You could take a look how other architectures and tests were added and follow the same logic. Usefull links you may need as well are: https://github.com/ggerganov/llama.cpp/blob/master/convert_hf_to_gguf.py where you can find tensors and config conversion logic, this one is for mapping: https://github.com/ggerganov/llama.cpp/blob/master/gguf-py/gguf/tensor_mapping.py, in order to correctly rename all tensors back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

2 participants