pytorch-bert-ner

基于bert的命名实体识别，pytorch实现，支持中英文

Requirements

python3
pip3 install -r requirements.txt

Run Exmaple

--bert_model is the pre_trained pytorch bert model path(pytorch), must contains: pytorch_model.bin、vocab.txt、bert_config.json

If tensorflow bert model(download from https://github.com/google-research/bert), should convert to pytoch bert model as follow command:

python3 convert_tf_checkpoint_to_pytorch.py --tf_checkpoint_path ../bert_model.ckpt --bert_config_file ../bert_config.json --pytorch_dump_path ../pytorch_model.bin

Recommend to download and use the converted model

English NER

python3 run_ner.py --data_dir=data/ --bert_model=base-cased --task_name=ner --output_dir=output --max_seq_length=64 --do_train --num_train_epochs 5 --do_eval --warmup_proportion=0.4

Chinese NER(train corpus in data folder is small part of people daily news for quick start, recommend to download)

python3 run_ner.py --data_dir=data/ --bert_model=chinese-base-uncased --task_name=chinese_ner --output_dir=output --max_seq_length=64 --do_train --num_train_epochs 5 --do_eval --warmup_proportion=0.4

Pretrained pytorch model and data download

链接：https://pan.baidu.com/s/1TNcsx6zGCKjN_KY2It7hyA 提取码：mlmd

Result

Validation Data

             precision    recall  f1-score   support

       MISC     0.9407    0.9304    0.9355       273
        LOC     0.9650    0.9881    0.9764       419
        PER     0.9844    0.9783    0.9813       322
        ORG     0.9794    0.9852    0.9822       337

avg / total     0.9683    0.9734    0.9708      1351

Test Data

             precision    recall  f1-score   support

        ORG     0.9152    0.9073    0.9113       464
        PER     0.9767    0.9692    0.9730       260
        LOC     0.9397    0.9263    0.9330       353
       MISC     0.8276    0.9014    0.8629       213

avg / total     0.9198    0.9240    0.9217      1290

Inference

python3 predict.py

{'2': {'tag': 'B_T', 'confidence': 0.9999847412109375}, '0': {'tag': 'I_T', 'confidence': 0.9989903569221497}, '1': {'tag': 'I_T', 'confidence': 0.9995298385620117}, '4': {'tag': 'I_T', 'confidence': 0.9996459484100342}, '年': {'tag': 'I_T', 'confidence': 0.9996104836463928}, '新': {'tag': 'O', 'confidence': 0.9995424747467041}, '的': {'tag': 'O', 'confidence': 0.9997028708457947}, '开': {'tag': 'O', 'confidence': 0.9999663829803467}, '始': {'tag': 'O', 'confidence': 0.9999591112136841}, '，王': {'tag': 'O', 'confidence': 0.9999748468399048}, '兴': {'tag': 'I_PER', 'confidence': 0.9997753500938416}, '很': {'tag': 'O', 'confidence': 0.9993890523910522}, '高': {'tag': 'O', 'confidence': 0.9992743134498596}, '兴': {'tag': 'O', 'confidence': 0.9999097585678101}}

or refer:

from bert import Ner

model = Ner("output/")
output = model.predict("Steve went to Paris")

print(output)
# {
#     "Steve": {
#         "tag": "B-PER",
#         "confidence": 0.999879002571106
#     },
#     "went": {
#         "tag": "O",
#         "confidence": 0.9968552589416504
#     },
#     "to": {
#         "tag": "O",
#         "confidence": 0.9996656179428101
#     },
#     "Paris": {
#         "tag": "B-LOC",
#         "confidence": 0.999504804611206
#     }
# }

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
data		data
pytorch_pretrained_bert		pytorch_pretrained_bert
samples		samples
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
bert.py		bert.py
convert_tf_checkpoint_to_pytorch.py		convert_tf_checkpoint_to_pytorch.py
predict.py		predict.py
requirements.txt		requirements.txt
run_ner.py		run_ner.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pytorch-bert-ner

Requirements

Run Exmaple

English NER

Chinese NER(train corpus in data folder is small part of people daily news for quick start, recommend to download)

Pretrained pytorch model and data download

Result

Validation Data

Test Data

Inference

About

Releases

Packages

Languages

License

alphanlp/pytorch-bert-ner

Folders and files

Latest commit

History

Repository files navigation

pytorch-bert-ner

Requirements

Run Exmaple

English NER

Chinese NER(train corpus in data folder is small part of people daily news for quick start, recommend to download)

Pretrained pytorch model and data download

Result

Validation Data

Test Data

Inference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages