The source code and Dataset of the paper: Qing Ye, Xuan Lai, Chunlei Cheng. Named entity recognition for traditional Chinese medicine with lexical enhancement and span method.
torch==1.7.1
numpy==1.19.5
transformers==4.26.1
FastNLP==0.5.0
- Character embeddings: gigaword_chn.all.a2b.uni.ite50.vec
- Bi-gram embeddings: gigaword_chn.all.a2b.bi.ite50.vec
all of the training and hyper parameters are in the file of src/options.py
bert_dir='the path of the Pretrained model Chinese-BERT-wwm'
task_type='crf / span / mrc' # three methods for decoding
train_epochs # the epoch of training, default=10
train_batch_size # the batch size of training,default=64
gpu_ids # gpu ids to use, -1 for cpu, "0,1,..." for multi gpu
- The FLAT model source code.
- The paper of FLAT model: Li X, Yan H, Qiu X, Huang X. J. 2020. FLAT: Chinese NER Using Flat-Lattice Transformer. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 6836-6842
- The detials about FastNLP