UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition
Wenxuan Zhou*, Sheng Zhang*, Yu Gu, Muhao Chen, Hoifung Poon (*Equal Contribution)
[Project Page] [Demo] [Paper] [Data] [Model]
- [12/4] We add the evaluation code for NER datasets.
- [9/14] We add our training code for finetuning the LLama base model with UniversalNER data.
- [8/11] We release two more UniNER models, UniNER-7B-type-sup and UniNER-7B-all, which were finetuned on ChatGPT-generated data and 40 supervised datasets of various domains and offers better NER performance.
- [8/10] We have released the inference code for running the model checkpoints. The code for pretraining and evaluation will be released soon.
Usage and License Notices: The data and model checkpoints are intended and licensed for research use only. They are also restricted to uses that follow the license agreement of LLaMA, Vicuna, and ChatGPT.
This project relies on vllm
. Ensure you have gcc
version 5 or later, and CUDA versions between 11.0 and 11.8, as specified in the installation requirements for vllm.
- Clone this repository and navigate to the folder
git clone https://github.com/universal-ner/universal-ner.git
cd universal-ner
- Install the required packages
pip install -r requirements.txt
We use vllm for inference. The inference can be run with a single V100 16G GPU.
To launch a Gradio demo locally, run the following command:
python -m src.serve.gradio_server \
--model_path Universal-NER/UniNER-7B-type \
--tensor_parallel_size 1 \
--max_input_length 512
Run the following command to use vllm for inference:
python -m src.serve.cli \
--model_path Universal-NER/UniNER-7B-type \
--tensor_parallel_size 1 \
--max_input_length 512
Run the following command to use Huggingface Transformers for inference:
python -m src.serve.hf \
--model_path Universal-NER/UniNER-7B-type \
--tensor_parallel_size 1 \
--max_input_length 512
Our training code is adapted from FastChat. See here for how to finetune the LLama base model with UniversalNER data.
To execute the evaluation process:
python -m src.eval.evaluate \
--model_path Universal-NER/UniNER-7B-type \
--data_path ./src/eval/test_data/CrossNER_AI.json \
--tensor_parallel_size 1
Due to licensing restrictions associated with many NER datasets used in our evaluation, only the CrossNER and MIT datasets are included in this repository.
If you find UniversalNER helpful for your research and applications, please cite using this BibTeX:
@article{zhou2023universalner,
title={UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition},
author={Wenxuan Zhou and Sheng Zhang and Yu Gu and Muhao Chen and Hoifung Poon},
year={2023},
eprint={2308.03279},
archivePrefix={arXiv},
primaryClass={cs.CL}
}