-
Notifications
You must be signed in to change notification settings - Fork 27.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BROS #23190
Add BROS #23190
Conversation
042ab4a
to
ad32c01
Compare
@jinhopark8345 Awesome work - looking forward to having this model added! Feel free to ping us when the PR is ready for review or you have any implementation questions in the meantime. |
I am confused about what needs to be done. According to the How to add a new model guideline, a big part of it is porting pretrained models (from the original repo) into Huggingface transformers and making sure they are correctly ported by checking the outputs of each layer's forward step. However, it seems like the authors of the Bros model used Do I need to write a conversion script? Or can I skip this step and move to the step where I add model test codes? Thanks for the help in advance! |
@jinhopark8345 Interesting - that will definitely make things easier! In this case, if the files are already on the hub and in the correct format, there's no need for the conversion script. It's possible there might be additional arguments required in the config files or additional files needed in the hub repo, in which case, I'd suggest writing a script to add these. You probably won't be able to write directly to the org's repo, but can open a PR with any necessary changes. |
Bros' positional embedding)
@amyeroberts I added |
README.md
Outdated
1. **[Bros](https://huggingface.co/docs/transformers/model_doc/bros)** (from NAVER CLOVA) released with the paper [BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents](https://arxiv.org/abs/2108.04539) by Teakgyu Hong, Donghyun Kim, Mingi Ji, Wonseok Hwang, Daehyun Nam, Sungrae Park. | ||
1. **[BROS](https://huggingface.co/docs/transformers/main/model_doc/bros)** (from <FILL INSTITUTION>) released with the paper [<FILL PAPER TITLE>](<FILL ARKIV LINK>) by <FILL AUTHORS>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be fixed for the documentation to build. Once resolved we can merge 🤗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied the fix!
@jinhopark8345 Great - thank you! Could you rebase on main to resolve the conflicts? We should be good to go after that :) |
@jinhopark8345 Thanks for contributing this model! Make sure to share about it's addition to the library on twitter/linkedin/your medium of choice 🤗 |
@jinhopark8345 Thank you for adding this model into Regarding the test Could you double check on your machines, please? And what's your machine (GPU) type? Thank you in advance. (Pdb) outputs.last_hidden_state[0, :3, :3]
tensor([[-0.3165, 0.0830, -0.1203],
[-0.0089, 0.0031, 0.0736],
[-0.0461, 0.0146, 0.0880]], device='cuda:0')
(Pdb) expected_slice
tensor([[-0.4027, 0.0756, -0.0647],
[-0.0192, -0.0065, 0.1042],
[-0.0671, 0.0214, 0.0960]], device='cuda:0')
You can run the test with TF_FORCE_GPU_ALLOW_GROWTH=true RUN_SLOW=1 python3 -m pytest -v tests/models/bros/test_modeling_bros.py::BrosModelIntegrationTest::test_inference_no_head |
@ydshieh Thank you for providing the test command! I was able to reproduce the issue, but the After some testing, I found that some weights weren't being initialized properly. By changing:
to:
I was able to get consistent outputs. (conversion script : I suspect this issue wasn't detected earlier because when running:
torch cuda seed is manually set to certain value, perhaps due to other tests or other reasons. The update is here but I am not sure how I should apply this patch to Transformers library. |
Hi @jinhopark8345 Thanks a lot for looking into this! You can open a PR to update the checkpoint repo used in the test, or we can do it on our own side. But is it expected that |
The these are the renamed weights! def rename_key(name):
if name == "embeddings.bbox_projection.weight":
name = "bbox_embeddings.bbox_projection.weight"
if name == "embeddings.bbox_sinusoid_emb.x_pos_emb.inv_freq":
name = "bbox_embeddings.bbox_sinusoid_emb.x_pos_emb.inv_freq"
if name == "embeddings.bbox_sinusoid_emb.y_pos_emb.inv_freq":
name = "bbox_embeddings.bbox_sinusoid_emb.y_pos_emb.inv_freq"
return name If you confirm updating the checkpoint is okay, I would like to open PR! |
Sure, go for it. BTW, I see a lot of From your description, I think it is |
Hello @jinhopark8345 Thank you again for fixing the checkpoint. I have yet another question needs your help. For transformers/src/transformers/models/bros/modeling_bros.py Lines 927 to 933 in 37c205e
but eventually,
Could you double check if Thank you in advance, again! |
Hello @ydshieh Thank you for asking! Below code is the original implementation
In original implementation, Would it be more helpful to users if we remove
BrosModel fails earlier? or do you suggest different solutions?
|
Hi! In this case, you can add a try: except at the beginning of (we might need a few more fixes if CI fails due to this) Thank you ! |
Hi @jinhopark8345 , congrats on this amazing contribution. Feel free to share about it on Twitter/LinkedIn and we'll amplify. |
* add Bros boilerplate * copy and pasted modeling_bros.py from official Bros repo * update copyright of bros files * copy tokenization_bros.py from official repo and update import path * copy tokenization_bros_fast.py from official repo and update import path * copy configuration_bros.py from official repo and update import path * remove trailing period in copyright line * copy and paste bros/__init__.py from official repo * save formatting * remove unused unnecessary pe_type argument - using only crel type * resolve import issue * remove unused model classes * remove unnecessary tests * remove unused classes * fix original code's bug - layer_module's argument order * clean up modeling auto * add bbox to prepare_config_and_inputs * set temporary value to hidden_size (32 is too low because of the of the Bros' positional embedding) * remove decoder test, update create_and_check* input arguemnts * add missing variable to model tests * do make fixup * update bros.mdx * add boilerate plate for no_head inference test * update BROS_PRETRAINED_MODEL_ARCHIVE_LIST (add naver-clova-ocr prefix) * add prepare_bros_batch_inputs function * update modeling_common to add bbox inputs in Bros Model Test * remove unnecessary model inference * add test case * add model_doc * add test case for token_classification * apply fixup * update modeling code * update BrosForTokenClassification loss calculation logic * revert logits preprocessing logic to make sure logits have original shape * - update class name * - add BrosSpadeOutput - update BrosConfig arguments * add boilerate plate for no_head inference test * add prepare_bros_batch_inputs function * add test case * add test case for token_classification * update modeling code * update BrosForTokenClassification loss calculation logic * revert logits preprocessing logic to make sure logits have original shape * apply masking on the fly * add BrosSpadeForTokenLinking * update class name put docstring to the beginning of the file * separate the logits calculation logic and loss calculation logic * update logic for loss calculation so that logits shape doesn't change when return * update typo * update prepare_config_and_inputs * update dummy node initialization * update last_hidden_states getting logic to consider when return_dict is False * update box first token mask param * bugfix: remove random attention mask generation * update keys to ignore on load missing * run make style and quality * apply make style and quality of other codes * update box_first_token_mask to bool type * update index.md * apply make style and quality * apply make fix-copies * pass check_repo * update bros model doc * docstring bugfix fix * add checkpoint for doc, tokenizer for doc * Update README.md * Update docs/source/en/model_doc/bros.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update bros.md * Update src/transformers/__init__.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/bros.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * apply suggestions from code review * apply suggestions from code review * revert test_processor_markuplm.py * Update test_processor_markuplm.py * apply suggestions from code review * apply suggestions from code review * apply suggestions from code review * update BrosSpadeELForTokenClassification head name to entity linker * add doc string for config params * update class, var names to more explicit and apply suggestions from code review * remove unnecessary keys to ignore * update relation extractor to be initialized with config * add bros processor * apply make style and quality * update bros.md * remove bros tokenizer, add bros processor that wraps bert tokenizer * revert change * apply make fix-copies * update processor code, update itc -> initial token, stc -> subsequent token * add type hint * remove unnecessary condition branches in embedding forward * fix auto tokenizer fail * update docstring for each classes * update bbox input dimension as standard 2 points and convert them to 4 points in forward pass * update bros docs * apply suggestions from code review : update Bros -> BROS in bros.md * 1. box prefix var -> bbox 2. update variable names to be more explicit * replace einsum with torch matmul * apply style and quality * remove unused argument * remove unused arguments * update docstrings * apply suggestions from code review: add BrosBboxEmbeddings, replace einsum with classical matrix operations * revert einsum update * update bros processor * apply suggestions from code review * add conversion script for bros * Apply suggestions from code review * fix readme * apply fix-copies --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Hi @jinhopark8345, |
You can refer to the example notebook for identifying intra-relationships. If you are looking for information on entity linking versus entity extraction, you can check out the entity linking explanation vs entity extraction here. |
Thank you @jinhopark8345 , this is extremely helpful! |
What does this PR do?
Add BROS(BERT Relying On Spatiality) to 🤗 Transformers
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@NielsRogge