Proper sentence-transformers ONNX export support #1589

fxmarty · 2023-12-12T15:06:18Z

As reported in #1519, simple mapping sentence-transformers to transformers library allows to use only a subset of sentence-transformers library.

This PR adds the support of the export of sentence_embedding for sentence-transformers models.

Examples:

optimum-cli export onnx -m sentence-transformers/clip-ViT-B-32-multilingual-v1 clip_vit_multilingual_onnx
optimum-cli export onnx -m sentence-transformers/all-MiniLM-L6-v2 minilm_onnx

JingyaHuang · 2024-01-09T11:35:01Z

Can those extra outputs be supported directly in transformers? Just find the changes a bit hacky, and this is causing errors in optimum neuron subpackage: aws-neuron/aws-neuron-sdk#808

JingyaHuang · 2024-01-09T11:55:35Z

standardizing model attributes made it a bit misleading for debug, eg. it was surprising to get sentence-transformers-transformer as model_type from a config where the model_type is marked as bert:

fxmarty · 2024-01-09T12:23:37Z

I doubt this is feasible, as sentence-transformers adds quite a few features on top of transformers. For example, the sentence embeddings.

standardizing model attributes made it a bit misleading for debug, eg. it was surprising to get sentence-transformers-transformer as model_type from a config where the model_type is marked as bert:

I don't find this to be too hacky, as the model_type bert refers to the bottleneck model in sentence-transformers' nn.Sequential. Some sentence-transformers models use the Transformer as bottleneck, some use a CLIPModel, and the export is different depending on the architecture:

optimum/optimum/exporters/onnx/model_configs.py

Lines 818 to 842 in fc214d4

    
           class SentenceTransformersTransformerOnnxConfig(TextEncoderOnnxConfig): 
        
               NORMALIZED_CONFIG_CLASS = NormalizedTextConfig 
        
               DEFAULT_ONNX_OPSET = 14  # Some bottleneck transformers models require a specific ONNX opset to be successfully exported. We put a rather high opset here for the export to work for all architectures. 
        
               @property 
        
               def inputs(self) -> Dict[str, Dict[int, str]]: 
        
                   return { 
        
                       "input_ids": {0: "batch_size", 1: "sequence_length"}, 
        
                       "attention_mask": {0: "batch_size", 1: "sequence_length"}, 
        
                   } 
        
               @property 
        
               def outputs(self) -> Dict[str, Dict[int, str]]: 
        
                   return { 
        
                       "token_embeddings": {0: "batch_size", 1: "sequence_length"}, 
        
                       "sentence_embedding": {0: "batch_size"}, 
        
                   } 
        
               # we need to set output_attentions=True in the model input to avoid calling 
        
               # torch.nn.functional.scaled_dot_product_attention that is not supported by the ONNX export 
        
               # due to the op torch.nn.functional.multi_head_attention_forward used for WavLM 
        
               def patch_model_for_export( 
        
                   self, model: Union["PreTrainedModel", "TFPreTrainedModel"], model_kwargs: Optional[Dict[str, Any]] = None 
        
               ) -> "ModelPatcher": 
        
                   return SentenceTransformersTransformerPatcher(self, model, model_kwargs=model_kwargs)

&

optimum/optimum/exporters/onnx/model_configs.py

Lines 872 to 883 in fc214d4

    
           class SentenceTransformersCLIPOnnxConfig(CLIPOnnxConfig): 
        
               @property 
        
               def outputs(self) -> Dict[str, Dict[int, str]]: 
        
                   return { 
        
                       "text_embeds": {0: "text_batch_size"}, 
        
                       "image_embeds": {0: "image_batch_size"}, 
        
                   } 
        
               def patch_model_for_export( 
        
                   self, model: Union["PreTrainedModel", "TFPreTrainedModel"], model_kwargs: Optional[Dict[str, Any]] = None 
        
               ) -> "ModelPatcher": 
        
                   return SentenceTransformersCLIPPatcher(self, model, model_kwargs=model_kwargs)

.

Happy to refactor if needed though

JingyaHuang · 2024-01-15T17:13:39Z

tests/exporters/onnx/test_exporters_onnx_cli.py

+    @require_torch
+    @require_vision
+    @require_sentence_transformers
+    @pytest.mark.timm_test


Why is it timm test? @fxmarty

It is a typo

fxmarty added 2 commits December 12, 2023 16:04

proper sentence-transformers onnx export support

304a4ac

update doc

c767d3e

fxmarty requested review from mht-sharma and michaelbenayoun and removed request for mht-sharma December 12, 2023 15:11

fxmarty added 4 commits December 12, 2023 16:20

style

6890d4c

Merge branch 'master' into support-sentence-transformers-properly

1650e94

fix test

fdf6dac

fix tests

69d4254

fxmarty merged commit a3f4762 into huggingface:main Dec 13, 2023
57 of 62 checks passed

fxmarty mentioned this pull request Dec 13, 2023

Output embedding shape difference between exported onnx version vs. pt version of sentence-transformers model #1519

Closed

4 tasks

xenova mentioned this pull request Dec 29, 2023

Output different from sentence transformers huggingface/transformers.js#486

Closed

leodalcin mentioned this pull request Jan 2, 2024

Cannot convert sentence transformer model properly #1621

Closed

4 tasks

JingyaHuang mentioned this pull request Jan 9, 2024

sentence-transformers-transformer not supported yet with neuron backend aws-neuron/aws-neuron-sdk#808

Closed

JingyaHuang mentioned this pull request Jan 12, 2024

[Inference] Improve the support of sentence transformers huggingface/optimum-neuron#408

Merged

JingyaHuang reviewed Jan 15, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proper sentence-transformers ONNX export support #1589

Proper sentence-transformers ONNX export support #1589

fxmarty commented Dec 12, 2023

JingyaHuang commented Jan 9, 2024

JingyaHuang commented Jan 9, 2024

fxmarty commented Jan 9, 2024 •

edited

Loading

JingyaHuang Jan 15, 2024

fxmarty Jan 15, 2024

Proper sentence-transformers ONNX export support #1589

Proper sentence-transformers ONNX export support #1589

Conversation

fxmarty commented Dec 12, 2023

JingyaHuang commented Jan 9, 2024

JingyaHuang commented Jan 9, 2024

fxmarty commented Jan 9, 2024 • edited Loading

JingyaHuang Jan 15, 2024

Choose a reason for hiding this comment

fxmarty Jan 15, 2024

Choose a reason for hiding this comment

fxmarty commented Jan 9, 2024 •

edited

Loading