Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grounding dino requirements #6263

Open
wants to merge 4 commits into
base: 1.5_maintenance
Choose a base branch
from

Conversation

lucienfostier
Copy link
Collaborator

  • GafferML::Inference : Add support for registering custom op library into ONNX.

Checklist

  • I have read the contribution guidelines.
  • I have updated the documentation, if applicable.
  • I have tested my change(s) in the test suite, and added new test cases where necessary.
  • My code follows the Gaffer project's prevailing coding style and conventions.

@lucienfostier
Copy link
Collaborator Author

@johnhaddon Could you have a look at this PR, it would be useful to deploy grounding dino through GafferML at IE.

@lucienfostier lucienfostier changed the base branch from main to 1.5_maintenance February 8, 2025 19:58
This can be useful for models like Grounding Dino that requires
the onnx runtime extension for the BERT tokenizer.
@lucienfostier lucienfostier force-pushed the groundingDinoRequirements branch from 8c1d9db to 260f7f3 Compare February 8, 2025 19:59
Copy link
Member

@johnhaddon johnhaddon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Lucien! Would be great to get your dino stuff deployed at IE, and these seem like worthwhile features in general. I've commented inline as usual...
Cheers...
John

@@ -213,6 +216,16 @@ Tensor::Tensor( const IECore::ConstDataPtr &data, std::vector<int64_t> shape )
std::copy( typedData->readable().begin(), typedData->readable().end(), value.GetTensorMutableData<bool>() );
m_state = new State{ std::move( value ), nullptr };
}
else if constexpr( std::is_same_v<DataType, StringVectorData> )
{
// Special case for the vector of std::string fiasco.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "vector of bool fiasco" is a specific problem with vector<bool> that doesn't apply to strings. It looks like we do need a special case for strings, but not because of any underlying problems with vector<string>.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed here c7db62d

Ort::Value value = Ort::Value::CreateTensor(
allocator, shape.data(), shape.size(), ONNX_TENSOR_ELEMENT_DATA_TYPE_STRING
);
std::copy( typedData->readable().begin(), typedData->readable().end(), value.GetTensorMutableData<std::string>() );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really surprised this works. As far as I can see, the underlying ONNX C library represents strings as const char *, and there's no guarantee that std::string has the same layout (although it is possible to implement it so). Even if it did have the same layout, I think we've leaked the strings here, because the destructors for the copies will never get called (there aren't std::string objects in the C library).

I think we really need to build a vector<const char *> here and pass it to value.FillStringTensor() instead. Does that make sense?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed here c7db62d

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that makes sense but this comment is very confusing

Oh, that is confusing! Looking at the source it looks like the C library is actually implemented in C++ behind the scenes, so it's plausible that the tensor is backed by std::string, which would also explain how this was working before. I'd prefer not to depend on that though and go through the "proper" API as you've done now - thanks for the update.

@@ -125,6 +125,9 @@ void dispatchTensorData( const Ort::Value &value, F &&functor )
case ONNX_TENSOR_ELEMENT_DATA_TYPE_INT64 :
functor( value.GetTensorData<int64_t>() );
break;
case ONNX_TENSOR_ELEMENT_DATA_TYPE_STRING :
functor( value.GetTensorData<std::string>() );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised that this works - as I understand it, there are const char * elements in the tensor, so casting them to std::string will likely blow up. Would be good to have some unit tests to demonstrate this all working. We also need to update tensorGetItem() in GafferMLModule.cpp to handle strings, which will help with the testing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would need a bit more information here because it looks to me that the generic functor wont work here, right?
We need something a bit more specific to reconstruct std::string from char*, right?
Or do we convert into some kind of CharVectorData?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, yeah, this is messy. The ONNX docs do say we can't do it this way though - GetTensorData calls through to the C GetTensorMutableData function where the docs say this :

  • \param[in] value A tensor type (string tensors are not supported)

We don't really want to convert to another data type either though - for most of the uses of dispatchTensorData() it would be wasteful to make a copy. Even if we are getting lucky here and getting a pointer to real std::strings , Tensor::isEqualTo() and Tensor::memoryUsage() look broken since they're assuming they're looking at POD types, and won't handle the fact that the string contains a pointer. Since 50% of the clients need a special case anyway, we could argue that dispatchTensorData() should throw for string data, forcing clients to think carefully and implement a special case.

All this just convinces me further of the need for really strong testing of the string tensors .

@johnhaddon
Copy link
Member

Thanks for the update Lucien, and for pointing out the mysterious std::string references in the C documentation. If indeed the tensors are backed by std::string, it is a little tempting to rely on that if it makes thing easier in the short term. But I'd only be willing to do that if we have robust test coverage for all string tensor operations - equality, copy, hash, indexing, conversion to data etc. I think getting that coverage in place is the key - it would let us confirm the possibility of there being real strings there, and give us a safety net if we decide to proceed on that assumption.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants