-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sink connector ID configuration #65
Comments
* WIP - converter for map/struct w/ and w/out schema * Clarify converter config * Don't log record value as it could be large * test cleanup * Add changelog entry * Apply suggestions from code review These will probably need a follow-up commit to iron out any breakages... Co-authored-by: Rich Ellis <ricellis@users.noreply.github.com> * Post-suggestion fixups * PR suggestion: more explicit schema/type checks * PR feedback: cleanup imports * PR feedback: drop docid checks, to be addressed in #65 * PR feedback: fix tests and remove testNonReplicateSinkRecordSchema Test removed because we no longer support the KC_SCHEMA stuff, as discussed in PR. * PR feedback: clarify defaults * PR feedback: used linked list * Update src/main/java/com/ibm/cloudant/kafka/connect/CloudantSinkTask.java Co-authored-by: Rich Ellis <ricellis@users.noreply.github.com> * PR feedback: streaming implementation of put() Co-authored-by: Rich Ellis <ricellis@users.noreply.github.com>
Rather than a sink connector configuration option I think providing (and documenting) a SMT that is able to insert the message key into the message value as an |
Looking at the complexity that is coming out in #82 from trying to handle all the variations of inserting an
For the header, we would ensure the StringConverter was used as we only want strings for IDs. We would need to document the name of the header and the expectation that the value is a string. (this would cause errors for things that cannot convert cleanly to strings, inline with general Kafka behaviour on converters). We would overwrite existing |
* Add README section with SMT examples for customizing _id field and document the `HeaderFrom` transform to convert an event key to the header * Modify the ConnectRecordMapper to check for the presence of a custom header on the event and use the value of that header as the document ID. #65
Currently the sink connector handles
_id
as follows:replication
option_id
present inSinkRecord
_id
written_id
_id
generated by Cloudant<topic-name>_<partition>_<offset>_<_id>
_id
generated by CloudantI don't think the
_id
should be decided by a seemingly unrelated configuration option.We should instead document the use of SMTs to faciliate customization of the
_id
; namely:ReplaceField
_id
_id
field to force Cloudant to generateWe should provide (and document) a new SMT class (
KeyToDocId
?) that can insert the message key into the value_id
field (noting that it should be aString
schema or stringifiable).In the case that the required
_id
field is not present or the record has anull
key then we should not pass an ID to Cloudant and just let it generate a UUID. In the case that the user wanted all_id
generated they could useReplaceField
with an exclude to remove any existing_id
and the defaultid
mode.We should also document that it is possible to use further SMTs to customize e.g.
ValueToKey
andExtractField
to convert some field to the key and then using theKeyToDocId
transform to use the new key as an ID.The text was updated successfully, but these errors were encountered: