Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka Connect: Add SMTs for Debezium and AWS DMS #11936

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

ismailsimsek
Copy link
Contributor

@ismailsimsek ismailsimsek commented Jan 9, 2025

resolves #10844
resolves #11914

Copied over kafka-connect-transforms code. no code changes made
applied code formatting
and updated build.gradle accordingly

cc: @bryanck could you please take look at this when you have chance?

dependencies {
implementation project(path: ':iceberg-bundled-guava', configuration: 'shadow')
implementation libs.bson
// implementation libs.slf4j //TODO DISABLED! do we need this??
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented out this. not sure if its needed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a direct dependency so should not be commented out

@ismailsimsek
Copy link
Contributor Author

@jbonofre @bryanck its ready for review

@Fokko Fokko requested a review from bryanck January 15, 2025 14:04
bryanck and others added 13 commits January 15, 2025 17:13
(cherry picked from commit 639b0d5b41b827d984aae04efe594315ec2b2b91)
(cherry picked from commit 63cde8e7f6c12392c7741922d5e6ad807051f24a)
(cherry picked from commit b9cd15de938e57bf178e1ccb443e481fde881224)
(cherry picked from commit c17c6f734dac96975959e6165416396e4058332c)
(cherry picked from commit 03dcf40b484f40f62c551b9ddf5cefea93a3440a)
(cherry picked from commit d0adaf9f961ceb89aaa408c03874788b3cf2c422)
(cherry picked from commit 5812322e595cee663d920aedaed21998fffa9bdf)
(cherry picked from commit bf82d607dc2b5e816c8b6f59bcbdc48281154e98)
(cherry picked from commit 89f533b2e689cbd1935c3bd1b82eea5e9dc0cd07)
(cherry picked from commit 92e4d984fe41c20faf68b1c36e6fd20759e0a19f)
* smt-nested-json-as-map

- parse json objects into Maps rather than Structs prior to handing to the iceberg connector, for users with unstructured json data.

(cherry picked from commit 303435aa794d8df1728f83ca5179e896b17ca4ff)
* option-to-inject-kafka-metadata

- SMT to add Kafka metadata (topic, partition, offset, timestamp) to Struct and Map types

(cherry picked from commit 423b4a8b0f2e42f2dd7de315631e944c285dcb09)
* matf-non-flattening-mongodb-debezium-smt

- adds debezium mongo SMT for converting BSON before/after into typed Struct before/after

(cherry picked from commit 21d741e53ce77547edbb5838f1b2b49db619be0c)
@ismailsimsek ismailsimsek force-pushed the kafka-smt-copy branch 2 times, most recently from 2e2c727 to 9e64c5b Compare January 15, 2025 16:45
@@ -100,6 +101,7 @@ avro-avro = { module = "org.apache.avro:avro", version.ref = "avro" }
awssdk-bom = { module = "software.amazon.awssdk:bom", version.ref = "awssdk-bom" }
awssdk-s3accessgrants = { module = "software.amazon.s3.accessgrants:aws-s3-accessgrants-java-plugin", version.ref = "awssdk-s3accessgrants" }
azuresdk-bom = { module = "com.azure:azure-sdk-bom", version.ref = "azuresdk-bom" }
bson = { module = "org.mongodb:bson", version.ref = "bson-ver"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to verify this addition is captured in LICENSE/NOTICE files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added section for mongodb bson libs to LICENSE

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove it until we add the transform library to the runtime. When we do that, then the new license plugin can add what is needed.

@@ -159,6 +161,7 @@ jaxb-runtime = { module = "org.glassfish.jaxb:jaxb-runtime", version.ref = "jaxb
kafka-clients = { module = "org.apache.kafka:kafka-clients", version.ref = "kafka" }
kafka-connect-api = { module = "org.apache.kafka:connect-api", version.ref = "kafka" }
kafka-connect-json = { module = "org.apache.kafka:connect-json", version.ref = "kafka" }
kafka-connect-transforms = { module = "org.apache.kafka:connect-transforms", version.ref = "kafka" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to verify this addition is captured in LICENSE/NOTICE files (if separate from the other kafka dependencies)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added section for kafka libs to LICENSE

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this too for now, until we add the transforms to the runtime.

@bryanck
Copy link
Contributor

bryanck commented Jan 16, 2025

Thanks @ismailsimsek for porting this over!

@ismailsimsek ismailsimsek force-pushed the kafka-smt-copy branch 2 times, most recently from 823cc83 to 34445e6 Compare January 16, 2025 17:56
LICENSE Outdated
Comment on lines 337 to 357

--------------------------------------------------------------------------------

This product includes software developed at The Apache Software Foundation.

Apache Kafka (kafka-clients, kafka-connect-api, kafka-connect-json, kafka-connect-transforms)

Copyright: 1999-2022 The Apache Software Foundation.
Home page: https://kafka.apache.org/
License: https://www.apache.org/licenses/LICENSE-2.0

--------------------------------------------------------------------------------

This product includes software developed by MongoDB, Inc.

* MongoDB BSON (bson)

Copyright: 2008-present MongoDB, Inc.
Home page: https://www.mongodb.com/
License: https://github.com/mongodb/mongo-java-driver/blob/main/LICENSE.txt
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielcweeks is this look ok?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't modify the root LICENSE

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this isn't yet being included in the runtime, we don't necessarily need to add anything to the runtime LICENSE until we add it to that.

Copy link
Contributor Author

@ismailsimsek ismailsimsek Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bryanck should it be the NOTICE file instead? bson is using implementation scope? is it then included with the distribution?

reverted

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants