[CDAP-21079] Implement cdap-messaging-spi extension for spanner - Part 2 #15771

sidhdirenge · 2024-12-13T14:16:41Z

Implementation for fetch logic.

...saging-ext-spanner/src/main/java/io/cdap/cdap/messaging/spanner/SpannerMessagingService.java

tivv

Please also add a unit test

tivv · 2024-12-18T22:41:06Z

...saging-ext-spanner/src/main/java/io/cdap/cdap/messaging/spanner/SpannerMessagingService.java

+    // publish_ts, sequence_id
+    String sqlStatement = String.format(
+        "SELECT %s, %s, UNIX_MICROS(%s), %s FROM %s where (%s > TIMESTAMP_MICROS(%s)) or"
+            + " (%s = TIMESTAMP_MICROS(%s) and %s > %s) order by" + " %s, %s LIMIT %s",


Did we confirm this would use indexes? I am also a bit worried we are doing %s = TIMESTAMP_MICROS(%s). We essentially do truncation and then doing an "equal" comparison of this truncation results. It may lead to subtle errors unless we are strict about sequence ids.
E.g. let's say we have timestamps equivalent to 3.1,3.2 and 3.3 micros. When we read all are rounded to 3 micros. Let's say the last one we read was 3.2. Next time we will read 3.1 again because 3.1 > ROUND(3.2)

Regarding indexes : as of now the message tables are not indexed yet. During POC performed by Masoud, I believe performance analysis was done without indexing these tables. I see a suggestion in Cloud Spanner UI to index these queries for better performance. I plan to review this once we start performance testing.

Regarding the timestamps : I thought of various different scenarios. During publish we usually take care of such cases. Also publish uses spanner.commit_timestamp() which is of precision of microseconds, so the decimal scenario should not take place. Again for bigger pipelines and large volume of messages we can test if there is issue in this sequencing.

Will collect results related to various scenarios once basic implementations are done & we start continuous tests.

Will definitely take up this enhancement in follow up PR

Can you please add a TODO with JIRA for all the enhancements in the code?

Can you test on a big table? I believe we still plan to retain 7 days of data (correct me if I am wrong) and this can be quite a lot of records. And poll would probably most times return 0 records

Is it okay if I make the changes related to table indexing & the related tests as part of a follow up PR? There are a few metrics related to initial topic table creations as well which I need to check.

With the current PR, most of the messaging service related changes are implemented and then I plan to use these changes in existing probers to capture the exact stats.

Please let me know your thoughts.

PK should be on ts, sequence_id, payload_sequence_id. When I did the experiments, we were hitting the index as long as ts, sequence_id were being provided in the where clause in the same order as PK had been defined.
AFAIR, Arjan also did the test and saw no issue in it.
+1 to test it on a large populated table again to determine the latency one more time.

SG to test later

...saging-ext-spanner/src/main/java/io/cdap/cdap/messaging/spanner/SpannerMessagingService.java

sidhdirenge · 2024-12-19T17:23:48Z

Please also add a unit test

Added junits. These currently cover just basic scenarios right now. Will enhance and add more in follow up PRs.

sonarqubecloud · 2024-12-23T07:50:28Z

Quality Gate failed

Failed conditions
0.0% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

sidhdirenge added the build Triggers github actions build label Dec 13, 2024

sidhdirenge requested review from masoud-io and tivv December 13, 2024 14:16

masoud-io reviewed Dec 16, 2024

View reviewed changes

sidhdirenge self-assigned this Dec 16, 2024

sidhdirenge requested a review from masoud-io December 16, 2024 18:00

tivv reviewed Dec 18, 2024

View reviewed changes

sidhdirenge requested a review from tivv December 19, 2024 17:26

tivv approved these changes Dec 20, 2024

View reviewed changes

[CDAP-21079] Implement cdap-messaging-spi extension for spanner - Part 2

9c1ed72

sidhdirenge force-pushed the spanner-messaging-2 branch from 64ac98d to 9c1ed72 Compare December 23, 2024 06:11

sidhdirenge merged commit 1ef91ed into develop Dec 23, 2024
9 of 10 checks passed

sidhdirenge deleted the spanner-messaging-2 branch December 23, 2024 07:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CDAP-21079] Implement cdap-messaging-spi extension for spanner - Part 2 #15771

[CDAP-21079] Implement cdap-messaging-spi extension for spanner - Part 2 #15771

sidhdirenge commented Dec 13, 2024

tivv left a comment

tivv Dec 18, 2024

sidhdirenge Dec 19, 2024

sidhdirenge Dec 19, 2024

itsankit-google Dec 19, 2024

tivv Dec 19, 2024

sidhdirenge Dec 20, 2024

masoud-io Dec 20, 2024

tivv Dec 20, 2024

sidhdirenge commented Dec 19, 2024

sonarqubecloud bot commented Dec 23, 2024

[CDAP-21079] Implement cdap-messaging-spi extension for spanner - Part 2 #15771

[CDAP-21079] Implement cdap-messaging-spi extension for spanner - Part 2 #15771

Conversation

sidhdirenge commented Dec 13, 2024

tivv left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sidhdirenge commented Dec 19, 2024

sonarqubecloud bot commented Dec 23, 2024

Quality Gate failed