Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update nestcollectionsdriver.py #57

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

vsarathy1
Copy link
Contributor

No description provided.

@vsarathy1 vsarathy1 requested a review from d-nagy November 7, 2024 18:35
Copy link
Contributor

@d-nagy d-nagy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vsarathy1, looks pretty good overall. I had one concern about the regex substitution for making the transactions access the correct scope, and then this turned into a more substantial question about whether this is how we should deal with the transactions.

Comment on lines +437 to +438
if self.schema == constants.CH2_DRIVER_SCHEMA["CH2P"]:
stmt = re.sub("default:bench.ch2pp.", "default:bench.ch2p.", statement)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not need the equivalent here for the original ch2 schema?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, in the regex pattern we should escape the .s so that we actually match literal . characters instead of anything.

i.e.

stmt = re.sub("default:bench\.ch2pp\.", "default:bench.ch2p.", statement)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative approach to all of this could also be to supply a query_context when submitting N1QL queries (see https://docs.couchbase.com/cloud/n1ql/n1ql-manage/query-settings.html#query_context, and also https://docs.couchbase.com/cloud/n1ql/n1ql-language-reference/prepare.html#query-context for some info about query contexts for prepared statements)

With this approach, all the transaction statement definitions can just use not-fully-qualified table names like orders, and the query_context can be set to constants.CH2_NAMESPACE + ":" + constants.CH2_BUCKET + "." self.schema.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the second link above, prepared statements are also "stored relative to the current query context", which seems to mean that if we prepare queries while providing a specific query context like default:bench.ch2pp then we can prepare another query with the same name with a different query context and they won't interfere.

A prepared statement is created and stored relative to the current query context. You can create multiple prepared statements with the same name, each stored relative to a different query context. This enables you to run multiple instances of the same application against different datasets.

To execute a prepared statement, the query context must be the same as it was when the prepared statement was created; otherwise the prepared statement will not be found.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have CH2_TXN_QUERIES for CH2, and CH2PP_TXN_QUERIES for both CH2P and CH2PP, so CH2 and CH2PP are taken care of correctly.
We handle it like this:
if self.schema == constants.CH2_DRIVER_SCHEMA["CH2"]:
txnQueries = CH2_TXN_QUERIES
else:
txnQueries = CH2PP_TXN_QUERIES
for txn, queries in txnQueries.items():
for query, statement in queries.items():
stmt = statement
if self.schema == constants.CH2_DRIVER_SCHEMA["CH2P"]:
stmt = re.sub("default:bench.ch2pp.", "default:bench.ch2p.", statement)

Instead of duplicating and adding another dictionary for CH2P, I replace ch2pp with ch2p.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the change to escape the .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. This was inherited from the nestcollectionsdriver written by the query service team for the transactions part, so that is the history of this.
Your suggestion to set the correct query context is fantastic, but I am not sure how to set this via the rest api. If you can find some way to do it, we can add it later, but for now, let us pass on this if ok with you.

@@ -1094,34 +599,34 @@ def doDelivery(self, params):

result = [ ]
for d_id in range(1, constants.DISTRICTS_PER_WAREHOUSE+1):
rs1, status = runNQuery("begin", self.prepared_dict[ txn + "beginWork"],"",self.delivery_txtimeout, randomhost)
rs1, status = runNQuery("begin", self.prepared_dict[ self.schema + txn + "beginWork"],"",self.delivery_txtimeout, randomhost)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having self.schema + txn everywhere, we could just define txn = self.schema + "DELIVERY" initially.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

Comment on lines +1078 to +1079
stTime = int(time.time() * 1000)
stTime = int(time.time_ns() / 1000000)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need one of these (same goes for eTime)


# In benchmark run mode, if the duration has elapsed, stop executing queries
if duration is not None:
if start > endBenchmarkTime:
logging.debug("%s started at: %s (started after the duration of the benchmark)" % (query_id_str, startTime))
break

logging.info("%s started at: %s" % (query_id_str, startTime))
logging.info("%s started at: %s %d" % (query_id_str, startTime, stTime))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the idea behind logging stTime? I presume the log line will look something like so:

11-12-2024 19:02:25 [runCH2Queries:1580] INFO : AClient 1:Loop 1:Q02: started at: 19:02:25 1731501546163

If we want to log the query start time with more precision, then we could do something like:

startTime = datetime.fromtimestamp(start).strftime("%H:%M:%S.%f")

Note that this does require from datetime import datetime.

This would make use log something like so:

11-12-2024 19:02:25 [runCH2Queries:1580] INFO : AClient 1:Loop 1:Q02: started at: 19:02:25.123456

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this applies to eTime too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, this was some debugging I had put in for very short running queries while testing with very small data, and forgot to remove these.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants