-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update nestcollectionsdriver.py #57
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @vsarathy1, looks pretty good overall. I had one concern about the regex substitution for making the transactions access the correct scope, and then this turned into a more substantial question about whether this is how we should deal with the transactions.
if self.schema == constants.CH2_DRIVER_SCHEMA["CH2P"]: | ||
stmt = re.sub("default:bench.ch2pp.", "default:bench.ch2p.", statement) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we not need the equivalent here for the original ch2
schema?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, in the regex pattern we should escape the .
s so that we actually match literal .
characters instead of anything.
i.e.
stmt = re.sub("default:bench\.ch2pp\.", "default:bench.ch2p.", statement)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative approach to all of this could also be to supply a query_context
when submitting N1QL queries (see https://docs.couchbase.com/cloud/n1ql/n1ql-manage/query-settings.html#query_context, and also https://docs.couchbase.com/cloud/n1ql/n1ql-language-reference/prepare.html#query-context for some info about query contexts for prepared statements)
With this approach, all the transaction statement definitions can just use not-fully-qualified table names like orders
, and the query_context
can be set to constants.CH2_NAMESPACE + ":" + constants.CH2_BUCKET + "." self.schema
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the second link above, prepared statements are also "stored relative to the current query context", which seems to mean that if we prepare queries while providing a specific query context like default:bench.ch2pp
then we can prepare another query with the same name with a different query context and they won't interfere.
A prepared statement is created and stored relative to the current query context. You can create multiple prepared statements with the same name, each stored relative to a different query context. This enables you to run multiple instances of the same application against different datasets.
To execute a prepared statement, the query context must be the same as it was when the prepared statement was created; otherwise the prepared statement will not be found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have CH2_TXN_QUERIES for CH2, and CH2PP_TXN_QUERIES for both CH2P and CH2PP, so CH2 and CH2PP are taken care of correctly.
We handle it like this:
if self.schema == constants.CH2_DRIVER_SCHEMA["CH2"]:
txnQueries = CH2_TXN_QUERIES
else:
txnQueries = CH2PP_TXN_QUERIES
for txn, queries in txnQueries.items():
for query, statement in queries.items():
stmt = statement
if self.schema == constants.CH2_DRIVER_SCHEMA["CH2P"]:
stmt = re.sub("default:bench.ch2pp.", "default:bench.ch2p.", statement)
Instead of duplicating and adding another dictionary for CH2P, I replace ch2pp with ch2p.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made the change to escape the .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. This was inherited from the nestcollectionsdriver written by the query service team for the transactions part, so that is the history of this.
Your suggestion to set the correct query context is fantastic, but I am not sure how to set this via the rest api. If you can find some way to do it, we can add it later, but for now, let us pass on this if ok with you.
@@ -1094,34 +599,34 @@ def doDelivery(self, params): | |||
|
|||
result = [ ] | |||
for d_id in range(1, constants.DISTRICTS_PER_WAREHOUSE+1): | |||
rs1, status = runNQuery("begin", self.prepared_dict[ txn + "beginWork"],"",self.delivery_txtimeout, randomhost) | |||
rs1, status = runNQuery("begin", self.prepared_dict[ self.schema + txn + "beginWork"],"",self.delivery_txtimeout, randomhost) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of having self.schema + txn
everywhere, we could just define txn = self.schema + "DELIVERY"
initially.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea!
stTime = int(time.time() * 1000) | ||
stTime = int(time.time_ns() / 1000000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only need one of these (same goes for eTime
)
|
||
# In benchmark run mode, if the duration has elapsed, stop executing queries | ||
if duration is not None: | ||
if start > endBenchmarkTime: | ||
logging.debug("%s started at: %s (started after the duration of the benchmark)" % (query_id_str, startTime)) | ||
break | ||
|
||
logging.info("%s started at: %s" % (query_id_str, startTime)) | ||
logging.info("%s started at: %s %d" % (query_id_str, startTime, stTime)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the idea behind logging stTime
? I presume the log line will look something like so:
11-12-2024 19:02:25 [runCH2Queries:1580] INFO : AClient 1:Loop 1:Q02: started at: 19:02:25 1731501546163
If we want to log the query start time with more precision, then we could do something like:
startTime = datetime.fromtimestamp(start).strftime("%H:%M:%S.%f")
Note that this does require from datetime import datetime
.
This would make use log something like so:
11-12-2024 19:02:25 [runCH2Queries:1580] INFO : AClient 1:Loop 1:Q02: started at: 19:02:25.123456
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All this applies to eTime
too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, this was some debugging I had put in for very short running queries while testing with very small data, and forgot to remove these.
No description provided.