Only execute access rights change queries on a single node #257

aris-aiven · 2024-11-07T18:15:23Z

For the read-only step, we're filtering by replicated users, whose grant information is stored in ZooKeeper.

This means that executing the query on multiple nodes is counterproductive. Instead of achieving faster information propagation, we're getting transaction errors from ZooKeeper. Any one of these can in turn fail the step.

Here's one such example:

Query failed: b'REVOKE INSERT, ALTER UPDATE, ALTER DELETE ON `default`.`keeper_map` FROM `alice`' on <host>:
status_code=500,
exception_code=999,
response=b'Code: 999. Coordination::Exception: Transaction failed: Op #0, path: /clickhouse/access/uuid/914a9caa-682f-5dce-4ed7-b0f16f564798. (KEEPER_EXCEPTION) (version 24.3.5.1)\n'

For the read-only step, we're filtering by replicated users, whose grant information is stored in ZooKeeper. This means that executing the query on multiple nodes is counterproductive. Instead of achieving faster information propagation, we're getting transaction errors from ZooKeeper. Any one of these can in turn fail the step.

Khatskevich · 2024-11-07T18:33:37Z

astacus/coordinator/plugins/clickhouse/steps.py


    async def grant_write_on_table(self, table: Table, user_name: bytes) -> None:
        escaped_user_name = escape_sql_identifier(user_name)
        grant_statement = (
            f"GRANT INSERT, ALTER UPDATE, ALTER DELETE ON {table.escaped_sql_identifier} TO {escaped_user_name}"
        )
-        await asyncio.gather(*(client.execute(grant_statement.encode()) for client in self.clients))
+        await self.clients[0].execute(grant_statement.encode())


This change looks very controversial to me

Yes, in principle it is not necessary to change privileges multiple times, but we do not control how fast it will reach other servers, do we? Theoretically it might be delayed for unknown time which spoils all guarantees.

I really want to continue this discussion. So far I think it is better to just delete this step completely...

I mean the cluster is in a single region, and zookeeper is strictly serializable, so the upper bound on the latency for the change to go through in ZK is <5ms.

Actually I don't think the deduplication log is relevant for the restore. The log is saved in ZooKeeper, and the restore process doesn't start from a ZK snapshot but from an empty state. So the block-level information is lost.

Will test if in practice the argument I just provided stands, but the main branch is broken at the moment and I'd rather we fix it. If the step is irrelevant, then of course we'll drop it.

Zookeeper only supports write operation serialization, but reads could propagate slowly.
They first go to the local zookeeper, and then to the clickhouse to be processed, as I understand.
But looks like it does not efficiently make things worse, you are right.

the solution is not easy, so, lets start with this one

aris-aiven marked this pull request as ready for review November 7, 2024 18:16

aris-aiven requested a review from a team November 7, 2024 18:17

Khatskevich reviewed Nov 7, 2024

View reviewed changes

aris-aiven requested review from a team and Khatskevich November 7, 2024 20:08

Khatskevich approved these changes Nov 7, 2024

View reviewed changes

Khatskevich merged commit 28484bc into main Nov 7, 2024
2 checks passed

Khatskevich deleted the aris-fix-zk-transaction-error branch November 7, 2024 22:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only execute access rights change queries on a single node #257

Only execute access rights change queries on a single node #257

aris-aiven commented Nov 7, 2024 •

edited

Loading

Khatskevich Nov 7, 2024 •

edited

Loading

aris-aiven Nov 7, 2024

aris-aiven Nov 7, 2024

Khatskevich Nov 7, 2024

Only execute access rights change queries on a single node #257

Only execute access rights change queries on a single node #257

Conversation

aris-aiven commented Nov 7, 2024 • edited Loading

Khatskevich Nov 7, 2024 • edited Loading

Choose a reason for hiding this comment

aris-aiven Nov 7, 2024

Choose a reason for hiding this comment

aris-aiven Nov 7, 2024

Choose a reason for hiding this comment

Khatskevich Nov 7, 2024

Choose a reason for hiding this comment

aris-aiven commented Nov 7, 2024 •

edited

Loading

Khatskevich Nov 7, 2024 •

edited

Loading