Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PS-9297 Lost a record while scanning in reverse order on cursor #5348

Open
wants to merge 1 commit into
base: release-8.0.37-29
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions mysql-test/suite/innodb/r/percona_bug_ps9297.result
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
create table c(id int primary key, val varchar(8000));
insert into c values (1,repeat('a',3000));
insert into c values (2,repeat('a',3000));
insert into c values (3,repeat('a',3000));
insert into c values (4,repeat('a',3000));
insert into c values (5,repeat('a',3000));
insert into c values (-1,repeat('a',2000));
insert into c values (-2,repeat('a',2000));
insert into c values (-3,repeat('a',2000));
insert into c values (-4,repeat('a',2000));
insert into c values (-5,repeat('a',2000));
select id,length(val) from c order by id desc;
id length(val)
5 3000
4 3000
3 3000
2 3000
1 3000
-1 2000
-2 2000
-3 2000
-4 2000
-5 2000
set DEBUG="+d,desc_scan_debug";
set DEBUG_SYNC="desc_scan_before_restore_position SIGNAL desc_scan_before_restore_position_done WAIT_FOR continue";
select id,length(val) from c order by id desc;
set DEBUG_SYNC="now WAIT_FOR desc_scan_before_restore_position_done";
update c set val=repeat('b',6000) where id=2;
set DEBUG_SYNC="now SIGNAL continue";
id length(val)
5 3000
4 3000
3 3000
2 3000
1 3000
-1 2000
-2 2000
-3 2000
-4 2000
-5 2000
set DEBUG="-d,desc_scan_debug";
drop table c;
31 changes: 31 additions & 0 deletions mysql-test/suite/innodb/t/percona_bug_ps9297.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
--source include/have_debug.inc

connect (con1, localhost, root,,);
--connection default
create table c(id int primary key, val varchar(8000));
insert into c values (1,repeat('a',3000));
insert into c values (2,repeat('a',3000));
insert into c values (3,repeat('a',3000));
insert into c values (4,repeat('a',3000));
insert into c values (5,repeat('a',3000));

insert into c values (-1,repeat('a',2000));
insert into c values (-2,repeat('a',2000));
insert into c values (-3,repeat('a',2000));
insert into c values (-4,repeat('a',2000));
insert into c values (-5,repeat('a',2000));

--connection con1
select id,length(val) from c order by id desc;
set DEBUG="+d,desc_scan_debug";
set DEBUG_SYNC="desc_scan_before_restore_position SIGNAL desc_scan_before_restore_position_done WAIT_FOR continue";
send select id,length(val) from c order by id desc;
--connection default
set DEBUG_SYNC="now WAIT_FOR desc_scan_before_restore_position_done";
update c set val=repeat('b',6000) where id=2;
set DEBUG_SYNC="now SIGNAL continue";
--connection con1
reap;
set DEBUG="-d,desc_scan_debug";
--connection default
drop table c;
14 changes: 14 additions & 0 deletions storage/innobase/btr/btr0cur.cc
Original file line number Diff line number Diff line change
Expand Up @@ -3849,6 +3849,7 @@ dberr_t btr_cur_pessimistic_update(ulint flags, btr_cur_t *cursor,
dberr_t optim_err;
roll_ptr_t roll_ptr;
bool was_first;
bool was_before_last = false;
ulint n_reserved = 0;
ulint max_ins_size = 0;
trx_t *const trx = (thr == nullptr) ? nullptr : thr_get_trx(thr);
Expand Down Expand Up @@ -4155,6 +4156,10 @@ dberr_t btr_cur_pessimistic_update(ulint flags, btr_cur_t *cursor,
record on its page? */
was_first = page_cur_is_before_first(page_cursor);

was_before_last =
!page_rec_is_supremum(btr_cur_get_rec(cursor)) &&
page_rec_is_supremum(page_rec_get_next(btr_cur_get_rec(cursor)));

/* Lock checks and undo logging were already performed by
btr_cur_upd_lock_and_undo(). We do not try
btr_cur_optimistic_insert() because
Expand All @@ -4169,6 +4174,15 @@ dberr_t btr_cur_pessimistic_update(ulint flags, btr_cur_t *cursor,
ut_ad(rec_offs_validate(rec, cursor->index, *offsets));
page_cursor->rec = rec;

if (was_before_last) {
/** We may moved the record from the current page to the next page.
* Therefore, we need to disable optimistic access to the next page to
* ensure that the record will not be mistakenly skipped during reverse
* scanning on cursor.
*/
buf_block_modify_clock_inc(btr_cur_get_block(cursor));
Copy link
Contributor

@satya-bodapati satya-bodapati Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While PS-9092 blocked the merge(that caused records movement), this bug shows that records are moved without a merge.

But when a record is moved pessimistically, how can we assume it is transferred to a neighbour? You should limit this check if the movement is neighbour only.

If the record is moved to a non-neighbour, please check this case.
It should be protected by delete marking of record. ie we should read delete marked version of record and the new version that is moved to some far page is not visible/seen. This is OK.

To summarize:

  1. record movement due to merges (PS-9092)
  2. record movement to neighbour without merge (this PR )
  3. record movement to FAR pages (to be verified that delete marked version is read and new version is not SEEN).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, but in our observations, pessimistic updates are limited to the current page and its sibling pages. Whether it is inserted after splitting or inserted directly into its sibling page, the inserted page needs to be disabled for optimistic access, so we do not need stricter judgment here. In addition, the original record has been physically deleted from the original page, and no changes to the delete-mark will be involved.

Copy link
Contributor

@satya-bodapati satya-bodapati Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's because for the query update c set val=repeat('b',6000) where id=2, the update only changes non-key fields, and the new record should always land on neighbour pages.

But if there are queries that update key, records can move with delete marking.
For example UPDATE c SET id = 20 WHERE id = 5;

Copy link
Author

@zhry110 zhry110 Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether it is PS-9092 or PS-9297, the necessary condition to trigger the problem is that the same physical record is moved between adjacent pages. But UPDATE c SET id = 20 WHERE id = 5; actually del-marks the record with id=5 and inserts the record with id=20 within a transaction. These two records are not physically the same record. Therefore I think this scenario is not affected by this problem, the ReadView of the read transaction will ensure that the correct version of the record is visible.

}

/* Multiple transactions cannot simultaneously operate on the
same temp-table in parallel.
max_trx_id is ignored for temp tables because it not required
Expand Down
6 changes: 6 additions & 0 deletions storage/innobase/btr/btr0pcur.cc
Original file line number Diff line number Diff line change
Expand Up @@ -419,6 +419,12 @@ void btr_pcur_t::move_backward_from_page(mtr_t *mtr) {

mtr_commit(mtr);

DBUG_EXECUTE_IF("desc_scan_debug", {
if (strcmp(index()->table_name, "test/c") == 0) {
DEBUG_SYNC_C("desc_scan_before_restore_position");
}
});

mtr_start(mtr);

restore_position(latch_mode2, mtr, UT_LOCATION_HERE);
Expand Down