本文是对MySQL B+树根页(root page)的详细学习。
参考:https://help.aliyun.com/zh/polardb/polardb-for-mysql/innodb-space-files
源码:https://fossies.org/dox/mysql-9.7.0/btr0btr_8cc_source.html
根页相关的内容可在源码里查看btr_compress和btr_root_raise_and_insert相关内容。
起因是和朋友交流的时候说在MySQL里查数据,走索引从根页开始,根页号是不变的。
这立刻就引起了我的疑惑,如果根页的记录被删除了会怎么样呢?
这里直接说结论,根页号是一直不变的。
- MySQL5.7里数据库里用户自建表的主键索引B+树根页号为3,创建表时一起创建的二级索引根页号为4,后续新增的就会是一个未知数字。
- MySQL8里数据库里用户自建表的主键索引B+树根页号为4,创建表时一起创建的二级索引根页号为5,后续新增的就会是一个未知数字。
可以在数据库里用如下SQL语句查询。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
|
-- mysql8
SELECT INDEX_ID
,NAME
,PAGE_NO
FROM INFORMATION_SCHEMA.INNODB_INDEXES;
mysql> SELECT INDEX_ID
-> ,NAME
-> ,PAGE_NO
-> FROM INFORMATION_SCHEMA.INNODB_INDEXES;
+----------+---------------+---------+
| INDEX_ID | NAME | PAGE_NO |
+----------+---------------+---------+
| 152 | PRIMARY | 4 |
| 153 | PRIMARY | 4 |
| 154 | idx_last_name | 5 |
+----------+---------------+---------+
3 rows in set (0.00 sec)
mysql> select version();
+-------------------------+
| version() |
+-------------------------+
| 8.0.45-0ubuntu0.22.04.1 |
+-------------------------+
1 row in set (0.00 sec)
-- mysql5.7
SELECT INDEX_ID
,NAME
,PAGE_NO
FROM INFORMATION_SCHEMA.INNODB_SYS_INDEXES;
mysql> SELECT INDEX_ID,NAME,PAGE_NO FROM INFORMATION_SCHEMA.INNODB_SYS_INDEXES;
+----------+-----------------------------+---------+
| INDEX_ID | NAME | PAGE_NO |
+----------+-----------------------------+---------+
| 11 | ID_IND | 270 |
| 12 | FOR_IND | 271 |
| 13 | REF_IND | 272 |
| 14 | ID_IND | 273 |
| 15 | SYS_TABLESPACES_SPACE | 275 |
| 16 | SYS_DATAFILES_SPACE | 276 |
| 17 | BASE_IDX | 278 |
| 18 | SYS_ZIP_DICT_ID | 280 |
| 19 | SYS_ZIP_DICT_NAME | 281 |
| 20 | SYS_ZIP_DICT_COLS_COMPOSITE | 282 |
| 21 | PRIMARY | 3 |
| 22 | PRIMARY | 3 |
| 34 | PRIMARY | 3 |
| 35 | PRIMARY | 3 |
| 36 | PRIMARY | 3 |
| 37 | PRIMARY | 3 |
| 38 | PRIMARY | 3 |
| 39 | PRIMARY | 3 |
| 40 | PRIMARY | 3 |
| 41 | PRIMARY | 3 |
| 42 | PRIMARY | 3 |
| 23 | PRIMARY | 3 |
| 24 | name | 4 |
| 25 | PRIMARY | 3 |
| 26 | name | 4 |
| 28 | PRIMARY | 3 |
| 29 | name | 4 |
| 27 | PRIMARY | 3 |
| 43 | PRIMARY | 3 |
| 31 | PRIMARY | 3 |
| 30 | PRIMARY | 3 |
| 32 | PRIMARY | 3 |
| 33 | PRIMARY | 3 |
| 44 | PRIMARY | 3 |
| 45 | idx_last_name | 4 |
+----------+-----------------------------+---------+
35 rows in set (0.00 sec)
mysql> select version();
+-----------+
| version() |
+-----------+
| 5.7.19-17 |
+-----------+
1 row in set (0.00 sec)
|
https://planet.mysql.com/entry/?id=35781
An index tree starts at a “root” page, whose location is fixed (and permanently stored in the InnoDB’s data dictionary) as a starting point for accessing the tree. The tree may be as small as the single root page, or as large as many millions of pages in a multi-level tree.
由于B+树是M叉树,且非叶子节点中存有冗余的目录项(键值),又因为单条 delete 只是打标记,后续 purge 掉后只要页内填充率仍大于 MERGE_THRESHOLD(默认50%),btr_compress 直接返回 false,什么也不做。所以一条记录的删除大概率不会影响根页结构。
MERGE_THRESHOLD 本身是每个索引的属性,默认 50%,存在 InnoDB 数据字典中,并在 DDL 时可配(CREATE/ALTER TABLE … INDEX … COMMENT ‘merge_threshold=30’ 等)。官方文档有描述,并且在 5.7 的 INNODB_SYS_INDEXES 视图里也有字段。
当大量删除数据后,数据页数据量占用小于50%会触发页合并,从叶子节点开始合并,优先合并左兄弟,再合并右兄弟,整层只有自己时,会将当前层这唯一一页的数据上移到父页,然后释放当前页。如果父页是根页,则根页被原地覆写,树的高度减一。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
// 优先合并左兄弟
is_left = btr_can_merge_with_page(cursor, left_page_no, &merge_block, mtr);
// 左兄弟无法被合并时合并右兄弟
DBUG_EXECUTE_IF("ib_always_merge_right", is_left = false;);
retry:
if (!is_left &&
!btr_can_merge_with_page(cursor, right_page_no, &merge_block, mtr)) {
if (!merge_block) {
merge_page = nullptr;
}
goto err_exit;
if (left_page_no == FIL_NULL && right_page_no == FIL_NULL) {
/* The page is the only one on the level, lift the records
to the father */
merge_block = btr_lift_page_up(index, block, mtr);
goto func_exit;
}
/** Discards a page that is the only page on its level. This will empty
the whole B-tree, leaving just an empty root page. This function
should never be reached, because btr_compress(), which is invoked in
delete operations, calls btr_lift_page_up() to flatten the B-tree. */
|
只有当删除大量数据时,子页合并到父页,一直合并到根页时,根页号也不会改变,只是高度会减少
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
|
// 插入数据引起根页分裂,树层级+1
/* Rebuild the root page to get free space */
btr_page_empty(root_block, root_page_zip, index, level + 1, mtr);
// 删除数据引起数据页合并到根页,清空根页,从根页的子页拷贝记录到根页
/* Make the father empty */
btr_page_empty(father_block, father_page_zip, index, page_level, mtr);
page_level++;
/* Copy the records to the father page one by one. */
if (false
#ifdef UNIV_ZIP_COPY
|| father_page_zip
#endif /* UNIV_ZIP_COPY */
|| !page_copy_rec_list_end(father_block, block,
page_get_infimum_rec(page), index, mtr)) {
const page_zip_des_t *page_zip = buf_block_get_page_zip(block);
ut_a(father_page_zip);
ut_a(page_zip);
/* Copy the page byte for byte. */
page_zip_copy_recs(father_page_zip, father_page, page_zip, page, index,
mtr);
/* Update the lock table and possible hash index. */
if (!dict_table_is_locking_disabled(index->table)) {
lock_move_rec_list_end(father_block, block, page_get_infimum_rec(page));
}
/* Also update the predicate locks */
if (dict_index_is_spatial(index)) {
lock_prdt_rec_move(father_block, block);
}
btr_search_update_hash_on_move(father_block, block, index);
}
|