从MySQL B+树根页号不变说开去

本文是对MySQL B+树根页(root page)的详细学习。

参考:https://help.aliyun.com/zh/polardb/polardb-for-mysql/innodb-space-files

源码:https://fossies.org/dox/mysql-9.7.0/btr0btr_8cc_source.html

根页相关的内容可在源码里查看btr_compressbtr_root_raise_and_insert相关内容。

起因是和朋友交流的时候说在MySQL里查数据,走索引从根页开始,根页号是不变的。 这立刻就引起了我的疑惑,如果根页的记录被删除了会怎么样呢?

这里直接说结论,根页号是一直不变的。

  • MySQL5.7里数据库里用户自建表的主键索引B+树根页号为3,创建表时一起创建的二级索引根页号为4,后续新增的就会是一个未知数字。
  • MySQL8里数据库里用户自建表的主键索引B+树根页号为4,创建表时一起创建的二级索引根页号为5,后续新增的就会是一个未知数字。

可以在数据库里用如下SQL语句查询。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
-- mysql8
SELECT INDEX_ID
       ,NAME
       ,PAGE_NO
  FROM INFORMATION_SCHEMA.INNODB_INDEXES;


mysql> SELECT INDEX_ID
    ->        ,NAME
    ->        ,PAGE_NO
    ->   FROM INFORMATION_SCHEMA.INNODB_INDEXES;
+----------+---------------+---------+
| INDEX_ID | NAME          | PAGE_NO |
+----------+---------------+---------+
|      152 | PRIMARY       |       4 |
|      153 | PRIMARY       |       4 |
|      154 | idx_last_name |       5 |
+----------+---------------+---------+
3 rows in set (0.00 sec)

mysql> select version();
+-------------------------+
| version()               |
+-------------------------+
| 8.0.45-0ubuntu0.22.04.1 |
+-------------------------+
1 row in set (0.00 sec)


-- mysql5.7
SELECT INDEX_ID
       ,NAME
       ,PAGE_NO
  FROM INFORMATION_SCHEMA.INNODB_SYS_INDEXES;

mysql> SELECT INDEX_ID,NAME,PAGE_NO  FROM INFORMATION_SCHEMA.INNODB_SYS_INDEXES;
+----------+-----------------------------+---------+
| INDEX_ID | NAME                        | PAGE_NO |
+----------+-----------------------------+---------+
|       11 | ID_IND                      |     270 |
|       12 | FOR_IND                     |     271 |
|       13 | REF_IND                     |     272 |
|       14 | ID_IND                      |     273 |
|       15 | SYS_TABLESPACES_SPACE       |     275 |
|       16 | SYS_DATAFILES_SPACE         |     276 |
|       17 | BASE_IDX                    |     278 |
|       18 | SYS_ZIP_DICT_ID             |     280 |
|       19 | SYS_ZIP_DICT_NAME           |     281 |
|       20 | SYS_ZIP_DICT_COLS_COMPOSITE |     282 |
|       21 | PRIMARY                     |       3 |
|       22 | PRIMARY                     |       3 |
|       34 | PRIMARY                     |       3 |
|       35 | PRIMARY                     |       3 |
|       36 | PRIMARY                     |       3 |
|       37 | PRIMARY                     |       3 |
|       38 | PRIMARY                     |       3 |
|       39 | PRIMARY                     |       3 |
|       40 | PRIMARY                     |       3 |
|       41 | PRIMARY                     |       3 |
|       42 | PRIMARY                     |       3 |
|       23 | PRIMARY                     |       3 |
|       24 | name                        |       4 |
|       25 | PRIMARY                     |       3 |
|       26 | name                        |       4 |
|       28 | PRIMARY                     |       3 |
|       29 | name                        |       4 |
|       27 | PRIMARY                     |       3 |
|       43 | PRIMARY                     |       3 |
|       31 | PRIMARY                     |       3 |
|       30 | PRIMARY                     |       3 |
|       32 | PRIMARY                     |       3 |
|       33 | PRIMARY                     |       3 |
|       44 | PRIMARY                     |       3 |
|       45 | idx_last_name               |       4 |
+----------+-----------------------------+---------+
35 rows in set (0.00 sec)

mysql> select version();
+-----------+
| version() |
+-----------+
| 5.7.19-17 |
+-----------+
1 row in set (0.00 sec)

https://planet.mysql.com/entry/?id=35781

An index tree starts at a “root” page, whose location is fixed (and permanently stored in the InnoDB’s data dictionary) as a starting point for accessing the tree. The tree may be as small as the single root page, or as large as many millions of pages in a multi-level tree.

由于B+树是M叉树,且非叶子节点中存有冗余的目录项(键值),又因为单条 delete 只是打标记,后续 purge 掉后只要页内填充率仍大于 MERGE_THRESHOLD(默认50%),btr_compress 直接返回 false,什么也不做。所以一条记录的删除大概率不会影响根页结构。

MERGE_THRESHOLD 本身是每个索引的属性,默认 50%,存在 InnoDB 数据字典中,并在 DDL 时可配(CREATE/ALTER TABLE … INDEX … COMMENT ‘merge_threshold=30’ 等)。官方文档有描述,并且在 5.7 的 INNODB_SYS_INDEXES 视图里也有字段。

当大量删除数据后,数据页数据量占用小于50%会触发页合并,从叶子节点开始合并,优先合并左兄弟,再合并右兄弟,整层只有自己时,会将当前层这唯一一页的数据上移到父页,然后释放当前页。如果父页是根页,则根页被原地覆写,树的高度减一。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// 优先合并左兄弟
is_left = btr_can_merge_with_page(cursor, left_page_no, &merge_block, mtr);

// 左兄弟无法被合并时合并右兄弟
  DBUG_EXECUTE_IF("ib_always_merge_right", is_left = false;);
retry:
  if (!is_left &&
      !btr_can_merge_with_page(cursor, right_page_no, &merge_block, mtr)) {
    if (!merge_block) {
      merge_page = nullptr;
    }
    goto err_exit;


  if (left_page_no == FIL_NULL && right_page_no == FIL_NULL) {
    /* The page is the only one on the level, lift the records
    to the father */
 
    merge_block = btr_lift_page_up(index, block, mtr);
    goto func_exit;
  }

/** Discards a page that is the only page on its level.  This will empty
 the whole B-tree, leaving just an empty root page.  This function
 should never be reached, because btr_compress(), which is invoked in
 delete operations, calls btr_lift_page_up() to flatten the B-tree. */

只有当删除大量数据时,子页合并到父页,一直合并到根页时,根页号也不会改变,只是高度会减少

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// 插入数据引起根页分裂,树层级+1
 /* Rebuild the root page to get free space */
  btr_page_empty(root_block, root_page_zip, index, level + 1, mtr);


// 删除数据引起数据页合并到根页,清空根页,从根页的子页拷贝记录到根页
  /* Make the father empty */
  btr_page_empty(father_block, father_page_zip, index, page_level, mtr);
  page_level++;
 
  /* Copy the records to the father page one by one. */
  if (false
#ifdef UNIV_ZIP_COPY
      || father_page_zip
#endif /* UNIV_ZIP_COPY */
      || !page_copy_rec_list_end(father_block, block,
                                 page_get_infimum_rec(page), index, mtr)) {
    const page_zip_des_t *page_zip = buf_block_get_page_zip(block);
    ut_a(father_page_zip);
    ut_a(page_zip);
 
    /* Copy the page byte for byte. */
    page_zip_copy_recs(father_page_zip, father_page, page_zip, page, index,
                       mtr);
 
    /* Update the lock table and possible hash index. */
 
    if (!dict_table_is_locking_disabled(index->table)) {
      lock_move_rec_list_end(father_block, block, page_get_infimum_rec(page));
    }
 
    /* Also update the predicate locks */
    if (dict_index_is_spatial(index)) {
      lock_prdt_rec_move(father_block, block);
    }
 
    btr_search_update_hash_on_move(father_block, block, index);
  }  
页面浏览量Loading
网站总访客数:Loading
网站总访问量:Loading
使用 Hugo 构建
主题 StackJimmy 设计