歡迎光臨
每天分享高質量文章

MySQL-8.0 | 資料字典最強解讀

公眾號展示代碼會自動折行,建議橫屏閱讀

1. 引言


 

資料字典(Data Dictionary)中儲存了諸多資料庫的元資料信息如圖1所示,包括基本Database, table, index, column, function, trigger, procedure,privilege等;以及與儲存引擎相關的元資料,如InnoDB的tablespace, table_id, index_id等。MySQL-8.0在資料字典上進行了諸多優化,本文將對其進行逐一介紹。

 

圖1

 

2. MySQL-8.0之前的資料字典


俗話說知己知彼,方能百戰不殆。在介紹MySQL-8.0的資料字典前,我們先一起回顧一下MySQL-8.0之前的資料字典。

 

2.1 Data Dictionary 分佈位置

 

圖2

 

如圖2所示,舊的資料字典信息分佈在server層,mysql庫下的系統表和InnoDB內部系統表三個地方,其中儲存的信息分別如下所示:

 

  • server層檔案
    • .frm files: Table metadata files.
    • .par files: Partition definition files. InnoDB stopped using partition definition files in MySQL 5.7 with the introduction of native partitioning support for InnoDB tables.
    • .TRN files: Trigger namespace files.
    • .TRG files: Trigger parameter files.
    • .isl files: InnoDB Symbolic Link files containing the location of file-per-table tablespace files created outside of the data directory.
    • .db.opt files: Database configuration files. These files, one per database directory, contained database default character set attributes.
  • mysql庫下的系統表

    mysql.user mysql.db mysql.proc mysql.event等
    show tables in mysql;

  • InnoDB內部系統表
    • SYS_DATAFILES
    • SYS_FOREIGN
    • SYS_FOREIGN_COLS
    • SYS_TABLESPACES
    • SYS_VIRTUAL  

 

2.2 存在的問題

 

  • 資料字典分散儲存,維護管理沒有統一接口
  • MyISAM系統表易損壞
  • DDL沒有原子性,server層與innodb層資料字典容易不一致
  • 檔案儲存資料字典擴展性不好
  • 通過information_schema查詢資料字典時生成臨時表不友好

 

3. MySQL-8.0的資料字典


鑒於舊資料字典的種種缺點,MySQL-8.0對資料字典進行了較大的改動:把所有的元資料信息都儲存在InnoDB dictionary table中,並且儲存在單獨的表空間mysql.ibd里,其架構如圖3所示。下麵逐一介紹各項改變的細節。

 

圖3

 

3.1 儲存結構

 

MySQL下的原有系統表由MyISAM轉為了InnoDB表,沒有了proc和event表,直接改存到了dictionary table中。在debug樣式下,可用如下指令查看dictionary tables:

 

SET SESSION debug='+d,skip_dd_table_access_check';select name from mysql.tables where hidden='System' and type='BASE TABLE';+------------------------------+| name                         |+------------------------------+| dd_properties                || innodb_dynamic_metadata      || innodb_ddl_log               || catalogs                     || character_sets               || collations                   || column_statistics            || column_type_elements         || columns                      || events                       || foreign_key_column_usage     || foreign_keys                 || index_column_usage           || index_partitions             || index_stats                  || indexes                      || parameter_type_elements      || parameters                   || resource_groups              || routines                     || schemata                     || st_spatial_reference_systems || table_partition_values       || table_partitions             || table_stats                  || tables                       || tablespace_files             || tablespaces                  || triggers                     || view_routine_usage           || view_table_usage             |+------------------------------+31 rows in set (0.01 sec)

 

3.2 Dictionary Object Cache

 

資料字典表信息可以通過全域性的cache進行快取。


show variables like '%definition%';+---------------------------------+-------+| Variable_name                   | Value |+---------------------------------+-------+| schema_definition_cache         | 256   || stored_program_definition_cache | 256   || table_definition_cache          | 2000  || tablespace_definition_cache     | 256   |+---------------------------------+-------+
  • table_definition_cache:儲存表定義
  • schema_definition_cache:儲存schema定義
  • stored_program_definition_cache:儲存proc和func定義
  • tablespace_definition_cache:儲存tablespace定義

 

另外還有character,collation,event,column_statistics也有cache,不過其大小硬編碼不可配置:


class Shared_dictionary_cache{... Shared_multi_map<Abstract_table> m_abstract_table_map; Shared_multi_map<Charset> m_charset_map; Shared_multi_map<Collation> m_collation_map; Shared_multi_map<Column_statistics> m_column_stat_map; Shared_multi_map<Event> m_event_map; Shared_multi_map<Resource_group> m_resource_group_map; Shared_multi_map<Routine> m_routine_map; Shared_multi_map<Schema> m_schema_map; Shared_multi_map<Spatial_reference_system> m_spatial_reference_system_map; Shared_multi_map<Tablespace> m_tablespace_map;...}

3.3 Information_schema

 

圖4

 

information_schema的變化如圖4所示,主要包括以下幾個方面:

 

1. information_schema部分表名變化

 

Old Name New Name
INNODB_SYS_COLUMNS INNODB_COLUMNS
INNODB_SYS_DATAFILES INNODB_DATAFILES
INNODB_SYS_FIELDS INNODB_FIELDS
INNODB_SYS_FOREIGN INNODB_FOREIGN
INNODB_SYS_FOREIGN_COLS INNODB_FOREIGN_COLS
INNODB_SYS_INDEXES INNODB_INDEXES
INNODB_SYS_TABLES INNODB_TABLES
INNODB_SYS_TABLESPACES INNODB_TABLESPACES
INNODB_SYS_TABLESTATS INNODB_TABLESTATS
INNODB_SYS_VIRTUAL INNODB_VIRTUAL

2. 通過information_schema查詢時不再需要生成臨時表獲取,而是直接從資料字典表獲取

show create table SCHEMATA\G*************************** 1. row ***************************             View: SCHEMATA      Create View: CREATE ALGORITHM=UNDEFINED DEFINER=`mysql.infoschema`@`localhost` SQL SECURITY DEFINER VIEW `SCHEMATA` AS select `cat`.`name` AS `CATALOG_NAME`,`sch`.`name` AS `SCHEMA_NAME`,`cs`.`name` AS `DEFAULT_CHARACTER_SET_NAME`,`col`.`name` AS `DEFAULT_COLLATION_NAME`,NULL AS `SQL_PATH` from (((`mysql`.`schemata` `sch` join `mysql`.`catalogs` `cat` on((`cat`.`id` = `sch`.`catalog_id`))) join `mysql`.`collations` `col` on((`sch`.`default_collation_id` = `col`.`id`))) join `mysql`.`character_sets` `cs` on((`col`.`character_set_id` = `cs`.`id`))) where can_access_database(`sch`.`name`)character_set_client: utf8collation_connection: utf8_general_ci1 row in set (0.00 sec)

3. 不需要像以前一樣掃描檔案夾獲取資料庫串列,不需要打開frm檔案獲取表信息,而是直接從資料字典表獲取

 

4. information_schema查詢以view的形式展現,更利於優化器優化查詢


EXPLAIN SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'test' AND TABLE_NAME = 't1';+----+-------------+-------+------------+--------+--------------------+------------+---------+-------------------------+------+----------+-------------+| id | select_type | table | partitions | type   | possible_keys      | key        | key_len | ref                     | rows | filtered | Extra       |+----+-------------+-------+------------+--------+--------------------+------------+---------+-------------------------+------+----------+-------------+|  1 | SIMPLE      | cat   | NULL       | index  | PRIMARY            | name       | 194     | NULL                    |    1 |   100.00 | Using index ||  1 | SIMPLE      | sch   | NULL       | eq_ref | PRIMARY,catalog_id | catalog_id | 202     | mysql.cat.id,const      |    1 |   100.00 | Using index ||  1 | SIMPLE      | tbl   | NULL       | eq_ref | schema_id          | schema_id  | 202     | mysql.sch.id,const      |    1 |   100.00 | Using where ||  1 | SIMPLE      | stat  | NULL       | const  | PRIMARY            | PRIMARY    | 388     | const,const             |    1 |   100.00 | Using index ||  1 | SIMPLE      | ts    | NULL       | eq_ref | PRIMARY            | PRIMARY    | 8       | mysql.tbl.tablespace_id |    1 |   100.00 | Using index ||  1 | SIMPLE      | col   | NULL       | eq_ref | PRIMARY            | PRIMARY    | 8       | mysql.tbl.collation_id  |    1 |   100.00 | Using index |+----+-------------+-------+------------+--------+--------------------+------------+---------+-------------------------+------+----------+-------------+6 rows in set, 1 warning (0.00 sec)

3.4 優點

 

  • 去掉server層的元資料檔案,元資料統一儲存到InnoDB資料字典表,易於管理且crash-safe
  • 支持原子DDL
  • information_schema查詢更高效

 

4. Serialized Dictionary Information (SDI)


MySQL8.0不僅將元資料信息儲存在資料字典表中,同時也冗餘儲存了一份在SDI中。對於非InnoDB表,SDI資料在後綴為.sdi的檔案中,而對於innodb,SDI資料則直接儲存與ibd中,如以下例子所示:


create table t1(c1 int) engine=InnoDB;create table t2(c1 int) engine=MyISAM;
ll test/-rw-r----- 1 root root 114688 2月  22 17:47 t1.ibd-rw-r----- 1 root root   1495 2月  22 17:47 t2_337.sdi-rw-r----- 1 root root      0 2月  22 17:47 t2.MYD-rw-r----- 1 root root   1024 2月  22 17:47 t2.MYI
select id from mysql.tables where name='t2';+-----+| id  |+-----+| 337 |+-----+

4.1 非事務表

 

上述例子中MyISAM表t2的SDI為test/t2_337.sdi,其中337為table_id, t2_337.sdi可以直接打開,資料是json格式(cat test/t2_337.sdi):


{    "mysqld_version_id": 80015,    "dd_version": 80014,    "sdi_version": 1,    "dd_object_type": "Table",    "dd_object": {        "name": "t2",        "mysql_version_id": 80015,        "created": 20190222094723,        "last_altered": 20190222094723,        "hidden": 1,        "options": "avg_row_length=0;key_block_size=0;keys_disabled=0;pack_record=0;stats_auto_recalc=0;stats_sample_pages=0;",        "columns": [{            "name": "c1",            "type": 4,            "is_nullable": true,            "is_zerofill": false,            "is_unsigned": false,            "is_auto_increment": false,            "is_virtual": false,            "hidden": 1,            "ordinal_position": 1,            "char_length": 11,            "numeric_precision": 10,            "numeric_scale": 0,            "numeric_scale_null": false,            "datetime_precision": 0,            "datetime_precision_null": 1,            "has_no_default": false,            "default_value_null": true,            "srs_id_null": true,            "srs_id": 0,            "default_value": "",            "default_value_utf8_null": true,            "default_value_utf8": "",            "default_option": "",            "update_option": "",            "comment": "",            "generation_expression": "",            "generation_expression_utf8": "",            "options": "interval_count=0;",            "se_private_data": "",            "column_key": 1,            "column_type_utf8": "int(11)",            "elements": [],            "collation_id": 255,            "is_explicit_collation": false        }],        "schema_ref": "test",        "se_private_id": 18446744073709551615,        "engine": "MyISAM",        "last_checked_for_upgrade_version_id": 0,        "comment": "",        "se_private_data": "",        "row_format": 1,        "partition_type": 0,        "partition_expression": "",        "partition_expression_utf8": "",        "default_partitioning": 0,        "subpartition_type": 0,        "subpartition_expression": "",        "subpartition_expression_utf8": "",        "default_subpartitioning": 0,        "indexes": [],        "foreign_keys": [],        "partitions": [],        "collation_id": 255    }}

4.2 InnoDB事務表

 

上述例子中的InnoDB表t1的SDI則可以通過工具ibd2sdi可以解析出來(ibd2sdi test/t1.ibd):


["ibd2sdi",{    "type": 1,    "id": 336,    "object":        {    "mysqld_version_id": 80015,    "dd_version": 80014,    "sdi_version": 1,    "dd_object_type": "Table",    "dd_object": {        "name": "t1",        "mysql_version_id": 80015,        "created": 20190222094723,        "last_altered": 20190222094723,        "hidden": 1,        "options": "avg_row_length=0;key_block_size=0;keys_disabled=0;pack_record=0;stats_auto_recalc=0;stats_sample_pages=0;",        "columns": [            {                "name": "c1",                "type": 4,                "is_nullable": true,                "is_zerofill": false,                "is_unsigned": false,                "is_auto_increment": false,                "is_virtual": false,                "hidden": 1,                "ordinal_position": 1,                "char_length": 11,                "numeric_precision": 10,                "numeric_scale": 0,                "numeric_scale_null": false,                "datetime_precision": 0,                "datetime_precision_null": 1,                "has_no_default": false,                "default_value_null": true,                "srs_id_null": true,                "srs_id": 0,                "default_value": "",                "default_value_utf8_null": true,                "default_value_utf8": "",                "default_option": "",                "update_option": "",                "comment": "",                "generation_expression": "",                "generation_expression_utf8": "",                "options": "interval_count=0;",                "se_private_data": "table_id=1059;",                "column_key": 1,                "column_type_utf8": "int(11)",                "elements": [],                "collation_id": 255,                "is_explicit_collation": false            },            {                "name": "DB_ROW_ID",                "type": 10,                "is_nullable": false,                "is_zerofill": false,                "is_unsigned": false,                "is_auto_increment": false,                "is_virtual": false,                "hidden": 2,                "ordinal_position": 2,                "char_length": 6,                "numeric_precision": 0,                "numeric_scale": 0,                "numeric_scale_null": true,                "datetime_precision": 0,                "datetime_precision_null": 1,                "has_no_default": false,                "default_value_null": true,                "srs_id_null": true,                "srs_id": 0,                "default_value": "",                "default_value_utf8_null": true,                "default_value_utf8": "",                "default_option": "",                "update_option": "",                "comment": "",                "generation_expression": "",                "generation_expression_utf8": "",                "options": "",                "se_private_data": "table_id=1059;",                "column_key": 1,                "column_type_utf8": "",                "elements": [],                "collation_id": 63,                "is_explicit_collation": false            },            {                "name": "DB_TRX_ID",                "type": 10,                "is_nullable": false,                "is_zerofill": false,                "is_unsigned": false,                "is_auto_increment": false,                "is_virtual": false,                "hidden": 2,                "ordinal_position": 3,                "char_length": 6,                "numeric_precision": 0,                "numeric_scale": 0,                "numeric_scale_null": true,                "datetime_precision": 0,                "datetime_precision_null": 1,                "has_no_default": false,                "default_value_null": true,                "srs_id_null": true,                "srs_id": 0,                "default_value": "",                "default_value_utf8_null": true,                "default_value_utf8": "",                "default_option": "",                "update_option": "",                "comment": "",                "generation_expression": "",                "generation_expression_utf8": "",                "options": "",                "se_private_data": "table_id=1059;",                "column_key": 1,                "column_type_utf8": "",                "elements": [],                "collation_id": 63,                "is_explicit_collation": false            },            {                "name": "DB_ROLL_PTR",                "type": 9,                "is_nullable": false,                "is_zerofill": false,                "is_unsigned": false,                "is_auto_increment": false,                "is_virtual": false,                "hidden": 2,                "ordinal_position": 4,                "char_length": 7,                "numeric_precision": 0,                "numeric_scale": 0,                "numeric_scale_null": true,                "datetime_precision": 0,                "datetime_precision_null": 1,                "has_no_default": false,                "default_value_null": true,                "srs_id_null": true,                "srs_id": 0,                "default_value": "",                "default_value_utf8_null": true,                "default_value_utf8": "",                "default_option": "",                "update_option": "",                "comment": "",                "generation_expression": "",                "generation_expression_utf8": "",                "options": "",                "se_private_data": "table_id=1059;",                "column_key": 1,                "column_type_utf8": "",                "elements": [],                "collation_id": 63,                "is_explicit_collation": false            }        ],        "schema_ref": "test",        "se_private_id": 1059,        "engine": "InnoDB",        "last_checked_for_upgrade_version_id": 0,        "comment": "",        "se_private_data": "",        "row_format": 2,        "partition_type": 0,        "partition_expression": "",        "partition_expression_utf8": "",        "default_partitioning": 0,        "subpartition_type": 0,        "subpartition_expression": "",        "subpartition_expression_utf8": "",        "default_subpartitioning": 0,        "indexes": [            {                "name": "PRIMARY",                "hidden": true,                "is_generated": false,                "ordinal_position": 1,                "comment": "",                "options": "",                "se_private_data": "id=140;root=4;space_id=2;table_id=1059;trx_id=2569;",                "type": 2,                "algorithm": 2,                "is_algorithm_explicit": false,                "is_visible": true,                "engine": "InnoDB",                "elements": [                    {                        "ordinal_position": 1,                        "length": 4294967295,                        "order": 2,                        "column_opx": 1                    },                    {                        "ordinal_position": 2,                        "length": 4294967295,                        "order": 2,                        "column_opx": 2                    },                    {                        "ordinal_position": 3,                        "length": 4294967295,                        "order": 2,                        "column_opx": 3                    },                    {                        "ordinal_position": 4,                        "length": 4294967295,                        "order": 2,                        "column_opx": 0                    }                ],                "tablespace_ref": "test/t1"            }        ],        "foreign_keys": [],        "partitions": [],        "collation_id": 255    }}},{    "type": 2,    "id": 7,    "object":        {    "mysqld_version_id": 80015,    "dd_version": 80014,    "sdi_version": 1,    "dd_object_type": "Tablespace",    "dd_object": {        "name": "test/t1",        "comment": "",        "options": "",        "se_private_data": "flags=16417;id=2;server_version=80015;space_version=1;state=normal;",        "engine": "InnoDB",        "files": [            {                "ordinal_position": 1,                "filename": "./test/t1.ibd",                "se_private_data": "id=2;"            }        ]    }}}]

SDI在ibd中實際是以表(BTree)的形式儲存的。建表時會通過btr_sdi_create_index建立SDI的BTree,同時會向BTree插入table和tablespace的SDI信息,表的結構如下:


create table SDI_$TABLESPACEID(type int,                               id int,                               compressed_len int,                               uncompressed_len int,                               data blob not null,                               primary key(type,id));


dd::sdi::store     -->dd::sdi_tablespace::store_tsp_sdi // store tablespace SDI          -->dict_sdi_set               -->ib_sdi_set                    -->ib_cursor_insert_row    -->dd::sdi_tablespace::store_tbl_sdi // store table SDI          -->ib_sdi_set               -->ib_sdi_set                    -->ib_cursor_insert_row

4.3 其他表空間的SDI

 

ibd2sdi mysql.ibd,可以查看所以mysql下的表,包括new dictionary和mysql下的普通表。需要註意的是ibdata1中不存放SDI信息,使用ibd2sdi解析它會出現以下提示:

  • [INFO] ibd2sdi: SDI is empty.

 

4.4 import

 

import (import table *.sdi)只支持MyISAM表,InnoDB不支持。由於SDI不包含trigger信息,所以import也不會匯入trigger信息,trigger需額外處理。

 

5. Data Dictionary存取實現



class Storage_adapter {
 /**   Drop a dictionary object from persistent storage. */
 template <typename T> static bool drop(THD *thd, const T *object);
/**  Store a dictionary object to persistent storage.*/
template <typename T>static bool store(THD *thd, T *object);}

例如create table 會涉及到mysql.tablespaces,mysql.tablespace_files, mysql.tables, mysql.indexes, mysql.columns,mysql.index_column_usage等。create table的過程如圖5所示:

 

圖5


mysql_create_table    -->mysql_create_table_no_lock          -->create_table_impl               -->rea_create_base_table                    -->dd::cache::Dictionary_client::store<:table> // mysql.tables</:table>                         -->dd::cache::Storage_adapter::store<:table/>                              -->dd::Weak_object_impl::store                                   -->dd::Raw_new_record::insert // store mysql.tables                                         -->handler::ha_write_row                                              -->ha_innobase::write_row                                  -->dd::Table_impl::store_children                                       -->dd::Abstract_table_impl::store_children // store  mysql.columns                                            -->dd::Collection<:column>::store_items</:column>                                       -->dd::Collection<:index>::store_items // store mysql.indexes</:index>                                            -->dd::Weak_object_impl::store                                                  -->dd::Index_impl::store_children                                                       -->dd::Collection<:index_element>::store_items // store mysql.index_column_usage</:index_element>                              -->dd::sdi::store //store table SDI                    -->ha_create_table                         -->handler::ha_create                               -->ha_innobase::create                                    -->innobase_basic_ddl::create_impl                                         -->create_table_info_t::create_table_update_global_dd                                              -->dd_set_autoinck                                              -->dd_create_implicit_tablespace                                                   -->dd_create_tablespace                                                        -->dd::cache::Dictionary_client::store<:tablespace/>                                                             -->dd::cache::Storage_adapter::store<:tablespace/>                                                                   -->dd::Weak_object_impl::store // store mysql.tablespace                                                                       -->dd::Tablespace_impl::store_children                                                                            -->dd::Collection<:tablespace_file>::store_items</:tablespace_file>                                                                                -->dd::Tablespace_file_impl::store // store tablespace_files                                                                   -->dd::sdi::store     // store tablespace SDI                         -->dd::cache::Dictionary_client::update<:table> // 更新innodb引擎相關元資料</:table>                              -->dd::cache::Storage_adapter::store<:table/>                              -->dd::Weak_object_impl::store

下麵以表t1為例,演示create table在DD中的資料分佈:


create table t1(c1 int primary key, c2 int) engine=innodb;
select s.name as schema_name, t.* from mysql.tables t, mysql.schemata s where s.id=t.schema_id and s.name='test' and t.name='t1'\G*************************** 1. row ***************************                        schema_name: test                                 id: 374                          schema_id: 5                               name: t1                               type: BASE TABLE                             engine: InnoDB                   mysql_version_id: 80015                         row_format: Dynamic                       collation_id: 255                            comment:                             hidden: Visible                            options: avg_row_length=0;key_block_size=0;keys_disabled=0;pack_record=0;stats_auto_recalc=0;stats_sample_pages=0;                    se_private_data: NULL                      se_private_id: 1096                      tablespace_id: NULL                     partition_type: NULL               partition_expression: NULL          partition_expression_utf8: NULL               default_partitioning: NULL                  subpartition_type: NULL            subpartition_expression: NULL       subpartition_expression_utf8: NULL            default_subpartitioning: NULL                            created: 2019-03-12 19:30:46                       last_altered: 2019-03-12 19:30:46                    view_definition: NULL               view_definition_utf8: NULL                  view_check_option: NULL                  view_is_updatable: NULL                     view_algorithm: NULL                 view_security_type: NULL                       view_definer: NULL           view_client_collation_id: NULL       view_connection_collation_id: NULL                  view_column_names: NULLlast_checked_for_upgrade_version_id: 0
select * from mysql.indexes where table_id= 374\G*************************** 1. row ***************************                   id: 299             table_id: 374                 name: PRIMARY                 type: PRIMARY            algorithm: BTREEis_algorithm_explicit: 0           is_visible: 1         is_generated: 0               hidden: 0     ordinal_position: 1              comment:              options: flags=0;      se_private_data: id=177;root=4;space_id=39;table_id=1096;trx_id=9996;        tablespace_id: 43               engine: InnoDB
select id, name, type from mysql.columns where table_id= 374;+------+-------------+---------------------+| id   | name        | type                |+------+-------------+---------------------+| 4025 | c1          | MYSQL_TYPE_LONG     || 4026 | c2          | MYSQL_TYPE_LONG     || 4028 | DB_ROLL_PTR | MYSQL_TYPE_LONGLONG || 4027 | DB_TRX_ID   | MYSQL_TYPE_INT24    |+------+-------------+---------------------+
select * from mysql.index_column_usage where index_id=299;+----------+------------------+-----------+--------+-------+--------+| index_id | ordinal_position | column_id | length | order | hidden |+----------+------------------+-----------+--------+-------+--------+|      299 |                1 |      4025 |      4 | ASC   |      0 ||      299 |                2 |      4027 |   NULL | ASC   |      1 ||      299 |                3 |      4028 |   NULL | ASC   |      1 ||      299 |                4 |      4026 |   NULL | ASC   |      1 |+----------+------------------+-----------+--------+-------+--------+
select * from mysql.tablespaces where name='test/t1';+----+---------+---------+----------------------------------------------------------------------+---------+--------+| id | name    | options | se_private_data                                                      | comment | engine |+----+---------+---------+----------------------------------------------------------------------+---------+--------+| 43 | test/t1 | NULL    | flags=16417;id=39;server_version=80015;space_version=1;state=normal; |         | InnoDB |+----+---------+---------+----------------------------------------------------------------------+---------+--------+
select * from mysql.tablespace_files where tablespace_id=43;+---------------+------------------+---------------+-----------------+| tablespace_id | ordinal_position | file_name     | se_private_data |+---------------+------------------+---------------+-----------------+|            43 |                1 | ./test/t1.ibd | id=39;          |+---------------+------------------+---------------+-----------------+
select * from mysql.tablespaces a, mysql.tablespace_files b where a.id=b.tablespace_id and a.name='test/t1';+----+---------+---------+----------------------------------------------------------------------+---------+--------+---------------+------------------+---------------+-----------------+| id | name    | options | se_private_data                                                      | comment | engine | tablespace_id | ordinal_position | file_name     | se_private_data |+----+---------+---------+----------------------------------------------------------------------+---------+--------+---------------+------------------+---------------+-----------------+| 43 | test/t1 | NULL    | flags=16417;id=39;server_version=80015;space_version=1;state=normal; |         | InnoDB |            43 |                1 | ./test/t1.ibd | id=39;          |+----+---------+---------+------------------------------------------------------

drop table是create table的逆過程,不再具體分析。

 

6. Initialize


圖6

 

mysqld –initialize的原始碼流程如圖6所示。具體過程為:


dd::Dictionary_impl::init     -->bootstrap::initialize            -->DDSE_dict_init                 -->innobase_ddse_dict_init                      -->innobase_init_files //創建mysql.ibd            -->initialize_dictionary                 -->create_dd_schema/initialize_dd_properties/create_tables                 -->DDSE_dict_recover // 創建mysql/innodb_system tablespace                 -->flush_meta_data                      -->dd::cache::Storage_adapter::store()   //                           -->dd::Weak_object_impl::store                                -->dd::Table_impl::store_attributes                      -->dd::cache::Storage_adapter::core_store  // 儲存到Object_registry m_core_registry;                      -->dd::sdi::store                 -->populate_tables                      -->get_dml                 -->update_properties                 -->update_versions

7. Atomic DDL


7.1 Atomic DDL

 

定義:DDL所涉及的以下更改操作是原子的,這些更改操作要麼都提交,要麼都回滾。

  • data dictionary
  • storage engine
  • binary log

 

只有InnoDB engine支持Atomic DDL,以下操作不支持:

  • Table-related DDL statements that involve a storage engine other than InnoDB.
  • INSTALL PLUGIN and UNINSTALL PLUGIN statements.
  • INSTALL COMPONENT and UNINSTALL COMPONENT statements.
  • CREATE SERVER, ALTER SERVER, and DROP SERVER statements.

 

7.2 DDL log

 

DDL過程中操作DD事物表是原子的,而DDL過程中也會操作檔案,創建和釋放BTree以及修改DD cache,這些操作不是原子的。為了實現atomic DDL, DDL過程中對檔案操作和Btree操作等記錄日誌,這些日誌會記錄到DD表mysql.innodb_ddl_log中。日誌有以下幾個型別:


enum class Log_Type : uint32_t {/** Smallest log type */SMALLEST_LOG = 1,
/** Drop an index tree */FREE_TREE_LOG = 1,
/** Delete a file */DELETE_SPACE_LOG,
/** Rename a file */RENAME_SPACE_LOG,
/** Drop the entry in innodb_dynamic_metadata */DROP_LOG,
/** Rename table in dict cache. */RENAME_TABLE_LOG,
/** Remove a table from dict cache */REMOVE_CACHE_LOG,
/** Alter Encrypt a tablespace */ALTER_ENCRYPT_TABLESPACE_LOG,
/** Biggest log type */BIGGEST_LOG = ALTER_ENCRYPT_TABLESPACE_LOG};

mysql.innodb_ddl_log 表結構如下:

 


CREATE TABLE `innodb_ddl_log` (`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,`thread_id` bigint(20) unsigned NOT NULL,`type` int(10) unsigned NOT NULL,`space_id` int(10) unsigned DEFAULT NULL,`page_no` int(10) unsigned DEFAULT NULL,`index_id` bigint(20) unsigned DEFAULT NULL,`table_id` bigint(20) unsigned DEFAULT NULL,`old_file_path` varchar(512) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,`new_file_path` varchar(512) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,PRIMARY KEY (`id`),KEY `thread_id` (`thread_id`);

將DDL分為以下幾個階段, Prepare記錄DDL log,Post-DDL會replay log來提交或回滾DDL操作,同時也並清理DDL log。


Prepare: Create the required objects and write the DDL logs to the mysql.innodb_ddl_log table. The DDL logs define how to roll forward and roll back the DDL operation.
Perform: Perform the DDL operation. For example, perform a create routine for a CREATE TABLE operation.
Commit: Update the data dictionary and commit the data dictionary transaction.
Post-DDL: Replay and remove DDL logs from the mysql.innodb_ddl_log table. To ensure that rollback can be performed safely without introducing inconsistencies, file operations such as renaming or removing data files are performed in this final phase. This phase also removes dynamic metadata from the mysql.innodb_dynamic_metadata data dictionary table for DROP TABLE, TRUNCATE TABLE, and other DDL operations that rebuild the table.

 

7.3 Atomic DDL Examples

 

7.3.1 drop table

 

以drop table為例,drop 過程中會刪除ibd檔案,也會從mysql.innodb_dynamic_metadata 中刪除相應記錄。

 

在preppare階段只是記錄日誌,沒有真正刪除。如果drop過程成功, innobase_post_ddl才從mysql.innodb_ddl_log中讀取記錄去replay,replay_delete_space_log/replay_drop_log會真正執行刪除, replay完也會清理ddl log;如果drop過程失敗,rollback時mysql.innodb_ddl_log的記錄也回滾了,innobase_post_ddl時沒有記錄需要replay。

mysql_rm_table     -->mysql_rm_table_no_locks        -->drop_base_table            -->ha_delete_table               -—>handler::ha_delete_table                  -->ha_innobase::delete_table                    -->innobase_basic_ddl::delete_impl                       -->row_drop_table_for_mysql                          -->Log_DDL::write_drop_log               // 記錄刪innodb_dynamic_metadata日誌                          -—>Log_DDL::write_delete_space_log       // 記錄刪ibd日誌            -->dd::drop_table                -->dd::cache::Dictionary_client::drop<:table/>                    -->dd::cache::Storage_adapter::drop<:table/>                         -->dd::sdi::drop        -->innobase_post_ddl            -->Log_DDL::post_ddl               -->Log_DDL::replay_by_thread_id                    -->Log_DDL::replay                       —>Log_DDL::replay_delete_space_log // post-ddl 真正刪除innodb_dynamic_metadata                       —>Log_DDL::replay_drop_log         // post-ddl 真正刪除ibd                    -->delete_by_ids                       -->DDL_Log_Table::remove

7.3.2 create table

 

drop table post_ddl階段執行的redo操作,而create table post ddl執行的是rollback操作。create table prepare階段會真正的創建ibd,BTree,修改DD share cache, 同時記錄相應的log到mysql.innodb_ddl_log中。

create_table        -->Log_DDL::write_delete_space_log        -->Log_DDL::write_remove_cache_log        -->Log_DDL::write_free_tree_log

如果DDL成功commit,在post-DDL階段,DDL log記錄被清理了,不需要replay。如果DDL失敗rollback,在post-DDL階段,DDL log清理操作也回滾了,需要replay, relay會rollback前面的創建ibd,BTree,以及修改DD share cache。

 

如果create table過程中發生crash, 重啟後會讀取ddl log完成ddl的回滾。

write_delete_space_log{     // 插入記錄時在內部開啟trx執行     insert_delete_space_log(nullptr, id, thread_id, space_id, file_path,                         dict_locked);     // 清理記錄用DDL自身的trx     delete_by_id(trx, id, dict_locked);}

 

如果create table過程中發生crash, 重啟後會讀取ddl log完成ddl的回滾。

init_server_components    -->ha_post_recover         -->post_recover_handlerton              -->innobase_post_recover                   -->Log_DDL::recover                        -->Log_DDL::replay_all                             -->Log_DDL::replay                                  -->replay_delete_space_log/replay_remove_cache_log/replay_free_tree_log                        -->delete_by_ids                             -->DDL_Log_Table::remove

7.3.3 truncate table

 

truncate 先rename 為臨時ibd,然後drop臨時ibd,再重建表。rename會記錄ddl log, 參考write_rename_space_log函式,刪除重建也會記錄ddl log, 同前面介紹的create/drop table, 可以做到原子。rollback時通過日誌將臨時ibd重命名為原ibd,參考replay_rename_space_log函式。

Sql_cmd_truncate_table::truncate_table    -->ha_create_table         -->handler::ha_create              -->ha_innobase::create                   -->ha_innobase::truncate_impl                        -->innobase_truncate<:table>::exec</:table>                             -->innobase_truncate

::truncate

 

                                  -->rename_tablespace . // t1.ibd rename to #sql-ib1084-513656514.ibd                                       -->fil_rename_tablespace                                            -->Fil_shard::space_rename                                                 -->Log_DDL::write_rename_space_log                                  -->innobase_basic_ddl::delete_impl                                       -->row_drop_table_for_mysql                                            -->Log_DDL::write_delete_space_log// drop #sql-ib1084-513656514.ibd                                  -->innobase_basic_ddl::create_impl                                       -->create_table_info_t::create_table                                            -->create_table_info_t::create_table_def                                                 -->row_create_table_for_mysql                                                      -->dict_build_table_def                                                           -->dict_build_tablespace_for_table                                                                -->Log_DDL::write_delete_space_log // drop t1.ibd                                                       -->Log_DDL::write_remove_cache_log                                                       -->dict_create_index_tree_in_mem                                                            -->Log_DDL::write_free_tree_log      -->innobase_post_ddl           -->Log_DDL::post_ddl                -->Log_DDL::replay_by_thread_id                      -->Log_DDL::replay                          -->Log_DDL::replay_delete_space_log // drop #sql-ib1084-513656514.ibd                          --> Log_DDL::replay_drop_log                -->delete_by_ids                      -->DDL_Log_Table::remove

7.4 Atomic DDL帶來的變化

 

drop 多表或多用戶時,如果個別失敗,整個DDL都會回滾,且不會記錄binlog;而在MySQL8.0以前, 部分DDL會成功且整個DDL會記錄binlog。

 

mysql> show tables;+----------------+| Tables_in_test |+----------------+| t1             |+----------------+
mysql> drop table t1,tt;ERROR 1051 (42S02): Unknown table 'test.tt'mysql> show tables;+----------------+| Tables_in_test |+----------------+| t1             |+----------------+
show binlog events;+------------------+-----+----------------+-----------+-------------+-----------------------------------------+| Log_name         | Pos | Event_type     | Server_id | End_log_pos | Info                                    |+------------------+-----+----------------+-----------+-------------+-----------------------------------------+| mysql-bin.000001 |   4 | Format_desc    |         1 |         124 | Server ver: 8.0.15-debug, Binlog ver: 4 || mysql-bin.000001 | 124 | Previous_gtids |         1 |         155 |                                         |+------------------+-----+----------------+-----------+-------------+-----------------------------------------+2 rows in set (0.00 sec)

 

8. Persistent Autoinc


MySQL8.0以前自增值沒有持久化,重啟時通過select MAX(id)的方式獲取當前自增值,這種方式自增值會重覆利用。MySQL8.0開始支持自增值持久化,通過增加redo日誌和Data Dictonary 表mysql.innodb_dynamic_metadata來實現持久化。

 

每次insert/update更新自增值時會將自增值寫到redo日誌中,參考dict_table_autoinc_log函式,日誌格式如下:


MLOG_TABLE_DYNAMIC_METAid,version,PM_TABLE_AUTO_INC,autoinc

同時dict_table_t增加了新的變數autoinc_persisted, 在每次checkpoint時會將autoinc_persisted儲存到表mysql.innodb_dynamic_metadata中。

 

dict_table從dictionary cache淘汰時也會將autoinc_persisted持久化到mysql.innodb_dynamic_metadata中。


log_checkpointer     -->log_consider_checkpoint         -->dict_persist_to_dd_table_buffer              -->dict_table_persist_to_dd_table_buffer_low                   -->Persisters::write                        -->AutoIncPersister::write                        -->DDTableBuffer::replace

dict_table從dictionary cache淘汰時也會將autoinc_persisted持久化到mysql.innodb_dynamic_metadata中。

 

crash重啟時,先從mysql.innodb_dynamic_metadata獲取持久化的自增值,再從redo日誌中讀取最新的自增值, 參考MetadataRecover::parseMetadataLog,並通過MetadataRecover::apply更新到table->autoinc。

 


dict_table_remove_from_cache_low    -->dict_table_persist_to_dd_table_buffer_low        -->Persisters::write            -->AutoIncPersister::write            -->DDTableBuffer::replace

crash重啟時,先從mysql.innodb_dynamic_metadata獲取持久化的自增值,再從redo日誌中讀取最新的自增值, 參考MetadataRecover::parseMetadataLog,並通過MetadataRecover::apply更新到table->autoinc。

 

9. Upgrade


MySQL-8.0不支持跨版本升級,只能從5.7升級到8.0,不支持5.5,5.6直接升級到8.0。升級需要註意的問題:

 

  • 原mysql5.7 mysql庫下不能存在dictinary table同名的表
  • 不支持老版本(5.6之前)的資料型別decimal,varchar, data/datetime/timestamp, 通過check table xxx for upgrade可以檢測
  • non-native 分割槽表不支持
  • 不支持5.0之前的trigger,5.0之前的trigger沒有definer
  • foreign key constraint name 不能超過64位元組
  • view的column不能超過255 chars
  • enum 型別不能超過255 chars.
  • frm需與InnoDB系統表一致
  • 一些空間函式如PointFromText需修改為ST_POINTFROMTEXT

 

10. 參考信息


  • https://github.com/mysql/mysql-server
  • https://dev.mysql.com/doc/refman/8.0/en/data-dictionary.html
  • https://dev.mysql.com/doc/refman/8.0/en/system-schema.html
  • https://mysqlserverteam.com/upgrading-to-mysql-8-0-here-is-what-you-need-to-know/
  • http://mysqlserverteam.com/mysql-8-0-improvements-to-information_schema/
  • https://dev.mysql.com/doc/refman/8.0/en/atomic-ddl.html
  • https://mysqlserverteam.com/bootstrapping-the-transactional-data-dictionary/
  • https://www.slideshare.net/StleDeraas/dd-and-atomic-ddl-pl17-dublin

    已同步到看一看
    赞(0)

    分享創造快樂