Technology Sharing | Controlling the Transaction Size of SQL Files Exported by mysqldump

Author: Chen Juncong

 

background

Some people asked whether insert statements from mysqldump could be organized as one insert statement per 10 row s.

 

Consideration 1: Parameter -- extended-insert

Recalling what we learned in the past:

I only know one pair of parameters.

-- extended-insert (default value)

Represents the use of long INSERT, multiple row s in the merger of INSERT batch, improve import efficiency

--skip-extended-insert

Short INSERT Line by Line

Neither of them can meet the needs of the group friends and can not control the form of an insert statement per 10 row s.

 

Think 2: "Avoid Big Business"

This has never been considered before. It is believed that the main purpose of raising this issue is to "avoid big business". So satisfying insert is a small transaction.

Next, let's discuss the following questions:

1. What is big business?

2. Can the insert statement from mysqldump be a big transaction?

What is big business?

  • Definition: Transactions that take longer to run and operate with more data are called big transactions.

  • Big transaction risk:

    Lock too much data, cause a lot of blocking and lock timeouts, rollback takes a long time.

    _Long execution time, easy to cause master-slave delay.

    _undo log expansion

  • Avoiding Big Business: According to the company's actual scenario, I stipulate that the amount of data per operation/acquisition should be less than 5000, and the result set should be less than 2M.

Does the SQL file from mysqldump have big transactions?

The premise is that MySQL is self-committed by default, so if a transaction is not explicitly opened, an SQL statement is a transaction. In mysqldump, an SQL statement is a transaction.

According to my "Avoid Big Business" custom rule, the answer is No.

It turns out that mysqldump automatically splits SQL statements according to the parameter net-buffer-length. The default value is 1M. According to the criteria we defined earlier, we did not meet our 2M large transaction criteria.

The maximum value of net-buffer-length can be set to 1677216. If the manual setting is larger than this value, it will automatically adjust to 1677216, or 16M. Setting 16M can improve export import performance. If you want to avoid big transactions, it is not recommended to adjust this parameter, just use the default value.

[root@192-168-199-198 ~]# mysqldump --net-buffer-length=104652800 -uroot -proot -P3306 -h192.168.199.198 test t >16M.sql
mysqldump: [Warning] option 'net_buffer_length': unsigned value 104652800 adjusted to 16777216
#Settings greater than 16M, parameters automatically adjusted to 16M

Note that it refers to the parameters of mysqldump, not mysqld. The official document mentions: If you increase this variable, ensure that the MySQL server net_buffer_length system variable has a value at least this large.

This means that mysqldump increases this value, and mysqld also increases this value. The test conclusion is not needed. It is suspected that the official documents are incorrect.

However, when importing, affected by the server parameter max_allowed_package, it controls the maximum size of the packets that the server can accept. The default value is 4194304, or 4M. So when importing the database, you need to adjust the value of the parameter max_allowed_package.

set global max_allowed_packet=16*1024*1024*1024;

If not adjusted, the following errors will occur:

[root@192-168-199-198 ~]# mysql -uroot -proot -P3306 -h192.168.199.198 test <16M.sql
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 2006 (HY000) at line 46: MySQL server has gone away

 

Relevant tests

Finally, I release my relevant test steps.

mysql> select version();
+------------+
| version() |
+------------+
| 5.7.26-log |
+------------+
1 row in set (0.00 sec)

Build 1 million rows of data

create database test;
use test;
CREATE TABLE `t` (
`a` int(11) DEFAULT NULL,
`b` int(11) DEFAULT NULL,
`c` varchar(255) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

insert into t values (1,1,'abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyztuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz');

insert into t select * from t; #Repeat 20 times
# Until Records: 524288 Duplicates: 0 Warnings: 0
# It shows that the amount of data has reached more than one million.

mysql> select count(*) from t;
+----------+
| count(*) |
+----------+
| 1048576 |
+----------+
1 row in set (1.04 sec)

The data size is as follows, 284 MB

[root@192-168-199-198 test]# pwd
/data/mysql/mysql3306/data/test
[root@192-168-199-198 test]# du -sh t.ibd
284M t.ibd

--net-buffer-length=1M

[root@192-168-199-198 ~]# mysqldump -uroot -proot -S /tmp/mysql3306.sock test t >1M.sql
[root@192-168-199-198 ~]# du -sh 1M.sql
225M 1M.sql
[root@192-168-199-198 ~]# cat 1M.sql |grep -i insert |wc -l
226

By default -- net-buffer-length=1M, there are 226 inserts in the 225M SQL file, and the average size of each insert is exactly 1M.

--net-buffer-length=16M

[root@192-168-199-198 ~]# mysqldump --net-buffer-length=16M -uroot -proot -S /tmp/mysql3306.sock test t >16M.sql
[root@192-168-199-198 ~]# du -sh 16M.sql
225M 16M.sql
[root@192-168-199-198 ~]# cat 16M.sql |grep -i insert |wc -l
15

By default -- net-buffer-length=16M, there are 15 inserts in the 225M SQL file, and the average size of each insert is actually 16M.

So, here's the proof that -- net-buffer-length can really be used to split the SQL size of mysqldump backup files.

 

performance testing

The more inserts, the more interactions, and the lower performance. But given that the number of inserts in the above example is not much different, only 16 times, the performance gap will not be very large (and so will the actual test). We compared the insert times between the cases of net-buffer-length=16K and that of net-buffer-length=16M, which were 1024 times different.

[root@192-168-199-198 ~]# time mysql -uroot -proot -S /tmp/mysql3306.sock test <16K.sql
mysql: [Warning] Using a password on the command line interface can be insecure.

real 0m10.911s #11 seconds
user 0m1.273s
sys 0m0.677s
[root@192-168-199-198 ~]# mysql -uroot -proot -S /tmp/mysql3306.sock -e "reset master";
mysql: [Warning] Using a password on the command line interface can be insecure.
[root@192-168-199-198 ~]# time mysql -uroot -proot -S /tmp/mysql3306.sock test <16M.sql
mysql: [Warning] Using a password on the command line interface can be insecure.

real 0m8.083s #8 seconds
user 0m1.669s
sys 0m0.066s

The results are obvious. The larger the settings of net-buffer-length, the fewer times the client interacts with the database, and the faster the import.

 

conclusion

The backup files exported under the default settings of mysqldump meet the import requirements and will not cause large transactions. The performance also meets the requirements, and there is no need to adjust the parameters.

 

Reference link:

https://dev.mysql.com/doc/refman/5.7/en/mysqldump.html#option_mysqldump_net-buffer-length

Tags: SQL MySQL mysqldump Database

Posted on Mon, 26 Aug 2019 19:46:35 -0700 by vishal99