Issue 02: Art of MySQL data type - large object field

I've shared one before MySQL big object example , let's talk about the advantages and disadvantages of large objects and the use scenarios in detail today.

We divided MySQL's large object types into TEXT / BLOB and JSON to illustrate.

1, TEXT / BLOB type

The difference between TEXT and BLOB is very simple. TEXT is stored in plaintext, with corresponding character set and verification rules; BLOB is stored in binary, without character set and sorting rules, and all comparisons are performed in binary.

Example 1

Create a table c1 with fields F1 and F2 as tinytext and tinyblob respectively.

mysql> create table c1 (f1 tinytext, f2 tinyblob);
Query OK, 0 rows affected (0.03 sec)

Insert sample data.

insert into c1 values ('a','a'),('b','b'),('B','B'),('d','d'),('F','F'),('you','you'),('I','I'),('Is that so?','Is that so?');

Sort by field f1.

mysql> select * from c1 order by f1;
| f1     | f2     |
| a      | a      |
| b      | b      |
| B      | B      |
| d      | d      |
| F      | F      |
| you     | you     |
| I     | I     |
| Is that so?   | Is that so?   |
8 rows in set (0.01 sec)

Sort by field f2.

mysql> select * from c1 order by f2;
| f1     | f2     |
| B      | B      |
| F      | F      |
| a      | a      |
| b      | b      |
| d      | d      |
| you     | you     |
| I     | I     |
| Is that so?   | Is that so?   |
8 rows in set (0.00 sec)

The sorting results of f1 and f2 fields are inconsistent. f1 is a case insensitive check rule, f2 is a direct binary check.

For the large object types of MySQL, I will explain them in detail from the following categories:

1.1 disk space occupation

1.2 storage format of table

  • 1.2.1 redundant/compact

    For the redundant format, the first 768 bytes of a large object are saved on the InnoDB data page, and the rest on the overflow page. If there are multiple TEXT/BLOB fields, the data page will become bloated and the performance will be greatly affected. Data pages are almost full of useless data, resulting in additional resource consumption. At the same time, if it is the master-slave architecture, it will also synchronize all data to the slave, which is also an additional consumption of the network. So in this scenario, the path of large objects is usually saved to the database, and the real data is placed on the disk.

  • 1.2.2 dynamic/compressed

    For dynamic format, if the size of data stored in large object field is less than 40 bytes, all of them will be placed on the data page. For the remaining scenarios, only a 20 byte pointer will be reserved on the data page to point to the overflow page. In this scenario, if the data saved in each large object field is less than 40 bytes, the effect is the same as that of varchar(40). So we can't generalize whether we use big objects or not.

1.3 table engine related

Large objects are not recommended for NDB engines. There are many details involved, such as the association fields with large objects do not support engine association data push down; tables with large objects need additional locks, and so on.

1.4 index related

It must be a prefix to build an index on a large object field. For example, if field f1 is text, index the first 10 characters idx Β f1 (f1 (10)).

mysql> alter table t2 add key idx_f100(f100);
ERROR 1170 (42000): BLOB/TEXT column 'f100' used in key specification without a key length

1.5 division table related

Partition table partition field does not support large objects.

mysql> create table p1(f1 text) partition by list columns(f1) (partition p0 values in ('1','2'));
ERROR 1502 (HY000): A BLOB field is not allowed in partition function

1.6 parameter correlation

mysql_allowed_packet. This parameter represents the upper limit of single packet transmitted by MySQL server and client. If there is a text/blob field, this parameter is set to 1GB maximum. Of course, both the client and the server must be set up.

1.7 how to insert / acquire

Example 2

Insert the contents of all files in a / var / lib / MySQL files directory.

root@ytt-pc:/var/lib/mysql-files# ll
//Total consumption 32M
1324681  14M -rw-r--r-- 1 root  root   14M 12 Month 3017:23 test3
1311598 6.6M -rw-r--r-- 1 root  root  6.6M 12 Month 3017:23 test
1316840 6.6M -rw-r--r-- 1 root  root  6.6M 12 Month 3017:23 test2
1316693 5.5M -rw-r--r-- 1 root  root  5.4M 12 Month 3017:26 random_test
1315066 4.0K -rw-r--r-- 1 root  root  1.9K 12 Month 3014:56
1311256 4.0K -rw-r----- 1 mysql mysql 1.7K 12 Month 3016:23 y1.txt
1316129 4.0K -rw-r--r-- 1 root  root  1.5K 12 Month 3015:28 sample_y1.txt.nl2
1312042 4.0K -rw-r--r-- 1 root  root  1.5K 12 Month 3015:07 sample_y1.txt.awk
1316691 4.0K -rw-rw-rw- 1 mysql mysql 1.4K 12 Month 3016:25 y1_older.txt
1316168 4.0K -rw-r----- 1 mysql mysql  874 12 Month 3014:41 sample_y1.txt
1315893 4.0K -rw-r----- 1 root  root   874 12 Month 3014:45 sample_y1.txt.bak
1311076 4.0K -rw-r----- 1 mysql mysql  662 12 Month 3013:56 ascii.txt
1312152 4.0K -rw-r--r-- 1 root  root   648 12 Month 3014:37 sample.txt
1325161 4.0K -rw-r--r-- 1 root  root   403 12 Month 3110:59 y1111.txt.10
1312256 4.0K -rw-r----- 1 mysql mysql  164 12 Month 3109:49 y111.txt
1311248 4.0K -rw-r----- 1 mysql mysql  159 12 Month 3109:42 y11.txt
1327711 4.0K -rw-r--r-- 1 root  root   126 12 Month 509:56 performance_schema_enable.sql
1326639 4.0K -rw-r--r-- 1 root  root    22 1 Month 1011:32 System name

Table structure for storing the contents of documents.

mysql-(ytt/3305)->create table file_list_upload_dir(file_name varchar(100) not null primary key,content longtext, updated datetime);
Query OK, 0 rows affected (0.03 sec)

Insert the contents of the file.

root@ytt-pc:/var/lib/mysql-files# for i in $(ls); \
do mysql -uytt -pytt -P3305 -h127.0.0.1  \
-e "use ytt;insert into file_list_upload_dir  \
values ('$i',load_file('$(pwd)/$i'),now())";done;

2, JSON type

Compared with JSON type, we are very familiar with the lightweight text interaction format, which does not depend on any statement.

Development information reference:

JSON type is stored in MySQL in a special binary way, similar to the JSONB type of PostgreSQL. The maximum occupied space is the same as long text or long blob. Text can also store JSON objects, but there is no JSON type format validation and many functions provided internally. For example:

Example 3

variable @a And @ b are standard JSON format and non JSON format respectively. Create the json1 table.

mysql-(ytt/3305)->set @a='{"a":1,"b":2,"c":3,"d":4}';
Query OK, 0 rows affected (0.00 sec)

mysql-(ytt/3305)->set @b="{'a':1,'b':2,'c':3,'d':4}";
Query OK, 0 rows affected (0.00 sec)

mysql-(ytt/3305)->create table json1 (str1 json,str2 longtext);
Query OK, 0 rows affected (0.02 sec)

Insert for str1 @a Succeeded, @ b failed; str2 can insert any character.

mysql-(ytt/3305)->insert into json1 values (@a,@a);
Query OK, 1 row affected (0.00 sec)

mysql-(ytt/3305)->insert into json1 values (@b,@b);
ERROR 3140 (22032): Invalid JSON text: "Missing a name for object member." at position 1 in value for column 'json1.str1'.

mysql-(ytt/3305)->insert into json1 values (@a,@b);
Query OK, 1 row affected (0.01 sec)

Another example is the retrieval part. It's much easier to retrieve JSON than text. Take the example just now. You need to take the value 1 corresponding to A. it may be troublesome for @ b, but it's much easier to change it to @ a. Just use the JSON? Extract function.

mysql-(ytt/3305)->select json_extract(@a,'$.a') as 'a';
| a    |
| 1    |
1 row in set (0.00 sec)

But fortunately, for @ b format, you can split a single field into multiple fields, so that you can use SQL statements to simply retrieve the results. The advantages and disadvantages of detaching or not detaching are not discussed in this paper, which will be introduced in detail in table design concept. OK, so far is the introduction of MySQL large object types. I hope it can help you.

What else do you want to know about MySQL technology? Leave a message and tell Xiaobian!

Tags: Database MySQL JSON less SQL

Posted on Wed, 15 Apr 2020 01:52:58 -0700 by Daen