22.2 Partitioning Types

This section discusses the types of partitioning which are available in MySQL 5.7. These include the types listed here:

A very common use of database partitioning is to segregate data by date. Some database systems support explicit date partitioning, which MySQL does not implement in 5.7. However, it is not difficult in MySQL to create partitioning schemes based on note 'DATE': datetime, note 'TIME': time, or *note 'DATETIME': datetime. columns, or based on expressions making use of such columns.

When partitioning by 'KEY' or 'LINEAR KEY', you can use a note 'DATE': datetime, note 'TIME': time, or *note 'DATETIME': datetime. column as the partitioning column without performing any modification of the column value. For example, this table creation statement is perfectly valid in MySQL:

 CREATE TABLE members (
     firstname VARCHAR(25) NOT NULL,
     lastname VARCHAR(25) NOT NULL,
     username VARCHAR(16) NOT NULL,
     email VARCHAR(35),
     joined DATE NOT NULL
 )
 PARTITION BY KEY(joined)
 PARTITIONS 6;

In MySQL 5.7, it is also possible to use a note 'DATE': datetime. or note 'DATETIME': datetime. column as the partitioning column using 'RANGE COLUMNS' and 'LIST COLUMNS' partitioning.

MySQL's other partitioning types, however, require a partitioning expression that yields an integer value or 'NULL'. If you wish to use date-based partitioning by 'RANGE', 'LIST', 'HASH', or 'LINEAR HASH', you can simply employ a function that operates on a note 'DATE': datetime, note 'TIME': time, or *note 'DATETIME': datetime. column and returns such a value, as shown here:

 CREATE TABLE members (
     firstname VARCHAR(25) NOT NULL,
     lastname VARCHAR(25) NOT NULL,
     username VARCHAR(16) NOT NULL,
     email VARCHAR(35),
     joined DATE NOT NULL
 )
 PARTITION BY RANGE( YEAR(joined) ) (
     PARTITION p0 VALUES LESS THAN (1960),
     PARTITION p1 VALUES LESS THAN (1970),
     PARTITION p2 VALUES LESS THAN (1980),
     PARTITION p3 VALUES LESS THAN (1990),
     PARTITION p4 VALUES LESS THAN MAXVALUE
 );

Additional examples of partitioning using dates may be found in the following sections of this chapter:

For more complex examples of date-based partitioning, see the following sections:

MySQL partitioning is optimized for use with the 'TO_DAYS()', 'YEAR()', and 'TO_SECONDS()' functions. However, you can use other date and time functions that return an integer or 'NULL', such as 'WEEKDAY()', 'DAYOFYEAR()', or 'MONTH()'. See *note date-and-time-functions::, for more information about such functions.

It is important to remember--regardless of the type of partitioning that you use--that partitions are always numbered automatically and in sequence when created, starting with '0'. When a new row is inserted into a partitioned table, it is these partition numbers that are used in identifying the correct partition. For example, if your table uses 4 partitions, these partitions are numbered '0', '1', '2', and '3'. For the 'RANGE' and 'LIST' partitioning types, it is necessary to ensure that there is a partition defined for each partition number. For 'HASH' partitioning, the user-supplied expression must evaluate to an integer value. For 'KEY' partitioning, this issue is taken care of automatically by the hashing function which the MySQL server employs internally.

Names of partitions generally follow the rules governing other MySQL identifiers, such as those for tables and databases. However, you should note that partition names are not case-sensitive. For example, the following *note 'CREATE TABLE': create-table. statement fails as shown:

 mysql> CREATE TABLE t2 (val INT)
     -> PARTITION BY LIST(val)(
     ->     PARTITION mypart VALUES IN (1,3,5),
     ->     PARTITION MyPart VALUES IN (2,4,6)
     -> );
 ERROR 1488 (HY000): Duplicate partition name mypart

Failure occurs because MySQL sees no difference between the partition names 'mypart' and 'MyPart'.

When you specify the number of partitions for the table, this must be expressed as a positive, nonzero integer literal with no leading zeros, and may not be an expression such as '0.8E+01' or '6-2', even if it evaluates to an integer value. Decimal fractions are not permitted.

In the sections that follow, we do not necessarily provide all possible forms for the syntax that can be used for creating each partition type; this information may be found in *note create-table::.

 File: manual.info.tmp, Node: partitioning-range, Next: partitioning-list, Prev: partitioning-types, Up: partitioning-types

22.2.1 RANGE Partitioning

A table that is partitioned by range is partitioned in such a way that each partition contains rows for which the partitioning expression value lies within a given range. Ranges should be contiguous but not overlapping, and are defined using the 'VALUES LESS THAN' operator. For the next few examples, suppose that you are creating a table such as the following to hold personnel records for a chain of 20 video stores, numbered 1 through 20:

 CREATE TABLE employees (
     id INT NOT NULL,
     fname VARCHAR(30),
     lname VARCHAR(30),
     hired DATE NOT NULL DEFAULT '1970-01-01',
     separated DATE NOT NULL DEFAULT '9999-12-31',
     job_code INT NOT NULL,
     store_id INT NOT NULL
 );

Note:

The 'employees' table used here has no primary or unique keys. While the examples work as shown for purposes of the present discussion, you should keep in mind that tables are extremely likely in practice to have primary keys, unique keys, or both, and that allowable choices for partitioning columns depend on the columns used for these keys, if any are present. For a discussion of these issues, see *note partitioning-limitations-partitioning-keys-unique-keys::.

This table can be partitioned by range in a number of ways, depending on your needs. One way would be to use the 'store_id' column. For instance, you might decide to partition the table 4 ways by adding a 'PARTITION BY RANGE' clause as shown here:

 CREATE TABLE employees (
     id INT NOT NULL,
     fname VARCHAR(30),
     lname VARCHAR(30),
     hired DATE NOT NULL DEFAULT '1970-01-01',
     separated DATE NOT NULL DEFAULT '9999-12-31',
     job_code INT NOT NULL,
     store_id INT NOT NULL
 )
 PARTITION BY RANGE (store_id) (
     PARTITION p0 VALUES LESS THAN (6),
     PARTITION p1 VALUES LESS THAN (11),
     PARTITION p2 VALUES LESS THAN (16),
     PARTITION p3 VALUES LESS THAN (21)
 );

In this partitioning scheme, all rows corresponding to employees working at stores 1 through 5 are stored in partition 'p0', to those employed at stores 6 through 10 are stored in partition 'p1', and so on. Note that each partition is defined in order, from lowest to highest. This is a requirement of the 'PARTITION BY RANGE' syntax; you can think of it as being analogous to a series of 'if ... elseif ...' statements in C or Java in this regard.

It is easy to determine that a new row containing the data '(72, 'Mitchell', 'Wilson', '1998-06-25', DEFAULT, 7, 13)' is inserted into partition 'p2', but what happens when your chain adds a 21^st store? Under this scheme, there is no rule that covers a row whose 'store_id' is greater than 20, so an error results because the server does not know where to place it. You can keep this from occurring by using a 'catchall' 'VALUES LESS THAN' clause in the *note 'CREATE TABLE': create-table. statement that provides for all values greater than the highest value explicitly named:

 CREATE TABLE employees (
     id INT NOT NULL,
     fname VARCHAR(30),
     lname VARCHAR(30),
     hired DATE NOT NULL DEFAULT '1970-01-01',
     separated DATE NOT NULL DEFAULT '9999-12-31',
     job_code INT NOT NULL,
     store_id INT NOT NULL
 )
 PARTITION BY RANGE (store_id) (
     PARTITION p0 VALUES LESS THAN (6),
     PARTITION p1 VALUES LESS THAN (11),
     PARTITION p2 VALUES LESS THAN (16),
     _PARTITION p3 VALUES LESS THAN MAXVALUE_
 );

Another way to avoid an error when no matching value is found is to use the 'IGNORE' keyword as part of the note 'INSERT': insert. statement. For an example, see note partitioning-list::.

'MAXVALUE' represents an integer value that is always greater than the largest possible integer value (in mathematical language, it serves as a least upper bound). Now, any rows whose 'store_id' column value is greater than or equal to 16 (the highest value defined) are stored in partition 'p3'. At some point in the future--when the number of stores has increased to 25, 30, or more--you can use an note 'ALTER TABLE': alter-table-partition-operations. statement to add new partitions for stores 21-25, 26-30, and so on (see note partitioning-management::, for details of how to do this).

In much the same fashion, you could partition the table based on employee job codes--that is, based on ranges of 'job_code' column values. For example--assuming that two-digit job codes are used for regular (in-store) workers, three-digit codes are used for office and support personnel, and four-digit codes are used for management positions--you could create the partitioned table using the following statement:

 CREATE TABLE employees (
     id INT NOT NULL,
     fname VARCHAR(30),
     lname VARCHAR(30),
     hired DATE NOT NULL DEFAULT '1970-01-01',
     separated DATE NOT NULL DEFAULT '9999-12-31',
     job_code INT NOT NULL,
     store_id INT NOT NULL
 )
 PARTITION BY RANGE (job_code) (
     PARTITION p0 VALUES LESS THAN (100),
     PARTITION p1 VALUES LESS THAN (1000),
     PARTITION p2 VALUES LESS THAN (10000)
 );

In this instance, all rows relating to in-store workers would be stored in partition 'p0', those relating to office and support staff in 'p1', and those relating to managers in partition 'p2'.

It is also possible to use an expression in 'VALUES LESS THAN' clauses. However, MySQL must be able to evaluate the expression's return value as part of a 'LESS THAN' ('<') comparison.

Rather than splitting up the table data according to store number, you can use an expression based on one of the two note 'DATE': datetime. columns instead. For example, let us suppose that you wish to partition based on the year that each employee left the company; that is, the value of 'YEAR(separated)'. An example of a note 'CREATE TABLE': create-table. statement that implements such a partitioning scheme is shown here:

 CREATE TABLE employees (
     id INT NOT NULL,
     fname VARCHAR(30),
     lname VARCHAR(30),
     hired DATE NOT NULL DEFAULT '1970-01-01',
     separated DATE NOT NULL DEFAULT '9999-12-31',
     job_code INT,
     store_id INT
 )
 PARTITION BY RANGE ( YEAR(separated) ) (
     PARTITION p0 VALUES LESS THAN (1991),
     PARTITION p1 VALUES LESS THAN (1996),
     PARTITION p2 VALUES LESS THAN (2001),
     PARTITION p3 VALUES LESS THAN MAXVALUE
 );

In this scheme, for all employees who left before 1991, the rows are stored in partition 'p0'; for those who left in the years 1991 through 1995, in 'p1'; for those who left in the years 1996 through 2000, in 'p2'; and for any workers who left after the year 2000, in 'p3'.

It is also possible to partition a table by 'RANGE', based on the value of a *note 'TIMESTAMP': datetime. column, using the 'UNIX_TIMESTAMP()' function, as shown in this example:

 CREATE TABLE quarterly_report_status (
     report_id INT NOT NULL,
     report_status VARCHAR(20) NOT NULL,
     report_updated TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
 )
 PARTITION BY RANGE ( UNIX_TIMESTAMP(report_updated) ) (
     PARTITION p0 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-01-01 00:00:00') ),
     PARTITION p1 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-04-01 00:00:00') ),
     PARTITION p2 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-07-01 00:00:00') ),
     PARTITION p3 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-10-01 00:00:00') ),
     PARTITION p4 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-01-01 00:00:00') ),
     PARTITION p5 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-04-01 00:00:00') ),
     PARTITION p6 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-07-01 00:00:00') ),
     PARTITION p7 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-10-01 00:00:00') ),
     PARTITION p8 VALUES LESS THAN ( UNIX_TIMESTAMP('2010-01-01 00:00:00') ),
     PARTITION p9 VALUES LESS THAN (MAXVALUE)
 );

Any other expressions involving *note 'TIMESTAMP': datetime. values are not permitted. (See Bug #42849.)

Range partitioning is particularly useful when one or more of the following conditions is true:

A variant on this type of partitioning is 'RANGE COLUMNS' partitioning. Partitioning by 'RANGE COLUMNS' makes it possible to employ multiple columns for defining partitioning ranges that apply both to placement of rows in partitions and for determining the inclusion or exclusion of specific partitions when performing partition pruning. See *note partitioning-columns-range::, for more information.

Partitioning schemes based on time intervals

If you wish to implement a partitioning scheme based on ranges or intervals of time in MySQL 5.7, you have two options:

  1. Partition the table by 'RANGE', and for the partitioning expression, employ a function operating on a note 'DATE': datetime, note 'TIME': time, or *note 'DATETIME': datetime. column and returning an integer value, as shown here:

      CREATE TABLE members (
          firstname VARCHAR(25) NOT NULL,
          lastname VARCHAR(25) NOT NULL,
          username VARCHAR(16) NOT NULL,
          email VARCHAR(35),
          joined DATE NOT NULL
      )
      PARTITION BY RANGE( YEAR(joined) ) (
          PARTITION p0 VALUES LESS THAN (1960),
          PARTITION p1 VALUES LESS THAN (1970),
          PARTITION p2 VALUES LESS THAN (1980),
          PARTITION p3 VALUES LESS THAN (1990),
          PARTITION p4 VALUES LESS THAN MAXVALUE
      );

    In MySQL 5.7, it is also possible to partition a table by 'RANGE' based on the value of a *note 'TIMESTAMP': datetime. column, using the 'UNIX_TIMESTAMP()' function, as shown in this example:

      CREATE TABLE quarterly_report_status (
          report_id INT NOT NULL,
          report_status VARCHAR(20) NOT NULL,
          report_updated TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
      )
      PARTITION BY RANGE ( UNIX_TIMESTAMP(report_updated) ) (
          PARTITION p0 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-01-01 00:00:00') ),
          PARTITION p1 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-04-01 00:00:00') ),
          PARTITION p2 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-07-01 00:00:00') ),
          PARTITION p3 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-10-01 00:00:00') ),
          PARTITION p4 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-01-01 00:00:00') ),
          PARTITION p5 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-04-01 00:00:00') ),
          PARTITION p6 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-07-01 00:00:00') ),
          PARTITION p7 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-10-01 00:00:00') ),
          PARTITION p8 VALUES LESS THAN ( UNIX_TIMESTAMP('2010-01-01 00:00:00') ),
          PARTITION p9 VALUES LESS THAN (MAXVALUE)
      );

    In MySQL 5.7, any other expressions involving *note 'TIMESTAMP': datetime. values are not permitted. (See Bug #42849.)

    Note:

    It is also possible in MySQL 5.7 to use 'UNIX_TIMESTAMP(timestamp_column)' as a partitioning expression for tables that are partitioned by 'LIST'. However, it is usually not practical to do so.

  2. Partition the table by 'RANGE COLUMNS', using a note 'DATE': datetime. or note 'DATETIME': datetime. column as the partitioning column. For example, the 'members' table could be defined using the 'joined' column directly, as shown here:

      CREATE TABLE members (
          firstname VARCHAR(25) NOT NULL,
          lastname VARCHAR(25) NOT NULL,
          username VARCHAR(16) NOT NULL,
          email VARCHAR(35),
          joined DATE NOT NULL
      )
      PARTITION BY RANGE COLUMNS(joined) (
          PARTITION p0 VALUES LESS THAN ('1960-01-01'),
          PARTITION p1 VALUES LESS THAN ('1970-01-01'),
          PARTITION p2 VALUES LESS THAN ('1980-01-01'),
          PARTITION p3 VALUES LESS THAN ('1990-01-01'),
          PARTITION p4 VALUES LESS THAN MAXVALUE
      );

Note:

The use of partitioning columns employing date or time types other than note 'DATE': datetime. or note 'DATETIME': datetime. is not supported with 'RANGE COLUMNS'.

 File: manual.info.tmp, Node: partitioning-list, Next: partitioning-columns, Prev: partitioning-range, Up: partitioning-types

22.2.2 LIST Partitioning

List partitioning in MySQL is similar to range partitioning in many ways. As in partitioning by 'RANGE', each partition must be explicitly defined. The chief difference between the two types of partitioning is that, in list partitioning, each partition is defined and selected based on the membership of a column value in one of a set of value lists, rather than in one of a set of contiguous ranges of values. This is done by using 'PARTITION BY LIST(EXPR)' where EXPR is a column value or an expression based on a column value and returning an integer value, and then defining each partition by means of a 'VALUES IN (VALUE_LIST)', where VALUE_LIST is a comma-separated list of integers.

Note:

In MySQL 5.7, it is possible to match against only a list of integers (and possibly 'NULL'--see *note partitioning-handling-nulls::) when partitioning by 'LIST'.

However, other column types may be used in value lists when employing 'LIST COLUMN' partitioning, which is described later in this section.

Unlike the case with partitions defined by range, list partitions do not need to be declared in any particular order. For more detailed syntactical information, see *note create-table::.

For the examples that follow, we assume that the basic definition of the table to be partitioned is provided by the *note 'CREATE TABLE': create-table. statement shown here:

 CREATE TABLE employees (
     id INT NOT NULL,
     fname VARCHAR(30),
     lname VARCHAR(30),
     hired DATE NOT NULL DEFAULT '1970-01-01',
     separated DATE NOT NULL DEFAULT '9999-12-31',
     job_code INT,
     store_id INT
 );

(This is the same table used as a basis for the examples in *note partitioning-range::.)

Suppose that there are 20 video stores distributed among 4 franchises as shown in the following table.

Region Store ID Numbers

North 3, 5, 6, 9, 17

East 1, 2, 10, 11, 19, 20

West 4, 12, 13, 14, 18

Central 7, 8, 15, 16

To partition this table in such a way that rows for stores belonging to the same region are stored in the same partition, you could use the *note 'CREATE TABLE': create-table. statement shown here:

 CREATE TABLE employees (
     id INT NOT NULL,
     fname VARCHAR(30),
     lname VARCHAR(30),
     hired DATE NOT NULL DEFAULT '1970-01-01',
     separated DATE NOT NULL DEFAULT '9999-12-31',
     job_code INT,
     store_id INT
 )
 PARTITION BY LIST(store_id) (
     PARTITION pNorth VALUES IN (3,5,6,9,17),
     PARTITION pEast VALUES IN (1,2,10,11,19,20),
     PARTITION pWest VALUES IN (4,12,13,14,18),
     PARTITION pCentral VALUES IN (7,8,15,16)
 );

This makes it easy to add or drop employee records relating to specific regions to or from the table. For instance, suppose that all stores in the West region are sold to another company. In MySQL 5.7, all rows relating to employees working at stores in that region can be deleted with the query 'ALTER TABLE employees TRUNCATE PARTITION pWest', which can be executed much more efficiently than the equivalent *note 'DELETE': delete. statement 'DELETE FROM employees WHERE store_id IN (4,12,13,14,18);'. (Using 'ALTER TABLE employees DROP PARTITION pWest' would also delete all of these rows, but would also remove the partition 'pWest' from the definition of the table; you would need to use an 'ALTER TABLE ... ADD PARTITION' statement to restore the table's original partitioning scheme.)

As with 'RANGE' partitioning, it is possible to combine 'LIST' partitioning with partitioning by hash or key to produce a composite partitioning (subpartitioning). See *note partitioning-subpartitions::.

Unlike the case with 'RANGE' partitioning, there is no 'catch-all' such as 'MAXVALUE'; all expected values for the partitioning expression should be covered in 'PARTITION ... VALUES IN (...)' clauses. An *note 'INSERT': insert. statement containing an unmatched partitioning column value fails with an error, as shown in this example:

 mysql> CREATE TABLE h2 (
     ->   c1 INT,
     ->   c2 INT
     -> )
     -> PARTITION BY LIST(c1) (
     ->   PARTITION p0 VALUES IN (1, 4, 7),
     ->   PARTITION p1 VALUES IN (2, 5, 8)
     -> );
 Query OK, 0 rows affected (0.11 sec)

 mysql> INSERT INTO h2 VALUES (3, 5);
 ERROR 1525 (HY000): Table has no partition for value 3

When inserting multiple rows using a single note 'INSERT': insert. statement the behavior depends on whether the table uses a transactional storage engine. For an note 'InnoDB': innodb-storage-engine. table, the statement is considered a single transaction, so the presence of any unmatched values causes the statement to fail completely, and no rows are inserted. For a table using a nontransactional storage engine such as *note 'MyISAM': myisam-storage-engine, any rows coming before the row containing the unmatched value are inserted, but any coming after it are not.

You can cause this type of error to be ignored by using the 'IGNORE' keyword, although a warning is issued for each row containing unmatched partitioning column values, as shown here.

 mysql> TRUNCATE h2;
 Query OK, 1 row affected (0.00 sec)

 mysql> TABLE h2;
 Empty set (0.00 sec)

 mysql> INSERT IGNORE INTO h2 VALUES (2, 5), (6, 10), (7, 5), (3, 1), (1, 9);
 Query OK, 3 rows affected, 2 warnings (0.01 sec)
 Records: 5  Duplicates: 2  Warnings: 2

 mysql> SHOW WARNINGS;
 +---------+------+------------------------------------+
 | Level   | Code | Message                            |
 +---------+------+------------------------------------+
 | Warning | 1526 | Table has no partition for value 6 |
 | Warning | 1526 | Table has no partition for value 3 |
 +---------+------+------------------------------------+
 2 rows in set (0.00 sec)

You can see in the output of the following 'TABLE' (https://dev.mysql.com/doc/refman/8.0/en/table.html) statement that rows containing unmatched partitioning column values were silently rejected, while rows containing no unmatched values were inserted into the table:

 mysql> TABLE h2;
 +------+------+
 | c1   | c2   |
 +------+------+
 |    7 |    5 |
 |    1 |    9 |
 |    2 |    5 |
 +------+------+
 3 rows in set (0.00 sec)

MySQL also provides support for 'LIST COLUMNS' partitioning, a variant of 'LIST' partitioning that enables you to use columns of types other than integer for partitioning columns, and to use multiple columns as partitioning keys. For more information, see *note partitioning-columns-list::.

 File: manual.info.tmp, Node: partitioning-columns, Next: partitioning-hash, Prev: partitioning-list, Up: partitioning-types

22.2.3 COLUMNS Partitioning

The next two sections discuss 'COLUMNS' partitioning, which are variants on 'RANGE' and 'LIST' partitioning. 'COLUMNS' partitioning enables the use of multiple columns in partitioning keys. All of these columns are taken into account both for the purpose of placing rows in partitions and for the determination of which partitions are to be checked for matching rows in partition pruning.

In addition, both 'RANGE COLUMNS' partitioning and 'LIST COLUMNS' partitioning support the use of non-integer columns for defining value ranges or list members. The permitted data types are shown in the following list:

The discussions of 'RANGE COLUMNS' and 'LIST COLUMNS' partitioning in the next two sections assume that you are already familiar with partitioning based on ranges and lists as supported in MySQL 5.1 and later; for more information about these, see note partitioning-range::, and note partitioning-list::, respectively.

 File: manual.info.tmp, Node: partitioning-columns-range, Next: partitioning-columns-list, Prev: partitioning-columns, Up: partitioning-columns

22.2.3.1 RANGE COLUMNS partitioning ...................................

Range columns partitioning is similar to range partitioning, but enables you to define partitions using ranges based on multiple column values. In addition, you can define the ranges using columns of types other than integer types.

'RANGE COLUMNS' partitioning differs significantly from 'RANGE' partitioning in the following ways:

The basic syntax for creating a table partitioned by 'RANGE COLUMNS' is shown here:

 CREATE TABLE TABLE_NAME
 PARTITION BY RANGE COLUMNS(COLUMN_LIST) (
     PARTITION PARTITION_NAME VALUES LESS THAN (VALUE_LIST)[,
     PARTITION PARTITION_NAME VALUES LESS THAN (VALUE_LIST)][,
     ...]
 )

 COLUMN_LIST:
     COLUMN_NAME[, COLUMN_NAME][, ...]

 VALUE_LIST:
     VALUE[, VALUE][, ...]

Note:

Not all note 'CREATE TABLE': create-table. options that can be used when creating partitioned tables are shown here. For complete information, see note create-table::.

In the syntax just shown, COLUMN_LIST is a list of one or more columns (sometimes called a partitioning column list), and VALUE_LIST is a list of values (that is, it is a partition definition value list). A VALUE_LIST must be supplied for each partition definition, and each VALUE_LIST must have the same number of values as the COLUMN_LIST has columns. Generally speaking, if you use N columns in the 'COLUMNS' clause, then each 'VALUES LESS THAN' clause must also be supplied with a list of N values.

The elements in the partitioning column list and in the value list defining each partition must occur in the same order. In addition, each element in the value list must be of the same data type as the corresponding element in the column list. However, the order of the column names in the partitioning column list and the value lists does not have to be the same as the order of the table column definitions in the main part of the note 'CREATE TABLE': create-table. statement. As with table partitioned by 'RANGE', you can use 'MAXVALUE' to represent a value such that any legal value inserted into a given column is always less than this value. Here is an example of a note 'CREATE TABLE': create-table. statement that helps to illustrate all of these points:

 mysql> CREATE TABLE rcx (
     ->     a INT,
     ->     b INT,
     ->     c CHAR(3),
     ->     d INT
     -> )
     -> PARTITION BY RANGE COLUMNS(a,d,c) (
     ->     PARTITION p0 VALUES LESS THAN (5,10,'ggg'),
     ->     PARTITION p1 VALUES LESS THAN (10,20,'mmm'),
     ->     PARTITION p2 VALUES LESS THAN (15,30,'sss'),
     ->     PARTITION p3 VALUES LESS THAN (MAXVALUE,MAXVALUE,MAXVALUE)
     -> );
 Query OK, 0 rows affected (0.15 sec)

Table 'rcx' contains the columns 'a', 'b', 'c', 'd'. The partitioning column list supplied to the 'COLUMNS' clause uses 3 of these columns, in the order 'a', 'd', 'c'. Each value list used to define a partition contains 3 values in the same order; that is, each value list tuple has the form ('INT', 'INT', 'CHAR(3)'), which corresponds to the data types used by columns 'a', 'd', and 'c' (in that order).

Placement of rows into partitions is determined by comparing the tuple from a row to be inserted that matches the column list in the 'COLUMNS' clause with the tuples used in the 'VALUES LESS THAN' clauses to define partitions of the table. Because we are comparing tuples (that is, lists or sets of values) rather than scalar values, the semantics of 'VALUES LESS THAN' as used with 'RANGE COLUMNS' partitions differs somewhat from the case with simple 'RANGE' partitions. In 'RANGE' partitioning, a row generating an expression value that is equal to a limiting value in a 'VALUES LESS THAN' is never placed in the corresponding partition; however, when using 'RANGE COLUMNS' partitioning, it is sometimes possible for a row whose partitioning column list's first element is equal in value to the that of the first element in a 'VALUES LESS THAN' value list to be placed in the corresponding partition.

Consider the 'RANGE' partitioned table created by this statement:

 CREATE TABLE r1 (
     a INT,
     b INT
 )
 PARTITION BY RANGE (a)  (
     PARTITION p0 VALUES LESS THAN (5),
     PARTITION p1 VALUES LESS THAN (MAXVALUE)
 );

If we insert 3 rows into this table such that the column value for 'a' is '5' for each row, all 3 rows are stored in partition 'p1' because the 'a' column value is in each case not less than 5, as we can see by executing the proper query against the Information Schema *note 'PARTITIONS': information-schema-partitions-table. table:

 mysql> INSERT INTO r1 VALUES (5,10), (5,11), (5,12);
 Query OK, 3 rows affected (0.00 sec)
 Records: 3  Duplicates: 0  Warnings: 0

 mysql> SELECT PARTITION_NAME, TABLE_ROWS
     ->     FROM INFORMATION_SCHEMA.PARTITIONS
     ->     WHERE TABLE_NAME = 'r1';
 +----------------+------------+
 | PARTITION_NAME | TABLE_ROWS |
 +----------------+------------+
 | p0             |          0 |
 | p1             |          3 |
 +----------------+------------+
 2 rows in set (0.00 sec)

Now consider a similar table 'rc1' that uses 'RANGE COLUMNS' partitioning with both columns 'a' and 'b' referenced in the 'COLUMNS' clause, created as shown here:

 CREATE TABLE rc1 (
     a INT,
     b INT
 )
 PARTITION BY RANGE COLUMNS(a, b) (
     PARTITION p0 VALUES LESS THAN (5, 12),
     PARTITION p3 VALUES LESS THAN (MAXVALUE, MAXVALUE)
 );

If we insert exactly the same rows into 'rc1' as we just inserted into 'r1', the distribution of the rows is quite different:

 mysql> INSERT INTO rc1 VALUES (5,10), (5,11), (5,12);
 Query OK, 3 rows affected (0.00 sec)
 Records: 3  Duplicates: 0  Warnings: 0

 mysql> SELECT PARTITION_NAME, TABLE_ROWS
     ->     FROM INFORMATION_SCHEMA.PARTITIONS
     ->     WHERE TABLE_NAME = 'rc1';
 +----------------+------------+
 | PARTITION_NAME | TABLE_ROWS |
 +----------------+------------+
 | p0             |          2 |
 | p3             |          1 |
 +----------------+------------+
 2 rows in set (0.00 sec)

This is because we are comparing rows rather than scalar values. We can compare the row values inserted with the limiting row value from the 'VALUES THAN LESS THAN' clause used to define partition 'p0' in table 'rc1', like this:

 mysql> SELECT (5,10) < (5,12), (5,11) < (5,12), (5,12) < (5,12);
 +-----------------+-----------------+-----------------+
 | (5,10) < (5,12) | (5,11) < (5,12) | (5,12) < (5,12) |
 +-----------------+-----------------+-----------------+
 |               1 |               1 |               0 |
 +-----------------+-----------------+-----------------+
 1 row in set (0.00 sec)

The 2 tuples '(5,10)' and '(5,11)' evaluate as less than '(5,12)', so they are stored in partition 'p0'. Since 5 is not less than 5 and 12 is not less than 12, '(5,12)' is considered not less than '(5,12)', and is stored in partition 'p1'.

The *note 'SELECT': select. statement in the preceding example could also have been written using explicit row constructors, like this:

 SELECT ROW(5,10) < ROW(5,12), ROW(5,11) < ROW(5,12), ROW(5,12) < ROW(5,12);

For more information about the use of row constructors in MySQL, see *note row-subqueries::.

For a table partitioned by 'RANGE COLUMNS' using only a single partitioning column, the storing of rows in partitions is the same as that of an equivalent table that is partitioned by 'RANGE'. The following 'CREATE TABLE' statement creates a table partitioned by 'RANGE COLUMNS' using 1 partitioning column:

 CREATE TABLE rx (
     a INT,
     b INT
 )
 PARTITION BY RANGE COLUMNS (a)  (
     PARTITION p0 VALUES LESS THAN (5),
     PARTITION p1 VALUES LESS THAN (MAXVALUE)
 );

If we insert the rows '(5,10)', '(5,11)', and '(5,12)' into this table, we can see that their placement is the same as it is for the table 'r' we created and populated earlier:

 mysql> INSERT INTO rx VALUES (5,10), (5,11), (5,12);
 Query OK, 3 rows affected (0.00 sec)
 Records: 3  Duplicates: 0  Warnings: 0

 mysql> SELECT PARTITION_NAME,TABLE_ROWS
     ->     FROM INFORMATION_SCHEMA.PARTITIONS
     ->     WHERE TABLE_NAME = 'rx';
 +----------------+------------+
 | PARTITION_NAME | TABLE_ROWS |
 +----------------+------------+
 | p0             |          0 |
 | p1             |          3 |
 +----------------+------------+
 2 rows in set (0.00 sec)

It is also possible to create tables partitioned by 'RANGE COLUMNS' where limiting values for one or more columns are repeated in successive partition definitions. You can do this as long as the tuples of column values used to define the partitions are strictly increasing. For example, each of the following *note 'CREATE TABLE': create-table. statements is valid:

 CREATE TABLE rc2 (
     a INT,
     b INT
 )
 PARTITION BY RANGE COLUMNS(a,b) (
     PARTITION p0 VALUES LESS THAN (0,10),
     PARTITION p1 VALUES LESS THAN (10,20),
     PARTITION p2 VALUES LESS THAN (10,30),
     PARTITION p3 VALUES LESS THAN (MAXVALUE,MAXVALUE)
  );

 CREATE TABLE rc3 (
     a INT,
     b INT
 )
 PARTITION BY RANGE COLUMNS(a,b) (
     PARTITION p0 VALUES LESS THAN (0,10),
     PARTITION p1 VALUES LESS THAN (10,20),
     PARTITION p2 VALUES LESS THAN (10,30),
     PARTITION p3 VALUES LESS THAN (10,35),
     PARTITION p4 VALUES LESS THAN (20,40),
     PARTITION p5 VALUES LESS THAN (MAXVALUE,MAXVALUE)
  );

The following statement also succeeds, even though it might appear at first glance that it would not, since the limiting value of column 'b' is 25 for partition 'p0' and 20 for partition 'p1', and the limiting value of column 'c' is 100 for partition 'p1' and 50 for partition 'p2':

 CREATE TABLE rc4 (
     a INT,
     b INT,
     c INT
 )
 PARTITION BY RANGE COLUMNS(a,b,c) (
     PARTITION p0 VALUES LESS THAN (0,25,50),
     PARTITION p1 VALUES LESS THAN (10,20,100),
     PARTITION p2 VALUES LESS THAN (10,30,50),
     PARTITION p3 VALUES LESS THAN (MAXVALUE,MAXVALUE,MAXVALUE)
  );

When designing tables partitioned by 'RANGE COLUMNS', you can always test successive partition definitions by comparing the desired tuples using the *note 'mysql': mysql. client, like this:

 mysql> SELECT (0,25,50) < (10,20,100), (10,20,100) < (10,30,50);
 +-------------------------+--------------------------+
 | (0,25,50) < (10,20,100) | (10,20,100) < (10,30,50) |
 +-------------------------+--------------------------+
 |                       1 |                        1 |
 +-------------------------+--------------------------+
 1 row in set (0.00 sec)

If a *note 'CREATE TABLE': create-table. statement contains partition definitions that are not in strictly increasing order, it fails with an error, as shown in this example:

 mysql> CREATE TABLE rcf (
     ->     a INT,
     ->     b INT,
     ->     c INT
     -> )
     -> PARTITION BY RANGE COLUMNS(a,b,c) (
     ->     PARTITION p0 VALUES LESS THAN (0,25,50),
     ->     PARTITION p1 VALUES LESS THAN (20,20,100),
     ->     PARTITION p2 VALUES LESS THAN (10,30,50),
     ->     PARTITION p3 VALUES LESS THAN (MAXVALUE,MAXVALUE,MAXVALUE)
     ->  );
 ERROR 1493 (HY000): VALUES LESS THAN value must be strictly increasing for each partition

When you get such an error, you can deduce which partition definitions are invalid by making 'less than' comparisons between their column lists. In this case, the problem is with the definition of partition 'p2' because the tuple used to define it is not less than the tuple used to define partition 'p3', as shown here:

 mysql> SELECT (0,25,50) < (20,20,100), (20,20,100) < (10,30,50);
 +-------------------------+--------------------------+
 | (0,25,50) < (20,20,100) | (20,20,100) < (10,30,50) |
 +-------------------------+--------------------------+
 |                       1 |                        0 |
 +-------------------------+--------------------------+
 1 row in set (0.00 sec)

It is also possible for 'MAXVALUE' to appear for the same column in more than one 'VALUES LESS THAN' clause when using 'RANGE COLUMNS'. However, the limiting values for individual columns in successive partition definitions should otherwise be increasing, there should be no more than one partition defined where 'MAXVALUE' is used as the upper limit for all column values, and this partition definition should appear last in the list of 'PARTITION ... VALUES LESS THAN' clauses. In addition, you cannot use 'MAXVALUE' as the limiting value for the first column in more than one partition definition.

As stated previously, it is also possible with 'RANGE COLUMNS' partitioning to use non-integer columns as partitioning columns. (See *note partitioning-columns::, for a complete listing of these.) Consider a table named 'employees' (which is not partitioned), created using the following statement:

 CREATE TABLE employees (
     id INT NOT NULL,
     fname VARCHAR(30),
     lname VARCHAR(30),
     hired DATE NOT NULL DEFAULT '1970-01-01',
     separated DATE NOT NULL DEFAULT '9999-12-31',
     job_code INT NOT NULL,
     store_id INT NOT NULL
 );

Using 'RANGE COLUMNS' partitioning, you can create a version of this table that stores each row in one of four partitions based on the employee's last name, like this:

 CREATE TABLE employees_by_lname (
     id INT NOT NULL,
     fname VARCHAR(30),
     lname VARCHAR(30),
     hired DATE NOT NULL DEFAULT '1970-01-01',
     separated DATE NOT NULL DEFAULT '9999-12-31',
     job_code INT NOT NULL,
     store_id INT NOT NULL
 )
 PARTITION BY RANGE COLUMNS (lname)  (
     PARTITION p0 VALUES LESS THAN ('g'),
     PARTITION p1 VALUES LESS THAN ('m'),
     PARTITION p2 VALUES LESS THAN ('t'),
     PARTITION p3 VALUES LESS THAN (MAXVALUE)
 );

Alternatively, you could cause the 'employees' table as created previously to be partitioned using this scheme by executing the following *note 'ALTER TABLE': alter-table-partition-operations. statement:

 ALTER TABLE employees PARTITION BY RANGE COLUMNS (lname)  (
     PARTITION p0 VALUES LESS THAN ('g'),
     PARTITION p1 VALUES LESS THAN ('m'),
     PARTITION p2 VALUES LESS THAN ('t'),
     PARTITION p3 VALUES LESS THAN (MAXVALUE)
 );

Note:

Because different character sets and collations have different sort orders, the character sets and collations in use may effect which partition of a table partitioned by 'RANGE COLUMNS' a given row is stored in when using string columns as partitioning columns. In addition, changing the character set or collation for a given database, table, or column after such a table is created may cause changes in how rows are distributed. For example, when using a case-sensitive collation, ''and'' sorts before ''Andersen'', but when using a collation that is case-insensitive, the reverse is true.

For information about how MySQL handles character sets and collations, see *note charset::.

Similarly, you can cause the 'employees' table to be partitioned in such a way that each row is stored in one of several partitions based on the decade in which the corresponding employee was hired using the *note 'ALTER TABLE': alter-table-partition-operations. statement shown here:

 ALTER TABLE employees PARTITION BY RANGE COLUMNS (hired)  (
     PARTITION p0 VALUES LESS THAN ('1970-01-01'),
     PARTITION p1 VALUES LESS THAN ('1980-01-01'),
     PARTITION p2 VALUES LESS THAN ('1990-01-01'),
     PARTITION p3 VALUES LESS THAN ('2000-01-01'),
     PARTITION p4 VALUES LESS THAN ('2010-01-01'),
     PARTITION p5 VALUES LESS THAN (MAXVALUE)
 );

See *note create-table::, for additional information about 'PARTITION BY RANGE COLUMNS' syntax.

 File: manual.info.tmp, Node: partitioning-columns-list, Prev: partitioning-columns-range, Up: partitioning-columns

22.2.3.2 LIST COLUMNS partitioning ..................................

MySQL 5.7 provides support for 'LIST COLUMNS' partitioning. This is a variant of 'LIST' partitioning that enables the use of multiple columns as partition keys, and for columns of data types other than integer types to be used as partitioning columns; you can use string types, note 'DATE': datetime, and note 'DATETIME': datetime. columns. (For more information about permitted data types for 'COLUMNS' partitioning columns, see *note partitioning-columns::.)

Suppose that you have a business that has customers in 12 cities which, for sales and marketing purposes, you organize into 4 regions of 3 cities each as shown in the following table:

Region Cities

1 Oskarshamn, Ho"gsby, Mo"nsteraas

2 Vimmerby, Hultsfred, Va"stervik

3 Na"ssjo", Eksjo", Vetlanda

4 Uppvidinge, Alvesta, Va"xjo

With 'LIST COLUMNS' partitioning, you can create a table for customer data that assigns a row to any of 4 partitions corresponding to these regions based on the name of the city where a customer resides, as shown here:

 CREATE TABLE customers_1 (
     first_name VARCHAR(25),
     last_name VARCHAR(25),
     street_1 VARCHAR(30),
     street_2 VARCHAR(30),
     city VARCHAR(15),
     renewal DATE
 )
 PARTITION BY LIST COLUMNS(city) (
     PARTITION pRegion_1 VALUES IN('Oskarshamn', 'Ho"gsby', 'Mo"nsteraas'),
     PARTITION pRegion_2 VALUES IN('Vimmerby', 'Hultsfred', 'Va"stervik'),
     PARTITION pRegion_3 VALUES IN('Na"ssjo"', 'Eksjo"', 'Vetlanda'),
     PARTITION pRegion_4 VALUES IN('Uppvidinge', 'Alvesta', 'Va"xjo')
 );

As with partitioning by 'RANGE COLUMNS', you do not need to use expressions in the 'COLUMNS()' clause to convert column values into integers. (In fact, the use of expressions other than column names is not permitted with 'COLUMNS()'.)

It is also possible to use note 'DATE': datetime. and note 'DATETIME': datetime. columns, as shown in the following example that uses the same name and columns as the 'customers_1' table shown previously, but employs 'LIST COLUMNS' partitioning based on the 'renewal' column to store rows in one of 4 partitions depending on the week in February 2010 the customer's account is scheduled to renew:

 CREATE TABLE customers_2 (
     first_name VARCHAR(25),
     last_name VARCHAR(25),
     street_1 VARCHAR(30),
     street_2 VARCHAR(30),
     city VARCHAR(15),
     renewal DATE
 )
 PARTITION BY LIST COLUMNS(renewal) (
     PARTITION pWeek_1 VALUES IN('2010-02-01', '2010-02-02', '2010-02-03',
         '2010-02-04', '2010-02-05', '2010-02-06', '2010-02-07'),
     PARTITION pWeek_2 VALUES IN('2010-02-08', '2010-02-09', '2010-02-10',
         '2010-02-11', '2010-02-12', '2010-02-13', '2010-02-14'),
     PARTITION pWeek_3 VALUES IN('2010-02-15', '2010-02-16', '2010-02-17',
         '2010-02-18', '2010-02-19', '2010-02-20', '2010-02-21'),
     PARTITION pWeek_4 VALUES IN('2010-02-22', '2010-02-23', '2010-02-24',
         '2010-02-25', '2010-02-26', '2010-02-27', '2010-02-28')
 );

This works, but becomes cumbersome to define and maintain if the number of dates involved grows very large; in such cases, it is usually more practical to employ 'RANGE' or 'RANGE COLUMNS' partitioning instead. In this case, since the column we wish to use as the partitioning key is a *note 'DATE': datetime. column, we use 'RANGE COLUMNS' partitioning, as shown here:

 CREATE TABLE customers_3 (
     first_name VARCHAR(25),
     last_name VARCHAR(25),
     street_1 VARCHAR(30),
     street_2 VARCHAR(30),
     city VARCHAR(15),
     renewal DATE
 )
 PARTITION BY RANGE COLUMNS(renewal) (
     PARTITION pWeek_1 VALUES LESS THAN('2010-02-09'),
     PARTITION pWeek_2 VALUES LESS THAN('2010-02-15'),
     PARTITION pWeek_3 VALUES LESS THAN('2010-02-22'),
     PARTITION pWeek_4 VALUES LESS THAN('2010-03-01')
 );

See *note partitioning-columns-range::, for more information.

In addition (as with 'RANGE COLUMNS' partitioning), you can use multiple columns in the 'COLUMNS()' clause.

See *note create-table::, for additional information about 'PARTITION BY LIST COLUMNS()' syntax.

 File: manual.info.tmp, Node: partitioning-hash, Next: partitioning-key, Prev: partitioning-columns, Up: partitioning-types

22.2.4 HASH Partitioning

Partitioning by 'HASH' is used primarily to ensure an even distribution of data among a predetermined number of partitions. With range or list partitioning, you must specify explicitly into which partition a given column value or set of column values is to be stored; with hash partitioning, MySQL takes care of this for you, and you need only specify a column value or expression based on a column value to be hashed and the number of partitions into which the partitioned table is to be divided.

To partition a table using 'HASH' partitioning, it is necessary to append to the *note 'CREATE TABLE': create-table. statement a 'PARTITION BY HASH (EXPR)' clause, where EXPR is an expression that returns an integer. This can simply be the name of a column whose type is one of MySQL's integer types. In addition, you most likely want to follow this with 'PARTITIONS NUM', where NUM is a positive integer representing the number of partitions into which the table is to be divided.

Note:

For simplicity, the tables in the examples that follow do not use any keys. You should be aware that, if a table has any unique keys, every column used in the partitioning expression for this table must be part of every unique key, including the primary key. See *note partitioning-limitations-partitioning-keys-unique-keys::, for more information.

The following statement creates a table that uses hashing on the 'store_id' column and is divided into 4 partitions:

 CREATE TABLE employees (
     id INT NOT NULL,
     fname VARCHAR(30),
     lname VARCHAR(30),
     hired DATE NOT NULL DEFAULT '1970-01-01',
     separated DATE NOT NULL DEFAULT '9999-12-31',
     job_code INT,
     store_id INT
 )
 PARTITION BY HASH(store_id)
 PARTITIONS 4;

If you do not include a 'PARTITIONS' clause, the number of partitions defaults to '1'.

Using the 'PARTITIONS' keyword without a number following it results in a syntax error.

You can also use an SQL expression that returns an integer for EXPR. For instance, you might want to partition based on the year in which an employee was hired. This can be done as shown here:

 CREATE TABLE employees (
     id INT NOT NULL,
     fname VARCHAR(30),
     lname VARCHAR(30),
     hired DATE NOT NULL DEFAULT '1970-01-01',
     separated DATE NOT NULL DEFAULT '9999-12-31',
     job_code INT,
     store_id INT
 )
 PARTITION BY HASH( YEAR(hired) )
 PARTITIONS 4;

EXPR must return a nonconstant, nonrandom integer value (in other words, it should be varying but deterministic), and must not contain any prohibited constructs as described in *note partitioning-limitations::. You should also keep in mind that this expression is evaluated each time a row is inserted or updated (or possibly deleted); this means that very complex expressions may give rise to performance issues, particularly when performing operations (such as batch inserts) that affect a great many rows at one time.

The most efficient hashing function is one which operates upon a single table column and whose value increases or decreases consistently with the column value, as this allows for 'pruning' on ranges of partitions. That is, the more closely that the expression varies with the value of the column on which it is based, the more efficiently MySQL can use the expression for hash partitioning.

For example, where 'date_col' is a column of type *note 'DATE': datetime, then the expression 'TO_DAYS(date_col)' is said to vary directly with the value of 'date_col', because for every change in the value of 'date_col', the value of the expression changes in a consistent manner. The variance of the expression 'YEAR(date_col)' with respect to 'date_col' is not quite as direct as that of 'TO_DAYS(date_col)', because not every possible change in 'date_col' produces an equivalent change in 'YEAR(date_col)'. Even so, 'YEAR(date_col)' is a good candidate for a hashing function, because it varies directly with a portion of 'date_col' and there is no possible change in 'date_col' that produces a disproportionate change in 'YEAR(date_col)'.

By way of contrast, suppose that you have a column named 'int_col' whose type is *note 'INT': integer-types. Now consider the expression 'POW(5-int_col,3) + 6'. This would be a poor choice for a hashing function because a change in the value of 'int_col' is not guaranteed to produce a proportional change in the value of the expression. Changing the value of 'int_col' by a given amount can produce widely differing changes in the value of the expression. For example, changing 'int_col' from '5' to '6' produces a change of '-1' in the value of the expression, but changing the value of 'int_col' from '6' to '7' produces a change of '-7' in the expression value.

In other words, the more closely the graph of the column value versus the value of the expression follows a straight line as traced by the equation 'y=Cx' where C is some nonzero constant, the better the expression is suited to hashing. This has to do with the fact that the more nonlinear an expression is, the more uneven the distribution of data among the partitions it tends to produce.

In theory, pruning is also possible for expressions involving more than one column value, but determining which of such expressions are suitable can be quite difficult and time-consuming. For this reason, the use of hashing expressions involving multiple columns is not particularly recommended.

When 'PARTITION BY HASH' is used, MySQL determines which partition of NUM partitions to use based on the modulus of the result of the expression. In other words, for a given expression EXPR, the partition in which the record is stored is partition number N, where 'N = MOD(EXPR, NUM)'. Suppose that table 't1' is defined as follows, so that it has 4 partitions:

 CREATE TABLE t1 (col1 INT, col2 CHAR(5), col3 DATE)
     PARTITION BY HASH( YEAR(col3) )
     PARTITIONS 4;

If you insert a record into 't1' whose 'col3' value is ''2005-09-15'', then the partition in which it is stored is determined as follows:

 MOD(YEAR('2005-09-01'),4)
 =  MOD(2005,4)
 =  1

MySQL 5.7 also supports a variant of 'HASH' partitioning known as linear hashing which employs a more complex algorithm for determining the placement of new rows inserted into the partitioned table. See *note partitioning-linear-hash::, for a description of this algorithm.

The user-supplied expression is evaluated each time a record is inserted or updated. It may also--depending on the circumstances--be evaluated when records are deleted.

 File: manual.info.tmp, Node: partitioning-linear-hash, Prev: partitioning-hash, Up: partitioning-hash

22.2.4.1 LINEAR HASH Partitioning .................................

MySQL also supports linear hashing, which differs from regular hashing in that linear hashing utilizes a linear powers-of-two algorithm whereas regular hashing employs the modulus of the hashing function's value.

Syntactically, the only difference between linear-hash partitioning and regular hashing is the addition of the 'LINEAR' keyword in the 'PARTITION BY' clause, as shown here:

 CREATE TABLE employees (
     id INT NOT NULL,
     fname VARCHAR(30),
     lname VARCHAR(30),
     hired DATE NOT NULL DEFAULT '1970-01-01',
     separated DATE NOT NULL DEFAULT '9999-12-31',
     job_code INT,
     store_id INT
 )
 PARTITION BY LINEAR HASH( YEAR(hired) )
 PARTITIONS 4;

Given an expression EXPR, the partition in which the record is stored when linear hashing is used is partition number N from among NUM partitions, where N is derived according to the following algorithm:

  1. Find the next power of 2 greater than NUM. We call this value V; it can be calculated as:

      V = POWER(2, CEILING(LOG(2, NUM)))

    (Suppose that NUM is 13. Then 'LOG(2,13)' is 3.7004397181411. 'CEILING(3.7004397181411)' is 4, and V = 'POWER(2,4)', which is 16.)

  2. Set N = F(COLUMN_LIST) & (V - 1).

  3. While N >= NUM:

    * Set V = V / 2
    
    * Set N = N & (V - 1)

Suppose that the table 't1', using linear hash partitioning and having 6 partitions, is created using this statement:

 CREATE TABLE t1 (col1 INT, col2 CHAR(5), col3 DATE)
     PARTITION BY LINEAR HASH( YEAR(col3) )
     PARTITIONS 6;

Now assume that you want to insert two records into 't1' having the 'col3' column values ''2003-04-14'' and ''1998-10-19''. The partition number for the first of these is determined as follows:

 V = POWER(2, CEILING( LOG(2,6) )) = 8
 N = YEAR('2003-04-14') & (8 - 1)
    = 2003 & 7
    = 3

 (_3 >= 6 is FALSE: record stored in partition #3_)

The number of the partition where the second record is stored is calculated as shown here:

 V = 8
 N = YEAR('1998-10-19') & (8 - 1)
   = 1998 & 7
   = 6

 (_6 >= 6 is TRUE: additional step required_)

 N = 6 & ((8 / 2) - 1)
   = 6 & 3
   = 2

 (_2 >= 6 is FALSE: record stored in partition #2_)

The advantage in partitioning by linear hash is that the adding, dropping, merging, and splitting of partitions is made much faster, which can be beneficial when dealing with tables containing extremely large amounts (terabytes) of data. The disadvantage is that data is less likely to be evenly distributed between partitions as compared with the distribution obtained using regular hash partitioning.

 File: manual.info.tmp, Node: partitioning-key, Next: partitioning-subpartitions, Prev: partitioning-hash, Up: partitioning-types

22.2.5 KEY Partitioning

Partitioning by key is similar to partitioning by hash, except that where hash partitioning employs a user-defined expression, the hashing function for key partitioning is supplied by the MySQL server. NDB Cluster uses 'MD5()' for this purpose; for tables using other storage engines, the server employs its own internal hashing function which is based on the same algorithm as 'PASSWORD()'.

The syntax rules for 'CREATE TABLE ... PARTITION BY KEY' are similar to those for creating a table that is partitioned by hash. The major differences are listed here:

It is also possible to partition a table by linear key. Here is a simple example:

 CREATE TABLE tk (
     col1 INT NOT NULL,
     col2 CHAR(5),
     col3 DATE
 )
 PARTITION BY LINEAR KEY (col1)
 PARTITIONS 3;

Using 'LINEAR' has the same effect on 'KEY' partitioning as it does on 'HASH' partitioning, with the partition number being derived using a powers-of-two algorithm rather than modulo arithmetic. See *note partitioning-linear-hash::, for a description of this algorithm and its implications.

 File: manual.info.tmp, Node: partitioning-subpartitions, Next: partitioning-handling-nulls, Prev: partitioning-key, Up: partitioning-types

22.2.6 Subpartitioning

Subpartitioning--also known as composite partitioning--is the further division of each partition in a partitioned table. Consider the following *note 'CREATE TABLE': create-table. statement:

 CREATE TABLE ts (id INT, purchased DATE)
     PARTITION BY RANGE( YEAR(purchased) )
     SUBPARTITION BY HASH( TO_DAYS(purchased) )
     SUBPARTITIONS 2 (
         PARTITION p0 VALUES LESS THAN (1990),
         PARTITION p1 VALUES LESS THAN (2000),
         PARTITION p2 VALUES LESS THAN MAXVALUE
     );

Table 'ts' has 3 'RANGE' partitions. Each of these partitions--'p0', 'p1', and 'p2'--is further divided into 2 subpartitions. In effect, the entire table is divided into '3 * 2 = 6' partitions. However, due to the action of the 'PARTITION BY RANGE' clause, the first 2 of these store only those records with a value less than 1990 in the 'purchased' column.

In MySQL 5.7, it is possible to subpartition tables that are partitioned by 'RANGE' or 'LIST'. Subpartitions may use either 'HASH' or 'KEY' partitioning. This is also known as composite partitioning.

Note:

'SUBPARTITION BY HASH' and 'SUBPARTITION BY KEY' generally follow the same syntax rules as 'PARTITION BY HASH' and 'PARTITION BY KEY', respectively. An exception to this is that 'SUBPARTITION BY KEY' (unlike 'PARTITION BY KEY') does not currently support a default column, so the column used for this purpose must be specified, even if the table has an explicit primary key. This is a known issue which we are working to address; see *note partitioning-limitations-subpartitions::, for more information and an example.

It is also possible to define subpartitions explicitly using 'SUBPARTITION' clauses to specify options for individual subpartitions. For example, a more verbose fashion of creating the same table 'ts' as shown in the previous example would be:

 CREATE TABLE ts (id INT, purchased DATE)
     PARTITION BY RANGE( YEAR(purchased) )
     SUBPARTITION BY HASH( TO_DAYS(purchased) ) (
         PARTITION p0 VALUES LESS THAN (1990) (
             SUBPARTITION s0,
             SUBPARTITION s1
         ),
         PARTITION p1 VALUES LESS THAN (2000) (
             SUBPARTITION s2,
             SUBPARTITION s3
         ),
         PARTITION p2 VALUES LESS THAN MAXVALUE (
             SUBPARTITION s4,
             SUBPARTITION s5
         )
     );

Some syntactical items of note are listed here:

Subpartitions can be used with especially large *note 'MyISAM': myisam-storage-engine. tables to distribute data and indexes across many disks. Suppose that you have 6 disks mounted as '/disk0', '/disk1', '/disk2', and so on. Now consider the following example:

 CREATE TABLE ts (id INT, purchased DATE)
     ENGINE = MYISAM
     PARTITION BY RANGE( YEAR(purchased) )
     SUBPARTITION BY HASH( TO_DAYS(purchased) ) (
         PARTITION p0 VALUES LESS THAN (1990) (
             SUBPARTITION s0
                 DATA DIRECTORY = '/disk0/data'
                 INDEX DIRECTORY = '/disk0/idx',
             SUBPARTITION s1
                 DATA DIRECTORY = '/disk1/data'
                 INDEX DIRECTORY = '/disk1/idx'
         ),
         PARTITION p1 VALUES LESS THAN (2000) (
             SUBPARTITION s2
                 DATA DIRECTORY = '/disk2/data'
                 INDEX DIRECTORY = '/disk2/idx',
             SUBPARTITION s3
                 DATA DIRECTORY = '/disk3/data'
                 INDEX DIRECTORY = '/disk3/idx'
         ),
         PARTITION p2 VALUES LESS THAN MAXVALUE (
             SUBPARTITION s4
                 DATA DIRECTORY = '/disk4/data'
                 INDEX DIRECTORY = '/disk4/idx',
             SUBPARTITION s5
                 DATA DIRECTORY = '/disk5/data'
                 INDEX DIRECTORY = '/disk5/idx'
         )
     );

In this case, a separate disk is used for the data and for the indexes of each 'RANGE'. Many other variations are possible; another example might be:

 CREATE TABLE ts (id INT, purchased DATE)
     ENGINE = MYISAM
     PARTITION BY RANGE(YEAR(purchased))
     SUBPARTITION BY HASH( TO_DAYS(purchased) ) (
         PARTITION p0 VALUES LESS THAN (1990) (
             SUBPARTITION s0a
                 DATA DIRECTORY = '/disk0'
                 INDEX DIRECTORY = '/disk1',
             SUBPARTITION s0b
                 DATA DIRECTORY = '/disk2'
                 INDEX DIRECTORY = '/disk3'
         ),
         PARTITION p1 VALUES LESS THAN (2000) (
             SUBPARTITION s1a
                 DATA DIRECTORY = '/disk4/data'
                 INDEX DIRECTORY = '/disk4/idx',
             SUBPARTITION s1b
                 DATA DIRECTORY = '/disk5/data'
                 INDEX DIRECTORY = '/disk5/idx'
         ),
         PARTITION p2 VALUES LESS THAN MAXVALUE (
             SUBPARTITION s2a,
             SUBPARTITION s2b
         )
     );

Here, the storage is as follows:

The 'DATA DIRECTORY' and 'INDEX DIRECTORY' options are not permitted in partition definitions when the 'NO_DIR_IN_CREATE' server SQL mode is in effect. In MySQL 5.7, these options are also not permitted when defining subpartitions (Bug #42954).

 File: manual.info.tmp, Node: partitioning-handling-nulls, Prev: partitioning-subpartitions, Up: partitioning-types

22.2.7 How MySQL Partitioning Handles NULL

Partitioning in MySQL does nothing to disallow 'NULL' as the value of a partitioning expression, whether it is a column value or the value of a user-supplied expression. Even though it is permitted to use 'NULL' as the value of an expression that must otherwise yield an integer, it is important to keep in mind that 'NULL' is not a number. MySQL's partitioning implementation treats 'NULL' as being less than any non-'NULL' value, just as 'ORDER BY' does.

This means that treatment of 'NULL' varies between partitioning of different types, and may produce behavior which you do not expect if you are not prepared for it. This being the case, we discuss in this section how each MySQL partitioning type handles 'NULL' values when determining the partition in which a row should be stored, and provide examples for each.

Handling of NULL with RANGE partitioning

If you insert a row into a table partitioned by 'RANGE' such that the column value used to determine the partition is 'NULL', the row is inserted into the lowest partition. Consider these two tables in a database named 'p', created as follows:

 mysql> CREATE TABLE t1 (
     ->     c1 INT,
     ->     c2 VARCHAR(20)
     -> )
     -> PARTITION BY RANGE(c1) (
     ->     PARTITION p0 VALUES LESS THAN (0),
     ->     PARTITION p1 VALUES LESS THAN (10),
     ->     PARTITION p2 VALUES LESS THAN MAXVALUE
     -> );
 Query OK, 0 rows affected (0.09 sec)

 mysql> CREATE TABLE t2 (
     ->     c1 INT,
     ->     c2 VARCHAR(20)
     -> )
     -> PARTITION BY RANGE(c1) (
     ->     PARTITION p0 VALUES LESS THAN (-5),
     ->     PARTITION p1 VALUES LESS THAN (0),
     ->     PARTITION p2 VALUES LESS THAN (10),
     ->     PARTITION p3 VALUES LESS THAN MAXVALUE
     -> );
 Query OK, 0 rows affected (0.09 sec)

You can see the partitions created by these two note 'CREATE TABLE': create-table. statements using the following query against the note 'PARTITIONS': information-schema-partitions-table. table in the 'INFORMATION_SCHEMA' database:

 mysql> SELECT TABLE_NAME, PARTITION_NAME, TABLE_ROWS, AVG_ROW_LENGTH, DATA_LENGTH
      >   FROM INFORMATION_SCHEMA.PARTITIONS
      >   WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME LIKE 't_';
 +------------+----------------+------------+----------------+-------------+
 | TABLE_NAME | PARTITION_NAME | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH |
 +------------+----------------+------------+----------------+-------------+
 | t1         | p0             |          0 |              0 |           0 |
 | t1         | p1             |          0 |              0 |           0 |
 | t1         | p2             |          0 |              0 |           0 |
 | t2         | p0             |          0 |              0 |           0 |
 | t2         | p1             |          0 |              0 |           0 |
 | t2         | p2             |          0 |              0 |           0 |
 | t2         | p3             |          0 |              0 |           0 |
 +------------+----------------+------------+----------------+-------------+
 7 rows in set (0.00 sec)

(For more information about this table, see note information-schema-partitions-table::.) Now let us populate each of these tables with a single row containing a 'NULL' in the column used as the partitioning key, and verify that the rows were inserted using a pair of note 'SELECT': select. statements:

 mysql> INSERT INTO t1 VALUES (NULL, 'mothra');
 Query OK, 1 row affected (0.00 sec)

 mysql> INSERT INTO t2 VALUES (NULL, 'mothra');
 Query OK, 1 row affected (0.00 sec)

 mysql> SELECT * FROM t1;
 +------+--------+
 | id   | name   |
 +------+--------+
 | NULL | mothra |
 +------+--------+
 1 row in set (0.00 sec)

 mysql> SELECT * FROM t2;
 +------+--------+
 | id   | name   |
 +------+--------+
 | NULL | mothra |
 +------+--------+
 1 row in set (0.00 sec)

You can see which partitions are used to store the inserted rows by rerunning the previous query against *note 'INFORMATION_SCHEMA.PARTITIONS': information-schema-partitions-table. and inspecting the output:

 mysql> SELECT TABLE_NAME, PARTITION_NAME, TABLE_ROWS, AVG_ROW_LENGTH, DATA_LENGTH
      >   FROM INFORMATION_SCHEMA.PARTITIONS
      >   WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME LIKE 't_';
 +------------+----------------+------------+----------------+-------------+
 | TABLE_NAME | PARTITION_NAME | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH |
 +------------+----------------+------------+----------------+-------------+
 _| t1         | p0             |          1 |             20 |          20 |_
 | t1         | p1             |          0 |              0 |           0 |
 | t1         | p2             |          0 |              0 |           0 |
 _| t2         | p0             |          1 |             20 |          20 |_
 | t2         | p1             |          0 |              0 |           0 |
 | t2         | p2             |          0 |              0 |           0 |
 | t2         | p3             |          0 |              0 |           0 |
 +------------+----------------+------------+----------------+-------------+
 7 rows in set (0.01 sec)

You can also demonstrate that these rows were stored in the lowest partition of each table by dropping these partitions, and then re-running the *note 'SELECT': select. statements:

 mysql> ALTER TABLE t1 DROP PARTITION p0;
 Query OK, 0 rows affected (0.16 sec)

 mysql> ALTER TABLE t2 DROP PARTITION p0;
 Query OK, 0 rows affected (0.16 sec)

 mysql> SELECT * FROM t1;
 Empty set (0.00 sec)

 mysql> SELECT * FROM t2;
 Empty set (0.00 sec)

(For more information on 'ALTER TABLE ... DROP PARTITION', see *note alter-table::.)

'NULL' is also treated in this way for partitioning expressions that use SQL functions. Suppose that we define a table using a *note 'CREATE TABLE': create-table. statement such as this one:

 CREATE TABLE tndate (
     id INT,
     dt DATE
 )
 PARTITION BY RANGE( YEAR(dt) ) (
     PARTITION p0 VALUES LESS THAN (1990),
     PARTITION p1 VALUES LESS THAN (2000),
     PARTITION p2 VALUES LESS THAN MAXVALUE
 );

As with other MySQL functions, 'YEAR(NULL)' returns 'NULL'. A row with a 'dt' column value of 'NULL' is treated as though the partitioning expression evaluated to a value less than any other value, and so is inserted into partition 'p0'.

Handling of NULL with LIST partitioning

A table that is partitioned by 'LIST' admits 'NULL' values if and only if one of its partitions is defined using that value-list that contains 'NULL'. The converse of this is that a table partitioned by 'LIST' which does not explicitly use 'NULL' in a value list rejects rows resulting in a 'NULL' value for the partitioning expression, as shown in this example:

 mysql> CREATE TABLE ts1 (
     ->     c1 INT,
     ->     c2 VARCHAR(20)
     -> )
     -> PARTITION BY LIST(c1) (
     ->     PARTITION p0 VALUES IN (0, 3, 6),
     ->     PARTITION p1 VALUES IN (1, 4, 7),
     ->     PARTITION p2 VALUES IN (2, 5, 8)
     -> );
 Query OK, 0 rows affected (0.01 sec)

 mysql> INSERT INTO ts1 VALUES (9, 'mothra');
 ERROR 1504 (HY000): Table has no partition for value 9

 mysql> INSERT INTO ts1 VALUES (NULL, 'mothra');
 ERROR 1504 (HY000): Table has no partition for value NULL

Only rows having a 'c1' value between '0' and '8' inclusive can be inserted into 'ts1'. 'NULL' falls outside this range, just like the number '9'. We can create tables 'ts2' and 'ts3' having value lists containing 'NULL', as shown here:

 mysql> CREATE TABLE ts2 (
     ->     c1 INT,
     ->     c2 VARCHAR(20)
     -> )
     -> PARTITION BY LIST(c1) (
     ->     PARTITION p0 VALUES IN (0, 3, 6),
     ->     PARTITION p1 VALUES IN (1, 4, 7),
     ->     PARTITION p2 VALUES IN (2, 5, 8),
     ->     PARTITION p3 VALUES IN (NULL)
     -> );
 Query OK, 0 rows affected (0.01 sec)

 mysql> CREATE TABLE ts3 (
     ->     c1 INT,
     ->     c2 VARCHAR(20)
     -> )
     -> PARTITION BY LIST(c1) (
     ->     PARTITION p0 VALUES IN (0, 3, 6),
     ->     PARTITION p1 VALUES IN (1, 4, 7, NULL),
     ->     PARTITION p2 VALUES IN (2, 5, 8)
     -> );
 Query OK, 0 rows affected (0.01 sec)

When defining value lists for partitioning, you can (and should) treat 'NULL' just as you would any other value. For example, both 'VALUES IN (NULL)' and 'VALUES IN (1, 4, 7, NULL)' are valid, as are 'VALUES IN (1, NULL, 4, 7)', 'VALUES IN (NULL, 1, 4, 7)', and so on. You can insert a row having 'NULL' for column 'c1' into each of the tables 'ts2' and 'ts3':

 mysql> INSERT INTO ts2 VALUES (NULL, 'mothra');
 Query OK, 1 row affected (0.00 sec)

 mysql> INSERT INTO ts3 VALUES (NULL, 'mothra');
 Query OK, 1 row affected (0.00 sec)

By issuing the appropriate query against the Information Schema *note 'PARTITIONS': information-schema-partitions-table. table, you can determine which partitions were used to store the rows just inserted (we assume, as in the previous examples, that the partitioned tables were created in the 'p' database):

 mysql> SELECT TABLE_NAME, PARTITION_NAME, TABLE_ROWS, AVG_ROW_LENGTH, DATA_LENGTH
      >   FROM INFORMATION_SCHEMA.PARTITIONS
      >   WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME LIKE 'ts_';
 +------------+----------------+------------+----------------+-------------+
 | TABLE_NAME | PARTITION_NAME | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH |
 +------------+----------------+------------+----------------+-------------+
 | ts2        | p0             |          0 |              0 |           0 |
 | ts2        | p1             |          0 |              0 |           0 |
 | ts2        | p2             |          0 |              0 |           0 |
 _| ts2        | p3             |          1 |             20 |          20 |_
 | ts3        | p0             |          0 |              0 |           0 |
 _| ts3        | p1             |          1 |             20 |          20 |_
 | ts3        | p2             |          0 |              0 |           0 |
 +------------+----------------+------------+----------------+-------------+
 7 rows in set (0.01 sec)

As shown earlier in this section, you can also verify which partitions were used for storing the rows by deleting these partitions and then performing a *note 'SELECT': select.

Handling of NULL with HASH and KEY partitioning

'NULL' is handled somewhat differently for tables partitioned by 'HASH' or 'KEY'. In these cases, any partition expression that yields a 'NULL' value is treated as though its return value were zero. We can verify this behavior by examining the effects on the file system of creating a table partitioned by 'HASH' and populating it with a record containing appropriate values. Suppose that you have a table 'th' (also in the 'p' database) created using the following statement:

 mysql> CREATE TABLE th (
     ->     c1 INT,
     ->     c2 VARCHAR(20)
     -> )
     -> PARTITION BY HASH(c1)
     -> PARTITIONS 2;
 Query OK, 0 rows affected (0.00 sec)

The partitions belonging to this table can be viewed using the query shown here:

 mysql> SELECT TABLE_NAME,PARTITION_NAME,TABLE_ROWS,AVG_ROW_LENGTH,DATA_LENGTH
      >   FROM INFORMATION_SCHEMA.PARTITIONS
      >   WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME ='th';
 +------------+----------------+------------+----------------+-------------+
 | TABLE_NAME | PARTITION_NAME | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH |
 +------------+----------------+------------+----------------+-------------+
 | th         | p0             |          0 |              0 |           0 |
 | th         | p1             |          0 |              0 |           0 |
 +------------+----------------+------------+----------------+-------------+
 2 rows in set (0.00 sec)

'TABLE_ROWS' for each partition is 0. Now insert two rows into 'th' whose 'c1' column values are 'NULL' and 0, and verify that these rows were inserted, as shown here:

 mysql> INSERT INTO th VALUES (NULL, 'mothra'), (0, 'gigan');
 Query OK, 1 row affected (0.00 sec)

 mysql> SELECT * FROM th;
 +------+---------+
 | c1   | c2      |
 +------+---------+
 | NULL | mothra  |
 +------+---------+
 |    0 | gigan   |
 +------+---------+
 2 rows in set (0.01 sec)

Recall that for any integer N, the value of 'NULL MOD N' is always 'NULL'. For tables that are partitioned by 'HASH' or 'KEY', this result is treated for determining the correct partition as '0'. Checking the Information Schema *note 'PARTITIONS': information-schema-partitions-table. table once again, we can see that both rows were inserted into partition 'p0':

 mysql> SELECT TABLE_NAME, PARTITION_NAME, TABLE_ROWS, AVG_ROW_LENGTH, DATA_LENGTH
      >   FROM INFORMATION_SCHEMA.PARTITIONS
      >   WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME ='th';
 +------------+----------------+------------+----------------+-------------+
 | TABLE_NAME | PARTITION_NAME | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH |
 +------------+----------------+------------+----------------+-------------+
 _| th         | p0             |          2 |             20 |          20 |_
 | th         | p1             |          0 |              0 |           0 |
 +------------+----------------+------------+----------------+-------------+
 2 rows in set (0.00 sec)

By repeating the last example using 'PARTITION BY KEY' in place of 'PARTITION BY HASH' in the definition of the table, you can verify that 'NULL' is also treated like 0 for this type of partitioning.

 File: manual.info.tmp, Node: partitioning-management, Next: partitioning-pruning, Prev: partitioning-types, Up: partitioning