13.2 Data Manipulation Statements

 File: manual.info.tmp, Node: call, Next: delete, Prev: sql-data-manipulation-statements, Up: sql-data-manipulation-statements

13.2.1 CALL Statement

 CALL SP_NAME([PARAMETER[,...]])
 CALL SP_NAME[()]

The note 'CALL': call. statement invokes a stored procedure that was defined previously with note 'CREATE PROCEDURE': create-procedure.

Stored procedures that take no arguments can be invoked without parentheses. That is, 'CALL p()' and 'CALL p' are equivalent.

*note 'CALL': call. can pass back values to its caller using parameters that are declared as 'OUT' or 'INOUT' parameters. When the procedure returns, a client program can also obtain the number of rows affected for the final statement executed within the routine: At the SQL level, call the 'ROW_COUNT()' function; from the C API, call the 'mysql_affected_rows()' (https://dev.mysql.com/doc/c-api/5.7/en/mysql-affected-rows.html) function.

For information about the effect of unhandled conditions on procedure parameters, see *note conditions-and-parameters::.

To get back a value from a procedure using an 'OUT' or 'INOUT' parameter, pass the parameter by means of a user variable, and then check the value of the variable after the procedure returns. (If you are calling the procedure from within another stored procedure or function, you can also pass a routine parameter or local routine variable as an 'IN' or 'INOUT' parameter.) For an 'INOUT' parameter, initialize its value before passing it to the procedure. The following procedure has an 'OUT' parameter that the procedure sets to the current server version, and an 'INOUT' value that the procedure increments by one from its current value:

 DELIMITER //

 CREATE PROCEDURE p (OUT ver_param VARCHAR(25), INOUT incr_param INT)
 BEGIN
   # Set value of OUT parameter
   SELECT VERSION() INTO ver_param;
   # Increment value of INOUT parameter
   SET incr_param = incr_param + 1;
 END //

 DELIMITER ;

Before calling the procedure, initialize the variable to be passed as the 'INOUT' parameter. After calling the procedure, the values of the two variables have been set or modified:

 mysql> SET @increment = 10;
 mysql> CALL p(@version, @increment);
 mysql> SELECT @version, @increment;
 +----------+------------+
 | @version | @increment |
 +----------+------------+
 | 5.7.44   |         11 |
 +----------+------------+

In prepared note 'CALL': call. statements used with note 'PREPARE': prepare. and *note 'EXECUTE': execute, placeholders can be used for 'IN' parameters, 'OUT', and 'INOUT' parameters. These types of parameters can be used as follows:

 mysql> SET @increment = 10;
 mysql> PREPARE s FROM 'CALL p(?, ?)';
 mysql> EXECUTE s USING @version, @increment;
 mysql> SELECT @version, @increment;
 +----------+------------+
 | @version | @increment |
 +----------+------------+
 | 5.7.44   |         11 |
 +----------+------------+

To write C programs that use the note 'CALL': call. SQL statement to execute stored procedures that produce result sets, the 'CLIENT_MULTI_RESULTS' flag must be enabled. This is because each note 'CALL': call. returns a result to indicate the call status, in addition to any result sets that might be returned by statements executed within the procedure. 'CLIENT_MULTI_RESULTS' must also be enabled if *note 'CALL': call. is used to execute any stored procedure that contains prepared statements. It cannot be determined when such a procedure is loaded whether those statements produce result sets, so it is necessary to assume that they do so.

'CLIENT_MULTI_RESULTS' can be enabled when you call 'mysql_real_connect()' (https://dev.mysql.com/doc/c-api/5.7/en/mysql-real-connect.html), either explicitly by passing the 'CLIENT_MULTI_RESULTS' flag itself, or implicitly by passing 'CLIENT_MULTI_STATEMENTS' (which also enables 'CLIENT_MULTI_RESULTS'). 'CLIENT_MULTI_RESULTS' is enabled by default.

To process the result of a *note 'CALL': call. statement executed using 'mysql_query()' (https://dev.mysql.com/doc/c-api/5.7/en/mysql-query.html) or 'mysql_real_query()' (https://dev.mysql.com/doc/c-api/5.7/en/mysql-real-query.html), use a loop that calls 'mysql_next_result()' (https://dev.mysql.com/doc/c-api/5.7/en/mysql-next-result.html) to determine whether there are more results. For an example, see Multiple Statement Execution Support (https://dev.mysql.com/doc/c-api/5.7/en/c-api-multiple-queries.html).

C programs can use the prepared-statement interface to execute note 'CALL': call. statements and access 'OUT' and 'INOUT' parameters. This is done by processing the result of a note 'CALL': call. statement using a loop that calls 'mysql_stmt_next_result()' (https://dev.mysql.com/doc/c-api/5.7/en/mysql-stmt-next-result.html) to determine whether there are more results. For an example, see Prepared CALL Statement Support (https://dev.mysql.com/doc/c-api/5.7/en/c-api-prepared-call-statements.html). Languages that provide a MySQL interface can use prepared *note 'CALL': call. statements to directly retrieve 'OUT' and 'INOUT' procedure parameters.

Metadata changes to objects referred to by stored programs are detected and cause automatic reparsing of the affected statements when the program is next executed. For more information, see *note statement-caching::.

 File: manual.info.tmp, Node: delete, Next: do, Prev: call, Up: sql-data-manipulation-statements

13.2.2 DELETE Statement

*note 'DELETE': delete. is a DML statement that removes rows from a table.

Single-Table Syntax

 DELETE [LOW_PRIORITY] [QUICK] [IGNORE] FROM TBL_NAME
     [PARTITION (PARTITION_NAME [, PARTITION_NAME] ...)]
     [WHERE WHERE_CONDITION]
     [ORDER BY ...]
     [LIMIT ROW_COUNT]

The 'DELETE' statement deletes rows from TBL_NAME and returns the number of deleted rows. To check the number of deleted rows, call the 'ROW_COUNT()' function described in *note information-functions::.

Main Clauses

The conditions in the optional 'WHERE' clause identify which rows to delete. With no 'WHERE' clause, all rows are deleted.

WHERE_CONDITION is an expression that evaluates to true for each row to be deleted. It is specified as described in *note select::.

If the 'ORDER BY' clause is specified, the rows are deleted in the order that is specified. The 'LIMIT' clause places a limit on the number of rows that can be deleted. These clauses apply to single-table deletes, but not multi-table deletes.

Multiple-Table Syntax

 DELETE [LOW_PRIORITY] [QUICK] [IGNORE]
     TBL_NAME[.*] [, TBL_NAME[.*]] ...
     FROM TABLE_REFERENCES
     [WHERE WHERE_CONDITION]

 DELETE [LOW_PRIORITY] [QUICK] [IGNORE]
     FROM TBL_NAME[.*] [, TBL_NAME[.*]] ...
     USING TABLE_REFERENCES
     [WHERE WHERE_CONDITION]

Privileges

You need the 'DELETE' privilege on a table to delete rows from it. You need only the 'SELECT' privilege for any columns that are only read, such as those named in the 'WHERE' clause.

Performance

When you do not need to know the number of deleted rows, the note 'TRUNCATE TABLE': truncate-table. statement is a faster way to empty a table than a note 'DELETE': delete. statement with no 'WHERE' clause. Unlike note 'DELETE': delete, note 'TRUNCATE TABLE': truncate-table. cannot be used within a transaction or if you have a lock on the table. See note truncate-table:: and note lock-tables::.

The speed of delete operations may also be affected by factors discussed in *note delete-optimization::.

To ensure that a given note 'DELETE': delete. statement does not take too much time, the MySQL-specific 'LIMIT ROW_COUNT' clause for note 'DELETE': delete. specifies the maximum number of rows to be deleted. If the number of rows to delete is larger than the limit, repeat the 'DELETE' statement until the number of affected rows is less than the 'LIMIT' value.

Subqueries

You cannot delete from a table and select from the same table in a subquery.

Partitioned Table Support

'DELETE' supports explicit partition selection using the 'PARTITION' clause, which takes a list of the comma-separated names of one or more partitions or subpartitions (or both) from which to select rows to be dropped. Partitions not included in the list are ignored. Given a partitioned table 't' with a partition named 'p0', executing the statement 'DELETE FROM t PARTITION (p0)' has the same effect on the table as executing *note 'ALTER TABLE t TRUNCATE PARTITION (p0)': alter-table.; in both cases, all rows in partition 'p0' are dropped.

'PARTITION' can be used along with a 'WHERE' condition, in which case the condition is tested only on rows in the listed partitions. For example, 'DELETE FROM t PARTITION (p0) WHERE c < 5' deletes rows only from partition 'p0' for which the condition 'c < 5' is true; rows in any other partitions are not checked and thus not affected by the 'DELETE'.

The 'PARTITION' clause can also be used in multiple-table 'DELETE' statements. You can use up to one such option per table named in the 'FROM' option.

For more information and examples, see *note partitioning-selection::.

Auto-Increment Columns

If you delete the row containing the maximum value for an 'AUTO_INCREMENT' column, the value is not reused for a 'MyISAM' or 'InnoDB' table. If you delete all rows in the table with 'DELETE FROM TBL_NAME' (without a 'WHERE' clause) in 'autocommit' mode, the sequence starts over for all storage engines except 'InnoDB' and 'MyISAM'. There are some exceptions to this behavior for 'InnoDB' tables, as discussed in *note innodb-auto-increment-handling::.

For 'MyISAM' tables, you can specify an 'AUTO_INCREMENT' secondary column in a multiple-column key. In this case, reuse of values deleted from the top of the sequence occurs even for 'MyISAM' tables. See *note example-auto-increment::.

Modifiers

The *note 'DELETE': delete. statement supports the following modifiers:

Order of Deletion

If the *note 'DELETE': delete. statement includes an 'ORDER BY' clause, rows are deleted in the order specified by the clause. This is useful primarily in conjunction with 'LIMIT'. For example, the following statement finds rows matching the 'WHERE' clause, sorts them by 'timestamp_column', and deletes the first (oldest) one:

 DELETE FROM somelog WHERE user = 'jcole'
 ORDER BY timestamp_column LIMIT 1;

'ORDER BY' also helps to delete rows in an order required to avoid referential integrity violations.

InnoDB Tables

If you are deleting many rows from a large table, you may exceed the lock table size for an 'InnoDB' table. To avoid this problem, or simply to minimize the time that the table remains locked, the following strategy (which does not use *note 'DELETE': delete. at all) might be helpful:

  1. Select the rows not to be deleted into an empty table that has the same structure as the original table:

      INSERT INTO t_copy SELECT * FROM t WHERE ... ;
  2. Use *note 'RENAME TABLE': rename-table. to atomically move the original table out of the way and rename the copy to the original name:

      RENAME TABLE t TO t_old, t_copy TO t;
  3. Drop the original table:

      DROP TABLE t_old;

No other sessions can access the tables involved while note 'RENAME TABLE': rename-table. executes, so the rename operation is not subject to concurrency problems. See note rename-table::.

MyISAM Tables

In 'MyISAM' tables, deleted rows are maintained in a linked list and subsequent note 'INSERT': insert. operations reuse old row positions. To reclaim unused space and reduce file sizes, use the note 'OPTIMIZE TABLE': optimize-table. statement or the note 'myisamchk': myisamchk. utility to reorganize tables. note 'OPTIMIZE TABLE': optimize-table. is easier to use, but note 'myisamchk': myisamchk. is faster. See note optimize-table::, and *note myisamchk::.

The 'QUICK' modifier affects whether index leaves are merged for delete operations. 'DELETE QUICK' is most useful for applications where index values for deleted rows are replaced by similar index values from rows inserted later. In this case, the holes left by deleted values are reused.

'DELETE QUICK' is not useful when deleted values lead to underfilled index blocks spanning a range of index values for which new inserts occur again. In this case, use of 'QUICK' can lead to wasted space in the index that remains unreclaimed. Here is an example of such a scenario:

  1. Create a table that contains an indexed 'AUTO_INCREMENT' column.

  2. Insert many rows into the table. Each insert results in an index value that is added to the high end of the index.

  3. Delete a block of rows at the low end of the column range using 'DELETE QUICK'.

In this scenario, the index blocks associated with the deleted index values become underfilled but are not merged with other index blocks due to the use of 'QUICK'. They remain underfilled when new inserts occur, because new rows do not have index values in the deleted range. Furthermore, they remain underfilled even if you later use note 'DELETE': delete. without 'QUICK', unless some of the deleted index values happen to lie in index blocks within or adjacent to the underfilled blocks. To reclaim unused index space under these circumstances, use note 'OPTIMIZE TABLE': optimize-table.

If you are going to delete many rows from a table, it might be faster to use 'DELETE QUICK' followed by *note 'OPTIMIZE TABLE': optimize-table. This rebuilds the index rather than performing many index block merge operations.

Multi-Table Deletes

You can specify multiple tables in a note 'DELETE': delete. statement to delete rows from one or more tables depending on the condition in the 'WHERE' clause. You cannot use 'ORDER BY' or 'LIMIT' in a multiple-table 'DELETE'. The TABLE_REFERENCES clause lists the tables involved in the join, as described in note join::.

For the first multiple-table syntax, only matching rows from the tables listed before the 'FROM' clause are deleted. For the second multiple-table syntax, only matching rows from the tables listed in the 'FROM' clause (before the 'USING' clause) are deleted. The effect is that you can delete rows from many tables at the same time and have additional tables that are used only for searching:

 DELETE t1, t2 FROM t1 INNER JOIN t2 INNER JOIN t3
 WHERE t1.id=t2.id AND t2.id=t3.id;

Or:

 DELETE FROM t1, t2 USING t1 INNER JOIN t2 INNER JOIN t3
 WHERE t1.id=t2.id AND t2.id=t3.id;

These statements use all three tables when searching for rows to delete, but delete matching rows only from tables 't1' and 't2'.

The preceding examples use 'INNER JOIN', but multiple-table note 'DELETE': delete. statements can use other types of join permitted in note 'SELECT': select. statements, such as 'LEFT JOIN'. For example, to delete rows that exist in 't1' that have no match in 't2', use a 'LEFT JOIN':

 DELETE t1 FROM t1 LEFT JOIN t2 ON t1.id=t2.id WHERE t2.id IS NULL;

The syntax permits '.*' after each TBL_NAME for compatibility with 'Access'.

If you use a multiple-table *note 'DELETE': delete. statement involving 'InnoDB' tables for which there are foreign key constraints, the MySQL optimizer might process tables in an order that differs from that of their parent/child relationship. In this case, the statement fails and rolls back. Instead, you should delete from a single table and rely on the 'ON DELETE' capabilities that 'InnoDB' provides to cause the other tables to be modified accordingly.

Note:

If you declare an alias for a table, you must use the alias when referring to the table:

 DELETE t1 FROM test AS t1, test2 WHERE ...

Table aliases in a multiple-table *note 'DELETE': delete. should be declared only in the TABLE_REFERENCES part of the statement. Elsewhere, alias references are permitted but not alias declarations.

Correct:

 DELETE a1, a2 FROM t1 AS a1 INNER JOIN t2 AS a2
 WHERE a1.id=a2.id;

 DELETE FROM a1, a2 USING t1 AS a1 INNER JOIN t2 AS a2
 WHERE a1.id=a2.id;

Incorrect:

 DELETE t1 AS a1, t2 AS a2 FROM t1 INNER JOIN t2
 WHERE a1.id=a2.id;

 DELETE FROM t1 AS a1, t2 AS a2 USING t1 INNER JOIN t2
 WHERE a1.id=a2.id;

 File: manual.info.tmp, Node: do, Next: handler, Prev: delete, Up: sql-data-manipulation-statements

13.2.3 DO Statement

 DO EXPR [, EXPR] ...

note 'DO': do. executes the expressions but does not return any results. In most respects, note 'DO': do. is shorthand for 'SELECT EXPR, ...', but has the advantage that it is slightly faster when you do not care about the result.

*note 'DO': do. is useful primarily with functions that have side effects, such as 'RELEASE_LOCK()'.

Example: This *note 'SELECT': select. statement pauses, but also produces a result set:

 mysql> SELECT SLEEP(5);
 +----------+
 | SLEEP(5) |
 +----------+
 |        0 |
 +----------+
 1 row in set (5.02 sec)

*note 'DO': do, on the other hand, pauses without producing a result set.:

 mysql> DO SLEEP(5);
 Query OK, 0 rows affected (4.99 sec)

This could be useful, for example in a stored function or trigger, which prohibit statements that produce result sets.

*note 'DO': do. only executes expressions. It cannot be used in all cases where 'SELECT' can be used. For example, 'DO id FROM t1' is invalid because it references a table.

 File: manual.info.tmp, Node: handler, Next: insert, Prev: do, Up: sql-data-manipulation-statements

13.2.4 HANDLER Statement

 HANDLER TBL_NAME OPEN [ [AS] ALIAS]

 HANDLER TBL_NAME READ INDEX_NAME { = | <= | >= | < | > } (VALUE1,VALUE2,...)
     [ WHERE WHERE_CONDITION ] [LIMIT ... ]
 HANDLER TBL_NAME READ INDEX_NAME { FIRST | NEXT | PREV | LAST }
     [ WHERE WHERE_CONDITION ] [LIMIT ... ]
 HANDLER TBL_NAME READ { FIRST | NEXT }
     [ WHERE WHERE_CONDITION ] [LIMIT ... ]

 HANDLER TBL_NAME CLOSE

The 'HANDLER' statement provides direct access to table storage engine interfaces. It is available for 'InnoDB' and 'MyISAM' tables.

The 'HANDLER ... OPEN' statement opens a table, making it accessible using subsequent 'HANDLER ... READ' statements. This table object is not shared by other sessions and is not closed until the session calls 'HANDLER ... CLOSE' or the session terminates.

If you open the table using an alias, further references to the open table with other 'HANDLER' statements must use the alias rather than the table name. If you do not use an alias, but open the table using a table name qualified by the database name, further references must use the unqualified table name. For example, for a table opened using 'mydb.mytable', further references must use 'mytable'.

The first 'HANDLER ... READ' syntax fetches a row where the index specified satisfies the given values and the 'WHERE' condition is met. If you have a multiple-column index, specify the index column values as a comma-separated list. Either specify values for all the columns in the index, or specify values for a leftmost prefix of the index columns. Suppose that an index 'my_idx' includes three columns named 'col_a', 'col_b', and 'col_c', in that order. The 'HANDLER' statement can specify values for all three columns in the index, or for the columns in a leftmost prefix. For example:

 HANDLER ... READ my_idx = (col_a_val,col_b_val,col_c_val) ...
 HANDLER ... READ my_idx = (col_a_val,col_b_val) ...
 HANDLER ... READ my_idx = (col_a_val) ...

To employ the 'HANDLER' interface to refer to a table's 'PRIMARY KEY', use the quoted identifier 'PRIMARY':

 HANDLER TBL_NAME READ `PRIMARY` ...

The second 'HANDLER ... READ' syntax fetches a row from the table in index order that matches the 'WHERE' condition.

The third 'HANDLER ... READ' syntax fetches a row from the table in natural row order that matches the 'WHERE' condition. It is faster than 'HANDLER TBL_NAME READ INDEX_NAME' when a full table scan is desired. Natural row order is the order in which rows are stored in a 'MyISAM' table data file. This statement works for 'InnoDB' tables as well, but there is no such concept because there is no separate data file.

Without a 'LIMIT' clause, all forms of 'HANDLER ... READ' fetch a single row if one is available. To return a specific number of rows, include a 'LIMIT' clause. It has the same syntax as for the note 'SELECT': select. statement. See note select::.

'HANDLER ... CLOSE' closes a table that was opened with 'HANDLER ... OPEN'.

There are several reasons to use the 'HANDLER' interface instead of normal *note 'SELECT': select. statements:

'HANDLER' is a somewhat low-level statement. For example, it does not provide consistency. That is, 'HANDLER ... OPEN' does not take a snapshot of the table, and does not lock the table. This means that after a 'HANDLER ... OPEN' statement is issued, table data can be modified (by the current session or other sessions) and these modifications might be only partially visible to 'HANDLER ... NEXT' or 'HANDLER ... PREV' scans.

An open handler can be closed and marked for reopen, in which case the handler loses its position in the table. This occurs when both of the following circumstances are true:

note 'TRUNCATE TABLE': truncate-table. for a table closes all handlers for the table that were opened with note 'HANDLER OPEN': handler.

If a table is flushed with 'FLUSH TABLES TBL_NAME WITH READ LOCK' was opened with 'HANDLER', the handler is implicitly flushed and loses its position.

 File: manual.info.tmp, Node: insert, Next: load-data, Prev: handler, Up: sql-data-manipulation-statements

13.2.5 INSERT Statement

note 'INSERT': insert. inserts new rows into an existing table. The note 'INSERT ... VALUES': insert. and note 'INSERT ... SET': insert. forms of the statement insert rows based on explicitly specified values. The note 'INSERT ... SELECT': insert-select. form inserts rows selected from another table or tables. *note 'INSERT': insert. with an 'ON DUPLICATE KEY UPDATE' clause enables existing rows to be updated if a row to be inserted would cause a duplicate value in a 'UNIQUE' index or 'PRIMARY KEY'.

For additional information about note 'INSERT ... SELECT': insert-select. and note 'INSERT ... ON DUPLICATE KEY UPDATE': insert-on-duplicate, see note insert-select::, and note insert-on-duplicate::.

In MySQL 5.7, the 'DELAYED' keyword is accepted but ignored by the server. For the reasons for this, see *note insert-delayed::,

Inserting into a table requires the 'INSERT' privilege for the table. If the 'ON DUPLICATE KEY UPDATE' clause is used and a duplicate key causes an *note 'UPDATE': update. to be performed instead, the statement requires the 'UPDATE' privilege for the columns to be updated. For columns that are read but not modified you need only the 'SELECT' privilege (such as for a column referenced only on the right hand side of an COL_NAME=EXPR assignment in an 'ON DUPLICATE KEY UPDATE' clause).

When inserting into a partitioned table, you can control which partitions and subpartitions accept new rows. The 'PARTITION' clause takes a list of the comma-separated names of one or more partitions or subpartitions (or both) of the table. If any of the rows to be inserted by a given note 'INSERT': insert. statement do not match one of the partitions listed, the note 'INSERT': insert. statement fails with the error 'Found a row not matching the given partition set'. For more information and examples, see *note partitioning-selection::.

TBL_NAME is the table into which rows should be inserted. Specify the columns for which the statement provides values as follows:

Column values can be given in several ways:

*note 'INSERT': insert. statements that use 'VALUES' syntax can insert multiple rows. To do this, include multiple lists of comma-separated column values, with lists enclosed within parentheses and separated by commas. Example:

 INSERT INTO TBL_NAME (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);

Each values list must contain exactly as many values as are to be inserted per row. The following statement is invalid because it contains one list of nine values, rather than three lists of three values each:

 INSERT INTO TBL_NAME (a,b,c) VALUES(1,2,3,4,5,6,7,8,9);

'VALUE' is a synonym for 'VALUES' in this context. Neither implies anything about the number of values lists, nor about the number of values per list. Either may be used whether there is a single values list or multiple lists, and regardless of the number of values per list.

The affected-rows value for an note 'INSERT': insert. can be obtained using the 'ROW_COUNT()' SQL function or the 'mysql_affected_rows()' (https://dev.mysql.com/doc/c-api/5.7/en/mysql-affected-rows.html) C API function. See note information-functions::, and mysql_affected_rows() (https://dev.mysql.com/doc/c-api/5.7/en/mysql-affected-rows.html).

If you use an note 'INSERT ... VALUES': insert. statement with multiple value lists or note 'INSERT ... SELECT': insert-select, the statement returns an information string in this format:

 Records: N1 Duplicates: N2 Warnings: N3

If you are using the C API, the information string can be obtained by invoking the 'mysql_info()' (https://dev.mysql.com/doc/c-api/5.7/en/mysql-info.html) function. See mysql_info() (https://dev.mysql.com/doc/c-api/5.7/en/mysql-info.html).

'Records' indicates the number of rows processed by the statement. (This is not necessarily the number of rows actually inserted because 'Duplicates' can be nonzero.) 'Duplicates' indicates the number of rows that could not be inserted because they would duplicate some existing unique index value. 'Warnings' indicates the number of attempts to insert column values that were problematic in some way. Warnings can occur under any of the following conditions:

The *note 'INSERT': insert. statement supports the following modifiers:

An 'INSERT' statement affecting a partitioned table using a storage engine such as note 'MyISAM': myisam-storage-engine. that employs table-level locks locks only those partitions into which rows are actually inserted. (For storage engines such as note 'InnoDB': innodb-storage-engine. that employ row-level locking, no locking of partitions takes place.) For more information, see *note partitioning-limitations-locking::.

 File: manual.info.tmp, Node: insert-select, Next: insert-on-duplicate, Prev: insert, Up: insert

13.2.5.1 INSERT ... SELECT Statement ....................................

 INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]
     [INTO] TBL_NAME
     [PARTITION (PARTITION_NAME [, PARTITION_NAME] ...)]
     [(COL_NAME [, COL_NAME] ...)]
     SELECT ...
     [ON DUPLICATE KEY UPDATE ASSIGNMENT_LIST]

 VALUE:
     {EXPR | DEFAULT}

 ASSIGNMENT:
     COL_NAME = VALUE

 ASSIGNMENT_LIST:
     ASSIGNMENT [, ASSIGNMENT] ...

With note 'INSERT ... SELECT': insert-select, you can quickly insert many rows into a table from the result of a note 'SELECT': select. statement, which can select from one or many tables. For example:

 INSERT INTO tbl_temp2 (fld_id)
   SELECT tbl_temp1.fld_order_id
   FROM tbl_temp1 WHERE tbl_temp1.fld_order_id > 100;

The following conditions hold for *note 'INSERT ... SELECT': insert-select. statements:

You can explicitly select which partitions or subpartitions (or both) of the source or target table (or both) are to be used with a 'PARTITION' clause following the name of the table. When 'PARTITION' is used with the name of the source table in the note 'SELECT': select. portion of the statement, rows are selected only from the partitions or subpartitions named in its partition list. When 'PARTITION' is used with the name of the target table for the note 'INSERT': insert. portion of the statement, it must be possible to insert all rows selected into the partitions or subpartitions named in the partition list following the option. Otherwise, the 'INSERT ... SELECT' statement fails. For more information and examples, see *note partitioning-selection::.

For note 'INSERT ... SELECT': insert-on-duplicate. statements, see note insert-on-duplicate:: for conditions under which the *note 'SELECT': select. columns can be referred to in an 'ON DUPLICATE KEY UPDATE' clause.

The order in which a note 'SELECT': select. statement with no 'ORDER BY' clause returns rows is nondeterministic. This means that, when using replication, there is no guarantee that such a note 'SELECT': select. returns rows in the same order on the source and the replica, which can lead to inconsistencies between them. To prevent this from occurring, always write 'INSERT ... SELECT' statements that are to be replicated using an 'ORDER BY' clause that produces the same row order on the source and the replica. See also *note replication-features-limit::.

Due to this issue, note 'INSERT ... SELECT ON DUPLICATE KEY UPDATE': insert-on-duplicate. and note 'INSERT IGNORE ... SELECT': insert-select. statements are flagged as unsafe for statement-based replication. Such statements produce a warning in the error log when using statement-based mode and are written to the binary log using the row-based format when using 'MIXED' mode. (Bug #11758262, Bug #50439)

See also *note replication-sbr-rbr::.

An 'INSERT ... SELECT' statement affecting partitioned tables using a storage engine such as note 'MyISAM': myisam-storage-engine. that employs table-level locks locks all partitions of the target table; however, only those partitions that are actually read from the source table are locked. (This does not occur with tables using storage engines such as note 'InnoDB': innodb-storage-engine. that employ row-level locking.) For more information, see *note partitioning-limitations-locking::.

 File: manual.info.tmp, Node: insert-on-duplicate, Next: insert-delayed, Prev: insert-select, Up: insert

13.2.5.2 INSERT ... ON DUPLICATE KEY UPDATE Statement .....................................................

If you specify an 'ON DUPLICATE KEY UPDATE' clause and a row to be inserted would cause a duplicate value in a 'UNIQUE' index or 'PRIMARY KEY', an *note 'UPDATE': update. of the old row occurs. For example, if column 'a' is declared as 'UNIQUE' and contains the value '1', the following two statements have similar effect:

 INSERT INTO t1 (a,b,c) VALUES (1,2,3)
   ON DUPLICATE KEY UPDATE c=c+1;

 UPDATE t1 SET c=c+1 WHERE a=1;

The effects are not quite identical: For an 'InnoDB' table where 'a' is an auto-increment column, the 'INSERT' statement increases the auto-increment value but the 'UPDATE' does not.

If column 'b' is also unique, the note 'INSERT': insert. is equivalent to this note 'UPDATE': update. statement instead:

 UPDATE t1 SET c=c+1 WHERE a=1 OR b=2 LIMIT 1;

If 'a=1 OR b=2' matches several rows, only one row is updated. In general, you should try to avoid using an 'ON DUPLICATE KEY UPDATE' clause on tables with multiple unique indexes.

With 'ON DUPLICATE KEY UPDATE', the affected-rows value per row is 1 if the row is inserted as a new row, 2 if an existing row is updated, and 0 if an existing row is set to its current values. If you specify the 'CLIENT_FOUND_ROWS' flag to the 'mysql_real_connect()' (https://dev.mysql.com/doc/c-api/5.7/en/mysql-real-connect.html) C API function when connecting to *note 'mysqld': mysqld, the affected-rows value is 1 (not 0) if an existing row is set to its current values.

If a table contains an 'AUTO_INCREMENT' column and *note 'INSERT ... ON DUPLICATE KEY UPDATE': insert-on-duplicate. inserts or updates a row, the 'LAST_INSERT_ID()' function returns the 'AUTO_INCREMENT' value.

The 'ON DUPLICATE KEY UPDATE' clause can contain multiple column assignments, separated by commas.

It is possible to use 'IGNORE' with 'ON DUPLICATE KEY UPDATE' in an 'INSERT' statement, but this may not behave as you expect when inserting multiple rows into a table that has multiple unique keys. This becomes apparent when an updated value is itself a duplicate key value. Consider the table 't', created and populated by the statements shown here:

 mysql> CREATE TABLE t (a SERIAL, b BIGINT NOT NULL, UNIQUE KEY (b));;
 Query OK, 0 rows affected (0.03 sec)

 mysql> INSERT INTO t VALUES (1,1), (2,2);
 Query OK, 2 rows affected (0.01 sec)
 Records: 2  Duplicates: 0  Warnings: 0

 mysql> SELECT * FROM t;
 +---+---+
 | a | b |
 +---+---+
 | 1 | 1 |
 | 2 | 2 |
 +---+---+
 2 rows in set (0.00 sec)

Now we attempt to insert two rows, one of which contains a duplicate key value, using 'ON DUPLICATE KEY UPDATE', where the 'UPDATE' clause itself results in a duplicate key value:

 mysql> INSERT INTO t VALUES (2,3), (3,3) ON DUPLICATE KEY UPDATE a=a+1, b=b-1;
 ERROR 1062 (23000): Duplicate entry '1' for key 't.b'
 mysql> SELECT * FROM t;
 +---+---+
 | a | b |
 +---+---+
 | 1 | 1 |
 | 2 | 2 |
 +---+---+
 2 rows in set (0.00 sec)

The first row contains a duplicate value for one of the table's unique keys (column 'a'), but 'b=b+1' in the 'UPDATE' clause results in a unique key violation for column 'b'; the statement is immediately rejected with an error, and no rows are updated. Let us repeat the statement, this time adding the IGNORE keyword, like this:

 mysql> INSERT IGNORE INTO t VALUES (2,3), (3,3)
     -> ON DUPLICATE KEY UPDATE a=a+1, b=b-1;
 Query OK, 1 row affected, 1 warning (0.00 sec)
 Records: 2  Duplicates: 1  Warnings: 1

This time, the previous error is demoted to a warning, as shown here:

 mysql> SHOW WARNINGS;
 +---------+------+-----------------------------------+
 | Level   | Code | Message                           |
 +---------+------+-----------------------------------+
 | Warning | 1062 | Duplicate entry '1' for key 't.b' |
 +---------+------+-----------------------------------+
 1 row in set (0.00 sec)

Because the statement was not rejected, execution continues. This means that the second row is inserted into 't', as we can see here:

 mysql> SELECT * FROM t;
 +---+---+
 | a | b |
 +---+---+
 | 1 | 1 |
 | 2 | 2 |
 | 3 | 3 |
 +---+---+
 3 rows in set (0.00 sec)

In assignment value expressions in the 'ON DUPLICATE KEY UPDATE' clause, you can use the 'VALUES(COL_NAME)' function to refer to column values from the note 'INSERT': insert. portion of the note 'INSERT ... ON DUPLICATE KEY UPDATE': insert-on-duplicate. statement. In other words, 'VALUES(COL_NAME)' in the 'ON DUPLICATE KEY UPDATE' clause refers to the value of COL_NAME that would be inserted, had no duplicate-key conflict occurred. This function is especially useful in multiple-row inserts. The 'VALUES()' function is meaningful only as an introducer for 'INSERT' statement value lists, or in the 'ON DUPLICATE KEY UPDATE' clause of an *note 'INSERT': insert. statement, and returns 'NULL' otherwise. For example:

 INSERT INTO t1 (a,b,c) VALUES (1,2,3),(4,5,6)
   ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);

That statement is identical to the following two statements:

 INSERT INTO t1 (a,b,c) VALUES (1,2,3)
   ON DUPLICATE KEY UPDATE c=3;
 INSERT INTO t1 (a,b,c) VALUES (4,5,6)
   ON DUPLICATE KEY UPDATE c=9;

For *note 'INSERT ... SELECT': insert-on-duplicate. statements, these rules apply regarding acceptable forms of 'SELECT' query expressions that you can refer to in an 'ON DUPLICATE KEY UPDATE' clause:

References to columns from a note 'UNION': union. do not work reliably. To work around this restriction, rewrite the note 'UNION': union. as a derived table so that its rows can be treated as a single-table result set. For example, this statement can produce incorrect results:

 INSERT INTO t1 (a, b)
   SELECT c, d FROM t2
   UNION
   SELECT e, f FROM t3
 ON DUPLICATE KEY UPDATE b = b + c;

Instead, use an equivalent statement that rewrites the *note 'UNION': union. as a derived table:

 INSERT INTO t1 (a, b)
 SELECT * FROM
   (SELECT c, d FROM t2
    UNION
    SELECT e, f FROM t3) AS dt
 ON DUPLICATE KEY UPDATE b = b + c;

The technique of rewriting a query as a derived table also enables references to columns from 'GROUP BY' queries.

Because the results of note 'INSERT ... SELECT': insert-select. statements depend on the ordering of rows from the note 'SELECT': select. and this order cannot always be guaranteed, it is possible when logging note 'INSERT ... SELECT ON DUPLICATE KEY UPDATE': insert-on-duplicate. statements for the source and the replica to diverge. Thus, note 'INSERT ... SELECT ON DUPLICATE KEY UPDATE': insert-on-duplicate. statements are flagged as unsafe for statement-based replication. Such statements produce a warning in the error log when using statement-based mode and are written to the binary log using the row-based format when using 'MIXED' mode. An *note 'INSERT ... ON DUPLICATE KEY UPDATE': insert-on-duplicate. statement against a table having more than one unique or primary key is also marked as unsafe. (Bug #11765650, Bug #58637)

See also *note replication-sbr-rbr::.

An 'INSERT ... ON DUPLICATE KEY UPDATE' on a partitioned table using a storage engine such as note 'MyISAM': myisam-storage-engine. that employs table-level locks locks any partitions of the table in which a partitioning key column is updated. (This does not occur with tables using storage engines such as note 'InnoDB': innodb-storage-engine. that employ row-level locking.) For more information, see *note partitioning-limitations-locking::.

 File: manual.info.tmp, Node: insert-delayed, Prev: insert-on-duplicate, Up: insert

13.2.5.3 INSERT DELAYED Statement .................................

 INSERT DELAYED ...

The 'DELAYED' option for the note 'INSERT': insert. statement is a MySQL extension to standard SQL. In previous versions of MySQL, it can be used for certain kinds of tables (such as 'MyISAM'), such that when a client uses note 'INSERT DELAYED': insert-delayed, it gets an okay from the server at once, and the row is queued to be inserted when the table is not in use by any other thread.

'DELAYED' inserts and replaces were deprecated in MySQL 5.6. In MySQL 5.7, 'DELAYED' is not supported. The server recognizes but ignores the 'DELAYED' keyword, handles the insert as a nondelayed insert, and generates an 'ER_WARN_LEGACY_SYNTAX_CONVERTED' warning: 'INSERT DELAYED is no longer supported. The statement was converted to INSERT'. The 'DELAYED' keyword is scheduled for removal in a future release.

 File: manual.info.tmp, Node: load-data, Next: load-xml, Prev: insert, Up: sql-data-manipulation-statements

13.2.6 LOAD DATA Statement

 LOAD DATA
     [LOW_PRIORITY | CONCURRENT] [LOCAL]
     INFILE 'FILE_NAME'
     [REPLACE | IGNORE]
     INTO TABLE TBL_NAME
     [PARTITION (PARTITION_NAME [, PARTITION_NAME] ...)]
     [CHARACTER SET CHARSET_NAME]
     [{FIELDS | COLUMNS}
         [TERMINATED BY 'STRING']
         [[OPTIONALLY] ENCLOSED BY 'CHAR']
         [ESCAPED BY 'CHAR']
     ]
     [LINES
         [STARTING BY 'STRING']
         [TERMINATED BY 'STRING']
     ]
     [IGNORE NUMBER {LINES | ROWS}]
     [(COL_NAME_OR_USER_VAR
         [, COL_NAME_OR_USER_VAR] ...)]
     [SET COL_NAME={EXPR | DEFAULT}
         [, COL_NAME={EXPR | DEFAULT}] ...]

The *note 'LOAD DATA': load-data. statement reads rows from a text file into a table at a very high speed. The file can be read from the server host or the client host, depending on whether the 'LOCAL' modifier is given. 'LOCAL' also affects data interpretation and error handling.

note 'LOAD DATA': load-data. is the complement of note 'SELECT ... INTO OUTFILE': select-into. (See note select-into::.) To write data from a table to a file, use note 'SELECT ... INTO OUTFILE': select-into. To read the file back into a table, use *note 'LOAD DATA': load-data. The syntax of the 'FIELDS' and 'LINES' clauses is the same for both statements.

The note 'mysqlimport': mysqlimport. utility provides another way to load data files; it operates by sending a note 'LOAD DATA': load-data. statement to the server. See *note mysqlimport::.

For information about the efficiency of note 'INSERT': insert. versus note 'LOAD DATA': load-data. and speeding up note 'LOAD DATA': load-data, see note insert-optimization::.

Non-LOCAL Versus LOCAL Operation

The 'LOCAL' modifier affects these aspects of *note 'LOAD DATA': load-data, compared to non-'LOCAL' operation:

'LOCAL' works only if the server and your client both have been configured to permit it. For example, if note 'mysqld': mysqld. was started with the 'local_infile' system variable disabled, 'LOCAL' produces an error. See note load-data-local-security::.

Input File Character Set

The file name must be given as a literal string. On Windows, specify backslashes in path names as forward slashes or doubled backslashes. The server interprets the file name using the character set indicated by the 'character_set_filesystem' system variable.

By default, the server interprets the file contents using the character set indicated by the 'character_set_database' system variable. If the file contents use a character set different from this default, it is a good idea to specify that character set by using the 'CHARACTER SET' clause. A character set of 'binary' specifies 'no conversion.'

*note 'SET NAMES': set-names. and the setting of 'character_set_client' do not affect interpretation of file contents.

note 'LOAD DATA': load-data. interprets all fields in the file as having the same character set, regardless of the data types of the columns into which field values are loaded. For proper interpretation of the file, you must ensure that it was written with the correct character set. For example, if you write a data file with note 'mysqldump -T': mysqldump. or by issuing a note 'SELECT ... INTO OUTFILE': select-into. statement in note 'mysql': mysql, be sure to use a '--default-character-set' option to write output in the character set to be used when the file is loaded with *note 'LOAD DATA': load-data.

Note:

It is not possible to load data files that use the 'ucs2', 'utf16', 'utf16le', or 'utf32' character set.

Input File Location

These rules determine the *note 'LOAD DATA': load-data. input file location:

The non-'LOCAL' rules mean that the server reads a file named as './myfile.txt' relative to its data directory, whereas it reads a file named as 'myfile.txt' from the database directory of the default database. For example, if the following *note 'LOAD DATA': load-data. statement is executed while 'db1' is the default database, the server reads the file 'data.txt' from the database directory for 'db1', even though the statement explicitly loads the file into a table in the 'db2' database:

 LOAD DATA INFILE 'data.txt' INTO TABLE db2.my_table;

Security Requirements

For a non-'LOCAL' load operation, the server reads a text file located on the server host, so these security requirements must be satisified:

For a 'LOCAL' load operation, the client program reads a text file located on the client host. Because the file contents are sent over the connection by the client to the server, using 'LOCAL' is a bit slower than when the server accesses the file directly. On the other hand, you do not need the 'FILE' privilege, and the file can be located in any directory the client program can access.

Duplicate-Key and Error Handling

The 'REPLACE' and 'IGNORE' modifiers control handling of new (input) rows that duplicate existing table rows on unique key values ('PRIMARY KEY' or 'UNIQUE' index values):

Unless 'REPLACE' is also specified, the 'LOCAL' modifier has the same effect as 'IGNORE'. This occurs because the server has no way to stop transmission of the file in the middle of the operation.

If none of 'REPLACE', 'IGNORE', or 'LOCAL' is specified, an error occurs when a duplicate key value is found, and the rest of the text file is ignored.

In addition to affecting duplicate-key handling as just described, 'IGNORE' and 'LOCAL' also affect error handling:

Index Handling

To ignore foreign key constraints during the load operation, execute a 'SET foreign_key_checks = 0' statement before executing *note 'LOAD DATA': load-data.

If you use note 'LOAD DATA': load-data. on an empty 'MyISAM' table, all nonunique indexes are created in a separate batch (as for note 'REPAIR TABLE': repair-table.). Normally, this makes note 'LOAD DATA': load-data. much faster when you have many indexes. In some extreme cases, you can create the indexes even faster by turning them off with note 'ALTER TABLE ... DISABLE KEYS': alter-table. before loading the file into the table and re-creating the indexes with note 'ALTER TABLE ... ENABLE KEYS': alter-table. after loading the file. See note insert-optimization::.

Field and Line Handling

For both the note 'LOAD DATA': load-data. and note 'SELECT ... INTO OUTFILE': select-into. statements, the syntax of the 'FIELDS' and 'LINES' clauses is the same. Both clauses are optional, but 'FIELDS' must precede 'LINES' if both are specified.

If you specify a 'FIELDS' clause, each of its subclauses ('TERMINATED BY', '[OPTIONALLY] ENCLOSED BY', and 'ESCAPED BY') is also optional, except that you must specify at least one of them. Arguments to these clauses are permitted to contain only ASCII characters.

If you specify no 'FIELDS' or 'LINES' clause, the defaults are the same as if you had written this:

 FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\'
 LINES TERMINATED BY '\n' STARTING BY ''

Backslash is the MySQL escape character within strings in SQL statements. Thus, to specify a literal backslash, you must specify two backslashes for the value to be interpreted as a single backslash. The escape sequences '''' and '''' specify tab and newline characters, respectively.

In other words, the defaults cause *note 'LOAD DATA': load-data. to act as follows when reading input:

Conversely, the defaults cause *note 'SELECT ... INTO OUTFILE': select-into. to act as follows when writing output:

Note:

For a text file generated on a Windows system, proper file reading might require 'LINES TERMINATED BY ''' because Windows programs typically use two characters as a line terminator. Some programs, such as 'WordPad', might use '' as a line terminator when writing files. To read such files, use 'LINES TERMINATED BY '''.

If all the input lines have a common prefix that you want to ignore, you can use 'LINES STARTING BY 'PREFIX_STRING'' to skip the prefix and anything before it. If a line does not include the prefix, the entire line is skipped. Suppose that you issue the following statement:

 LOAD DATA INFILE '/tmp/test.txt' INTO TABLE test
   FIELDS TERMINATED BY ','  LINES STARTING BY 'xxx';

If the data file looks like this:

 xxx"abc",1
 something xxx"def",2
 "ghi",3

The resulting rows are '("abc",1)' and '("def",2)'. The third row in the file is skipped because it does not contain the prefix.

The 'IGNORE NUMBER LINES' clause can be used to ignore lines at the start of the file. For example, you can use 'IGNORE 1 LINES' to skip an initial header line containing column names:

 LOAD DATA INFILE '/tmp/test.txt' INTO TABLE test IGNORE 1 LINES;

When you use note 'SELECT ... INTO OUTFILE': select-into. in tandem with note 'LOAD DATA': load-data. to write data from a database into a file and then read the file back into the database later, the field- and line-handling options for both statements must match. Otherwise, note 'LOAD DATA': load-data. does not interpret the contents of the file properly. Suppose that you use note 'SELECT ... INTO OUTFILE': select-into. to write a file with fields delimited by commas:

 SELECT * INTO OUTFILE 'data.txt'
   FIELDS TERMINATED BY ','
   FROM table2;

To read the comma-delimited file, the correct statement is:

 LOAD DATA INFILE 'data.txt' INTO TABLE table2
   FIELDS TERMINATED BY ',';

If instead you tried to read the file with the statement shown following, it would not work because it instructs *note 'LOAD DATA': load-data. to look for tabs between fields:

 LOAD DATA INFILE 'data.txt' INTO TABLE table2
   FIELDS TERMINATED BY '\t';

The likely result is that each input line would be interpreted as a single field.

*note 'LOAD DATA': load-data. can be used to read files obtained from external sources. For example, many programs can export data in comma-separated values (CSV) format, such that lines have fields separated by commas and enclosed within double quotation marks, with an initial line of column names. If the lines in such a file are terminated by carriage return/newline pairs, the statement shown here illustrates the field- and line-handling options you would use to load the file:

 LOAD DATA INFILE 'data.txt' INTO TABLE TBL_NAME
   FIELDS TERMINATED BY ',' ENCLOSED BY '"'
   LINES TERMINATED BY '\r\n'
   IGNORE 1 LINES;

If the input values are not necessarily enclosed within quotation marks, use 'OPTIONALLY' before the 'ENCLOSED BY' option.

Any of the field- or line-handling options can specify an empty string (''''). If not empty, the 'FIELDS [OPTIONALLY] ENCLOSED BY' and 'FIELDS ESCAPED BY' values must be a single character. The 'FIELDS TERMINATED BY', 'LINES STARTING BY', and 'LINES TERMINATED BY' values can be more than one character. For example, to write lines that are terminated by carriage return/linefeed pairs, or to read a file containing such lines, specify a 'LINES TERMINATED BY ''' clause.

To read a file containing jokes that are separated by lines consisting of '%%', you can do this

 CREATE TABLE jokes
   (a INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
   joke TEXT NOT NULL);
 LOAD DATA INFILE '/tmp/jokes.txt' INTO TABLE jokes
   FIELDS TERMINATED BY ''
   LINES TERMINATED BY '\n%%\n' (joke);

'FIELDS [OPTIONALLY] ENCLOSED BY' controls quoting of fields. For output (*note 'SELECT ... INTO OUTFILE': select-into.), if you omit the word 'OPTIONALLY', all fields are enclosed by the 'ENCLOSED BY' character. An example of such output (using a comma as the field delimiter) is shown here:

 "1","a string","100.20"
 "2","a string containing a , comma","102.20"
 "3","a string containing a \" quote","102.20"
 "4","a string containing a \", quote and comma","102.20"

If you specify 'OPTIONALLY', the 'ENCLOSED BY' character is used only to enclose values from columns that have a string data type (such as note 'CHAR': char, note 'BINARY': binary-varbinary, note 'TEXT': blob, or note 'ENUM': enum.):

 1,"a string",100.20
 2,"a string containing a , comma",102.20
 3,"a string containing a \" quote",102.20
 4,"a string containing a \", quote and comma",102.20

Occurrences of the 'ENCLOSED BY' character within a field value are escaped by prefixing them with the 'ESCAPED BY' character. Also, if you specify an empty 'ESCAPED BY' value, it is possible to inadvertently generate output that cannot be read properly by *note 'LOAD DATA': load-data. For example, the preceding output just shown would appear as follows if the escape character is empty. Observe that the second field in the fourth line contains a comma following the quote, which (erroneously) appears to terminate the field:

 1,"a string",100.20
 2,"a string containing a , comma",102.20
 3,"a string containing a " quote",102.20
 4,"a string containing a ", quote and comma",102.20

For input, the 'ENCLOSED BY' character, if present, is stripped from the ends of field values. (This is true regardless of whether 'OPTIONALLY' is specified; 'OPTIONALLY' has no effect on input interpretation.) Occurrences of the 'ENCLOSED BY' character preceded by the 'ESCAPED BY' character are interpreted as part of the current field value.

If the field begins with the 'ENCLOSED BY' character, instances of that character are recognized as terminating a field value only if followed by the field or line 'TERMINATED BY' sequence. To avoid ambiguity, occurrences of the 'ENCLOSED BY' character within a field value can be doubled and are interpreted as a single instance of the character. For example, if 'ENCLOSED BY '"'' is specified, quotation marks are handled as shown here:

 "The ""BIG"" boss"  -> The "BIG" boss
 The "BIG" boss      -> The "BIG" boss
 The ""BIG"" boss    -> The ""BIG"" boss

'FIELDS ESCAPED BY' controls how to read or write special characters:

In certain cases, field- and line-handling options interact:

Handling of 'NULL' values varies according to the 'FIELDS' and 'LINES' options in use:

An attempt to load 'NULL' into a 'NOT NULL' column produces either a warning or an error according to the rules described in *note load-data-column-assignments::.

Some cases are not supported by *note 'LOAD DATA': load-data.:

Column List Specification

The following example loads all columns of the 'persondata' table:

 LOAD DATA INFILE 'persondata.txt' INTO TABLE persondata;

By default, when no column list is provided at the end of the *note 'LOAD DATA': load-data. statement, input lines are expected to contain a field for each table column. If you want to load only some of a table's columns, specify a column list:

 LOAD DATA INFILE 'persondata.txt' INTO TABLE persondata
 (COL_NAME_OR_USER_VAR [, COL_NAME_OR_USER_VAR] ...);

You must also specify a column list if the order of the fields in the input file differs from the order of the columns in the table. Otherwise, MySQL cannot tell how to match input fields with table columns.

Input Preprocessing

Each instance of COL_NAME_OR_USER_VAR in *note 'LOAD DATA': load-data. syntax is either a column name or a user variable. With user variables, the 'SET' clause enables you to perform preprocessing transformations on their values before assigning the result to columns.

User variables in the 'SET' clause can be used in several ways. The following example uses the first input column directly for the value of 't1.column1', and assigns the second input column to a user variable that is subjected to a division operation before being used for the value of 't1.column2':

 LOAD DATA INFILE 'file.txt'
   INTO TABLE t1
   (column1, @var1)
   SET column2 = @var1/100;

The 'SET' clause can be used to supply values not derived from the input file. The following statement sets 'column3' to the current date and time:

 LOAD DATA INFILE 'file.txt'
   INTO TABLE t1
   (column1, column2)
   SET column3 = CURRENT_TIMESTAMP;

You can also discard an input value by assigning it to a user variable and not assigning the variable to any table column:

 LOAD DATA INFILE 'file.txt'
   INTO TABLE t1
   (column1, @dummy, column2, @dummy, column3);

Use of the column/variable list and 'SET' clause is subject to the following restrictions:

Column Value Assignment

To process an input line, *note 'LOAD DATA': load-data. splits it into fields and uses the values according to the column/variable list and the 'SET' clause, if they are present. Then the resulting row is inserted into the table. If there are 'BEFORE INSERT' or 'AFTER INSERT' triggers for the table, they are activated before or after inserting the row, respectively.

Interpretation of field values and assignment to table columns depends on these factors:

Those factors combine to produce restrictive or nonrestrictive data interpretation by *note 'LOAD DATA': load-data.:

Restrictive data interpretation uses these rules:

By contrast, nonrestrictive data interpretation uses these rules:

note 'TIMESTAMP': datetime. columns are set to the current date and time only if there is a 'NULL' value for the column (that is, '') and the column is not declared to permit 'NULL' values, or if the note 'TIMESTAMP': datetime. column default value is the current timestamp and it is omitted from the field list when a field list is specified.

note 'LOAD DATA': load-data. regards all input as strings, so you cannot use numeric values for note 'ENUM': enum. or note 'SET': set. columns the way you can with note 'INSERT': insert. statements. All note 'ENUM': enum. and note 'SET': set. values must be specified as strings.

note 'BIT': bit-type. values cannot be loaded directly using binary notation (for example, 'b'011010''). To work around this, use the 'SET' clause to strip off the leading 'b'' and trailing ''' and perform a base-2 to base-10 conversion so that MySQL loads the values into the note 'BIT': bit-type. column properly:

 $> cat /tmp/bit_test.txt
 b'10'
 b'1111111'
 $> mysql test
 mysql> LOAD DATA INFILE '/tmp/bit_test.txt'
        INTO TABLE bit_test (@var1)
        SET b = CAST(CONV(MID(@var1, 3, LENGTH(@var1)-3), 2, 10) AS UNSIGNED);
 Query OK, 2 rows affected (0.00 sec)
 Records: 2  Deleted: 0  Skipped: 0  Warnings: 0

 mysql> SELECT BIN(b+0) FROM bit_test;
 +----------+
 | BIN(b+0) |
 +----------+
 | 10       |
 | 1111111  |
 +----------+
 2 rows in set (0.00 sec)

For *note 'BIT': bit-type. values in '0b' binary notation (for example, '0b011010'), use this 'SET' clause instead to strip off the leading '0b':

 SET b = CAST(CONV(MID(@var1, 3, LENGTH(@var1)-2), 2, 10) AS UNSIGNED)

Partitioned Table Support

note 'LOAD DATA': load-data. supports explicit partition selection using the 'PARTITION' clause with a list of one or more comma-separated names of partitions, subpartitions, or both. When this clause is used, if any rows from the file cannot be inserted into any of the partitions or subpartitions named in the list, the statement fails with the error 'Found a row not matching the given partition set'. For more information and examples, see note partitioning-selection::.

For partitioned tables using storage engines that employ table locks, such as note 'MyISAM': myisam-storage-engine, note 'LOAD DATA': load-data. cannot prune any partition locks. This does not apply to tables using storage engines that employ row-level locking, such as note 'InnoDB': innodb-storage-engine. For more information, see note partitioning-limitations-locking::.

Concurrency Considerations

With the 'LOW_PRIORITY' modifier, execution of the *note 'LOAD DATA': load-data. statement is delayed until no other clients are reading from the table. This affects only storage engines that use only table-level locking (such as 'MyISAM', 'MEMORY', and 'MERGE').

With the 'CONCURRENT' modifier and a 'MyISAM' table that satisfies the condition for concurrent inserts (that is, it contains no free blocks in the middle), other threads can retrieve data from the table while note 'LOAD DATA': load-data. is executing. This modifier affects the performance of note 'LOAD DATA': load-data. a bit, even if no other thread is using the table at the same time.

Statement Result Information

When the *note 'LOAD DATA': load-data. statement finishes, it returns an information string in the following format:

 Records: 1  Deleted: 0  Skipped: 0  Warnings: 0

Warnings occur under the same circumstances as when values are inserted using the note 'INSERT': insert. statement (see note insert::), except that *note 'LOAD DATA': load-data. also generates warnings when there are too few or too many fields in the input row.

You can use note 'SHOW WARNINGS': show-warnings. to get a list of the first 'max_error_count' warnings as information about what went wrong. See note show-warnings::.

If you are using the C API, you can get information about the statement by calling the 'mysql_info()' (https://dev.mysql.com/doc/c-api/5.7/en/mysql-info.html) function. See mysql_info() (https://dev.mysql.com/doc/c-api/5.7/en/mysql-info.html).

Replication Considerations

For information about note 'LOAD DATA': load-data. in relation to replication, see note replication-features-load-data::.

Miscellaneous Topics

On Unix, if you need *note 'LOAD DATA': load-data. to read from a pipe, you can use the following technique (the example loads a listing of the '/' directory into the table 'db1.t1'):

 mkfifo /mysql/data/db1/ls.dat
 chmod 666 /mysql/data/db1/ls.dat
 find / -ls > /mysql/data/db1/ls.dat &
 mysql -e "LOAD DATA INFILE 'ls.dat' INTO TABLE t1" db1

Here you must run the command that generates the data to be loaded and the note 'mysql': mysql. commands either on separate terminals, or run the data generation process in the background (as shown in the preceding example). If you do not do this, the pipe blocks until data is read by the note 'mysql': mysql. process.

 File: manual.info.tmp, Node: load-xml, Next: replace, Prev: load-data, Up: sql-data-manipulation-statements

13.2.7 LOAD XML Statement

 LOAD XML
     [LOW_PRIORITY | CONCURRENT] [LOCAL]
     INFILE 'FILE_NAME'
     [REPLACE | IGNORE]
     INTO TABLE [DB_NAME.]TBL_NAME
     [CHARACTER SET CHARSET_NAME]
     [ROWS IDENTIFIED BY '<TAGNAME>']
     [IGNORE NUMBER {LINES | ROWS}]
     [(FIELD_NAME_OR_USER_VAR
         [, FIELD_NAME_OR_USER_VAR] ...)]
     [SET COL_NAME={EXPR | DEFAULT}
         [, COL_NAME={EXPR | DEFAULT}] ...]

The *note 'LOAD XML': load-xml. statement reads data from an XML file into a table. The FILE_NAME must be given as a literal string. The TAGNAME in the optional 'ROWS IDENTIFIED BY' clause must also be given as a literal string, and must be surrounded by angle brackets ('<' and '>').

note 'LOAD XML': load-xml. acts as the complement of running the note 'mysql': mysql. client in XML output mode (that is, starting the client with the '--xml' option). To write data from a table to an XML file, you can invoke the *note 'mysql': mysql. client with the '--xml' and '-e' options from the system shell, as shown here:

 $> mysql --xml -e 'SELECT * FROM mydb.mytable' > file.xml

To read the file back into a table, use *note 'LOAD XML': load-xml. By default, the '' element is considered to be the equivalent of a database table row; this can be changed using the 'ROWS IDENTIFIED BY' clause.

This statement supports three different XML formats:

All three formats can be used in the same XML file; the import routine automatically detects the format for each row and interprets it correctly. Tags are matched based on the tag or attribute name and the column name.

In MySQL 5.7, 'LOAD XML' does not support 'CDATA' sections in the source XML. This limitation is removed in MySQL 8.0. (Bug #30753708, Bug #98199)

The following clauses work essentially the same way for note 'LOAD XML': load-xml. as they do for note 'LOAD DATA': load-data.:

See *note load-data::, for more information about these clauses.

'(FIELD_NAME_OR_USER_VAR, ...)' is a list of one or more comma-separated XML fields or user variables. The name of a user variable used for this purpose must match the name of a field from the XML file, prefixed with '@'. You can use field names to select only desired fields. User variables can be employed to store the corresponding field values for subsequent re-use.

The 'IGNORE NUMBER LINES' or 'IGNORE NUMBER ROWS' clause causes the first NUMBER rows in the XML file to be skipped. It is analogous to the *note 'LOAD DATA': load-data. statement's 'IGNORE ... LINES' clause.

Suppose that we have a table named 'person', created as shown here:

 USE test;

 CREATE TABLE person (
     person_id INT NOT NULL PRIMARY KEY,
     fname VARCHAR(40) NULL,
     lname VARCHAR(40) NULL,
     created TIMESTAMP
 );

Suppose further that this table is initially empty.

Now suppose that we have a simple XML file 'person.xml', whose contents are as shown here:

 <list>
   <person person_id="1" fname="Kapek" lname="Sainnouine"/>
   <person person_id="2" fname="Sajon" lname="Rondela"/>
   <person person_id="3"><fname>Likame</fname><lname>O"rrtmons</lname></person>
   <person person_id="4"><fname>Slar</fname><lname>Manlanth</lname></person>
   <person><field name="person_id">5</field><field name="fname">Stoma</field>
     <field name="lname">Milu</field></person>
   <person><field name="person_id">6</field><field name="fname">Nirtam</field>
     <field name="lname">Sklo"d</field></person>
   <person person_id="7"><fname>Sungam</fname><lname>Dulbaad</lname></person>
   <person person_id="8" fname="Sraref" lname="Encmelt"/>
 </list>

Each of the permissible XML formats discussed previously is represented in this example file.

To import the data in 'person.xml' into the 'person' table, you can use this statement:

 mysql> LOAD XML LOCAL INFILE 'person.xml'
     ->   INTO TABLE person
     ->   ROWS IDENTIFIED BY '<person>';

 Query OK, 8 rows affected (0.00 sec)
 Records: 8  Deleted: 0  Skipped: 0  Warnings: 0

Here, we assume that 'person.xml' is located in the MySQL data directory. If the file cannot be found, the following error results:

 ERROR 2 (HY000): File '/person.xml' not found (Errcode: 2)

The 'ROWS IDENTIFIED BY ''' clause means that each '' element in the XML file is considered equivalent to a row in the table into which the data is to be imported. In this case, this is the 'person' table in the 'test' database.

As can be seen by the response from the server, 8 rows were imported into the 'test.person' table. This can be verified by a simple *note 'SELECT': select. statement:

 mysql> SELECT * FROM person;
 +-----------+--------+------------+---------------------+
 | person_id | fname  | lname      | created             |
 +-----------+--------+------------+---------------------+
 |         1 | Kapek  | Sainnouine | 2007-07-13 16:18:47 |
 |         2 | Sajon  | Rondela    | 2007-07-13 16:18:47 |
 |         3 | Likame | O"rrtmons   | 2007-07-13 16:18:47 |
 |         4 | Slar   | Manlanth   | 2007-07-13 16:18:47 |
 |         5 | Stoma  | Nilu       | 2007-07-13 16:18:47 |
 |         6 | Nirtam | Sklo"d      | 2007-07-13 16:18:47 |
 |         7 | Sungam | Dulbaad     | 2007-07-13 16:18:47 |
 |         8 | Sreraf | Encmelt    | 2007-07-13 16:18:47 |
 +-----------+--------+------------+---------------------+
 8 rows in set (0.00 sec)

This shows, as stated earlier in this section, that any or all of the 3 permitted XML formats may appear in a single file and be read using *note 'LOAD XML': load-xml.

The inverse of the import operation just shown--that is, dumping MySQL table data into an XML file--can be accomplished using the *note 'mysql': mysql. client from the system shell, as shown here:

 $> mysql --xml -e "SELECT * FROM test.person" > person-dump.xml
 $> cat person-dump.xml
 <?xml version="1.0"?>

 <resultset statement="SELECT * FROM test.person" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
   <row>
    <field name="person_id">1</field>
    <field name="fname">Kapek</field>
    <field name="lname">Sainnouine</field>
   </row>

   <row>
    <field name="person_id">2</field>
    <field name="fname">Sajon</field>
    <field name="lname">Rondela</field>
   </row>

   <row>
    <field name="person_id">3</field>
    <field name="fname">Likema</field>
    <field name="lname">O"rrtmons</field>
   </row>

   <row>
    <field name="person_id">4</field>
    <field name="fname">Slar</field>
    <field name="lname">Manlanth</field>
   </row>

   <row>
    <field name="person_id">5</field>
    <field name="fname">Stoma</field>
    <field name="lname">Nilu</field>
   </row>

   <row>
    <field name="person_id">6</field>
    <field name="fname">Nirtam</field>
    <field name="lname">Sklo"d</field>
   </row>

   <row>
    <field name="person_id">7</field>
    <field name="fname">Sungam</field>
    <field name="lname">Dulbaad</field>
   </row>

   <row>
    <field name="person_id">8</field>
    <field name="fname">Sreraf</field>
    <field name="lname">Encmelt</field>
   </row>
 </resultset>

Note:

The '--xml' option causes the note 'mysql': mysql. client to use XML formatting for its output; the '-e' option causes the client to execute the SQL statement immediately following the option. See note mysql::.

You can verify that the dump is valid by creating a copy of the 'person' table and importing the dump file into the new table, like this:

 mysql> USE test;
 mysql> CREATE TABLE person2 LIKE person;
 Query OK, 0 rows affected (0.00 sec)

 mysql> LOAD XML LOCAL INFILE 'person-dump.xml'
     ->   INTO TABLE person2;
 Query OK, 8 rows affected (0.01 sec)
 Records: 8  Deleted: 0  Skipped: 0  Warnings: 0

 mysql> SELECT * FROM person2;
 +-----------+--------+------------+---------------------+
 | person_id | fname  | lname      | created             |
 +-----------+--------+------------+---------------------+
 |         1 | Kapek  | Sainnouine | 2007-07-13 16:18:47 |
 |         2 | Sajon  | Rondela    | 2007-07-13 16:18:47 |
 |         3 | Likema | O"rrtmons   | 2007-07-13 16:18:47 |
 |         4 | Slar   | Manlanth   | 2007-07-13 16:18:47 |
 |         5 | Stoma  | Nilu       | 2007-07-13 16:18:47 |
 |         6 | Nirtam | Sklo"d      | 2007-07-13 16:18:47 |
 |         7 | Sungam | Dulbaad     | 2007-07-13 16:18:47 |
 |         8 | Sreraf | Encmelt    | 2007-07-13 16:18:47 |
 +-----------+--------+------------+---------------------+
 8 rows in set (0.00 sec)

There is no requirement that every field in the XML file be matched with a column in the corresponding table. Fields which have no corresponding columns are skipped. You can see this by first emptying the 'person2' table and dropping the 'created' column, then using the same *note 'LOAD XML': load-xml. statement we just employed previously, like this:

 mysql> TRUNCATE person2;
 Query OK, 8 rows affected (0.26 sec)

 mysql> ALTER TABLE person2 DROP COLUMN created;
 Query OK, 0 rows affected (0.52 sec)
 Records: 0  Duplicates: 0  Warnings: 0

 mysql> SHOW CREATE TABLE person2\G
 *************************** 1. row ***************************
        Table: person2
 Create Table: CREATE TABLE `person2` (
   `person_id` int(11) NOT NULL,
   `fname` varchar(40) DEFAULT NULL,
   `lname` varchar(40) DEFAULT NULL,
   PRIMARY KEY (`person_id`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8
 1 row in set (0.00 sec)

 mysql> LOAD XML LOCAL INFILE 'person-dump.xml'
     ->   INTO TABLE person2;
 Query OK, 8 rows affected (0.01 sec)
 Records: 8  Deleted: 0  Skipped: 0  Warnings: 0

 mysql> SELECT * FROM person2;
 +-----------+--------+------------+
 | person_id | fname  | lname      |
 +-----------+--------+------------+
 |         1 | Kapek  | Sainnouine |
 |         2 | Sajon  | Rondela    |
 |         3 | Likema | O"rrtmons   |
 |         4 | Slar   | Manlanth   |
 |         5 | Stoma  | Nilu       |
 |         6 | Nirtam | Sklo"d      |
 |         7 | Sungam | Dulbaad     |
 |         8 | Sreraf | Encmelt    |
 +-----------+--------+------------+
 8 rows in set (0.00 sec)

The order in which the fields are given within each row of the XML file does not affect the operation of *note 'LOAD XML': load-xml.; the field order can vary from row to row, and is not required to be in the same order as the corresponding columns in the table.

As mentioned previously, you can use a '(FIELD_NAME_OR_USER_VAR, ...)' list of one or more XML fields (to select desired fields only) or user variables (to store the corresponding field values for later use). User variables can be especially useful when you want to insert data from an XML file into table columns whose names do not match those of the XML fields. To see how this works, we first create a table named 'individual' whose structure matches that of the 'person' table, but whose columns are named differently:

 mysql> CREATE TABLE individual (
     ->     individual_id INT NOT NULL PRIMARY KEY,
     ->     name1 VARCHAR(40) NULL,
     ->     name2 VARCHAR(40) NULL,
     ->     made TIMESTAMP
     -> );
 Query OK, 0 rows affected (0.42 sec)

In this case, you cannot simply load the XML file directly into the table, because the field and column names do not match:

 mysql> LOAD XML INFILE '../bin/person-dump.xml' INTO TABLE test.individual;
 ERROR 1263 (22004): Column set to default value; NULL supplied to NOT NULL column 'individual_id' at row 1

This happens because the MySQL server looks for field names matching the column names of the target table. You can work around this problem by selecting the field values into user variables, then setting the target table's columns equal to the values of those variables using 'SET'. You can perform both of these operations in a single statement, as shown here:

 mysql> LOAD XML INFILE '../bin/person-dump.xml'
     ->     INTO TABLE test.individual (@person_id, @fname, @lname, @created)
     ->     SET individual_id=@person_id, name1=@fname, name2=@lname, made=@created;
 Query OK, 8 rows affected (0.05 sec)
 Records: 8  Deleted: 0  Skipped: 0  Warnings: 0

 mysql> SELECT * FROM individual;
 +---------------+--------+------------+---------------------+
 | individual_id | name1  | name2      | made                |
 +---------------+--------+------------+---------------------+
 |             1 | Kapek  | Sainnouine | 2007-07-13 16:18:47 |
 |             2 | Sajon  | Rondela    | 2007-07-13 16:18:47 |
 |             3 | Likema | O"rrtmons   | 2007-07-13 16:18:47 |
 |             4 | Slar   | Manlanth   | 2007-07-13 16:18:47 |
 |             5 | Stoma  | Nilu       | 2007-07-13 16:18:47 |
 |             6 | Nirtam | Sklo"d      | 2007-07-13 16:18:47 |
 |             7 | Sungam | Dulbaad     | 2007-07-13 16:18:47 |
 |             8 | Srraf  | Encmelt    | 2007-07-13 16:18:47 |
 +---------------+--------+------------+---------------------+
 8 rows in set (0.00 sec)

The names of the user variables must match those of the corresponding fields from the XML file, with the addition of the required '@' prefix to indicate that they are variables. The user variables need not be listed or assigned in the same order as the corresponding fields.

Using a 'ROWS IDENTIFIED BY ''' clause, it is possible to import data from the same XML file into database tables with different definitions. For this example, suppose that you have a file named 'address.xml' which contains the following XML:

 <?xml version="1.0"?>

 <list>
   <person person_id="1">
     <fname>Robert</fname>
     <lname>Jones</lname>
     <address address_id="1" street="Mill Creek Road" zip="45365" city="Sidney"/>
     <address address_id="2" street="Main Street" zip="28681" city="Taylorsville"/>
   </person>

   <person person_id="2">
     <fname>Mary</fname>
     <lname>Smith</lname>
     <address address_id="3" street="River Road" zip="80239" city="Denver"/>
     <!-- <address address_id="4" street="North Street" zip="37920" city="Knoxville"/> -->
   </person>

 </list>

You can again use the 'test.person' table as defined previously in this section, after clearing all the existing records from the table and then showing its structure as shown here:

 mysql< TRUNCATE person;
 Query OK, 0 rows affected (0.04 sec)

 mysql< SHOW CREATE TABLE person\G
 *************************** 1. row ***************************
        Table: person
 Create Table: CREATE TABLE `person` (
   `person_id` int(11) NOT NULL,
   `fname` varchar(40) DEFAULT NULL,
   `lname` varchar(40) DEFAULT NULL,
   `created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
   PRIMARY KEY (`person_id`)
 ) ENGINE=MyISAM DEFAULT CHARSET=latin1
 1 row in set (0.00 sec)

Now create an 'address' table in the 'test' database using the following *note 'CREATE TABLE': create-table. statement:

 CREATE TABLE address (
     address_id INT NOT NULL PRIMARY KEY,
     person_id INT NULL,
     street VARCHAR(40) NULL,
     zip INT NULL,
     city VARCHAR(40) NULL,
     created TIMESTAMP
 );

To import the data from the XML file into the 'person' table, execute the following *note 'LOAD XML': load-xml. statement, which specifies that rows are to be specified by the '' element, as shown here;

 mysql> LOAD XML LOCAL INFILE 'address.xml'
     ->   INTO TABLE person
     ->   ROWS IDENTIFIED BY '<person>';
 Query OK, 2 rows affected (0.00 sec)
 Records: 2  Deleted: 0  Skipped: 0  Warnings: 0

You can verify that the records were imported using a *note 'SELECT': select. statement:

 mysql> SELECT * FROM person;
 +-----------+--------+-------+---------------------+
 | person_id | fname  | lname | created             |
 +-----------+--------+-------+---------------------+
 |         1 | Robert | Jones | 2007-07-24 17:37:06 |
 |         2 | Mary   | Smith | 2007-07-24 17:37:06 |
 +-----------+--------+-------+---------------------+
 2 rows in set (0.00 sec)
Since the '

' elements in the XML file have no corresponding columns in the 'person' table, they are skipped.

To import the data from the '

' elements into the 'address' table, use the *note 'LOAD XML': load-xml. statement shown here:

 mysql> LOAD XML LOCAL INFILE 'address.xml'
     ->   INTO TABLE address
     ->   ROWS IDENTIFIED BY '<address>';
 Query OK, 3 rows affected (0.00 sec)
 Records: 3  Deleted: 0  Skipped: 0  Warnings: 0

You can see that the data was imported using a *note 'SELECT': select. statement such as this one:

 mysql> SELECT * FROM address;
 +------------+-----------+-----------------+-------+--------------+---------------------+
 | address_id | person_id | street          | zip   | city         | created             |
 +------------+-----------+-----------------+-------+--------------+---------------------+
 |          1 |         1 | Mill Creek Road | 45365 | Sidney       | 2007-07-24 17:37:37 |
 |          2 |         1 | Main Street     | 28681 | Taylorsville | 2007-07-24 17:37:37 |
 |          3 |         2 | River Road      | 80239 | Denver       | 2007-07-24 17:37:37 |
 +------------+-----------+-----------------+-------+--------------+---------------------+
 3 rows in set (0.00 sec)
The data from the '
' element that is enclosed in XML comments is not imported. However, since there is a 'person_id' column in the 'address' table, the value of the 'person_id' attribute from the parent '' element for each '

' is imported into the 'address' table.

Security Considerations

As with the note 'LOAD DATA': load-data. statement, the transfer of the XML file from the client host to the server host is initiated by the MySQL server. In theory, a patched server could be built that would tell the client program to transfer a file of the server's choosing rather than the file named by the client in the note 'LOAD XML': load-xml. statement. Such a server could access any file on the client host to which the client user has read access.

In a Web environment, clients usually connect to MySQL from a Web server. A user that can run any command against the MySQL server can use *note 'LOAD XML LOCAL': load-xml. to read any files to which the Web server process has read access. In this environment, the client with respect to the MySQL server is actually the Web server, not the remote program being run by the user who connects to the Web server.

You can disable loading of XML files from clients by starting the server with '--local-infile=0' or '--local-infile=OFF'. This option can also be used when starting the note 'mysql': mysql. client to disable note 'LOAD XML': load-xml. for the duration of the client session.

To prevent a client from loading XML files from the server, do not grant the 'FILE' privilege to the corresponding MySQL user account, or revoke this privilege if the client user account already has it.

Important:

Revoking the 'FILE' privilege (or not granting it in the first place) keeps the user only from executing the note 'LOAD XML': load-xml. statement (as well as the 'LOAD_FILE()' function; it does not prevent the user from executing note 'LOAD XML LOCAL': load-xml. To disallow this statement, you must start the server or the client with '--local-infile=OFF'.

In other words, the 'FILE' privilege affects only whether the client can read files on the server; it has no bearing on whether the client can read files on the local file system.

For partitioned tables using storage engines that employ table locks, such as note 'MyISAM': myisam-storage-engine, any locks caused by note 'LOAD XML': load-xml. perform locks on all partitions of the table. This does not apply to tables using storage engines which employ row-level locking, such as note 'InnoDB': innodb-storage-engine. For more information, see note partitioning-limitations-locking::.

 File: manual.info.tmp, Node: replace, Next: select, Prev: load-xml, Up: sql-data-manipulation-statements

13.2.8 REPLACE Statement

 REPLACE [LOW_PRIORITY | DELAYED]
     [INTO] TBL_NAME
     [PARTITION (PARTITION_NAME [, PARTITION_NAME] ...)]
     [(COL_NAME [, COL_NAME] ...)]
     {VALUES | VALUE} (VALUE_LIST) [, (VALUE_LIST)] ...

 REPLACE [LOW_PRIORITY | DELAYED]
     [INTO] TBL_NAME
     [PARTITION (PARTITION_NAME [, PARTITION_NAME] ...)]
     SET ASSIGNMENT_LIST

 REPLACE [LOW_PRIORITY | DELAYED]
     [INTO] TBL_NAME
     [PARTITION (PARTITION_NAME [, PARTITION_NAME] ...)]
     [(COL_NAME [, COL_NAME] ...)]
     SELECT ...

 VALUE:
     {EXPR | DEFAULT}

 VALUE_LIST:
     VALUE [, VALUE] ...

 ASSIGNMENT:
     COL_NAME = VALUE

 ASSIGNMENT_LIST:
     ASSIGNMENT [, ASSIGNMENT] ...

note 'REPLACE': replace. works exactly like note 'INSERT': insert, except that if an old row in the table has the same value as a new row for a 'PRIMARY KEY' or a 'UNIQUE' index, the old row is deleted before the new row is inserted. See *note insert::.

note 'REPLACE': replace. is a MySQL extension to the SQL standard. It either inserts, or deletes and inserts. For another MySQL extension to standard SQL--that either inserts or updates--see note insert-on-duplicate::.

'DELAYED' inserts and replaces were deprecated in MySQL 5.6. In MySQL 5.7, 'DELAYED' is not supported. The server recognizes but ignores the 'DELAYED' keyword, handles the replace as a nondelayed replace, and generates an 'ER_WARN_LEGACY_SYNTAX_CONVERTED' warning: 'REPLACE DELAYED is no longer supported. The statement was converted to REPLACE'. The 'DELAYED' keyword is scheduled for removal in a future release. release.

Note:

note 'REPLACE': replace. makes sense only if a table has a 'PRIMARY KEY' or 'UNIQUE' index. Otherwise, it becomes equivalent to note 'INSERT': insert, because there is no index to be used to determine whether a new row duplicates another.

Values for all columns are taken from the values specified in the note 'REPLACE': replace. statement. Any missing columns are set to their default values, just as happens for note 'INSERT': insert. You cannot refer to values from the current row and use them in the new row. If you use an assignment such as 'SET COL_NAME = COL_NAME + 1', the reference to the column name on the right hand side is treated as 'DEFAULT(COL_NAME)', so the assignment is equivalent to 'SET COL_NAME = DEFAULT(COL_NAME) + 1'.

To use *note 'REPLACE': replace, you must have both the 'INSERT' and 'DELETE' privileges for the table.

If a generated column is replaced explicitly, the only permitted value is 'DEFAULT'. For information about generated columns, see *note create-table-generated-columns::.

'REPLACE' supports explicit partition selection using the 'PARTITION' clause with a list of comma-separated names of partitions, subpartitions, or both. As with note 'INSERT': insert, if it is not possible to insert the new row into any of these partitions or subpartitions, the 'REPLACE' statement fails with the error 'Found a row not matching the given partition set'. For more information and examples, see note partitioning-selection::.

The note 'REPLACE': replace. statement returns a count to indicate the number of rows affected. This is the sum of the rows deleted and inserted. If the count is 1 for a single-row note 'REPLACE': replace, a row was inserted and no rows were deleted. If the count is greater than 1, one or more old rows were deleted before the new row was inserted. It is possible for a single row to replace more than one old row if the table contains multiple unique indexes and the new row duplicates values for different old rows in different unique indexes.

The affected-rows count makes it easy to determine whether *note 'REPLACE': replace. only added a row or whether it also replaced any rows: Check whether the count is 1 (added) or greater (replaced).

If you are using the C API, the affected-rows count can be obtained using the 'mysql_affected_rows()' (https://dev.mysql.com/doc/c-api/5.7/en/mysql-affected-rows.html) function.

You cannot replace into a table and select from the same table in a subquery.

MySQL uses the following algorithm for note 'REPLACE': replace. (and note 'LOAD DATA ... REPLACE': load-data.):

  1. Try to insert the new row into the table

  2. While the insertion fails because a duplicate-key error occurs for a primary key or unique index:

    1. Delete from the table the conflicting row that has the duplicate key value

    2. Try again to insert the new row into the table

It is possible that in the case of a duplicate-key error, a storage engine may perform the 'REPLACE' as an update rather than a delete plus insert, but the semantics are the same. There are no user-visible effects other than a possible difference in how the storage engine increments 'Handler_XXX' status variables.

Because the results of 'REPLACE ... SELECT' statements depend on the ordering of rows from the note 'SELECT': select. and this order cannot always be guaranteed, it is possible when logging these statements for the source and the replica to diverge. For this reason, 'REPLACE ... SELECT' statements are flagged as unsafe for statement-based replication. such statements produce a warning in the error log when using statement-based mode and are written to the binary log using the row-based format when using 'MIXED' mode. See also note replication-sbr-rbr::.

When modifying an existing table that is not partitioned to accommodate partitioning, or, when modifying the partitioning of an already partitioned table, you may consider altering the table's primary key (see note partitioning-limitations-partitioning-keys-unique-keys::). You should be aware that, if you do this, the results of 'REPLACE' statements may be affected, just as they would be if you modified the primary key of a nonpartitioned table. Consider the table created by the following note 'CREATE TABLE': create-table. statement:

 CREATE TABLE test (
   id INT UNSIGNED NOT NULL AUTO_INCREMENT,
   data VARCHAR(64) DEFAULT NULL,
   ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
   PRIMARY KEY (id)
 );

When we create this table and run the statements shown in the mysql client, the result is as follows:

 mysql> REPLACE INTO test VALUES (1, 'Old', '2014-08-20 18:47:00');
 Query OK, 1 row affected (0.04 sec)

 mysql> REPLACE INTO test VALUES (1, 'New', '2014-08-20 18:47:42');
 Query OK, 2 rows affected (0.04 sec)

 mysql> SELECT * FROM test;
 +----+------+---------------------+
 | id | data | ts                  |
 +----+------+---------------------+
 |  1 | New  | 2014-08-20 18:47:42 |
 +----+------+---------------------+
 1 row in set (0.00 sec)

Now we create a second table almost identical to the first, except that the primary key now covers 2 columns, as shown here (emphasized text):

 CREATE TABLE test2 (
   id INT UNSIGNED NOT NULL AUTO_INCREMENT,
   data VARCHAR(64) DEFAULT NULL,
   ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
   _PRIMARY KEY (id, ts)_
 );

When we run on 'test2' the same two 'REPLACE' statements as we did on the original 'test' table, we obtain a different result:

 mysql> REPLACE INTO test2 VALUES (1, 'Old', '2014-08-20 18:47:00');
 Query OK, 1 row affected (0.05 sec)

 mysql> REPLACE INTO test2 VALUES (1, 'New', '2014-08-20 18:47:42');
 Query OK, 1 row affected (0.06 sec)

 mysql> SELECT * FROM test2;
 +----+------+---------------------+
 | id | data | ts                  |
 +----+------+---------------------+
 |  1 | Old  | 2014-08-20 18:47:00 |
 |  1 | New  | 2014-08-20 18:47:42 |
 +----+------+---------------------+
 2 rows in set (0.00 sec)

This is due to the fact that, when run on 'test2', both the 'id' and 'ts' column values must match those of an existing row for the row to be replaced; otherwise, a row is inserted.

A 'REPLACE' statement affecting a partitioned table using a storage engine such as note 'MyISAM': myisam-storage-engine. that employs table-level locks locks only those partitions containing rows that match the 'REPLACE' statement 'WHERE' clause, as long as none of the table partitioning columns are updated; otherwise the entire table is locked. (For storage engines such as note 'InnoDB': innodb-storage-engine. that employ row-level locking, no locking of partitions takes place.) For more information, see *note partitioning-limitations-locking::.

 File: manual.info.tmp, Node: select, Next: subqueries, Prev: replace, Up: sql-data-manipulation-statements

13.2.9 SELECT Statement

note 'SELECT': select. is used to retrieve rows selected from one or more tables, and can include note 'UNION': union. statements and subqueries. See note union::, and note subqueries::.

The most commonly used clauses of *note 'SELECT': select. statements are these:

*note 'SELECT': select. can also be used to retrieve rows computed without reference to any table.

For example:

 mysql> SELECT 1 + 1;
         -> 2

You are permitted to specify 'DUAL' as a dummy table name in situations where no tables are referenced:

 mysql> SELECT 1 + 1 FROM DUAL;
         -> 2

'DUAL' is purely for the convenience of people who require that all *note 'SELECT': select. statements should have 'FROM' and possibly other clauses. MySQL may ignore the clauses. MySQL does not require 'FROM DUAL' if no tables are referenced.

In general, clauses used must be given in exactly the order shown in the syntax description. For example, a 'HAVING' clause must come after any 'GROUP BY' clause and before any 'ORDER BY' clause. The 'INTO' clause, if present, can appear in any position indicated by the syntax description, but within a given statement can appear only once, not in multiple positions. For more information about 'INTO', see *note select-into::.

The list of SELECT_EXPR terms comprises the select list that indicates which columns to retrieve. Terms specify a column or expression or can use '*'-shorthand:

The following list provides additional information about other 'SELECT' clauses:

Following the *note 'SELECT': select. keyword, you can use a number of modifiers that affect the operation of the statement. 'HIGH_PRIORITY', 'STRAIGHT_JOIN', and modifiers beginning with 'SQL_' are MySQL extensions to standard SQL.

A 'SELECT' from a partitioned table using a storage engine such as note 'MyISAM': myisam-storage-engine. that employs table-level locks locks only those partitions containing rows that match the 'SELECT' statement 'WHERE' clause. (This does not occur with storage engines such as note 'InnoDB': innodb-storage-engine. that employ row-level locking.) For more information, see *note partitioning-limitations-locking::.

 File: manual.info.tmp, Node: select-into, Next: join, Prev: select, Up: select

13.2.9.1 SELECT ... INTO Statement ..................................

The note 'SELECT ... INTO': select-into. form of note 'SELECT': select. enables a query result to be stored in variables or written to a file:

A given note 'SELECT': select. statement can contain at most one 'INTO' clause, although as shown by the note 'SELECT': select. syntax description (see *note select::), the 'INTO' can appear in different positions:

An 'INTO' clause should not be used in a nested note 'SELECT': select. because such a note 'SELECT': select. must return its result to the outer context. There are also constraints on the use of 'INTO' within note 'UNION': union. statements; see note union::.

For the 'INTO VAR_LIST' variant:

User variable names are not case-sensitive. See *note user-variables::.

The note 'SELECT ... INTO OUTFILE 'FILE_NAME'': select-into. form of note 'SELECT': select. writes the selected rows to a file. The file is created on the server host, so you must have the 'FILE' privilege to use this syntax. FILE_NAME cannot be an existing file, which among other things prevents files such as '/etc/passwd' and database tables from being modified. The 'character_set_filesystem' system variable controls the interpretation of the file name.

The note 'SELECT ... INTO OUTFILE': select-into. statement is intended to enable dumping a table to a text file on the server host. To create the resulting file on some other host, note 'SELECT ... INTO OUTFILE': select-into. normally is unsuitable because there is no way to write a path to the file relative to the server host file system, unless the location of the file on the remote host can be accessed using a network-mapped path on the server host file system.

Alternatively, if the MySQL client software is installed on the remote host, you can use a client command such as 'mysql -e "SELECT ..." > FILE_NAME' to generate the file on that host.

note 'SELECT ... INTO OUTFILE': select-into. is the complement of note 'LOAD DATA': load-data. Column values are written converted to the character set specified in the 'CHARACTER SET' clause. If no such clause is present, values are dumped using the 'binary' character set. In effect, there is no character set conversion. If a result set contains columns in several character sets, so does the output data file and it may not be possible to reload the file correctly.

The syntax for the EXPORT_OPTIONS part of the statement consists of the same 'FIELDS' and 'LINES' clauses that are used with the note 'LOAD DATA': load-data. statement. For more detailed information about the 'FIELDS' and 'LINES' clauses, including their default values and permissible values, see note load-data::.

'FIELDS ESCAPED BY' controls how to write special characters. If the 'FIELDS ESCAPED BY' character is not empty, it is used when necessary to avoid ambiguity as a prefix that precedes following characters on output:

The 'FIELDS TERMINATED BY', 'ENCLOSED BY', 'ESCAPED BY', or 'LINES TERMINATED BY' characters must be escaped so that you can read the file back in reliably. ASCII 'NUL' is escaped to make it easier to view with some pagers.

The resulting file need not conform to SQL syntax, so nothing else need be escaped.

If the 'FIELDS ESCAPED BY' character is empty, no characters are escaped and 'NULL' is output as 'NULL', not ''. It is probably not a good idea to specify an empty escape character, particularly if field values in your data contain any of the characters in the list just given.

Here is an example that produces a file in the comma-separated values (CSV) format used by many programs:

 SELECT a,b,a+b INTO OUTFILE '/tmp/result.txt'
   FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
   LINES TERMINATED BY '\n'
   FROM test_table;

If you use 'INTO DUMPFILE' instead of 'INTO OUTFILE', MySQL writes only one row into the file, without any column or line termination and without performing any escape processing. This is useful for selecting a *note 'BLOB': blob. value and storing it in a file.

Note:

Any file created by 'INTO OUTFILE' or 'INTO DUMPFILE' is writable by all users on the server host. The reason for this is that the MySQL server cannot create a file that is owned by anyone other than the user under whose account it is running. (You should never run *note 'mysqld': mysqld. as 'root' for this and other reasons.) The file thus must be world-writable so that you can manipulate its contents.

If the 'secure_file_priv' system variable is set to a nonempty directory name, the file to be written must be located in that directory.

In the context of note 'SELECT ... INTO': select-into. statements that occur as part of events executed by the Event Scheduler, diagnostics messages (not only errors, but also warnings) are written to the error log, and, on Windows, to the application event log. For additional information, see note events-status-info::.

 File: manual.info.tmp, Node: join, Next: union, Prev: select-into, Up: select

13.2.9.2 JOIN Clause ....................

MySQL supports the following 'JOIN' syntax for the TABLE_REFERENCES part of note 'SELECT': select. statements and multiple-table note 'DELETE': delete. and *note 'UPDATE': update. statements:

 TABLE_REFERENCES:
     ESCAPED_TABLE_REFERENCE [, ESCAPED_TABLE_REFERENCE] ...

 ESCAPED_TABLE_REFERENCE: {
     TABLE_REFERENCE
   | { OJ TABLE_REFERENCE }
 }

 TABLE_REFERENCE: {
     TABLE_FACTOR
   | JOINED_TABLE
 }

 TABLE_FACTOR: {
     TBL_NAME [PARTITION (PARTITION_NAMES)]
         [[AS] ALIAS] [INDEX_HINT_LIST]
   | TABLE_SUBQUERY [AS] ALIAS
   | ( TABLE_REFERENCES )
 }

 JOINED_TABLE: {
     TABLE_REFERENCE [INNER | CROSS] JOIN TABLE_FACTOR [JOIN_SPECIFICATION]
   | TABLE_REFERENCE STRAIGHT_JOIN TABLE_FACTOR
   | TABLE_REFERENCE STRAIGHT_JOIN TABLE_FACTOR ON SEARCH_CONDITION
   | TABLE_REFERENCE {LEFT|RIGHT} [OUTER] JOIN TABLE_REFERENCE JOIN_SPECIFICATION
   | TABLE_REFERENCE NATURAL [{LEFT|RIGHT} [OUTER]] JOIN TABLE_FACTOR
 }

 JOIN_SPECIFICATION: {
     ON SEARCH_CONDITION
   | USING (JOIN_COLUMN_LIST)
 }

 JOIN_COLUMN_LIST:
     COLUMN_NAME[, COLUMN_NAME] ...

 INDEX_HINT_LIST:
     INDEX_HINT[ INDEX_HINT] ...

 INDEX_HINT: {
     USE {INDEX|KEY}
       [FOR {JOIN|ORDER BY|GROUP BY}] ([INDEX_LIST])
   | {IGNORE|FORCE} {INDEX|KEY}
       [FOR {JOIN|ORDER BY|GROUP BY}] (INDEX_LIST)
 }

 INDEX_LIST:
     INDEX_NAME [, INDEX_NAME] ...

A table reference is also known as a join expression.

A table reference (when it refers to a partitioned table) may contain a 'PARTITION' clause, including a list of comma-separated partitions, subpartitions, or both. This option follows the name of the table and precedes any alias declaration. The effect of this option is that rows are selected only from the listed partitions or subpartitions. Any partitions or subpartitions not named in the list are ignored. For more information and examples, see *note partitioning-selection::.

The syntax of TABLE_FACTOR is extended in MySQL in comparison with standard SQL. The standard accepts only TABLE_REFERENCE, not a list of them inside a pair of parentheses.

This is a conservative extension if each comma in a list of TABLE_REFERENCE items is considered as equivalent to an inner join. For example:

 SELECT * FROM t1 LEFT JOIN (t2, t3, t4)
                  ON (t2.a = t1.a AND t3.b = t1.b AND t4.c = t1.c)

is equivalent to:

 SELECT * FROM t1 LEFT JOIN (t2 CROSS JOIN t3 CROSS JOIN t4)
                  ON (t2.a = t1.a AND t3.b = t1.b AND t4.c = t1.c)

In MySQL, 'JOIN', 'CROSS JOIN', and 'INNER JOIN' are syntactic equivalents (they can replace each other). In standard SQL, they are not equivalent. 'INNER JOIN' is used with an 'ON' clause, 'CROSS JOIN' is used otherwise.

In general, parentheses can be ignored in join expressions containing only inner join operations. MySQL also supports nested joins. See *note nested-join-optimization::.

Index hints can be specified to affect how the MySQL optimizer makes use of indexes. For more information, see note index-hints::. Optimizer hints and the 'optimizer_switch' system variable are other ways to influence optimizer use of indexes. See note optimizer-hints::, and *note switchable-optimizations::.

The following list describes general factors to take into account when writing joins:

Some join examples:

 SELECT * FROM table1, table2;

 SELECT * FROM table1 INNER JOIN table2 ON table1.id = table2.id;

 SELECT * FROM table1 LEFT JOIN table2 ON table1.id = table2.id;

 SELECT * FROM table1 LEFT JOIN table2 USING (id);

 SELECT * FROM table1 LEFT JOIN table2 ON table1.id = table2.id
   LEFT JOIN table3 ON table2.id = table3.id;

Natural joins and joins with 'USING', including outer join variants, are processed according to the SQL:2003 standard:

 File: manual.info.tmp, Node: union, Prev: join, Up: select

13.2.9.3 UNION Clause .....................

 SELECT ...
 UNION [ALL | DISTINCT] SELECT ...
 [UNION [ALL | DISTINCT] SELECT ...]

note 'UNION': union. combines the result from multiple note 'SELECT': select. statements into a single result set. Example:

 mysql> SELECT 1, 2;
 +---+---+
 | 1 | 2 |
 +---+---+
 | 1 | 2 |
 +---+---+
 mysql> SELECT 'a', 'b';
 +---+---+
 | a | b |
 +---+---+
 | a | b |
 +---+---+
 mysql> SELECT 1, 2 UNION SELECT 'a', 'b';
 +---+---+
 | 1 | 2 |
 +---+---+
 | 1 | 2 |
 | a | b |
 +---+---+

Result Set Column Names and Data Types

The column names for a note 'UNION': union. result set are taken from the column names of the first note 'SELECT': select. statement.

Selected columns listed in corresponding positions of each note 'SELECT': select. statement should have the same data type. For example, the first column selected by the first statement should have the same type as the first column selected by the other statements. If the data types of corresponding note 'SELECT': select. columns do not match, the types and lengths of the columns in the note 'UNION': union. result take into account the values retrieved by all the note 'SELECT': select. statements. For example, consider the following, where the column length is not constrained to the length of the value from the first *note 'SELECT': select.:

 mysql> SELECT REPEAT('a',1) UNION SELECT REPEAT('b',20);
 +----------------------+
 | REPEAT('a',1)        |
 +----------------------+
 | a                    |
 | bbbbbbbbbbbbbbbbbbbb |
 +----------------------+

UNION DISTINCT and UNION ALL

By default, duplicate rows are removed from note 'UNION': union. results. The optional 'DISTINCT' keyword has the same effect but makes it explicit. With the optional 'ALL' keyword, duplicate-row removal does not occur and the result includes all matching rows from all the note 'SELECT': select. statements.

You can mix note 'UNION ALL': union. and note 'UNION DISTINCT': union. in the same query. Mixed note 'UNION': union. types are treated such that a 'DISTINCT' union overrides any 'ALL' union to its left. A 'DISTINCT' union can be produced explicitly by using note 'UNION DISTINCT': union. or implicitly by using *note 'UNION': union. with no following 'DISTINCT' or 'ALL' keyword.

ORDER BY and LIMIT in Unions

To apply an 'ORDER BY' or 'LIMIT' clause to an individual note 'SELECT': select, parenthesize the note 'SELECT': select. and place the clause inside the parentheses:

 (SELECT a FROM t1 WHERE a=10 AND B=1 ORDER BY a LIMIT 10)
 UNION
 (SELECT a FROM t2 WHERE a=11 AND B=2 ORDER BY a LIMIT 10);

Note:

Previous versions of MySQL may permit such statements without parentheses. In MySQL 5.7, the requirement for parentheses is enforced.

Use of 'ORDER BY' for individual note 'SELECT': select. statements implies nothing about the order in which the rows appear in the final result because note 'UNION': union. by default produces an unordered set of rows. Therefore, 'ORDER BY' in this context typically is used in conjunction with 'LIMIT', to determine the subset of the selected rows to retrieve for the note 'SELECT': select, even though it does not necessarily affect the order of those rows in the final note 'UNION': union. result. If 'ORDER BY' appears without 'LIMIT' in a *note 'SELECT': select, it is optimized away because it has no effect.

To use an 'ORDER BY' or 'LIMIT' clause to sort or limit the entire note 'UNION': union. result, parenthesize the individual note 'SELECT': select. statements and place the 'ORDER BY' or 'LIMIT' after the last one:

 (SELECT a FROM t1 WHERE a=10 AND B=1)
 UNION
 (SELECT a FROM t2 WHERE a=11 AND B=2)
 ORDER BY a LIMIT 10;

A statement without parentheses is equivalent to one parenthesized as just shown.

This kind of 'ORDER BY' cannot use column references that include a table name (that is, names in TBL_NAME.COL_NAME format). Instead, provide a column alias in the first *note 'SELECT': select. statement and refer to the alias in the 'ORDER BY'. (Alternatively, refer to the column in the 'ORDER BY' using its column position. However, use of column positions is deprecated.)

Also, if a column to be sorted is aliased, the 'ORDER BY' clause must refer to the alias, not the column name. The first of the following statements is permitted, but the second fails with an 'Unknown column 'a' in 'order clause'' error:

 (SELECT a AS b FROM t) UNION (SELECT ...) ORDER BY b;
 (SELECT a AS b FROM t) UNION (SELECT ...) ORDER BY a;

To cause rows in a note 'UNION': union. result to consist of the sets of rows retrieved by each note 'SELECT': select. one after the other, select an additional column in each note 'SELECT': select. to use as a sort column and add an 'ORDER BY' that sorts on that column following the last note 'SELECT': select.:

 (SELECT 1 AS sort_col, col1a, col1b, ... FROM t1)
 UNION
 (SELECT 2, col2a, col2b, ... FROM t2) ORDER BY sort_col;

To additionally maintain sort order within individual *note 'SELECT': select. results, add a secondary column to the 'ORDER BY' clause:

 (SELECT 1 AS sort_col, col1a, col1b, ... FROM t1)
 UNION
 (SELECT 2, col2a, col2b, ... FROM t2) ORDER BY sort_col, col1a;

Use of an additional column also enables you to determine which *note 'SELECT': select. each row comes from. Extra columns can provide other identifying information as well, such as a string that indicates a table name.

*note 'UNION': union. queries with an aggregate function in an 'ORDER BY' clause are rejected with an 'ER_AGGREGATE_ORDER_FOR_UNION' (https://dev.mysql.com/doc/mysql-errors/5.7/en/server-error-reference.html#error_er_aggregate_order_for_union) error. Example:

 SELECT 1 AS foo UNION SELECT 2 ORDER BY MAX(1);

UNION Restrictions

In a note 'UNION': union, the note 'SELECT': select. statements are normal select statements, but with the following restrictions:

 File: manual.info.tmp, Node: subqueries, Next: update, Prev: select, Up: sql-data-manipulation-statements

13.2.10 Subqueries

A subquery is a *note 'SELECT': select. statement within another statement.

All subquery forms and operations that the SQL standard requires are supported, as well as a few features that are MySQL-specific.

Here is an example of a subquery:

 SELECT * FROM t1 WHERE column1 = (SELECT column1 FROM t2);

In this example, 'SELECT * FROM t1 ...' is the outer query (or outer statement), and '(SELECT column1 FROM t2)' is the subquery. We say that the subquery is nested within the outer query, and in fact it is possible to nest subqueries within other subqueries, to a considerable depth. A subquery must always appear within parentheses.

The main advantages of subqueries are:

Here is an example statement that shows the major points about subquery syntax as specified by the SQL standard and supported in MySQL:

 DELETE FROM t1
 WHERE s11 > ANY
  (SELECT COUNT(*) /* no hint */ FROM t2
   WHERE NOT EXISTS
    (SELECT * FROM t3
     WHERE ROW(5*t2.s1,77)=
      (SELECT 50,11*s1 FROM t4 UNION SELECT 50,77 FROM
       (SELECT * FROM t5) AS t5)));

A subquery can return a scalar (a single value), a single row, a single column, or a table (one or more rows of one or more columns). These are called scalar, column, row, and table subqueries. Subqueries that return a particular kind of result often can be used only in certain contexts, as described in the following sections.

There are few restrictions on the type of statements in which subqueries can be used. A subquery can contain many of the keywords or clauses that an ordinary note 'SELECT': select. can contain: 'DISTINCT', 'GROUP BY', 'ORDER BY', 'LIMIT', joins, index hints, note 'UNION': union. constructs, comments, functions, and so on.

A subquery's outer statement can be any one of: note 'SELECT': select, note 'INSERT': insert, note 'UPDATE': update, note 'DELETE': delete, note 'SET': set-variable, or note 'DO': do.

In MySQL, you cannot modify a table and select from the same table in a subquery. This applies to statements such as note 'DELETE': delete, note 'INSERT': insert, note 'REPLACE': replace, note 'UPDATE': update, and (because subqueries can be used in the 'SET' clause) *note 'LOAD DATA': load-data.

For information about how the optimizer handles subqueries, see note subquery-optimization::. For a discussion of restrictions on subquery use, including performance issues for certain forms of subquery syntax, see note subquery-restrictions::.

 File: manual.info.tmp, Node: scalar-subqueries, Next: comparisons-using-subqueries, Prev: subqueries, Up: subqueries

13.2.10.1 The Subquery as Scalar Operand ........................................

In its simplest form, a subquery is a scalar subquery that returns a single value. A scalar subquery is a simple operand, and you can use it almost anywhere a single column value or literal is legal, and you can expect it to have those characteristics that all operands have: a data type, a length, an indication that it can be 'NULL', and so on. For example:

 CREATE TABLE t1 (s1 INT, s2 CHAR(5) NOT NULL);
 INSERT INTO t1 VALUES(100, 'abcde');
 SELECT (SELECT s2 FROM t1);

The subquery in this note 'SELECT': select. returns a single value (''abcde'') that has a data type of note 'CHAR': char, a length of 5, a character set and collation equal to the defaults in effect at *note 'CREATE TABLE': create-table. time, and an indication that the value in the column can be 'NULL'. Nullability of the value selected by a scalar subquery is not copied because if the subquery result is empty, the result is 'NULL'. For the subquery just shown, if 't1' were empty, the result would be 'NULL' even though 's2' is 'NOT NULL'.

There are a few contexts in which a scalar subquery cannot be used. If a statement permits only a literal value, you cannot use a subquery. For example, 'LIMIT' requires literal integer arguments, and *note 'LOAD DATA': load-data. requires a literal string file name. You cannot use subqueries to supply these values.

When you see examples in the following sections that contain the rather spartan construct '(SELECT column1 FROM t1)', imagine that your own code contains much more diverse and complex constructions.

Suppose that we make two tables:

 CREATE TABLE t1 (s1 INT);
 INSERT INTO t1 VALUES (1);
 CREATE TABLE t2 (s1 INT);
 INSERT INTO t2 VALUES (2);

Then perform a *note 'SELECT': select.:

 SELECT (SELECT s1 FROM t2) FROM t1;

The result is '2' because there is a row in 't2' containing a column 's1' that has a value of '2'.

A scalar subquery can be part of an expression, but remember the parentheses, even if the subquery is an operand that provides an argument for a function. For example:

 SELECT UPPER((SELECT s1 FROM t1)) FROM t2;

 File: manual.info.tmp, Node: comparisons-using-subqueries, Next: any-in-some-subqueries, Prev: scalar-subqueries, Up: subqueries

13.2.10.2 Comparisons Using Subqueries ......................................

The most common use of a subquery is in the form:

 NON_SUBQUERY_OPERAND COMPARISON_OPERATOR (SUBQUERY)

Where COMPARISON_OPERATOR is one of these operators:

 =  >  <  >=  <=  <>  !=  <=>

For example:

 ... WHERE 'a' = (SELECT column1 FROM t1)

MySQL also permits this construct:

 NON_SUBQUERY_OPERAND LIKE (SUBQUERY)

At one time the only legal place for a subquery was on the right side of a comparison, and you might still find some old DBMSs that insist on this.

Here is an example of a common-form subquery comparison that you cannot do with a join. It finds all the rows in table 't1' for which the 'column1' value is equal to a maximum value in table 't2':

 SELECT * FROM t1
   WHERE column1 = (SELECT MAX(column2) FROM t2);

Here is another example, which again is impossible with a join because it involves aggregating for one of the tables. It finds all rows in table 't1' containing a value that occurs twice in a given column:

 SELECT * FROM t1 AS t
   WHERE 2 = (SELECT COUNT(*) FROM t1 WHERE t1.id = t.id);

For a comparison of the subquery to a scalar, the subquery must return a scalar. For a comparison of the subquery to a row constructor, the subquery must be a row subquery that returns a row with the same number of values as the row constructor. See *note row-subqueries::.

 File: manual.info.tmp, Node: any-in-some-subqueries, Next: all-subqueries, Prev: comparisons-using-subqueries, Up: subqueries

13.2.10.3 Subqueries with ANY, IN, or SOME ..........................................

Syntax:

 OPERAND COMPARISON_OPERATOR ANY (SUBQUERY)
 OPERAND IN (SUBQUERY)
 OPERAND COMPARISON_OPERATOR SOME (SUBQUERY)

Where COMPARISON_OPERATOR is one of these operators:

 =  >  <  >=  <=  <>  !=

The 'ANY' keyword, which must follow a comparison operator, means 'return 'TRUE' if the comparison is 'TRUE' for 'ANY' of the values in the column that the subquery returns.' For example:

 SELECT s1 FROM t1 WHERE s1 > ANY (SELECT s1 FROM t2);

Suppose that there is a row in table 't1' containing '(10)'. The expression is 'TRUE' if table 't2' contains '(21,14,7)' because there is a value '7' in 't2' that is less than '10'. The expression is 'FALSE' if table 't2' contains '(20,10)', or if table 't2' is empty. The expression is unknown (that is, 'NULL') if table 't2' contains '(NULL,NULL,NULL)'.

When used with a subquery, the word 'IN' is an alias for '= ANY'. Thus, these two statements are the same:

 SELECT s1 FROM t1 WHERE s1 = ANY (SELECT s1 FROM t2);
 SELECT s1 FROM t1 WHERE s1 IN    (SELECT s1 FROM t2);

'IN' and '= ANY' are not synonyms when used with an expression list. 'IN' can take an expression list, but '= ANY' cannot. See *note comparison-operators::.

'NOT IN' is not an alias for '<> ANY', but for '<> ALL'. See *note all-subqueries::.

The word 'SOME' is an alias for 'ANY'. Thus, these two statements are the same:

 SELECT s1 FROM t1 WHERE s1 <> ANY  (SELECT s1 FROM t2);
 SELECT s1 FROM t1 WHERE s1 <> SOME (SELECT s1 FROM t2);

Use of the word 'SOME' is rare, but this example shows why it might be useful. To most people, the English phrase 'a is not equal to any b' means 'there is no b which is equal to a,' but that is not what is meant by the SQL syntax. The syntax means 'there is some b to which a is not equal.' Using '<> SOME' instead helps ensure that everyone understands the true meaning of the query.

 File: manual.info.tmp, Node: all-subqueries, Next: row-subqueries, Prev: any-in-some-subqueries, Up: subqueries

13.2.10.4 Subqueries with ALL .............................

Syntax:

 OPERAND COMPARISON_OPERATOR ALL (SUBQUERY)

The word 'ALL', which must follow a comparison operator, means 'return 'TRUE' if the comparison is 'TRUE' for 'ALL' of the values in the column that the subquery returns.' For example:

 SELECT s1 FROM t1 WHERE s1 > ALL (SELECT s1 FROM t2);

Suppose that there is a row in table 't1' containing '(10)'. The expression is 'TRUE' if table 't2' contains '(-5,0,+5)' because '10' is greater than all three values in 't2'. The expression is 'FALSE' if table 't2' contains '(12,6,NULL,-100)' because there is a single value '12' in table 't2' that is greater than '10'. The expression is unknown (that is, 'NULL') if table 't2' contains '(0,NULL,1)'.

Finally, the expression is 'TRUE' if table 't2' is empty. So, the following expression is 'TRUE' when table 't2' is empty:

 SELECT * FROM t1 WHERE 1 > ALL (SELECT s1 FROM t2);

But this expression is 'NULL' when table 't2' is empty:

 SELECT * FROM t1 WHERE 1 > (SELECT s1 FROM t2);

In addition, the following expression is 'NULL' when table 't2' is empty:

 SELECT * FROM t1 WHERE 1 > ALL (SELECT MAX(s1) FROM t2);

In general, tables containing 'NULL' values and empty tables are 'edge cases.' When writing subqueries, always consider whether you have taken those two possibilities into account.

'NOT IN' is an alias for '<> ALL'. Thus, these two statements are the same:

 SELECT s1 FROM t1 WHERE s1 <> ALL (SELECT s1 FROM t2);
 SELECT s1 FROM t1 WHERE s1 NOT IN (SELECT s1 FROM t2);

 File: manual.info.tmp, Node: row-subqueries, Next: exists-and-not-exists-subqueries, Prev: all-subqueries, Up: subqueries

13.2.10.5 Row Subqueries ........................

Scalar or column subqueries return a single value or a column of values. A row subquery is a subquery variant that returns a single row and can thus return more than one column value. Legal operators for row subquery comparisons are:

 =  >  <  >=  <=  <>  !=  <=>

Here are two examples:

 SELECT * FROM t1
   WHERE (col1,col2) = (SELECT col3, col4 FROM t2 WHERE id = 10);
 SELECT * FROM t1
   WHERE ROW(col1,col2) = (SELECT col3, col4 FROM t2 WHERE id = 10);

For both queries, if the table 't2' contains a single row with 'id = 10', the subquery returns a single row. If this row has 'col3' and 'col4' values equal to the 'col1' and 'col2' values of any rows in 't1', the 'WHERE' expression is 'TRUE' and each query returns those 't1' rows. If the 't2' row 'col3' and 'col4' values are not equal the 'col1' and 'col2' values of any 't1' row, the expression is 'FALSE' and the query returns an empty result set. The expression is unknown (that is, 'NULL') if the subquery produces no rows. An error occurs if the subquery produces multiple rows because a row subquery can return at most one row.

For information about how each operator works for row comparisons, see *note comparison-operators::.

The expressions '(1,2)' and 'ROW(1,2)' are sometimes called row constructors. The two are equivalent. The row constructor and the row returned by the subquery must contain the same number of values.

A row constructor is used for comparisons with subqueries that return two or more columns. When a subquery returns a single column, this is regarded as a scalar value and not as a row, so a row constructor cannot be used with a subquery that does not return at least two columns. Thus, the following query fails with a syntax error:

 SELECT * FROM t1 WHERE ROW(1) = (SELECT column1 FROM t2)

Row constructors are legal in other contexts. For example, the following two statements are semantically equivalent (and are handled in the same way by the optimizer):

 SELECT * FROM t1 WHERE (column1,column2) = (1,1);
 SELECT * FROM t1 WHERE column1 = 1 AND column2 = 1;

The following query answers the request, 'find all rows in table 't1' that also exist in table 't2'':

 SELECT column1,column2,column3
   FROM t1
   WHERE (column1,column2,column3) IN
          (SELECT column1,column2,column3 FROM t2);

For more information about the optimizer and row constructors, see *note row-constructor-optimization::

 File: manual.info.tmp, Node: exists-and-not-exists-subqueries, Next: correlated-subqueries, Prev: row-subqueries, Up: subqueries

13.2.10.6 Subqueries with EXISTS or NOT EXISTS ..............................................

If a subquery returns any rows at all, 'EXISTS SUBQUERY' is 'TRUE', and 'NOT EXISTS SUBQUERY' is 'FALSE'. For example:

 SELECT column1 FROM t1 WHERE EXISTS (SELECT * FROM t2);

Traditionally, an 'EXISTS' subquery starts with 'SELECT ', but it could begin with 'SELECT 5' or 'SELECT column1' or anything at all. MySQL ignores the note 'SELECT': select. list in such a subquery, so it makes no difference.

For the preceding example, if 't2' contains any rows, even rows with nothing but 'NULL' values, the 'EXISTS' condition is 'TRUE'. This is actually an unlikely example because a '[NOT] EXISTS' subquery almost always contains correlations. Here are some more realistic examples:

The last example is a double-nested 'NOT EXISTS' query. That is, it has a 'NOT EXISTS' clause within a 'NOT EXISTS' clause. Formally, it answers the question 'does a city exist with a store that is not in 'Stores''? But it is easier to say that a nested 'NOT EXISTS' answers the question 'is X 'TRUE' for all Y?'

 File: manual.info.tmp, Node: correlated-subqueries, Next: derived-tables, Prev: exists-and-not-exists-subqueries, Up: subqueries

13.2.10.7 Correlated Subqueries ...............................

A correlated subquery is a subquery that contains a reference to a table that also appears in the outer query. For example:

 SELECT * FROM t1
   WHERE column1 = ANY (SELECT column1 FROM t2
                        WHERE t2.column2 = t1.column2);

Notice that the subquery contains a reference to a column of 't1', even though the subquery's 'FROM' clause does not mention a table 't1'. So, MySQL looks outside the subquery, and finds 't1' in the outer query.

Suppose that table 't1' contains a row where 'column1 = 5' and 'column2 = 6'; meanwhile, table 't2' contains a row where 'column1 = 5' and 'column2 = 7'. The simple expression '... WHERE column1 = ANY (SELECT column1 FROM t2)' would be 'TRUE', but in this example, the 'WHERE' clause within the subquery is 'FALSE' (because '(5,6)' is not equal to '(5,7)'), so the expression as a whole is 'FALSE'.

Scoping rule: MySQL evaluates from inside to outside. For example:

 SELECT column1 FROM t1 AS x
   WHERE x.column1 = (SELECT column1 FROM t2 AS x
     WHERE x.column1 = (SELECT column1 FROM t3
       WHERE x.column2 = t3.column1));

In this statement, 'x.column2' must be a column in table 't2' because 'SELECT column1 FROM t2 AS x ...' renames 't2'. It is not a column in table 't1' because 'SELECT column1 FROM t1 ...' is an outer query that is farther out.

For subqueries in 'HAVING' or 'ORDER BY' clauses, MySQL also looks for column names in the outer select list.

For certain cases, a correlated subquery is optimized. For example:

 VAL IN (SELECT KEY_VAL FROM TBL_NAME WHERE CORRELATED_CONDITION)

Otherwise, they are inefficient and likely to be slow. Rewriting the query as a join might improve performance.

Aggregate functions in correlated subqueries may contain outer references, provided the function contains nothing but outer references, and provided the function is not contained in another function or expression.

 File: manual.info.tmp, Node: derived-tables, Next: subquery-errors, Prev: correlated-subqueries, Up: subqueries

13.2.10.8 Derived Tables ........................

A derived table is an expression that generates a table within the scope of a query 'FROM' clause. For example, a subquery in a *note 'SELECT': select. statement 'FROM' clause is a derived table:

 SELECT ... FROM (SUBQUERY) [AS] TBL_NAME ...

The '[AS] TBL_NAME' clause is mandatory because every table in a 'FROM' clause must have a name. Any columns in the derived table must have unique names.

For the sake of illustration, assume that you have this table:

 CREATE TABLE t1 (s1 INT, s2 CHAR(5), s3 FLOAT);

Here is how to use a subquery in the 'FROM' clause, using the example table:

 INSERT INTO t1 VALUES (1,'1',1.0);
 INSERT INTO t1 VALUES (2,'2',2.0);
 SELECT sb1,sb2,sb3
   FROM (SELECT s1 AS sb1, s2 AS sb2, s3*2 AS sb3 FROM t1) AS sb
   WHERE sb1 > 1;

Result:

 +------+------+------+
 | sb1  | sb2  | sb3  |
 +------+------+------+
 |    2 | 2    |    4 |
 +------+------+------+

Here is another example: Suppose that you want to know the average of a set of sums for a grouped table. This does not work:

 SELECT AVG(SUM(column1)) FROM t1 GROUP BY column1;

However, this query provides the desired information:

 SELECT AVG(sum_column1)
   FROM (SELECT SUM(column1) AS sum_column1
         FROM t1 GROUP BY column1) AS t1;

Notice that the column name used within the subquery ('sum_column1') is recognized in the outer query.

A derived table can return a scalar, column, row, or table.

Derived tables are subject to these restrictions:

The optimizer determines information about derived tables in such a way that note 'EXPLAIN': explain. does not need to materialize them. See note derived-table-optimization::.

It is possible under certain circumstances that using *note 'EXPLAIN SELECT': explain. modifies table data. This can occur if the outer query accesses any tables and an inner query invokes a stored function that changes one or more rows of a table. Suppose that there are two tables 't1' and 't2' in database 'd1', and a stored function 'f1' that modifies 't2', created as shown here:

 CREATE DATABASE d1;
 USE d1;
 CREATE TABLE t1 (c1 INT);
 CREATE TABLE t2 (c1 INT);
 CREATE FUNCTION f1(p1 INT) RETURNS INT
   BEGIN
     INSERT INTO t2 VALUES (p1);
     RETURN p1;
   END;

Referencing the function directly in an *note 'EXPLAIN SELECT': explain. has no effect on 't2', as shown here:

 mysql> SELECT * FROM t2;
 Empty set (0.02 sec)

 mysql> EXPLAIN SELECT f1(5)\G
 *************************** 1. row ***************************
            id: 1
   select_type: SIMPLE
         table: NULL
    partitions: NULL
          type: NULL
 possible_keys: NULL
           key: NULL
       key_len: NULL
           ref: NULL
          rows: NULL
      filtered: NULL
         Extra: No tables used
 1 row in set (0.01 sec)

 mysql> SELECT * FROM t2;
 Empty set (0.01 sec)

This is because the note 'SELECT': select. statement did not reference any tables, as can be seen in the 'table' and 'Extra' columns of the output. This is also true of the following nested note 'SELECT': select.:

 mysql> EXPLAIN SELECT NOW() AS a1, (SELECT f1(5)) AS a2\G
 *************************** 1. row ***************************
            id: 1
   select_type: SIMPLE
         table: NULL
          type: NULL
 possible_keys: NULL
           key: NULL
       key_len: NULL
           ref: NULL
          rows: NULL
      filtered: NULL
         Extra: No tables used
 1 row in set, 1 warning (0.00 sec)

 mysql> SHOW WARNINGS;
 +-------+------+------------------------------------------+
 | Level | Code | Message                                  |
 +-------+------+------------------------------------------+
 | Note  | 1249 | Select 2 was reduced during optimization |
 +-------+------+------------------------------------------+
 1 row in set (0.00 sec)

 mysql> SELECT * FROM t2;
 Empty set (0.00 sec)

However, if the outer *note 'SELECT': select. references any tables, the optimizer executes the statement in the subquery as well, with the result that 't2' is modified:

 mysql> EXPLAIN SELECT * FROM t1 AS a1, (SELECT f1(5)) AS a2\G
 *************************** 1. row ***************************
            id: 1
   select_type: PRIMARY
         table: <derived2>
    partitions: NULL
          type: system
 possible_keys: NULL
           key: NULL
       key_len: NULL
           ref: NULL
          rows: 1
      filtered: 100.00
         Extra: NULL
 *************************** 2. row ***************************
            id: 1
   select_type: PRIMARY
         table: a1
    partitions: NULL
          type: ALL
 possible_keys: NULL
           key: NULL
       key_len: NULL
           ref: NULL
          rows: 1
      filtered: 100.00
         Extra: NULL
 *************************** 3. row ***************************
            id: 2
   select_type: DERIVED
         table: NULL
    partitions: NULL
          type: NULL
 possible_keys: NULL
           key: NULL
       key_len: NULL
           ref: NULL
          rows: NULL
      filtered: NULL
         Extra: No tables used
 3 rows in set (0.00 sec)

 mysql> SELECT * FROM t2;
 +------+
 | c1   |
 +------+
 |    5 |
 +------+
 1 row in set (0.00 sec)

 File: manual.info.tmp, Node: subquery-errors, Next: optimizing-subqueries, Prev: derived-tables, Up: subqueries

13.2.10.9 Subquery Errors .........................

There are some errors that apply only to subqueries. This section describes them.

For transactional storage engines, the failure of a subquery causes the entire statement to fail. For nontransactional storage engines, data modifications made before the error was encountered are preserved.

 File: manual.info.tmp, Node: optimizing-subqueries, Next: rewriting-subqueries, Prev: subquery-errors, Up: subqueries

13.2.10.10 Optimizing Subqueries ................................

Development is ongoing, so no optimization tip is reliable for the long term. The following list provides some interesting tricks that you might want to play with. See also *note subquery-optimization::.

These tricks might cause programs to go faster or slower. Using MySQL facilities like the 'BENCHMARK()' function, you can get an idea about what helps in your own situation. See *note information-functions::.

Some optimizations that MySQL itself makes are:

See also MySQL Internals: How MySQL Transforms Subqueries (https://dev.mysql.com/doc/internals/en/transformations.html).

 File: manual.info.tmp, Node: rewriting-subqueries, Next: subquery-restrictions, Prev: optimizing-subqueries, Up: subqueries

13.2.10.11 Rewriting Subqueries as Joins ........................................

Sometimes there are other ways to test membership in a set of values than by using a subquery. Also, on some occasions, it is not only possible to rewrite a query without a subquery, but it can be more efficient to make use of some of these techniques rather than to use subqueries. One of these is the 'IN()' construct:

For example, this query:

 SELECT * FROM t1 WHERE id IN (SELECT id FROM t2);

Can be rewritten as:

 SELECT DISTINCT t1.* FROM t1, t2 WHERE t1.id=t2.id;

The queries:

 SELECT * FROM t1 WHERE id NOT IN (SELECT id FROM t2);
 SELECT * FROM t1 WHERE NOT EXISTS (SELECT id FROM t2 WHERE t1.id=t2.id);

Can be rewritten as:

 SELECT table1.*
   FROM table1 LEFT JOIN table2 ON table1.id=table2.id
   WHERE table2.id IS NULL;

A 'LEFT [OUTER] JOIN' can be faster than an equivalent subquery because the server might be able to optimize it better--a fact that is not specific to MySQL Server alone. Prior to SQL-92, outer joins did not exist, so subqueries were the only way to do certain things. Today, MySQL Server and many other modern database systems offer a wide range of outer join types.

MySQL Server supports multiple-table note 'DELETE': delete. statements that can be used to efficiently delete rows based on information from one table or even from many tables at the same time. Multiple-table note 'UPDATE': update. statements are also supported. See note delete::, and note update::.

 File: manual.info.tmp, Node: subquery-restrictions, Prev: rewriting-subqueries, Up: subqueries

13.2.10.12 Restrictions on Subqueries .....................................

 File: manual.info.tmp, Node: update, Prev: subqueries, Up: sql-data-manipulation-statements

13.2.11 UPDATE Statement

*note 'UPDATE': update. is a DML statement that modifies rows in a table.

Single-table syntax:

 UPDATE [LOW_PRIORITY] [IGNORE] TABLE_REFERENCE
     SET ASSIGNMENT_LIST
     [WHERE WHERE_CONDITION]
     [ORDER BY ...]
     [LIMIT ROW_COUNT]

 VALUE:
     {EXPR | DEFAULT}

 ASSIGNMENT:
     COL_NAME = VALUE

 ASSIGNMENT_LIST:
     ASSIGNMENT [, ASSIGNMENT] ...

Multiple-table syntax:

 UPDATE [LOW_PRIORITY] [IGNORE] TABLE_REFERENCES
     SET ASSIGNMENT_LIST
     [WHERE WHERE_CONDITION]

For the single-table syntax, the *note 'UPDATE': update. statement updates columns of existing rows in the named table with new values. The 'SET' clause indicates which columns to modify and the values they should be given. Each value can be given as an expression, or the keyword 'DEFAULT' to set a column explicitly to its default value. The 'WHERE' clause, if given, specifies the conditions that identify which rows to update. With no 'WHERE' clause, all rows are updated. If the 'ORDER BY' clause is specified, the rows are updated in the order that is specified. The 'LIMIT' clause places a limit on the number of rows that can be updated.

For the multiple-table syntax, *note 'UPDATE': update. updates rows in each table named in TABLE_REFERENCES that satisfy the conditions. Each matching row is updated once, even if it matches the conditions multiple times. For multiple-table syntax, 'ORDER BY' and 'LIMIT' cannot be used.

For partitioned tables, both the single-single and multiple-table forms of this statement support the use of a 'PARTITION' clause as part of a table reference. This option takes a list of one or more partitions or subpartitions (or both). Only the partitions (or subpartitions) listed are checked for matches, and a row that is not in any of these partitions or subpartitions is not updated, whether it satisfies the WHERE_CONDITION or not.

Note:

Unlike the case when using 'PARTITION' with an note 'INSERT': insert. or note 'REPLACE': replace. statement, an otherwise valid 'UPDATE ... PARTITION' statement is considered successful even if no rows in the listed partitions (or subpartitions) match the WHERE_CONDITION.

For more information and examples, see *note partitioning-selection::.

WHERE_CONDITION is an expression that evaluates to true for each row to be updated. For expression syntax, see *note expressions::.

TABLE_REFERENCES and WHERE_CONDITION are specified as described in *note select::.

You need the 'UPDATE' privilege only for columns referenced in an *note 'UPDATE': update. that are actually updated. You need only the 'SELECT' privilege for any columns that are read but not modified.

The *note 'UPDATE': update. statement supports the following modifiers:

note 'UPDATE IGNORE': update. statements, including those having an 'ORDER BY' clause, are flagged as unsafe for statement-based replication. (This is because the order in which the rows are updated determines which rows are ignored.) Such statements produce a warning in the error log when using statement-based mode and are written to the binary log using the row-based format when using 'MIXED' mode. (Bug #11758262, Bug #50439) See note replication-rbr-safe-unsafe::, for more information.

If you access a column from the table to be updated in an expression, *note 'UPDATE': update. uses the current value of the column. For example, the following statement sets 'col1' to one more than its current value:

 UPDATE t1 SET col1 = col1 + 1;

The second assignment in the following statement sets 'col2' to the current (updated) 'col1' value, not the original 'col1' value. The result is that 'col1' and 'col2' have the same value. This behavior differs from standard SQL.

 UPDATE t1 SET col1 = col1 + 1, col2 = col1;

Single-table *note 'UPDATE': update. assignments are generally evaluated from left to right. For multiple-table updates, there is no guarantee that assignments are carried out in any particular order.

If you set a column to the value it currently has, MySQL notices this and does not update it.

If you update a column that has been declared 'NOT NULL' by setting to 'NULL', an error occurs if strict SQL mode is enabled; otherwise, the column is set to the implicit default value for the column data type and the warning count is incremented. The implicit default value is '0' for numeric types, the empty string ('''') for string types, and the 'zero' value for date and time types. See *note data-type-defaults::.

If a generated column is updated explicitly, the only permitted value is 'DEFAULT'. For information about generated columns, see *note create-table-generated-columns::.

note 'UPDATE': update. returns the number of rows that were actually changed. The 'mysql_info()' (https://dev.mysql.com/doc/c-api/5.7/en/mysql-info.html) C API function returns the number of rows that were matched and updated and the number of warnings that occurred during the note 'UPDATE': update.

You can use 'LIMIT ROW_COUNT' to restrict the scope of the *note 'UPDATE': update. A 'LIMIT' clause is a rows-matched restriction. The statement stops as soon as it has found ROW_COUNT rows that satisfy the 'WHERE' clause, whether or not they actually were changed.

If an *note 'UPDATE': update. statement includes an 'ORDER BY' clause, the rows are updated in the order specified by the clause. This can be useful in certain situations that might otherwise result in an error. Suppose that a table 't' contains a column 'id' that has a unique index. The following statement could fail with a duplicate-key error, depending on the order in which rows are updated:

 UPDATE t SET id = id + 1;

For example, if the table contains 1 and 2 in the 'id' column and 1 is updated to 2 before 2 is updated to 3, an error occurs. To avoid this problem, add an 'ORDER BY' clause to cause the rows with larger 'id' values to be updated before those with smaller values:

 UPDATE t SET id = id + 1 ORDER BY id DESC;

You can also perform note 'UPDATE': update. operations covering multiple tables. However, you cannot use 'ORDER BY' or 'LIMIT' with a multiple-table note 'UPDATE': update. The TABLE_REFERENCES clause lists the tables involved in the join. Its syntax is described in *note join::. Here is an example:

 UPDATE items,month SET items.price=month.price
 WHERE items.id=month.id;

The preceding example shows an inner join that uses the comma operator, but multiple-table note 'UPDATE': update. statements can use any type of join permitted in note 'SELECT': select. statements, such as 'LEFT JOIN'.

If you use a multiple-table note 'UPDATE': update. statement involving 'InnoDB' tables for which there are foreign key constraints, the MySQL optimizer might process tables in an order that differs from that of their parent/child relationship. In this case, the statement fails and rolls back. Instead, update a single table and rely on the 'ON UPDATE' capabilities that 'InnoDB' provides to cause the other tables to be modified accordingly. See note create-table-foreign-keys::.

You cannot update a table and select directly from the same table in a subquery. You can work around this by using a multi-table update in which one of the tables is derived from the table that you actually wish to update, and referring to the derived table using an alias. Suppose you wish to update a table named 'items' which is defined using the statement shown here:

 CREATE TABLE items (
     id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
     wholesale DECIMAL(6,2) NOT NULL DEFAULT 0.00,
     retail DECIMAL(6,2) NOT NULL DEFAULT 0.00,
     quantity BIGINT NOT NULL DEFAULT 0
 );

To reduce the retail price of any items for which the markup is 30% or greater and of which you have fewer than one hundred in stock, you might try to use an 'UPDATE' statement such as the one following, which uses a subquery in the 'WHERE' clause. As shown here, this statement does not work:

 mysql> UPDATE items
      > SET retail = retail * 0.9
      > WHERE id IN
      >     (SELECT id FROM items
      >         WHERE retail / wholesale >= 1.3 AND quantity > 100);
 ERROR 1093 (HY000): You can't specify target table 'items' for update in FROM clause

Instead, you can employ a multi-table update in which the subquery is moved into the list of tables to be updated, using an alias to reference it in the outermost 'WHERE' clause, like this:

 UPDATE items,
        (SELECT id FROM items
         WHERE id IN
             (SELECT id FROM items
              WHERE retail / wholesale >= 1.3 AND quantity < 100))
         AS discounted
 SET items.retail = items.retail * 0.9
 WHERE items.id = discounted.id;

Because the optimizer tries by default to merge the derived table 'discounted' into the outermost query block, this works only if you force materialization of the derived table. You can do this by setting the 'derived_merge' flag of the 'optimizer_switch' system variable to 'off' before running the update, or by using the 'NO_MERGE' optimizer hint, as shown here:

 UPDATE /*+ NO_MERGE(discounted) */ items,
        (SELECT id FROM items
         WHERE retail / wholesale >= 1.3 AND quantity < 100)
         AS discounted
     SET items.retail = items.retail * 0.9
     WHERE items.id = discounted.id;

The advantage of using the optimizer hint in such a case is that it applies only within the query block where it is used, so that it is not necessary to change the value of 'optimizer_switch' again after executing the 'UPDATE'.

Another possibility is to rewrite the subquery so that it does not use 'IN' or 'EXISTS', like this:

 UPDATE items,
        (SELECT id, retail / wholesale AS markup, quantity FROM items)
        AS discounted
     SET items.retail = items.retail * 0.9
     WHERE discounted.markup >= 1.3
     AND discounted.quantity < 100
     AND items.id = discounted.id;

In this case, the subquery is materialized by default rather than merged, so it is not necessary to disable merging of the derived table.

 File: manual.info.tmp, Node: sql-transactional-statements, Next: sql-replication-statements, Prev: sql-data-manipulation-statements, Up: sql-statements