8.4 Optimizing Database Structure

In your role as a database designer, look for the most efficient way to organize your schemas, tables, and columns. As when tuning application code, you minimize I/O, keep related items together, and plan ahead so that performance stays high as the data volume increases. Starting with an efficient database design makes it easier for team members to write high-performing application code, and makes the database likely to endure as applications evolve and are rewritten.

 File: manual.info.tmp, Node: data-size, Next: optimize-data-types, Prev: optimizing-database-structure, Up: optimizing-database-structure

8.4.1 Optimizing Data Size

Design your tables to minimize their space on the disk. This can result in huge improvements by reducing the amount of data written to and read from disk. Smaller tables normally require less main memory while their contents are being actively processed during query execution. Any space reduction for table data also results in smaller indexes that can be processed faster.

MySQL supports many different storage engines (table types) and row formats. For each table, you can decide which storage and indexing method to use. Choosing the proper table format for your application can give you a big performance gain. See note innodb-storage-engine::, and note storage-engines::.

You can get better performance for a table and minimize storage space by using the techniques listed here:

Table Columns

Row Format

Indexes

Joins

Normalization

 File: manual.info.tmp, Node: optimize-data-types, Next: optimize-multi-tables, Prev: data-size, Up: optimizing-database-structure

8.4.2 Optimizing MySQL Data Types

 File: manual.info.tmp, Node: optimize-numeric, Next: optimize-character, Prev: optimize-data-types, Up: optimize-data-types

8.4.2.1 Optimizing for Numeric Data ...................................

 File: manual.info.tmp, Node: optimize-character, Next: optimize-blob, Prev: optimize-numeric, Up: optimize-data-types

8.4.2.2 Optimizing for Character and String Types .................................................

For character and string columns, follow these guidelines:

 File: manual.info.tmp, Node: optimize-blob, Next: procedure-analyse, Prev: optimize-character, Up: optimize-data-types

8.4.2.3 Optimizing for BLOB Types .................................

 File: manual.info.tmp, Node: procedure-analyse, Prev: optimize-blob, Up: optimize-data-types

8.4.2.4 Using PROCEDURE ANALYSE ...............................

'ANALYSE([MAX_ELEMENTS[,MAX_MEMORY]])'

Note:

'PROCEDURE ANALYSE()' is deprecated as of MySQL 5.7.18, and is removed in MySQL 8.0.

'ANALYSE()' examines the result from a query and returns an analysis of the results that suggests optimal data types for each column that may help reduce table sizes. To obtain this analysis, append 'PROCEDURE ANALYSE' to the end of a *note 'SELECT': select. statement:

 SELECT ... FROM ... WHERE ... PROCEDURE ANALYSE([MAX_ELEMENTS,[MAX_MEMORY]])

For example:

 SELECT col1, col2 FROM table1 PROCEDURE ANALYSE(10, 2000);

The results show some statistics for the values returned by the query, and propose an optimal data type for the columns. This can be helpful for checking your existing tables, or after importing new data. You may need to try different settings for the arguments so that 'PROCEDURE ANALYSE()' does not suggest the *note 'ENUM': enum. data type when it is not appropriate.

The arguments are optional and are used as follows:

A 'PROCEDURE' clause is not permitted in a *note 'UNION': union. statement.

 File: manual.info.tmp, Node: optimize-multi-tables, Next: internal-temporary-tables, Prev: optimize-data-types, Up: optimizing-database-structure

8.4.3 Optimizing for Many Tables

Some techniques for keeping individual queries fast involve splitting data across many tables. When the number of tables runs into the thousands or even millions, the overhead of dealing with all these tables becomes a new performance consideration.

 File: manual.info.tmp, Node: table-cache, Next: creating-many-tables, Prev: optimize-multi-tables, Up: optimize-multi-tables

8.4.3.1 How MySQL Opens and Closes Tables .........................................

When you execute a *note 'mysqladmin status': mysqladmin. command, you should see something like this:

 Uptime: 426 Running threads: 1 Questions: 11082
 Reloads: 1 Open tables: 12

The 'Open tables' value of 12 can be somewhat puzzling if you have fewer than 12 tables.

MySQL is multithreaded, so there may be many clients issuing queries for a given table simultaneously. To minimize the problem with multiple client sessions having different states on the same table, the table is opened independently by each concurrent session. This uses additional memory but normally increases performance. With 'MyISAM' tables, one extra file descriptor is required for the data file for each client that has the table open. (By contrast, the index file descriptor is shared between all sessions.)

The 'table_open_cache' and 'max_connections' system variables affect the maximum number of files the server keeps open. If you increase one or both of these values, you may run up against a limit imposed by your operating system on the per-process number of open file descriptors. Many operating systems permit you to increase the open-files limit, although the method varies widely from system to system. Consult your operating system documentation to determine whether it is possible to increase the limit and how to do so.

'table_open_cache' is related to 'max_connections'. For example, for 200 concurrent running connections, specify a table cache size of at least '200 * N', where N is the maximum number of tables per join in any of the queries which you execute. You must also reserve some extra file descriptors for temporary tables and files.

Make sure that your operating system can handle the number of open file descriptors implied by the 'table_open_cache' setting. If 'table_open_cache' is set too high, MySQL may run out of file descriptors and exhibit symptoms such as refusing connections or failing to perform queries.

Also take into account that the 'MyISAM' storage engine needs two file descriptors for each unique open table. For a partitioned 'MyISAM' table, two file descriptors are required for each partition of the opened table. (When 'MyISAM' opens a partitioned table, it opens every partition of this table, whether or not a given partition is actually used. See note partitioning-limitations-myisam-file-descriptors::.) To increase the number of file descriptors available to MySQL, set the 'open_files_limit' system variable. See note not-enough-file-handles::.

The cache of open tables is kept at a level of 'table_open_cache' entries. The server autosizes the cache size at startup. To set the size explicitly, set the 'table_open_cache' system variable at startup. MySQL may temporarily open more tables than this to execute queries, as described later in this section.

MySQL closes an unused table and removes it from the table cache under the following circumstances:

When the table cache fills up, the server uses the following procedure to locate a cache entry to use:

A 'MyISAM' table is opened for each concurrent access. This means the table needs to be opened twice if two threads access the same table or if a thread accesses the table twice in the same query (for example, by joining the table to itself). Each concurrent open requires an entry in the table cache. The first open of any 'MyISAM' table takes two file descriptors: one for the data file and one for the index file. Each additional use of the table takes only one file descriptor for the data file. The index file descriptor is shared among all threads.

If you are opening a table with the 'HANDLER TBL_NAME OPEN' statement, a dedicated table object is allocated for the thread. This table object is not shared by other threads and is not closed until the thread calls 'HANDLER TBL_NAME CLOSE' or the thread terminates. When this happens, the table is put back in the table cache (if the cache is not full). See *note handler::.

To determine whether your table cache is too small, check the 'Opened_tables' status variable, which indicates the number of table-opening operations since the server started:

 mysql> SHOW GLOBAL STATUS LIKE 'Opened_tables';
 +---------------+-------+
 | Variable_name | Value |
 +---------------+-------+
 | Opened_tables | 2741  |
 +---------------+-------+

If the value is very large or increases rapidly, even when you have not issued many 'FLUSH TABLES' statements, increase the 'table_open_cache' value at server startup.

 File: manual.info.tmp, Node: creating-many-tables, Prev: table-cache, Up: optimize-multi-tables

8.4.3.2 Disadvantages of Creating Many Tables in the Same Database ..................................................................

If you have many 'MyISAM' tables in the same database directory, open, close, and create operations are slow. If you execute *note 'SELECT': select. statements on many different tables, there is a little overhead when the table cache is full, because for every table that has to be opened, another must be closed. You can reduce this overhead by increasing the number of entries permitted in the table cache.

 File: manual.info.tmp, Node: internal-temporary-tables, Next: database-count-limit, Prev: optimize-multi-tables, Up: optimizing-database-structure

8.4.4 Internal Temporary Table Use in MySQL

In some cases, the server creates internal temporary tables while processing statements. Users have no direct control over when this occurs.

The server creates temporary tables under conditions such as these:

To determine whether a statement requires a temporary table, use note 'EXPLAIN': explain. and check the 'Extra' column to see whether it says 'Using temporary' (see note using-explain::). 'EXPLAIN' does not necessarily say 'Using temporary' for derived or materialized temporary tables.

Some query conditions prevent the use of an in-memory temporary table, in which case the server uses an on-disk table instead:

The server does not use a temporary table for note 'UNION': union. statements that meet certain qualifications. Instead, it retains from temporary table creation only the data structures necessary to perform result column typecasting. The table is not fully instantiated and no rows are written to or read from it; rows are sent directly to the client. The result is reduced memory and disk requirements, and smaller delay before the first row is sent to the client because the server need not wait until the last query block is executed. note 'EXPLAIN': explain. and optimizer trace output reflects this execution strategy: The 'UNION RESULT' query block is not present because that block corresponds to the part that reads from the temporary table.

These conditions qualify a 'UNION' for evaluation without a temporary table:

Internal Temporary Table Storage Engine

An internal temporary table can be held in memory and processed by the 'MEMORY' storage engine, or stored on disk by the 'InnoDB' or 'MyISAM' storage engine.

If an internal temporary table is created as an in-memory table but becomes too large, MySQL automatically converts it to an on-disk table. The maximum size for in-memory temporary tables is defined by the 'tmp_table_size' or 'max_heap_table_size' value, whichever is smaller. This differs from 'MEMORY' tables explicitly created with *note 'CREATE TABLE': create-table. For such tables, only the 'max_heap_table_size' variable determines how large a table can grow, and there is no conversion to on-disk format.

The 'internal_tmp_disk_storage_engine' variable defines the storage engine the server uses to manage on-disk internal temporary tables. Permitted values are 'INNODB' (the default) and 'MYISAM'.

Note:

When using 'internal_tmp_disk_storage_engine=INNODB', queries that generate on-disk internal temporary tables that exceed *note 'InnoDB' row or column limits: innodb-limits. return 'Row size too large' or 'Too many columns' errors. The workaround is to set 'internal_tmp_disk_storage_engine' to 'MYISAM'.

When an internal temporary table is created in memory or on disk, the server increments the 'Created_tmp_tables' value. When an internal temporary table is created on disk, the server increments the 'Created_tmp_disk_tables' value. If too many internal temporary tables are created on disk, consider increasing the 'tmp_table_size' and 'max_heap_table_size' settings.

Internal Temporary Table Storage Format

In-memory temporary tables are managed by the 'MEMORY' storage engine, which uses fixed-length row format. 'VARCHAR' and 'VARBINARY' column values are padded to the maximum column length, in effect storing them as 'CHAR' and 'BINARY' columns.

On-disk temporary tables are managed by the 'InnoDB' or 'MyISAM' storage engine (depending on the 'internal_tmp_disk_storage_engine' setting). Both engines store temporary tables using dynamic-width row format. Columns take only as much storage as needed, which reduces disk I/O, space requirements, and processing time compared to on-disk tables that use fixed-length rows.

For statements that initially create an internal temporary table in memory, then convert it to an on-disk table, better performance might be achieved by skipping the conversion step and creating the table on disk to begin with. The 'big_tables' variable can be used to force disk storage of internal temporary tables.

 File: manual.info.tmp, Node: database-count-limit, Next: table-size-limit, Prev: internal-temporary-tables, Up: optimizing-database-structure

8.4.5 Limits on Number of Databases and Tables

MySQL has no limit on the number of databases. The underlying file system may have a limit on the number of directories.

MySQL has no limit on the number of tables. The underlying file system may have a limit on the number of files that represent tables. Individual storage engines may impose engine-specific constraints. 'InnoDB' permits up to 4 billion tables.

 File: manual.info.tmp, Node: table-size-limit, Next: column-count-limit, Prev: database-count-limit, Up: optimizing-database-structure

8.4.6 Limits on Table Size

The effective maximum table size for MySQL databases is usually determined by operating system constraints on file sizes, not by MySQL internal limits. For up-to-date information operating system file size limits, refer to the documentation specific to your operating system.

Windows users, please note that FAT and VFAT (FAT32) are not considered suitable for production use with MySQL. Use NTFS instead.

If you encounter a full-table error, there are several reasons why it might have occurred:

 File: manual.info.tmp, Node: column-count-limit, Prev: table-size-limit, Up: optimizing-database-structure

8.4.7 Limits on Table Column Count and Row Size

This section describes limits on the number of columns in tables and the size of individual rows.

Column Count Limits

MySQL has hard limit of 4096 columns per table, but the effective maximum may be less for a given table. The exact column limit depends on several factors:

Row Size Limits

The maximum row size for a given table is determined by several factors:

Row Size Limit Examples

 File: manual.info.tmp, Node: optimizing-innodb, Next: optimizing-myisam, Prev: optimizing-database-structure, Up: optimization