Advanced MySQL Administration
 
Administration of InnoDB Tables
 
As you know InnoDB tables offer, in comparison to MyISAM tables, a number of additional functions (in particular, transactions and integrity rules). However, the use of InnoDB tables makes the administration of MySQL a bit more complex. This section provides some tips for managing InnoDB tables and for the optimal configuration of the MySQL server.
 
Tablespace Administration
 
While MyISAM files and indexes are stored in their own files within a directory with the name of the database (e.g., data/dbname/tablename.myd), all InnoDB tables and indexes are stored in a virtual file system, which in the InnoDB documentation is called the tablespace. The tablespace itself can be composed of a number of files.
 
You can use InnoDB tables without any particular configuration settings, in which case at the first start of the MySQL server the file ibdata1, of size 10 megabytes, is created for the tablespace and can be enlarged repeatedly by 8 megabytes as required.
 
Determining the Space Requirements of InnoDB Tables
 
The tablespace is more or less a black box that you cannot see into. For example, there is no command to return any sort of directory of the tablespace. The command SHOW TABLE STATUS does give information about how much space the individual InnoDB tables and their indexes require within the tablespace and how much space is available before the tablespace will have to be enlarged.
 
Configuration of the tablespace Files
 
Where and in what size tablespace files are created is determined by the configuration parameters innodb_data_home and innodb_data_file_path. The former specifies the directory in which all InnoDb files are stored (by default the MySQL data directory), and the latter contains the names and sizes of the tablespace files. A possible setting in the MySQL configuration file is the following:
 
#in /etc/my.cnf bzw. Windows\my.ini
[myslqd]
innodb_data_home = c:\[my_sql_install_dir]\data
innodb_data_file_path = ibdata1:1G;ibdata2:1G:autoextend:max:2G
 
This means that the tablespace consists of the files D:\data\ibdata1 and D:\data\ibdata2.If these files do not exist, they will be created, of size 1 gigabyte each. If the InnoDB tables require more space while the server is running, then ibdata2 will be enlarged automatically in 8-megabyte increments (to a maximum of 2 gigabytes).
 
In the management of the tablespace files you should observe the following:
 
. The MySQL server requires write privileges for the innodb_data_home directory so that it can create and alter the tablespace files.
 
. As administrator you must enlarge the tablespace in good time (that is, before reaching the limits of a tablespace file). If the InnoDB driver determines in a transaction that the tablespace is full and cannot be enlarged any further, the transaction will be aborted with ROLLBACK.
 
Using Hard Drive Partitions Directly
 
InnoDB also offers the option of using an entire hard drive partition directly (that is, without a file system managed by the operating system). For this, instead of specifying the file name, you give the device name of the partition and append the exact size specification (newraw). The partition size must be an integer multiple of 1 megabyte. (The following example uses the Linux device notation.)
 
innodb_data_home_dir=
innodb_data_file_path=/dev/hdb1:61440Mnewraw
 
After the partition has been initialized by MySQL, you must stop the MySQL server and replace newraw by raw.(The former is necessary only if you are adding a new partition.)
 
innodb_data_file_path=/dev/hdb1:61440Mraw
 
The InnoDB documentation unfortunately contains no information about whether better performance can be obtained by the direct use of hard drive partitions (one suspects that it can) and if so, how much. I have been able to obtain little concrete information from the MySQL mailing list.
 
Enlarging the tablespace
 
In principle, one cannot enlarge individual tablespace files. (The one exception is the autoextend attribute, which, however, can be specified only for the last tablespace file.) To enlarge the tablespace, you must therefore add an additional file to innodb_data_file_path. The process looks in detail like this:
 
. Stop the MySQL server.
 
. If the size of the last tablespace file was variable due to autoextend, you must determine its actual size in megabytes (the number of bytes according to DIR or ls divided by 1,048,576). The resulting size must be specified in the innodb_data_file_path setting.
 
If innodb_data_file_path does not yet exist in the configuration file, then previously, ibdata was used as the default file, and you must determine and specify its size.
 
. Add one or more new files to innodb_data_file_path.
 
All files must be located in the one directory (or in directories relative to it). If you wish to divide the tablespace files among several partitions, hard drives, etc., then you must specify an empty character string for innodb_data_home, and in innodb_data_file_path you must use absolute file names.
 
Note that the order of the files specified up to now in innodb_data_file_path cannot be changed (and that of course, none of the previous files may be missing).
 
. Restart the MySQL server. If the server does not detect an erroneous configuration, it will generate the new tablespace files. This process will also log any errors in the file hostname.err.
 
Let us suppose that the previous setting looks like the following and that ibdata2 has current size of 1904 megabytes:
 
innodb_data_file_path = ibdata1:1G;ibdata2:1G:autoextend
 
Younow want to increase the size of the tablespace to 4 gigabytes. The new setting must look like the following:
 
innodb_data_file_path =
ibdata1:1G;ibdata2:1904M;ibdata3:1100MB:autoextend
 
Shrinking the Size of the tablespace
 
It is, unfortunately, impossible to make the tablespace smaller. If you delete large InnoDB tables or change them into another table format, the space within the tablespace is freed up, but the tablespace files cannot be made smaller. The only way to shrink them is by the following process:
 
. Make a backup of all InnoDB tables with mysqldump.
. Delete all InnoDB tables (DROP TABLE ... ).
. Stop the MySQL server (mysqladmin shutdown).
. Delete the current tablespace files (ibdata ... ). If you have enough space, it is, of course, more secure first to move the files into another directory.
. Change innodb_data_file_path in the MySQL configuration file.
. Restart the MySQL server. New tablespace files corresponding to the innodb_data_file_path setting will be created.
. Recreate all your InnoDB tables from your backup files.
 
Word of advice

Unfortunately, there is no way of obtaining a list of all InnoDB tables. If you are managing a large number of databases, you must search through the backup for each database for InnoDB tables. It is simpler to make a backup of all databases. The re-creation of the databases is also easier. But this way of proceeding makes sense only if the MyISAM tables constitute a small portion of the data or when you are going to make a complete backup anyhow. An alternative is to transform the InnoDB tables temporarily into MyISAM tables. This is easy to do, but involves a great deal of internal processing, and therefore, it is particularly slow for large tables. Furthermore, problems can arise if the tables use InnoDB-specific properties (such as foreign key constraints).
 
Copying, Deleting, and Moving InnoDB Tables
 
From your experience with MyISAM tables you are accustomed to the fact that (after a server shutdown) you can simply copy or move all files dbname.tablename. The MySQL server recognizes at restart which tables are to be found where. This is very practical for backups or making a copy of a table or database.
 
If you use InnoDB tables, none of this is possible. If you wish to copy a table, you must either create a new table and copy the data with INSERT ... SELECT, or you must make a backup of the table (mysqldump) and then create the table under another name.
 
Furthermore, caution is necessary with the *.frm files. These files give the structure of a table and are located in the relevant database directory (even with InnoDB tables!). The *.frm files and associated tables stored in tablespace must always be synchronized. You must not simply delete *.frm files. If you wish to delete a table, execute DROP TABLE; then the *.frm file will be deleted.
 
Making a Backup
 
There are several ways of making a backup of InnoDB tables:
 
. The most elegant way is to use InnoDB Hot Backup. This auxiliary program makes the backup while the server is running, without blocking tables.
 
. Of course, you can always use mysqldump. However, the results are consistent only if the tables are not changed during the execution of the command.
 
. If youare prepared to stop the MySQL server, you can simply copy the tablespace files. Note, though, that you must use the exact same innodb_data_file_path setting as well as all *.frm files. On the other hand, you do not require the transaction logging files. If the server was properly shut down, these files no longer contain any relevant data.
 
Moving the tablespace
 
According to the MySQL documentation, the tablespace files are independent of the operating system. Only the CPU's floating-point representation must be correct. If those conditions are satisfied, then the tablespace files can be moved without problem between, say, Windows and Linux. Of course, you must take care here, too, that the innodb_data_file_path setting is correct and that all *.frm files are moved as well. (In practice, it will usually be the case that both MyISAM and InnoDB tables are to be copied. For this, all database directories and all tablespace files are simply copied.
 
The InnoDB documentation contains no information as to whether the tablespace file format is dependent in any way on the version. Heikki Tuuri, the developer of the InnoDB table driver, has promised forward compatibility for all present and future versions. (It has always been a tradition with MySQL that database files be able to be moved without difficulty to a new version.)
 
Within MySQL 4.0.n I experience no problems with changing among numerous versions; however, I have not checked backward compatibility to earlier versions. In general, an occasional backup with mysqldump is a good idea. The backup file then exists in text format, which is immune against possible compatibility problems.
 
Logging Files
 
Transaction Logging
 
InnoDB logs all changes in logging files with the names ib_logfile0, ib_logfile1, etc. The purpose of these logging files is to make large transactions possible as well as to restore InnoDB data after a crash.
 
If MySQL is properly configured and there is sufficient memory, then most of the currently needed data should reside in RAM. To improve speed, changes to data are first made only in RAM, and not in the actual data files (that is, in the case of InnoDB, in the tablespace).
 
Only when a transaction is completed with COMMIT are the changes in data actually stored on the hard drive, and then first in the InnoDB logging files ib_logfile0, ib_logfile1, etc. The changed parts of the tablespace are only gradually transferred to the hard drive, all this for reasons of efficiency. If a crash occurs during these proceedings, then the tablespace can be restored with the help of the logging files.
 
The logging files ib_logfile0, etc., have two purposes. On the one hand, they satisfy the ACID condition of durability, so that transactions that have been carried out are not endangered even if there is a crash immediately after the transaction has been completed. On the other hand, the logging files enable transactions of almost unlimited size, even those for which it is not possible to hold all pending (but not yet confirmed) changes in RAM.
 
The InnoDB logging files are filled in order. When the last file is full, the InnoDB table driver begins writing data to the first logging file. Therefore, the entire size of all logging files limits the quantity of tablespace changes that can be temporarily stored before a COMMIT. The maximum size of all logging files is currently (MySQL 4.0.9) limited to 4 gigabytes.
 
The transaction logging files are necessary only while the MySQL server is running. As soon as the MySQL server has been properly shut down, these files are no longer needed. For example, if you make a backup with ibdata files, you do not need to copy the logging files.
 
Size and Location of the Logging Files
 
The proper dimensioning of the logging files has a great influence on the speed of MySQL/InnoDB. The location, size, and number of logging files are determined by the configuration parameters innodb_log_group_home, innodb_log_files_in_group, and innodb_log_file_size. By default, two logging files of size 5 megabytes each are created in MySQL's data directory.
 
The InnoDB documentation recommends that the total size of the logging files be about the size of the buffer (parameter innodb_buffer_pool_size, which is 8 megabytes by default). If the buffer is larger than the logging files, then it can happen that the InnoDB table driver will have to make a so-called checkpoint, involving temporary storage of uncommitted changes.
 
If you wish to change the location, size, or number of logging files, you must stop the MySQL server. Now delete the existing logging files ib_logfilen (only when you are sure that shutdown took place without error) and change /etc/my.cnf or Windows\my.ini. New logging files will be created at the subsequent restart of the MySQL server.
 
Logging Synchronization
 
The configuration parameters innodb_flush_log_at_trx_commit and innodb_flush_method tell when (how often) and how logging files are synchronized. For innodb_log_at_trx_commit there are three settings, which allow one to make tradeoffs between speed and security:
 
. The default setting is 0. The data are written about once per second into the current logging file, and then the file is synchronized. (Writing means here that the data are passed to an I/O function of the operating system. Synchronization means that changes are actually physically written to the hard drive.)
 
If there is a crash during the time between the COMMIT and the synchronization of the logging file, the transaction is lost and cannot be reconstructed at a later restart. Therefore, innodb_flush_log_at_trx_commit=0 is a strike against the ACID durability condition.
 
. More secure is the setting 1. Now writing and synchronization take place with each COMMIT. The drawback is that if you make mostly small transactions, then the hard drive limits the number of possible transactions per second. (For a hard drive with 7200 revolutions per minute, that is, 120 per second, at most 120 transactions can be executed per second, a theoretical limit that is never reached in practice.)
 
. The setting 2 is a good compromise. Here the writing takes place at each COMMIT, but the synchronization only about once per second. If the MySQL server crashes, then immediately terminated transactions are not lost (since the synchronization can take place after a crash). However, if the operating system crashes (power outage, for example), then transactions are lost as with setting 0.
 
The parameter innodb_flush_method determines whether the operating system function fsync (the default) or O_SYNC (setting O_DSYNC) is used for synchronizing the logging files. With many Unix versions, O_SYNC is faster.
 
Archive Logging
 
The transaction logging files are conceived only for internal data management, not for backups. If you require a record of all changes to data since a particular time (since the last complete backup), then you must use MySQL binary logging, which functions entirely independently of the InnoDB table driver. (See the discussion earlier in this chapter.)
 
InnoDB can, in principle, also carry out such logging. This type of logging is called archive logging in the InnoDB documentation. However, archive logging makes sense only if InnoDB is used independently of MySQL. The files ib_arch_log_n that appear in the data directory are a result of such archive logging. In them are logged, at the start of the MySQL server, the creation of new tablespace or transaction logging files. Then, however, archive logging is automatically shut off.
 
Should you wish for some reason to use archive logging, it can be activated with innodb_log_archive.
 
Tips for Speed Optimization
 
This section offers some tips for speed optimization. The information here is relevant only if you are using primarily InnoDB tables and are working with large data sets.
 
Buffer Settings
 
Perhaps the most important parameter for influencing the speed of the InnoDB table driver is innodb_buffer_pool_size. This parameter specifies how much RAM should be used for temporary storage of InnoDB tables and indexes. The more such data is available in RAM, the less often access must be made to the hard drive in SELECT commands. By default, the InnoDB table driver reserves only 8 megabytes as a buffer. Larger values (the InnoDB documentation recommends up to 80 percent of main memory for a dedicated database server) can dramatically increase the speed of SELECT queries. (The total size of the transaction logging files should be as large as the buffer storage.)
 
Depending on the application, two additional parameters influence what data are stored temporarily in RAM: innodb_log_buffer_size sets the size of the buffer for transaction logging, while innodb_additonal_mem_pool_size determines how much space in RAM is reserved for various other information such as metadata on open tables. This buffer (by default 1 megabyte) should be enlarged if you are dealing with a large number of InnoDB tables.
 
Block Operations
 
If you are carrying out extensive block operations (such as importing a table with a million data records or changing from MyISAM to InnoDB format), you can speed up the process with a few tricks:
 
. Use the setting SET unique_checks=0. Then no check is made whether the data of a UNIQUE column or the primary index column are actually unique. Of course, you should use this setting only if you are absolutely sure that there are, in fact, no duplicates!
 
. Use the setting SET foreign_key_checks=0. With this setting you achieve that the integrity conditions are not checked. Of course, here, too, this setting should be used only if you are convinced of the integrity of your data (e.g., in restoring backup data).
 
. Execute all the INSERT commands for a table as a single transaction. Usually, importation consists of countless INSERT commands, which by default (auto commit)are all carried out in separate transactions. With SET AUTOCOMMIT=0, all INSERT commands are collected into a single transaction. Keep in mind that this transaction must be confirmed with a COMMIT.
 
This technique works only if the transaction logging files are large enough. Note that with large transactions, a ROLLBACK can be unusually long, even taking hours. Note as well that CREATE TABLE has the effect of COMMIT. Therefore, it is impossible to read in several tables in a single transaction.