Optimizing Innodb for Bulk Insert

Bulk Insert? How about some some bulk raspberries?

Update: October 2nd, 2015: This article contains an important omission, as it does not consider LOAD DATA INFILE command. If the data that needs to be loaded can be loaded as is, or can be loaded with minimal tampering, I’d recommend using LOAD DATA INFILE first, and then apply the MySql server variable changes I recommended as needed. Additionally, this post only focuses on Innodb storage engine, if TokuDB is available on your platform, it may be worth looking into it as TokuDB’s primary use case is write-intensive environments.

Background

Need to parse, validate, and split into several Innodb tables a number of CSVs that range in size of 10,000 to over a million lines of data. These CSVs need to be loaded in a single transaction, such that if a problem occurs on the last line, the entire file is rolled back. As can be imagined, the database goes under immense write pressure during a small window of time. This post was warranted after seeing underwhelming performance. Here I document my quest.

The goal is to eek as much performance I can out of the test machine without compromising data integrity.

Test machine:

  • Ubuntu 14.04
  • MySql 5.6
  • 4GB RAM
  • Single virtualized core
  • Mounted on a ext4 filesystem

Tour of MySql Variables

I explored all the MySql server variables, I think I’ve come up with a pretty good list of variables to optimize bulk insert speed. Note that some of these variables, by default, are set to sensible values depending on the MySql version, but I believe being explicit is better than implicit, so I replicate it here.

  • innodb_change_buffering: set to all so that updates, inserts, and deletes are buffered in the Innodb buffer pool before being sent to disk.

  • innodb_change_buffer_max_size: Set to 25, so that up to 25% buffer pool is used to buffer changes. The default is 25 and that is a reasonable, if there is any desire to increase this number, just bump the amount of RAM and increase the buffer pool size.

  • innodb_buffer_pool_instances: This variable increases concurrency in regards to the number of caches there are for Innodb. There is no benefit to increasing this number, at least when regarding these kinds of writes, as the writes all deal with the same data. However, I don’t have strong evidence or reason to change it from the default. Depending on what MySql version, the default could be 1 or 8.

  • innodb_buffer_pool_size: The quintessential variable to change to increase performance. This is essentially the amount of RAM the server uses for caching data and indices. Since the MySql manual states to use up to 80% of the available RAM for this variable, I chose 3072M.

  • innodb_log_file_size: The larger the value, means the larger the log file can go before it is flushed into the database. Larger files also means a slower crash recovery, but in this case, I’m optimistic that the need for recovery will be need non-existent. There is a definitely a good reason to increase this variable, as some versions default to 5M, while newer versions default to 48M, and ‘for a high-performance workload you need hundreds of megabytes, or even gigabytes of log files’ (Schwartz, Zaitsev, & Tkachenko, 2012, p. 358). When changing this variable, please follow the Percona post to the letter, as I didn’t, and I corrupted the database. According to the manual, the max size is 1/2 of the innodb_buffer_pool_size, but an 1/8th in this case is plenty (384 MB).

  • innodb_log_buffer_size: This is the size of the buffer that is used to write to the log file. This is the one variable where I went unconventional. ‘The recommended range is 1 to 8 MB[, as] InnoDB flushes the buffer to the log files on disk when the buffer gets full, when a transaction commits, or once per second - whichever comes first’ (Schwartz, Zaitsev, & Tkachenko, 2012, p. 359) but I went with 128M and I saw a noticeable difference, so I kept it. It may be overkill because it is hard to imagine 128MB needing to be written, but the manual did have the following to say: “A large log buffer enables large transactions to run without a need to write the log to disk before the transactions commit. Thus, if you have big transactions, making the log buffer larger saves disk I/O.” So take it face value, this situation has unusually large transactions, so an unusually large log buffer size is warranted.

  • innodb_flush_log_at_trx_commit: Don’t change this value from the default of 1 because that signifies the log buffer is flushed to disk at each transaction commit. The other options deal with flushing every second, instead of each transaction, which is worse in this situation because the write transaction take longer than a second anyways. Also a value of one is the only one that guarantees ACID compliance.

  • skip-innodb_doublewrite: This is such an important system variable that I dedicate the entire next section to it. Note that this variable heavily depends on the file system that the database is hosted on.

Why the file system matters

I discovered why the file system matters when investigating why I wasn’t getting the performance I expected. I thought I knew where the bottleneck was, but like all the other stories out there, I was wrong. By using iotop, I found that a process called jbd2/sda1-8 was using 90-99% of the IO time, and that MySql throughput was limited to 500KB. Scrounging around the web, I discovered it was the journaling process of ext4, which ensures the file system against corruption, but wait, doesn’t Innodb ensure itself against corruption?

You can disable Innodb double write feature on transactional file systems because this variable dictates whether Innodb stores the data twice before it being stored in the database, though the MySql manual states that “the double write buffer does not require twice as much I/O overhead or twice as many I/O operation”. A transactional file system guarantees that partial writes are impossible, which is what the double write feature protects against. A type of transactional file system is a journaling file system. By default ext4 has journaling baked in, so one can save writes by skipping innodb_doublewrite

For completeness, this was how I mounted the drive:

/dev/sdb /mnt/database ext4 defaults,nodiratime 0 0

Pretty standard mount, the only oddity is nodiratime, which “disables writing file access times to the drive every time you read a file [and] disables the writing of file access times only for directories while other files still get access times written.” This is done to improve performance and the fact that file access times for a database aren’t important (and since a database accesses files a lot, this does indeed help). I did not benchmark the effects of nodiratime.

Some may argue to enable innodb_doublewrite and disable the journaling through data=writeback, but I disagree because disabling the journaling will affect all files on the drive and not just the database. I reasoned in my mind that it would be a lot more apparent why the database variable was set a certain way rather than the file system.

Options Not Considered

INSERT DELAYED. “INSERT DELAYED is slower than a normal INSERT if the table is not otherwise in use.” “As of MySQL 5.6.6, INSERT DELAYED is deprecated, and will be removed in a future release.” Also Innodb can’t use INSERT DELAYED

There is a way to skip checking for duplicates when inserting data:

If you have UNIQUE constraints on secondary keys, you can speed up table imports by temporarily turning off the uniqueness checks during the import session:

SET unique_checks=0;
... SQL import statements ...
SET unique_checks=1;

For big tables, this saves a lot of disk I/O because InnoDB can use its insert buffer to write secondary index records in a batch. Be certain that the data contains no duplicate keys.

Not an option because when splitting the CSV into multiple tables, checking for uniqeness and eliminating duplicates is key.

Another option is to not check foreign key constraints when loading the data:

If you have FOREIGN KEY constraints in your tables, you can speed up table imports by turning off the foreign key checks for the duration of the import session:

SET foreign_key_checks=0;
... SQL import statements ...
SET foreign_key_checks=1;

For big tables, this can save a lot of disk I/O.

A convincing argument can be made for using this technique. It is valid to use because the data files are independent of each other there will be no foreign key constraint violations. I decided against implementing this because these aren’t “big” tables (we’re talking about <10 million rows), and most, if not all, the data should fit in the buffer pool. If the data is in the buffer pool it is trivial to check for foreign key constraints.

Config

[mysqld]
innodb_change_buffering=all
innodb_change_buffer_max_size=25
# innodb_buffer_pool_instances=1 Change at your own discretion
innodb_buffer_pool_size=3072M
innodb_log_file_size=384M
innodb_log_buffer_size=128M
innodb_flush_log_at_trx_commit=1
skip-innodb_doublewrite

The Result

The result of increasing the log buffer and file, and skipping the double write cut the time for loading the files to a tenth of the original time. I’m very satisfied with this result.

Resources:

References

  1. Schwartz, B., Zaitsev, P., & Tkachenko, V. (2012). High Performance MySQL: Optimization, Backups, and Replication (3rd ed.). O’Reilly.

Comments: