37

I have an SQL dump, it's pretty big (411 MB) and it took 10 minutes to import on server A, the same import on my workstation B has an estimate (pipeviewer) of 8 hours to import (it imported 31 MB in 40 minutes) So this is factor 53 slower.

The specs:

Server A:
   MySQL Version: 5.5.30-1.1 (Debian)
   2 GB RAM
   1 core QEMU Virtual CPU version 1.0 - cpu MHz: 3400.020

Workstation B: 
   MySQL Version: 5.5.41-MariaDB-1ubuntu0.14.04.1
   14 GB RAM
   4 cores Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz - cpu MHz: 1600.000

The mysql/maria config is the stock config.

I switched yesterday to MariaDB on my workstation - but before MariaDB the stats were even worse.

I already removed all databases on my workstation - no difference.

The big question is: How can the performance be factor 53 slower? I can not work like this :-(

My import command:

pv sql/master.sql | mysql -h'localhost' -u'root' -p'root' 'master'

iostat -xm 5

server A:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
      17,43    0,00   30,28   51,85    0,00    0,44

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00   254,03    0,00 1305,45     0,00     6,09     9,56     0,78    0,60    0,00    0,60   0,57  74,25

workstation B:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
       7,32    0,00    3,22    5,03    0,00   84,42

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     1,40    0,80  172,40     0,00     0,56     6,72     1,17    6,75   12,00    6,72   5,40  93,52

dd if=/dev/zero of=tempfile bs=1M count=1024 conv=fdatasync,notrunc

server A:

1073741824 bytes (1,1 GB) copied, 18,6947 s, 57,4 MB/s

workstation B:

1073741824 bytes (1,1 GB) copied, 8,95646 s, 120 MB/s
Alex
  • 1,203
  • 1
  • 9
  • 11

4 Answers4

68

This answer speeded up everything a lot:

https://stackoverflow.com/a/2167641/292408

I simply

SET autocommit=0;
SET unique_checks=0;
SET foreign_key_checks=0;

at the beginning, and

COMMIT;
SET unique_checks=1;
SET foreign_key_checks=1;

at the end.

Now it took 3 minutes.

(Courtesy of @andreasemer via twitter)

Alex
  • 1,203
  • 1
  • 9
  • 11
  • 1
    In case your data is corrupt, foreign key constraints might not be met after import. – Alex Nov 27 '17 at 12:21
  • 1
    Didn't speed up performance for me. Without this it was 23 minutes and after it was 23 minutes (3GB .sql file). – Joshua Pinter Dec 06 '19 at 03:58
  • Well, thanks a lot for this! – ronit Dec 06 '21 at 10:09
  • @Alex this should not be possible if the foreign key checks were enabled on the database you generated the mysql dump from? So long as the checks are only being disabled for import to a new database (not in use) there should be no problem doing this. – Haravikk Apr 18 '22 at 15:40
7

Complementing what I see above... I have my dump file already automatically generated by something like:

mysqldump my_db > db-dump-file.sql

I want to automate this import so I created two files in my computer called default-start-import.sql and default-end-import.sql and their contents are default-start-import.sql:

SET autocommit=0;

and default-end-import.sql:

COMMIT;
SET autocommit=1;

and the script I run is something like this;

cat default-start-import.sql db-dump-file.sql default-end-import.sql | mysql my_other_db

same command but easier to read:

cat default-start-import.sql \
    db-dump-file.sql \
    default-end-import.sql \
| mysql my_other_db

In this case cat is used to concatenate those files before sending them to the pipe. I think it is important that all files end with a new-line character (an empty line in the end of the file if seen from a text editor) so that the cat command doesn't merge lines between files.

The import works fine, I have not tested if it is actually faster because of this improvement to the enable-and-disable-autocommit thing, but if that makes things faster, then this extra steps make things easier.

santiago arizti
  • 181
  • 1
  • 3
6

To compliment Alex's answer above, you could also save and restore the default variables rather than just blindly setting them to 1 after. In most cases, setting them back to 1 is what you want, but just in case, here's a way to set them back to the way they were before the import took place:

PRE Import:

SET @OLD_AUTOCOMMIT=@@AUTOCOMMIT, AUTOCOMMIT = 0;
SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS = 0;
SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS = 0;

POST Import:

SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS;
SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS;
SET AUTOCOMMIT = @OLD_AUTOCOMMIT;
COMMIT;
PadraigD
  • 161
  • 1
  • 4
3

I've tried --compress as well as SET autocommit=0; and they helped a small amount however...

I found that converting multiple INSERT INTO ... statements into one large statement with multiple VALUES(...), (...) improved speed considerably.

I'm using mysql over SSL over WAN. The remote MySQL database is hosted on Amazon.

Edit: See comments. Tested over LAN too, same 60% gains.

With 9 columns and 2,100 rows:

  • 2,100 separate INSERT statements: 82s
  • 2 consolidated INSERT statements: < 1s

With 7 columns and 42,000 rows:

  • 42,000 separate INSERT statements: 1,740s
  • 42 consolidated INSERT statement: 105s

So depending on the tool generating the database dump (or more specifically the format of the INSERT statements), the speed can be influenced.

Note: This also decreases the .sql dump file by over 60% in my tests, so it will save on i/o as well.

Warning: There are physical limitations to this technique with mysql and for those needing portability... SQL Server seems to be limited to only 1,000 rows at a time.

Still, doing 1,000 rows at a time for 42,000 rows still yields a 1,657% improvement!

tresf
  • 131
  • 3
  • I believe your boost to import speed is due to network connection overhead being minimised. You are essentially connecting far fewer times and transferring more data in each connection. This wouldn't really help in the OP's case as his is localhost so connection time is negligible anyway. – PadraigD Jun 29 '21 at 00:44
  • @PadraigD, no, it's still way faster. Try this: time sudo mysql < world.sql; time sudo mysql world2.sql. Files: https://gist.github.com/tresf/1d3e65a64a11ccb1c5101069b768f4a2. My machine shows ~450ms for the former, ~190ms for the latter, a reproducible 60% improvement, even on localhost, consistent with the results from this answer.

    CAUTION: It will drop a database called world if it exists.

    – tresf Jun 30 '21 at 15:23