Making Queries 45-90 Times Faster!!

aka…..”when good queries go bad!”

So, today the developers were debugging why a script was running much longer than expected. They were doing text database inserts, and got to the point where they realized that double the amount of text meant the queries took double the amount of time.

You see, they were doing similar text inserts over and over, instead of using connection pooling and/or batching them. Apparently the other DBA explained that it was a limitation of MySQL, but either the developers didn’t convey what they were doing well, or the DBA didn’t think to mention batching.

I ran a simple test on a test server. I used the commandline to connect to a db server on the same machine (even though in qa and production the db machine is on a different machine) just to make a point:

Queries per connect

Type Connects Queries Length of data transmitted Time
One-off 1000 1 619 bytes 12.232s
Single Connection 1 1000 604 kilobytes 0.268s
Batch 1 1 517 kilobytes 0.135s

So 1000 INSERTs using 1 connection is over 45 times faster than 1000 INSERTs using 1000 connections.
Using 1 batch INSERT statement is over 1.75 times faster than using 1 connection.
Using 1 batch INSERT statement is over 90 times faster than 1000 INSERTs using 1000 connections.

Note that while it’s faster to send a batch, if you don’t support sending 517 kilobytes to your database at once, you’ll want to break it up. That’s a small coding price to pay for 90x the database performance!!!

For reference, the formats used:
One-off:
INSERT INTO foo (col1, col2…) VALUES (val1, val2…);

Single Connection:
INSERT INTO foo (col1, col2…) VALUES (val1, val2…);
INSERT INTO foo (col1, col2…) VALUES (val1a, val2a…);

Batch: INSERT INTO foo (col1, col2…) VALUES (val1, val2…), (val1a, val2a);

aka…..”when good queries go bad!”

So, today the developers were debugging why a script was running much longer than expected. They were doing text database inserts, and got to the point where they realized that double the amount of text meant the queries took double the amount of time.

You see, they were doing similar text inserts over and over, instead of using connection pooling and/or batching them. Apparently the other DBA explained that it was a limitation of MySQL, but either the developers didn’t convey what they were doing well, or the DBA didn’t think to mention batching.

I ran a simple test on a test server. I used the commandline to connect to a db server on the same machine (even though in qa and production the db machine is on a different machine) just to make a point:

Queries per connect

Type Connects Queries Length of data transmitted Time
One-off 1000 1 619 bytes 12.232s
Single Connection 1 1000 604 kilobytes 0.268s
Batch 1 1 517 kilobytes 0.135s

So 1000 INSERTs using 1 connection is over 45 times faster than 1000 INSERTs using 1000 connections.
Using 1 batch INSERT statement is over 1.75 times faster than using 1 connection.
Using 1 batch INSERT statement is over 90 times faster than 1000 INSERTs using 1000 connections.

Note that while it’s faster to send a batch, if you don’t support sending 517 kilobytes to your database at once, you’ll want to break it up. That’s a small coding price to pay for 90x the database performance!!!

For reference, the formats used:
One-off:
INSERT INTO foo (col1, col2…) VALUES (val1, val2…);

Single Connection:
INSERT INTO foo (col1, col2…) VALUES (val1, val2…);
INSERT INTO foo (col1, col2…) VALUES (val1a, val2a…);

Batch: INSERT INTO foo (col1, col2…) VALUES (val1, val2…), (val1a, val2a);

Comments are closed.