Tag Archives: query review

Proactive MySQL: Query Reviews (part 1, overview)

One task that can really help reduce future problems is to do a periodic query review. I’ve been using pt-query-digest to do this since 2010 (back when it was part of Ma’atkit, mk-query-digest!), and while I have presented the idea several times at conferences, I have never blogged about it.

I am going to share a secret with you – I blog not just to share information with YOU, but to share information with ME. Future me. This comes up because I am working on a query review at Salesforce.com for one of our busiest shards. I went to go look up a blog post on using pt-query-digest, because it’s a handy cheat sheet….and….I never did it. So, here goes!

This is the first blog post in what will be a series of blog posts, because it is a long topic and there are a few tools that can be used these days for analysis.

What is a query review?
At its simplest, it’s a review of queries. In this context, a query review is where you proactively review the performance of ALL successful queries sent to a server.

Why should you do a query review?
A query review can find possibly problematic queries BEFORE they are a problem. As an example, you can easily find queries that do not use an index, and make indexes before the tables become so large that they have problems.

Another example is the case when you have a query that does the same thing over and over, many times per second – for example, a query that counts the sessions table every time a user hits the page, to say “x people online now”. That query could be rewritten to not do a count every time – maybe do a count once every minute or 5 minutes and put inside a “user count” table, and then each page hit queries that table. Or use an intermediate cache.

Another reason to do a query review is that sometimes a sample query is difficult to find.

Who should do a query review?
A query review is not for a junior DBA. You need to have some knowledge of how to optimize queries, how indexing works, when an index is valuable, and when an index is necessary. For example, a query not using an index on a text field in the “countries” table is much better than a query not using an index on an integer in the “customer” table, because countries won’t grow to be huge but hopefully your customers table will.

What does a query review look like?
Necessary for a query review is gathering “all” queries. This can be done in many ways:

  • general log – logs all queries when they are sent to the server
    • Pros
    • Built into all versions and forks of MySQL.
    • Gets ALL queries sent to the server, even ones that have an error. You can see if there are lots of queries with syntax errors, which can help find code/ORM bugs.
    • Gets ALL attempted logins, even if they fail, so it’s useful for a security/technical debt.
    • You can turn it on dynamically since….MySQL 5.1 (I think? at any rate, probably your version has it dynamic)
    • MySQL overhead
    • Cons
    • because the logging happens when they are sent, there is no indication if the query completed successfully, or how long the query took.
    • Write intensive to disk
    • Grows at a large rate
  • slow query log with long_query_time turned to 0 – gets all *successful* queries
    • Pros
    • Built into all versions and forks of MySQL.
    • Can be turned on dynamically (since, I think, MySQL 5.1, same as general log).
    • Gets lots of information, including lock timing, query execution timing, rows returned, whether the query was successful or not.
    • Cons
    • Does not get ALL attempted queries – only gets some errors (e.g., not syntax errors)
    • Does not get failed logins
    • Write intensive to disk
    • Grows at a large rate
    • MySQL overhead
  • tcpdump and other traffic sniffers (wireshark, built-in sniffers to programs like MONyog, etc)
    • Pros
    • Built into every platform
    • Gets all MySQL traffic, including attempted logins and queries with syntax errors
    • Gets information like execution time, users and hosts.
    • No additional MySQL overhead
    • Cons
    • Must be root to run tcpdump
    • Write intensive to disk
    • Grows at a large rate
  • PERFORMANCE_SCHEMA, pt-query-digest –processlist, proxies, audit logs

Well, that’s a lot of words for one blog post, so I’ll end part 1: overview here. Future posts in the series will cover how to use the tools to do a query review.