Category Archives: indexes

Lesson 06: Working with Database Structures

Notes/errata/updates for Chapter 6:
See the official book errata at http://tahaghoghi.com/LearningMySQL/errata.php – Chapter 6 includes pages 179 – 222.

Other notes:
At the end of the “Creating Tables” section (p.183-4), it says “We like using the underscore character to separate words, but that’s just a matter of style and taste; you could use underscores or dashes, or omit the word-separating formatting altogether.” While this is true, beware of using a dash, because MySQL will try to interpret “two-words”, thinking – is a minus sign. I recommend avoiding dashes for this reason (even though the book does this on page 215).

At the end of the “Collation and Character Sets” section (p.186), it says “When you’re creating a database, you can set the default character set and sort order for the database and its tables.” Note that the default character set for the server will set the default character set for any new databases created if a default character set is not specified; there is no change in existing databases. In turn, the default character set for the database sets the default character set for any new tables created but does not change any existing tables, and the default character set for a table determines the default character set for each column, which can be overridden by specifying a character set when defining a column.

Under the “Other Features” section it references a section called “Table Types”. This section is in chapter 7, p. 267.

Under the “Other Features” section it shows the SHOW CREATE TABLE example (p. 187). By default, MySQL shows you output in horizontal format – that is, one table row is shown as one row in the output. However, you can have MySQL show you output in vertical format, where one column is shows as one row in the output. Instead of using ; to end a query, use \G

Try it with:
SHOW CREATE TABLE artist;
vs
SHOW CREATE TABLE artist\G

And see the difference.

In the “Column Types” section on page 194, it says that “Only one TIMESTAMP column per table can be automatically set to the current date and time on insert or update.” This is not true as of MySQL version 5.6.5 and higher. As per the documentation at https://dev.mysql.com/doc/refman/5.6/en/timestamp-initialization.html: “For any TIMESTAMP or DATETIME column in a table, you can assign the current timestamp as the default value, the auto-update value, or both.”

In the section called “The AUTO_INCREMENT Feature”, on page 211, it says “If, however, we delete all the data in the table, the counter is reset to 1.” The example shows the use of TRUNCATE TABLE. Note that if you deleted all the data in the table with DELETE, such as “DELETE FROM count WHERE 1=1;”, the counter is NOT reset.

Supplemental material:
Data types:
Podcast on Strings
Podcast on Numeric data types
Podcast on ENUM, SET and different SQL modes
Podcast on Times and time zones

Topics covered:
How to CREATE, DROP and ALTER databases, tables, columns and indexes
Collations and character sets
Data types
AUTO_INCREMENT

Reference/Quick Links for MySQL Marinate

Cost/Benefit Analysis of a MySQL Index

We all know that if we add a MySQL index to speed up a read, we end up making writes slower. How often do we do the analysis to look at how much more work is done?

Recently, a developer came to me and wanted to add an index to a very large table (hundreds of gigabytes) to speed up a query. We did some testing on a moderately used server:

Set long_query_time to 0 and turn slow query logging on
Turn slow query logging off after 30 minutes.

Add the index (was on a single field)

Repeat the slow query logging for 30 minutes at a similar time frame (in our case, we did middle of the day usage on a Tuesday and Wednesday, when the database is heavily used).

Then I looked at the write analysis – there were no DELETEs, no UPDATEs that updated the indexed field, and no UPDATEs that used the indexed field in the filtering. There were only INSERTs, and with the help of pt-query-digest, here’s what I found:

INSERT analysis:
Query hash 0xFD7…..
Count: 2627 before, 2093 after
Exec time:
– avg – 299us before, 369us after (70us slower)
– 95% – 445 us before, 596us after
– median – 273us before, 301us after

I extrapolated the average per query to 2400 queries, and got:
**Total, based on 2400 queries – 71.76ms before, 88.56ms after, 16.8ms longer**

There was only one read query that used the indexed field for ORDER BY (or anywhere at all!), so the read analysis was also simple:

Read analysis:
Query hash 0xF94……
Count:187 before, 131 after
Exec time:
– avg – 9ms before, 8ms after. 1 ms saved
– 95% – 20ms before, 16 ms after
– median – 9ms before, 8 ms after

Again, extrapolating to average for 150 queries:
**Total, based on 150 queries: 150ms saved**

So we can see in this case, the index created a delay of 16.8 ms in a half-hour timeframe, but saved 150 ms in reads.

It is also impressive that the write index added very little time – 70 microseconds – but saved so much time – 1 millisecond – that there were 16 times the number of writes than reads, but we still had huge improvement, especially given the cost.

I cannot make a blanket statement, that this kind of index will always have this kind of profile – very tiny write cost for a very large read savings – but I am glad I did this analysis and would love to do it more in the future, to see what the real costs and savings are.

Why does the MySQL optimizer not do what I think it should?

In May, I presented two talks – one called “Are you getting the best out of your indexes?” and “Optimizing Queries Using EXPLAIN”. I now have slides and video for both of them.

The first talk about indexing should probably be titled “Why is MySQL doing this?!!?!!?” It gives insight into why the MySQL optimizer chooses indexes that you do not expect; especially when it does not use an index you expect it to.

The talk has something for everyone – for beginners it explains B-trees and how they work, and for the more seasoned DBA it explains concepts like average value group size, and how the optimizer uses those concepts applied to metadata to make decisions.

Slides are at http://technocation.org/files/doc/2017_05_MySQLindexes.pdf.
Click the slide image below to go to the video at https://www.youtube.com/watch?v=e39-UfxQCCsSlide from MySQL indexing talk

The EXPLAIN talk goes through everything in EXPLAIN – both the regular and JSON formats – and describes what the fields mean, and how you can use them to figure out how to best optimize your query. There are examples that show where you can find red flags, so that when you EXPLAIN your own queries, you can be better prepared for gotchas. The EXPLAIN talk references the indexing talk in a few places (both talks were given to the same audience, about a week apart), so I highly recommend you watch that one first.

Slides are at http://technocation.org/files/doc/2017_05_EXPLAIN.pdf.
Click the slide image below to go to the video at https://www.youtube.com/watch?v=OlclCoWXplgSlide image from the EXPLAIN talk

Indexing Talk Online

I am doing a quick blog post to announce that I have put an indexing talk online*. Most recently, I delivered this indexing talk at Confoo and Scale 11x.

The talk is on YouTube at Are You Getting the Best Out of Your MySQL Indexes? There are also PDF slides.
From the official conference description, if you want to know more:
MySQL indexes are often used to make performance better. However, they can make performance suffer if you are not using them properly. Oracle ACE Director Sheeri Cabral explains the pitfalls to avoid with indexes and how to utilize compound indexes to maximize index availability with the least amount of write overhead.

*I know I have not been posting blogs for a long time. This was a very busy year, and I took March through July off from conferences in order to buy a house and move.