Lesson 06: Working with Database Structures

Notes/errata/updates for Chapter 6:
See the official book errata at http://tahaghoghi.com/LearningMySQL/errata.php – Chapter 6 includes pages 179 – 222.

Other notes:
At the end of the “Creating Tables” section (p.183-4), it says “We like using the underscore character to separate words, but that’s just a matter of style and taste; you could use underscores or dashes, or omit the word-separating formatting altogether.” While this is true, beware of using a dash, because MySQL will try to interpret “two-words”, thinking – is a minus sign. I recommend avoiding dashes for this reason (even though the book does this on page 215).

At the end of the “Collation and Character Sets” section (p.186), it says “When you’re creating a database, you can set the default character set and sort order for the database and its tables.” Note that the default character set for the server will set the default character set for any new databases created if a default character set is not specified; there is no change in existing databases. In turn, the default character set for the database sets the default character set for any new tables created but does not change any existing tables, and the default character set for a table determines the default character set for each column, which can be overridden by specifying a character set when defining a column.

Under the “Other Features” section it references a section called “Table Types”. This section is in chapter 7, p. 267.

Under the “Other Features” section it shows the SHOW CREATE TABLE example (p. 187). By default, MySQL shows you output in horizontal format – that is, one table row is shown as one row in the output. However, you can have MySQL show you output in vertical format, where one column is shows as one row in the output. Instead of using ; to end a query, use \G

Try it with:

And see the difference.

In the “Column Types” section on page 194, it says that “Only one TIMESTAMP column per table can be automatically set to the current date and time on insert or update.” This is not true as of MySQL version 5.6.5 and higher. As per the documentation at https://dev.mysql.com/doc/refman/5.6/en/timestamp-initialization.html: “For any TIMESTAMP or DATETIME column in a table, you can assign the current timestamp as the default value, the auto-update value, or both.”

In the section called “The AUTO_INCREMENT Feature”, on page 211, it says “If, however, we delete all the data in the table, the counter is reset to 1.” The example shows the use of TRUNCATE TABLE. Note that if you deleted all the data in the table with DELETE, such as “DELETE FROM count WHERE 1=1;”, the counter is NOT reset.

Supplemental material:
Data types:
Podcast on Strings
Podcast on Numeric data types
Podcast on ENUM, SET and different SQL modes
Podcast on Times and time zones

Topics covered:
How to CREATE, DROP and ALTER databases, tables, columns and indexes
Collations and character sets
Data types

Reference/Quick Links for MySQL Marinate

Lesson 05: Basic SQL (in MySQL)

Notes/errata/updates for Chapter 5:
See the official book errata at http://tahaghoghi.com/LearningMySQL/errata.php – Chapter 5 is pages 135-177.

If you did not load the sample databases in Chapter 3 as per page 102, you can download the file to import from http://tahaghoghi.com/LearningMySQL/Downloads/Files/Data/SQL_files_with_foreign_key_references/music.sql

On page 162, it says “In Chapter 6, we’ll discuss how to load data from existing tables or from external data sources.” but this content is in Chapter 8.

On page 169, it has an example of “DELETE FROM played;” which deletes all the rows of a table. This will not work if safe_updates are enabled. I always advise that if you truly want to do that, ALWAYS use a WHERE clause, for example: DELETE FROM played WHERE 1=1;. In this way, your query is self-documenting – it tells the reader that you intended to delete all rows. If you just have “DELETE FROM played” it is possible you meant to put a WHERE clause in but forgot.

Similarly, on page 171, it shows an example where an UPDATE with no WHERE clause is used to update all rows. I prefer the syntax UPDATE tbl SET column=value WHERE 1=1 – this broadcasts your intent.

Topics covered:
How to get table metadata with SHOW
Basic SELECT statements
– Choosing columns
– Filtering with WHERE clauses
Joining tables with INNER JOIN

Reference/Quick Links for MySQL Marinate

How to Analyze Actions for Certain MySQL Tables

subtitle: How to use awk to filter binlogs for specific tables

The task: Find out how many inserts were done to a set of tables over a period of time. Specifically avg inserts/min, avg inserts/hour, avg inserts/day, for the past week. Seems simple, right?

I was surprised that there was no good way to get what was done to a specific table, for data gathering AND forensic purposes!

These tables are be deleted from, so just using rows and timestamps is not valid. We have binary logs, which log when an insert was done. My first thought was to use pt-query-digest, since the tool has a “Tables” section in the output, so it already parses out which tables are in a query.

Sadly, tables is not an attribute I can --filter on, at least not in my version of pt-query-digest. (insert sad trombone here)

My second thought was to see if mysqlbinlog could filter for records only in that table. No such luck.

So, I decided to use awk. With awk I could define a “record” as “separated by the string ‘# at'”, match for table names, and print out the separator and record.

In other words, I could parse binlogs for a specific match, and print the entire record. Because each record is a variable size (maybe the record is 5 lines, maybe it is 7, maybe it is even more!)

To get MySQL binlog events on a set of tables on a unix server:
mysqlbinlog $FILE | awk -vRS='# at' '/TABLE1|TABLE2/ { print RS $0 }' | grep -v ^$ > filtered_binlog.txt

Explanation (making no assumptions about unix knowledge):

  • mysqlbinlog $FILE – parses the mysql-bin.xxxxxx file
  • | awk – sends the parsed file to awk
  • -v – means “set this variable”
  • RS – record separator
  • '# at' – the record separator string, which is the beginning of each record.
  • /TABLE1|TABLE2/ – a regular expression matching either TABLE1 or TABLE2 – you may decide you want your string to be something like /FROM TABLE1|FROM TABLE2|JOIN TABLE1|JOIN TABLE2/ – it depends on your queries.
  • print RS $0 – prints both the record separator (RS) and the record ($0). If you just print the record, the default output record separator (ORS) is the empty string, so you will not get a good file.
  • | grep -v ^$ – take out blank lines, grep -v means “all lines except” and ^$ is start of line (^) followed by end of line ($). If there are blank lines, pt-query-digest will not work on this file.

From here, filtered_binlog.txt can be parsed and analyzed with tools like pt-query-digest or mysqlbinlog.

Edited to add: Commenters point out that this works for STATEMENT based binlogs. If your binlogging format is ROW, you can use mysqlbinlog –verbose (or -v) to get the reconstructed SQL statement, which includes the table name.

Lesson 04: Modeling and Designing Databases

Notes/errata/updates for Chapter 4:
See the official book errata at http://tahaghoghi.com/LearningMySQL/errata.php – Chapter 4 includes pages 109 – 133.

On page 114, it says “For some applications, no combination of attributes can uniquely identify an entity (or it would be too unwieldy to use a large composite key), so we create an artificial attribute that’s defined to be unique and can therefore be used as a key: student numbers, Social Security numbers, driver’s license numbers, and library card numbers are examples of unique attributes created for various applications.” This is known as a “surrogate key”, because the number is a surrogate for the “real” unique way of identification.

On page 131, it says that reverse engineering an ER model from an existing database in MySQL Workbench is in beta testing phase. This function has been stable for a long time, so it’s safe to use.

In the chapter it talks about making sure your data doesn’t repeat itself, and how to design so that there aren’t consistency problems. If you want to learn more about normalization, feel free to listen to the OurSQL Podcast (www.oursql.com), episode 7 about normalization and episode 64, about different normal forms.

As you are reading this chapter, if you are wondering how a model relates to reality, know that for the most part, entities become tables and attributes become fields in a table. Relationships may or may not become tables, though for many-to-many relationships, they usually do.

For the homework, DO NOT USE MySQL Workbench to make Entity/Relationship diagrams. It is not a true E/R diagram and you will not be able to show everything you need to (e.g. the symbols for a weak entity do not exist in MySQL Workbench). Use flowcharting software or just draw it on paper and scan it in or take a picture, and submit the scan/picture as the homework

Topics covered:
Entity-Relationship diagramming

Supplemental material – podcasts on normalization:
Normalization concepts and 1NF: http://technocation.org/content/oursql-episode-7-whats-it-be-normal
2NF, 3NF and when to denormalize: http://technocation.org/content/oursql-episode-64-are-we-normal-yet

Reference/Quick Links for MySQL Marinate

Lesson 03: Using the MySQL Monitor

Notes/errata/updates for Chapter 3:
See the official book errata at http://tahaghoghi.com/LearningMySQL/errata.php – Chapter 3 includes pages 95 – 106.

On p. 96 – 97, there is an example of what you will see with SHOW DATABASES; You will likely also see sys, INFORMATION_SCHEMA and PERFORMANCE_SCHEMA databases.

On p. 97, it says “There are some restrictions on what characters and words you can use in your database, table and other names.” These days, it is possible to escape characters and reserved words, but it is still not recommended.

On p. 101-102, there are examples of how to run the MySQL monitor in “batch mode”. It says you can use either “SOURCE” or the redirection operator on commandline (<). However, it does not explain an important difference - if you redirect from commandline with "mysql", an error will stop the rest of the script from running, unless you run it with mysql -f to force it to keep going. If you run a batch file with SOURCE, it will continue to read the source file even if there is an error. Remember to do a "git pull" to get the most recent homework questions. Topics covered:

  • Using the MySQL monitor, also known as the MySQL command line, including using options
  • Know what it means to end your statement with the following: ; \c \G \q
  • Batch mode
  • Loading the sample database

Reference/Quick Links for MySQL Marinate

Lesson 02: Installing MySQL

Notes/errata/updates for Chapter 2:
See the official book errata at http://tahaghoghi.com/LearningMySQL/errata.php – Chapter 2 includes pages 9-93 (but we skip pages 83-92). It seems like a lot of pages, however you will skip the operating systems that do not apply to you. Do NOT compile or install from tarball or source; just use the packages that are pre-made. You will want the latest version of MySQL, which at the time of this writing is MySQL 5.7.

You should install the latest version of MySQL, which can be downloaded from http://dev.mysql.com/downloads/mysql/ If you want to install something else, you can install MariaDB or Percona’s patched version.

Note that you do NOT need to install Apache, Perl or PHP. You can skip pages 83-92.

On p. 10, it says that “The MySQL Manual says that you can get a performance increase of up to 30 percent if you compile the code with the ideal settings for your environment.” This was true up through MySQL 5.1 (see http://dev.mysql.com/doc/refman/5.1/en/compile-and-link-options.html) However, with MySQL 5.5 and newer, this is no longer true.

On p. 62, it talks about checking to make sure the mysql user and group are on the machine by using NetInfo Manager. The NetInfo Manager was taken out of Mac OS X Leopard (10.7) and above. Just skip the paragraph starting “To check using the NetInfo Manager”, and proceed to the paragraph with “You can instead check these settings from the shell prompt.”

On p. 71, it talks about the MySQL Migration Toolkit and says it’s part of the “MySQL GUI Tools Bundle”. These days, it’s part of MySQL Workbench.

On p. 75 -78, the book talks about setting your path, which probably should be under “configuration” instead of the troubleshooting section…you might think once you get MySQL installed, you can skip that section, but you need to read it anyway.

On p. 93, it lists http://forge.mysql.com as a resource, but that website has been deprecated since the book was published.

Topics covered:
Installing MySQL on Linux, Mac OS X and Windows.

Verifying packages with MD5

Configuring a new server

Reference/Quick Links for MySQL Marinate

MySQL Marinate Quick Links

Here is a quick reference for MySQL Marinate:

How to Submit MySQL Marinate Homework

We will be submitting the homework via GitHub. The reason for this is to have a centralized place for homework, as well as teaching how to use a revision control system and how to interact specifically with GitHub.

Learn about git at http://git-scm.com/book/en/Getting-Started-Git-Basics (You only need to read this one webpage, no need to go to the next page)

To do:
– Create a github account at www.github.com if you do not already have one.
– Install and configure git on your local machine as per https://help.github.com/articles/set-up-git
– Fork the MySQL Marinate repo as per https://help.github.com/articles/fork-a-repo
The URL for the MySQL Marinate repo is https://github.com/Sheeri/mysql-marinate

– To demonstrate that you can submit homework, open and modify the 01Intro/homework.txt file
– Then commit your work locally:
git add 01Intro/homework.txt
git commit

– When you’re done, commit your work up to the github server:
git push origin master

When doing homework, remember to commit locally often. I would recommend committing up to the github server whenever you finish a question or a part of a question. Pretend at any time your local machine may die; how much work are you willing to lose? (5 minutes’ worth? 3 days’ worth?) I do not look at individual commits, only the final pull request when you’re done.

When you’re done with your homework, submit a Pull Request as per https://help.github.com/articles/using-pull-requests

If you have trouble and need help, ask away in the comments!

Reference/Quick Links for MySQL Marinate

Lesson 01: Introduction and Submitting Homework

This is for chapter 1 of Learning MySQL – it’s only pages 3-8, so it is an easy one!

Homework for this week: How to Submit Homework

Notes/errata/updates for Chapter 1:
The book mentions MySQL AB, the company behind MySQL. Since the book was written, MySQL was acquired by Sun, and then Sun was acquired by Oracle. (see http://en.wikipedia.org/wiki/MySQL#History­ for the timeline).

Topics covered:
MySQL History
Why MySQL is popular
Submitting Homework

Reference/Quick Links for MySQL Marinate

MySQL Marinate – So you want to learn MySQL! – START HERE

Want to learn or refresh yourself on MySQL? MySQL Marinate is the FREE virtual self-study group is for you!

MySQL Marinate quick links if you know what it is all about.

This is for beginners – If you have no experience with MySQL, or if you are a developer that wants to learn how to administer MySQL, or an administrator that wants to learn how to query MySQL, this course is what you want. If you are not a beginner, you will likely still learn some nuances, and it will be easy and fast to do. If you have absolutely zero experience with MySQL, this is perfect for you. The first few chapters walk you through getting and installing MySQL, so all you need is a computer and the book.

The format of a virtual self-study group is as follows:
Each participant acquires the same textbook (Learning MySQL, the “butterfly O’Reilly book”, published 2007). You can acquire the textbook however you want (e.g. from the libary or from a friend, hard copy or online). Yes, the book is old, but SQL dates back to at least the 1970’s and the basics haven’t changed! There are notes and errata for each chapter so you will have updated information. The book looks like this:

O'Reilly Butterfly book picture

O’Reilly Butterfly book picture

Each participant commits to reading each chapter (we suggest one chapter per week as a good deadline), complete the exercises and post a link to the completed work.

Each participant obtains assistance by posting questions to the comments on a particular chapter.

Note: There is no classroom instruction.

How do I get started?

– Watch sheeri.com each week for the chapters to be posted.

– Get Learning MySQL
Acquire a book (the only item that may cost money). Simply acquire Learning MySQL – see if your local library has it, if someone is selling their copy, or buy it new.

– Start!
When your book arrives, start your virtual learning by reading one chapter per week. Complete the exercises; if you have any questions, comments or want to learn more in-depth, that’s what the comments for!

Q: Does this cover the Percona patch set or MariaDB forks?

A: This covers the basics of MySQL, which are applicable to Percona’s patched MySQL or MariaDB builds, as well as newer versions of MySQL.

Q: What do I need in order to complete the course?

A: All you need is the book and access to a computer, preferably one that you have control over. Windows, Mac OS X or Unix/Linux will work. A Chromebook or tablet is not recommended for this course.

Q: Where can I put completed assignments?

A: Completed assignments get uploaded to github. See How to Submit Homework

Q: The book was published in 2007. Isn’t that a bit old?

A: Yes! The basics are still accurate, and we will let you know what in the book is outdated. I have contacted O’Reilly, offering to produce a new edition, and they are not interested in updating the book. We will also have optional supplemental material (blog posts, videos, slides) for those who want to learn more right away. We are confident that this self-study course will make you ready to dive into other, more advanced material.

Soak it in!

Reference/Quick Links for MySQL Marinate