Category Archives: Security

Lesson 09: Managing Users and Privileges in MySQL

Notes/errata/updates for Chapter 9:
See the official book errata at http://tahaghoghi.com/LearningMySQL/errata.php – Chapter 9 includes pages 297 – 350.

In the fourth paragraph of this chapter, starting with “Most applications don’t need superuser privileges for day-to-day activities” they give you some reasons why you want to create users without the SUPER privilege. There are better reasons than the book gives, which are at the MySQL Manual page for the SUPER privilege.

In the section “Creating and Using New Users” (p. 300) they say “There’s no limit on password length, but we recommend using eight or fewer characters because this avoids problems with system libraries on some platforms.” You should ignore this, this book was written in 2006 and modern system libraries can handle more than 8 characters in a password. Also ignore it when they say the same thing in the section “Understanding and Changing Passwords” (p. 324).

In the section “Creating a New Remote User” at the very end (p. 214), it talks about using % as a host wildcard character. I want to point out that if there are no ACL’s set for a given host, MySQL will reject ALL connections from that host – even “telnet host 3306” will fail. So if you avoid using %, you are slightly more secure.

In the “Anonymous Users” section (p. 315), one fact that is not mentioned is that for all users, including the anonymous user, any database named “test” or that starts with “test_” can be accessed and manipulated. So an anonymous user can create tables in the “test” database (or even “test_application”) and fill it full of data, causing a denial of service when the disk eventually fills up. This fact is mentioned later in the chapter in the “Default User Configuration” section under “Linux and Mac OS X”, but it should be known earlier.

The “mysqlaccess” utility described in the section of that name (p. 320) is usually not used. These days, folks prefer the pt-show-grants tool. Here is a blog post with some examples of pt-show-grants.

In the section on “Removing Users” (p. 324), it says that if all the privileges are revoked, and a user only has GRANT USAGE, “This means the user can still connect, but has no privileges when she does.” This is untrue, as mentioned before, everyone can access and manipulate databases starting with “test”.

The section “Managing Privileges with SQL” is deprecated (p. 339-346, up to and including “Activating Privileges”). It used to be, back when this was written, that few people used the GRANT statements and more people directly manipulated the tables. These days, it’s the other way around, and due to problems like SQL injection, there are safeguards in place – for example, if you change the host of a user with an ALTER TABLE on the mysql.user table, the user will have all privileges dropped. Just about the only thing direct querying is used for, is to find who has the Super_priv variable set to ‘Y’ in the user table.

Supplemental material: I have a video presentation on security which includes ACLs and there are accompanying PDF slides.

Topics covered:
Creating and dropping local and remote users
Different MySQL privileges
SUPER privilege
GRANT and REVOKE syntax
Hosts and wildcards
Anonymous and default users
Checking privileges
Password management
Basic user security
Resource limit controls

Reference/Quick Links for MySQL Marinate

MySQL Marinate – So you want to learn MySQL! – START HERE

Want to learn or refresh yourself on MySQL? MySQL Marinate is the FREE virtual self-study group is for you!

MySQL Marinate quick links if you know what it is all about.

This is for beginners – If you have no experience with MySQL, or if you are a developer that wants to learn how to administer MySQL, or an administrator that wants to learn how to query MySQL, this course is what you want. If you are not a beginner, you will likely still learn some nuances, and it will be easy and fast to do. If you have absolutely zero experience with MySQL, this is perfect for you. The first few chapters walk you through getting and installing MySQL, so all you need is a computer and the book.

The format of a virtual self-study group is as follows:
Each participant acquires the same textbook (Learning MySQL, the “butterfly O’Reilly book”, published 2007). You can acquire the textbook however you want (e.g. from the libary or from a friend, hard copy or online). Yes, the book is old, but SQL dates back to at least the 1970’s and the basics haven’t changed! There are notes and errata for each chapter so you will have updated information. The book looks like this:

O'Reilly Butterfly book picture

O’Reilly Butterfly book picture

Each participant commits to reading each chapter (we suggest one chapter per week as a good deadline), complete the exercises and post a link to the completed work.

Each participant obtains assistance by posting questions to the comments on a particular chapter.

Note: There is no classroom instruction.

How do I get started?

– Watch sheeri.com each week for the chapters to be posted.

– Get Learning MySQL
Acquire a book (the only item that may cost money). Simply acquire Learning MySQL – see if your local library has it, if someone is selling their copy, or buy it new.

– Start!
When your book arrives, start your virtual learning by reading one chapter per week. Complete the exercises; if you have any questions, comments or want to learn more in-depth, that’s what the comments for!

FAQs:
Q: Does this cover the Percona patch set or MariaDB forks?

A: This covers the basics of MySQL, which are applicable to Percona’s patched MySQL or MariaDB builds, as well as newer versions of MySQL.

Q: What do I need in order to complete the course?

A: All you need is the book and access to a computer, preferably one that you have control over. Windows, Mac OS X or Unix/Linux will work. A Chromebook or tablet is not recommended for this course.

Q: Where can I put completed assignments?

A: Completed assignments get uploaded to github. See How to Submit Homework

Q: The book was published in 2007. Isn’t that a bit old?

A: Yes! The basics are still accurate, and we will let you know what in the book is outdated. I have contacted O’Reilly, offering to produce a new edition, and they are not interested in updating the book. We will also have optional supplemental material (blog posts, videos, slides) for those who want to learn more right away. We are confident that this self-study course will make you ready to dive into other, more advanced material.

Soak it in!

Reference/Quick Links for MySQL Marinate

LDAP with auth_pam and PHP to authenticate against MySQL

In the quest to secure MySQL as well as ease the number of complicated passwords to remember, many organizations are looking into external authentication, especially using LDAP. For free and open source, Percona’s PAM authentication plugin is the standard option.

tl;dr is I go through how to compile php-cli for use with auth_pam plugin.

Background


There are two plugins that can be used. From the documentation, the two plugins are:

  • Full PAM plugin called auth_pam. This plugin uses dialog.so. It fully supports the PAM protocol with arbitrary communication between client and server.
  • Oracle-compatible PAM called auth_pam_compat. This plugin uses mysql_clear_password which is a part of Oracle MySQL client. It also has some limitations, such as, it supports only one password input. You must use -p option in order to pass the password to auth_pam_compat.

Percona’s MySQL client supports both plugins natively. That is, you can use auth_pam or auth_pam_compat and use the “mysql” tool (or “mysqldump”, or mysql_upgrade, etc.) and you are good to go. Given the choice, we would all use auth_pam, under which clients DO NOT use mysql_clear_password.

Not all clients support auth_pam, which is the main problem. Workarounds have called for using auth_pam_compat over SSL, which is a perfectly reasonable way to handle the risk of cleartext passwords – encrypt the connection.

However, what if you want to use auth_pam?

The problem with auth_pam

Back in 2013, Percona posted about how to install and configure auth_pam and auth_pam_compat. I will not rehash that setup, except to say that most organizations no longer use /etc/shadow, so the setup involves getting the correct /etc/pam.d/mysqld in place on the server.

That article has this gem:

As of now, only Percona Server’s mysql client and an older version of HeidiSQL(version 7), a GUI MySQL client for Windows, are able to authenticate over PAM via the auth_pam plugin by default.

So, if you try to connect to MySQL using Perl, PHP, Ruby, Python and the like, you will receive this error: “Client does not support authentication protocol requested by server; consider upgrading MySQL client.”

Fast forward 4 years, to now, and this is still an issue. Happily, the article goes on to explain how to recompile clients to get them to work:

The good news is that if the client uses libmysqlclient library to connect via MySQL, you can recompile the client’s source code to use the libmysqlclient library of Percona Server to make it compatible. This involves installing Percona Server development library, compiler tools, and development libraries followed by compiling and installing the client’s source code.

And, it helpfully goes step by step on how to recompile perl-DBD-mysql to get it working with LDAP authentication (as well as without – it still works for users who do not use LDAP).

But what if you are using PHP to connect to MySQL?

PHP and auth_pam


If you try to connect, you get this error:
SQLSTATE[HY000] [2054] The server requested authentication method unknown to the client

So let us try to mirror the perl recompilation process in PHP.

Step 1

“Install Percona yum repository and Percona Server development library.” This is not a problem, do what you need to do to install Percona-Server-devel for your version.

Step 2

Install a package manager so you can build a package – optional, but useful, if you ever want to have this new client without having to recompile. As in the example, I chose the RPM package manager, so I installed rpm-build.

Step 3

Download and install the source RPM for the client package. This is where I started running into trouble. What I did not realize was that PHP does not divide out its packages like Perl does. Well, it does, but php-mysqlnd is compiled as part of the core, even though it is a separate package.

Downloading the main PHP package


So I downloaded the source RPM for PHP at https://rpms.remirepo.net/SRPMS/, and installed it into the sources directory:
cd SRPMS
wget https://rpms.remirepo.net/SRPMS/php-7.0.22-2.remi.src.rpm
cd ../SOURCES
rpm -Uvh ../SRPMS/php-7.0.22-2.remi.src.rpm

This unpacks a main file, php-7.0.22.tar.xz, plus a bunch of supplemental files (like patches, etc).

What it does NOT contain is a spec file, which is critical for building the packages.

Getting a spec file


I searched around and found one at https://github.com/iuscommunity-pkg/php70u/blob/master/SPECS/php70u.spec – this is for 7.0.21, so beware of using different versions of spec files and source code. Once that was done, I changed the mysql lines to /usr/bin/mysql_config as per Choosing a MySQL library. Note that I went with the “not recommended” library, but in this case, we WANT to compile with libmysqlclient.

Compiling php-cli, not php-mysqlnd


In addition, I discovered that compiling php-mysqlnd with the new libraries did not work. Perhaps it was something I did wrong, as at that point I was still compiling the whole PHP package and every module in it.

However, what I *did* discover is that if I recompiled the php-cli package with libmysqlclient, I was able to get a connection via PHP using LDAP authentication, via a tool written by someone else – with no changes to the tool.

Final spec file


So here is the spec file I eventually came up with. I welcome any optimizations to be made!

Step 4

“Install compilers and dependencies”.
On my host I had to do a bunch of installations to get the requirements installed (your mileage may vary), including the Percona Server package for the /usr/lib64/mysql/plugin/dialog.so file:
yum install Percona-Server-server-55-5.5.55-rel38.8.el6.x86_64 libtool systemtap-sdt-devel unixODBC-devel

Step 5

“Build the RPM file”. Such an easy step, but it took about a week of back and forth with building the RPM file (which configures, tests and packages up everything), so I went between this step and updating the spec file a lot.

cd rpmbuild/SPECS/
rpmbuild -bb rpmbuild/SPECS/php-cli.spec

Then I installed my PHP file and tested it, and it worked!
# rpm -e php-cli –nodeps
# rpm -Uvh /root/rpmbuild/RPMS/x86_64/php70u-cli-7.0.22-2.ius.el6.x86_64.rpm –nodeps
Preparing… ########################################### [100%]
1:php70u-cli ########################################### [100%]

I hope you have similar success, and if you have updates to the spec files and lists of packages to install, please let me know!

Testing Advanced Log Flushing for Percona Audit

We use Percona’s <A HREF=”https://www.percona.com/doc/percona-server/5.5/management/audit_log_plugin.html”>audit log plugin</A> to keep a record of all our logins. Recently we did one of those tasks that everyone knows they should do, but few ever do: change the application user’s password.

When we change the application password, we add a new application user with the proper permissions and a new password, update the information in our repository and wait a while. Using this procedure, any failures mean the old user is used, and more importantly – failure does not impact the end user.

We check the audit logs to see if there were failures – if the user is still being used – when it is no longer in use, we can drop the user.

For reference, here are our settings:

[mysqlaudit]# grep audit /etc/my.cnf
# Percona audit plugin options
audit_log_format = JSON
audit_log_rotate_on_size = 1073741824 . #1G
audit_log_rotations = 10
audit_log_file = /var/log/mysqlaudit/audit.log
audit_log_buffer_size = 4194304
audit_log_strategy = ASYNCHRONOUS
audit_log_policy = LOGINS

This means that we automatically flush logs >1G, keeping 10 audit logs. The other option is to do manual flushing, but we do not want our log files to get very large, and we don’t need to keep audit logs for a very long time.

The ideal behavio – we update our app to the new user, and then flush the logs. That way we could start a new audit log and only have to search the current audit log for the old user. I’m sure people are thinking, “well, it’s set to rotate by size, not manually, so you just cannot do it.” However, binary logs are set the same way and FLUSH LOGS do indeed rotate logs manually, even when auto rotate by size is set.

The tl;dr is that there is currently no way to do this without restarting MySQL. The audit_log_rotate_on_size variable is not dynamic, so we could not set it to manual without restarting MySQL. Here are some other tests we did to see if we could force an audit log flush while using the auto rotate:

We tried moving the audit logs and flushing:

[ ~]# cd /var/log/mysqlaudit/
[mysqlaudit]# ls -l
total 1459572
-rw-rw—- 1 mysql mysql 420839439 Mar 31 11:04 audit.log
-rw-rw—- 1 mysql mysql 1073749720 Mar 25 08:50 audit.log.01
[mysqlaudit]# mv audit.log.01 audit.log.02
[mysqlaudit]# mv audit.log audit.log.01
[mysqlaudit]# ls
audit.log.01 audit.log.02

Don’t worry, this doesn’t affect writing the file – the inode is still in MySQL and it still writes to the file, now called audit.log.01:
[mysqlaudit]# ls -l
total 1459652
-rw-rw—- 1 mysql mysql 420925253 Mar 31 11:07 audit.log.01
-rw-rw—- 1 mysql mysql 1073749720 Mar 25 08:50 audit.log.02
[mysqlaudit]# ls -l
total 1459652
-rw-rw—- 1 mysql mysql 420925253 Mar 31 11:07 audit.log.01
-rw-rw—- 1 mysql mysql 1073749720 Mar 25 08:50 audit.log.02
[mysqlaudit]# mysql -u sheeri.cabral -p -e “FLUSH LOGS;”
Enter password:
[mysqlaudit]# ls -l
total 1459688
-rw-rw—- 1 mysql mysql 420958983 Mar 31 11:07 audit.log.01
-rw-rw—- 1 mysql mysql 1073749720 Mar 25 08:50 audit.log.02

Note that this also proves that “FLUSH LOGS” does not close and open the audit log.

Can we force it? Let’s try by setting the audit_log_policy to NONE and then to LOGINS (what we have it as by default):

[mysqlaudit]# mysql -u sheeri.cabral -p -e “set global audit_log_policy=NONE; set global audit_log_policy=LOGINS”
Enter password:
[mysqlaudit]# ls -l
total 1459768
-rw-rw—- 1 mysql mysql 421043317 Mar 31 11:10 audit.log.01
-rw-rw—- 1 mysql mysql 1073749720 Mar 25 08:50 audit.log.02

Here’s another failed test – let’s see if we can disable then enable the plugin:
[mysqlaudit]# mysql -u sheeri.cabral -p -e “UNINSTALL PLUGIN audit_log”
Enter password:
[mysqlaudit]# ls -rlth
total 1.4G
-rw-rw—- 1 mysql mysql 1.1G Mar 25 08:50 audit.log.01
-rw-rw—- 1 mysql mysql 403M Mar 31 11:44 audit.log
[mysqlaudit]# mysql -u sheeri.cabral -p -e “INSTALL PLUGIN audit_log SONAME ‘audit_log.so’; ”
Enter password:
ERROR 1125 (HY000) at line 1: Function ‘audit_log’ already exists
[mysqlaudit]# mysql -u sheeri.cabral -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1375345
Server version: 5.5.51-38.1-log Percona Server (GPL), Release 38.1, Revision b4a63b4

Copyright (c) 2009-2016 Percona LLC and/or its affiliates
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type ‘help;’ or ‘\h’ for help. Type ‘\c’ to clear the current input statement.

[(none)]> \P grep -i audit
PAGER set to ‘grep -i audit’
[(none)]> SHOW PLUGINS;
| audit_log | DELETED | AUDIT | audit_log.so | GPL |
41 rows in set (0.00 sec)

[(none)]> INSTALL PLUGIN audit_log SONAME ‘audit_log.so’;
ERROR 1125 (HY000): Function ‘audit_log’ already exists

So, I ended up needing to restart MySQL if I wanted to re-enable the plugin.

Sometimes the failures are the most illuminating!

[I realize that the plugin probably could not hack FLUSH LOGS but it would be great to get FLUSH AUDIT or something similar…]

Generating a MySQL Password

One of the services our database engineers provide is adding users to MySQL. We have some nice Chef recipes, so all I have to do is update a few files, including adding in the MySQL password hash.

Now, when I added myself, I just logged into MySQL and generated a password hash. But when my SRE (systems reliability engineer) colleague needed to generate a password, he did not have a MySQL system he could login to.

The good news is it’s easy to generate a MySQL password hash. The MySQL password hash is simply a SHA1 hash of a SHA1 hash, with * at the beginning. Which means you do not need a MySQL database to create a MySQL password hash – all you need is a programming language that has a SHA1 function (well, and a concatenate function).

And I found it, of course, on this post at StackExchange (http://unix.stackexchange.com/a/234592/217471). So you don’t have to click through, here is what it says – and I have tested all these methods and I get the same password hash. I have changed their example of “right” to “PASSWORD HERE” so it’s more readable and obvious where the password goes, in case you copy and paste from here.

Some one-liners:

**MySQL** (may require you add -u(user) -p):

mysql -NBe "select password('PASSWORD HERE')"

**Python**:

python -c 'from hashlib import sha1; print "*" + sha1(sha1("PASSWORD HERE").digest()).hexdigest().upper()'

**Perl**:

perl -MDigest::SHA1=sha1_hex -MDigest::SHA1=sha1 -le ‘print “*”. uc sha1_hex(sha1(“PASSWORD HERE”))’

**PHP**:

php -r 'echo "*" . strtoupper(sha1(sha1("PASSWORD HERE", TRUE))). "\n";'

Hopefully these help you – they enabled my colleagues to easily generate what’s needed without having to find (or create) a MySQL instance that they can already login to.

April 2007 Boston MySQL User Group Video

Using MySQL As Active DBMS for Monitoring Applications — Jacob Nikom.

Jacob presented this as a special preview at the April 2007 Boston MySQL User Group, and then presented it at the 2007 MySQL Users Conference and Expo.

The last in the “better late than never” series….

Download from http://www.technocation.org/videos/2007_04BostonUserGroup.wmv
or view right here:

MySQL Security Presentation at Boston MySQL User Group Meeting

The February Boston MySQL User Group meeting was great! I spoke about MySQL security; you can now download the slides and the video. I continue to be impressed with the sound quality of the video camera I have, I was pretty good about repeating the question folks asked, but you can clearly hear it in the audio (well, I could when I was wearing headphones, but I also have pretty bad hearing).

Special thanks to http://technocation.org for hosting the bandwidth for the videos.

Topics covered in the talk:
ACLs
Test dbs & anonymous accounts
OS files and permissions
Application data flow
SQL Injection
XSS (Cross-site scripting)

PDF of slides (1.4M):
http://www.sheeri.com/presentations/MySQLSecurity2007_02_08.pdf

Slides in Flash (107K):
http://www.sheeri.com/presentations/MySQLSecurity2007_02_08.swf

Video of presentation (large, 289M)
http://technocation.org/videos/original/mysqlsecurity2007_02_08large.wmv

Video of presentation (small, 27M)
http://technocation.org/videos/original/mysqlsecurity2007_02_08small.wmv

User Group Video Up, and Video Camera Review

Download the video at:

http://technocation.org/videos/original/BostonMySQLJanUserGroupBrianAkerLg.wmv
– 520 kbps, 320 x 240, 354M. Small size, low quality, but you can still see the slides and hear everything.

http://technocation.org/videos/original/BostonMySQLJanUserGroupBrianAkerSm.wmv – 45 kbps, 320 x 120, 29M. Small size, low quality, but you can still see the slides and hear everything.

Technocation, Inc. received a donation of a Sony Handycam DCR SR80 (http://tinyurl.com/yvyfam ), extra-long battery, microphone (proprietary Sony that goes with the camera).

In a short sentence: I am impressed. The sound quality (on the large version) is almost exactly what I heard. Granted, I have some hearing loss, but I forgot to bring the microphone, and you can still hear audience questions very well. The video quality is great, too. The hard disk is perfect, because files can be copied over or burned directly to DVD. It records in MPEG-4 format.

The 1 hour 38 minute talk took up less than 6 gigs of space raw (I forget how much exactly, but it cuts the files into 2G chunks, and there were 3). This gives at least 10 hours of recording time before needing to dump to disk. This is a very exciting prospect for the MySQL Conference and Expo at the end of April, I’ll be able to video a LOT.

Special thanks go to the User Group member (who may wish to remain anonymous, but I forgot his name anyway, so if he wishes to comment he can, or just e-mail me so I remember your name!) who talked to me about codecs and which programs to use, because they worked!

I was not quite ready for the start of the User Group, and had to run out to my car to get the tripod, so the first minute or so (until 1:25) is me setting up the tripod — I apologize for the movement.

You can see the “Night Shot” functionality early on, when I focus on Brian and turn it on. It does a great job, but loses a lot of color (1:53 until 1:59).

I was disappointed that when you connect the DC power supply, the video stops (so there’s a few hops int here).

Brian takes some slides, starts talking, and questions ensue. The basic slides were about MySQL’s internal architecture.

Some links:
MySQL and dual-master/circular replication
There’s a great article by Guiseppe Maxia at: http://www.onlamp.com/pub/a/onlamp/2006/04/20/advanced-mysql-replication.html

And a free chapter on Replication from Jeremy Zawodny’s
http://www.oreilly.com/catalog/hpmysql/chapter/ch07.pdf

Around 27:00 there is a reference to Jim Gray’s “Black Book”, which is entitled “Transaction Processing: Concepts and Techniques” and can be found here: http://tinyurl.com/2md3tb

http://forge.mysql.com/wiki/Top10SQLPerformanceTips

The Trend of Managed Schemas: A Database is Not a Messaging System

This thread on the Boston MySQL User Group Board is getting interesting:
http://mysql.meetup.com/137/boards/view/viewthread?thread=2280640

(From the original poster:)

I think that nonequivalence comes from the past when the data sharing was a
rare situation. The data changes were always initiated by application and it
always knew about those changes. Now the situation is different. When the data
are shared between multiple remote applications you have to notify other
interested parties about those changes.

Currently databases are mostly used as “pull” components. If they had standard
“push” functionality they could compete with messaging systems with the advantages
of automatic data persistence and powerful query language.

(my response:)
Well, and that’s the problem — the paradigm *has* changed. MySQL is fast and reliable because it does NOT put things like messaging into their database, which Oracle and SQL Server do. A database is not a messaging system, it’s a database.

What effect would notification that there have been changes have on MVCC? I do wish there was a “pull” way to check if the data has changed.

The paradigm change of the application managing the schema causes this. I do not believe messaging is the correct way to handle this problem.

Consider the parallel to source code version control. Much like MVCC, you check out code, change it, and commit the code. Unlike many source code version control systems, though, MVCC (“data version control”) does not have the equivalent of an “update” command, except for doing another pull from the database. It would be great if there was an easy way to do a “diff” of what’s in the database versus what the application is changing, but that seems like it would be a programmatic thing (function or method), not a database thing.

And consider the database overhead and bandwidth….instead of just running queries, MySQL would have to somehow keep track of which thread has what data, and then notify every single thread that has that data, that it’s changed. The applications will have to be written to keep threads open longer, which will consume lots of resources. That’s lots more overhead for the database, and much more bandwidth, because there may be instances of the application that are using data that they do not care if it changed….so the messaging system would be wasting bandwidth, sending messages to instances that do not care. Although that could be mitigated by the application keeping a thread open when it cares about whether or not the data has changed.

Then again, I’m not fond of managed schema in the application…or at least, when the developers write that code. Seems to me it should be the DBA writing that code. It’s *very* useful for data consistency and integrity, which is a function of the DBA, not a developer.

What effects do you see the managed schema having on databases? Who should be responsible for writing a managed schema? Should a managed schema be used for database consistency within an application? Where is the line drawn between the application putting the required information into the database, and the database’s job of maintaining consistency and integrity?

It’s somewhat ironic, since for a long time MySQL advocated using the application to ensure the consistency and integrity (ie, before MySQL had a storage engine with foreign keys and transactions).

I often say that the biggest reason MySQL is a widely used database is because it is fast. A fast database can be complemented by an application that adds the features the database is missing; but a slow database that is full-featured cannot be made faster by an application. So it worries me when folks request very specialized systems such as a messaging server (or ANY “push” system) into the database, because that could be easily done with a “pull” mechanism, only using the bandwidth needed by the instances of the applications that care. Otherwise, it will end up adding Microsoft-level bloat to a really nice and fast program.

October Boston MySQL User Group Topic: Boolean Values and Bit Operators

Boston October MySQL User Group: see full event listings at:

http://mysql.meetup.com/137/calendar/5118339/

Tuesday, Oct. 10th at MIT, free pizza and soda (thanks to MySQL, AB and the MIT community). Please RSVP!!

To RSVP anonymously, please login to the Meetup site with the e-mail address “admin at sheeri dot com” and the password “guest”.

Plenty of free parking (you can park in MIT lots after 3 pm); 1 block south of the Kendall Square T stop.

————–

Most of the September Boston User Group was spent discussing an interesting problem with a large amount of data (5 million records). Basically, this data had about 40 boolean (or small set) fields that needed to be able to be searched against. Folks suggested:

1) Just leaving the table as is and using 1-character values
Pro: simple
Con: Indexes are bad for columns with low selectivity, searching will take a long time due to full table scans

2) Creating a “joining” table for each boolean value
Pro: Indexing for each boolean value can be used
Con: Complex — lots of tables, lots of joins for search

3) Using BIT(1) values or BIT(2) values and matching up booleans
Pro: Simple
Con: Difficult to write the search query, keeping in mind the search terms given below.

The biggest issue is the accuracy of indexes vs. size/amount of tables and joins. The person with the original problem (Chris) and I are doing a joint presentation, with real data on those three cases to figure out which is the best for his situation.

What have other folks done for boolean values? Please be specific about the amount of data, and the performance. Remember that this situation involves a lot of data and a lot of boolean fields, and searching across any or all boolean/small set fields is a core function. As well, fields may be null, and searching may include:

For boolean:
search for 0
search for 1
search for 0 or 1 (any value set)
search for NULL (any value not set)
search for 0 or NULL
search for 1 or NULL

For small sets:
search for ‘a’ (single value match)
search for ‘a’,’b’, and ‘c’ (multiple values will match)
search for ‘any value not null’ (anything not null)
search for ‘any value including null’ (anything null)

Any ideas? I will do some quick research if there’s another option that the September User Group did not come up with.