Mozilla DB News, Friday May 18th

Last week I did not post one of these, because I was at Professional IT Community Conference (PICC) 2012; in New Brunswick, NJ. This is my third year in a row speaking and attending the conference, and it is always a great time. I did a session on MySQL Security (PDF slides) and another on Getting Started with a Podcast (PDF slides). There will be videos available sometime in the future, and I will link to those when they are up.

Other than that, it has been a fairly hectic two weeks. The final set of machines was moved out of the data center on Tuesday, so there was some work on that done, and there will be residual cleanup to do, but that is mostly behind us now. There was plenty of setting up machines, backing up machines and changing MySQL ACLs going on, with more to come to finally clean up all the cruft that has gone away.

And in bullet-point list format, everything else:

  • We finally got all the proper drivers/firmwares/system updates on the 2 addons databases that were crashing.

  • Unfortunately, now there are 2 other machines crashing (the first 2 were HP machines, these are seamicro machines).
  • Removed many staging databases that were not in use, such as the one for Chocolate Factory.
  • Creating new users on a database that had an application password leak.
  • Lots of documentation updating happening, and some new documents created, such as what VIPs folks need to use for certain databases, instead of connecting directly. This helps make automated failover and MySQL upgrades transparent.
  • Created a database and user ACLs for Tree Status
  • Created a database and ACLs for the Mozilla Ignite Apps Challenge
  • Created a database and ACLs for the Mozilla Popcorn website, which shows folks how to be powerful video editors.
  • Created a database and ACLs for the MozTrap website.
  • Data changes to Graphs
  • That's all, folks!

    MySQL Handshake and Encryption

    Interestingly, I have given the presentation on MySQL and Security at least 4 times in the past 6 weeks* and it was only last night, with the sharp minds at Baron’s Central Virginia MySQL Meetup Group (sadly Baron was not there!), that someone asked about when encryption happens in the MySQL handshake.

    We had been talking about how MySQL authenticates users, and how if there are no ACL’s set for a given host, MySQL will reject connections from that host – even “telnet host 3306″ will be refused – and that’s when a clever audience member asked where in the handshake process encryption started. Is it before the username is sent? Before the password is sent? Does it encrypt all traffic, even the handshake traffic?

    I think that’s an excellent question, and I know there’s a few sharp minds out there who probably know the answer….otherwise I will research the answer this weekend, when I’m back home in Boston.

    * Effective MySQL User Group, as part of a tutorial for Percona Live: MySQL Conference and Expo , at the Professional IT Community Conference last week, and last night at the Central Virginia MySQL Meetup Group

    SQL Injection at Reddit

    Reddit takes SQL injection very seriously.

    How seriously?

    Check their headers:

    scabral-07890:~ scabral$ curl --head www.reddit.com
    HTTP/1.1 200 OK
    Content-Type: text/html; charset=UTF-8
    Set-Cookie: reddit_first=%7B%22organic_pos%22%3A%201%2C%20%22firsttime%22%3A%20%22first%22%7D; Domain=reddit.com; expires=Thu, 31 Dec 2037 23:59:59 GMT; Path=/
    Server: '; DROP TABLE servertypes; --
    Date: Sat, 12 May 2012 13:54:20 GMT
    Connection: keep-alive

    scabral-07890:~ scabral$

    A colleague at PICC showed me this when he learned of my talk on MySQL security!

    Liveblogging: Pulling Strings with Puppet: A Gentle Introduction

    I’m at the Professional IT Community Conference, aka PICC, for the first day.
    I am taking Pulling Strings with Puppet: A Gentle Introduction by Thomas Uphill of the Institute for Advanced Study and Benjamin Rose – Princeton University…here are my notes!
    Slides available at http://goo.gl/Swg4G
    example files
    http://ramblings.narrabills.com/Talks/puppet-picc.tar.bz2
    What is puppet?
    It’s a language – you describe the state, not the steps to get there (declarative!) – paint a picture of your ideal system, and puppet does the rest.
    Written in Ruby
    Luke Kanies in 2005 developed it, company is http://puppetlabs.com – puppet Enterprise has commercial support, Windows support, etc.
    Why do I want puppet?
    It’s human-readable
    Puppet has a rich history of versioning, they went from 0.20 -> 0.25 -> 2.6.0 -> 2.7.0 -> 2.7.14
    Learning Puppet VM – http://docs.puppetlabs.com/learning – can get VMs for VMware, OVF, VirtualBox, KVM
    They’re using the learn vm, puppet master is Jim, piggy and kermit are the client nodes
    Resource Abstraction Layer
    a bunch of types, exec, mount, service (something started from inet.d), file (just a file), package, user, group, host (/etc/hosts).
    Types have providers, the providers take care of figuring out what kind of operating system you have and how to do it.
    The core of the language is classes, variables and types. These apply to nodes – e.g. hosts.
    Types, classes and variables can be put into modules, or plugins.
    Syntax of types
    type { "title":
    attribute => value,
    attribute => value
    }
    type is “exec”, “mount”, etc from above
    basic example:
    file {"testfile":
    mode => 0644,
    owner => root,
    path => "/etc/test",
    ensure => present
    }
    You can run a parser to make sure your syntax works:
    puppet parser validate syntax.pp
    Trifecta of package, file and service types (“core” types).
    file
    attributes: title
    ensure – one of {present, absent, file, directory, link}
    path – full path to the file, default is $title
    source – url to the file, or local path
    content – string contents of the file, or a template
    target – any links
    recurse – if this is a directory, you may want to copy everything in the dir (false by default)
    purge – remove non-puppet file from the directory (true/false, false by default)
    owner
    group
    mode
    content -
    file { "issue":
    path => "/etc/motd",
    content => "Hello World!"
    }
    file { "issue":
    path => "/etc/motd"
    content => file("/etc/hello")
    }
    templates
    file { "issue":
    path => "/etc/motd",
    content => template("/etc/hello.erb")
    }
    cat /etc/hello.erb
    Hello !
    If hostname is “learn”, it will say “Hello learn!”
    puppet apply is the most basic way to apply changes from the manifest
    puppet apply mymanifest.pp
    Puppet can be a fileserver:
    file { "issue":
    path => "/etc/motd"
    source => "puppet:///files/hello"
    }
    package
    attributes:
    title
    ensure – one of {present, latest, [version], absent, purged} latest will go and get the package. You might not want to do that if you’re using RH, b/c if you’re running puppet every half hour, it will run yum every half hour. purged takes any mentions of the package off, “absent” is safer
    name
    source – can provide location of the source
    package { "sar":
    name => 'sysstat',
    ensure => '7.0.2-3.el5'
    }
    package { "sar":
    name => 'sysstat',
    ensure => '7.0.2-3.el5'
    provider => rpm
    source => '/root/sysstat-7.0.2-3.el5.i386.rpm'
    }
    service
    service actually makes sure things are running
    attributes:
    title
    ensure – running, stopped, true, false (true=running, false=stopped)
    enable – true, false (e.g. true is chkconfig foo on)
    hasrestart – true, false (true means does a stop/start atomically)
    hasstatus – true, false (true means it uses the “status” command used for the package, otherwise it does a pgrep of the name of the service (won’t work with, say, bind, whose daemon is named)
    restart – this is the command to restart the service
    status – this is the command to test the status
    ordering – you need the package first
    Apache example:
    service { "httpd":
    enable => true,
    ensure => running,
    hasrestart => true,
    hasstatus => true,
    }
    (comma at end of last stanza is OK)
    file { "host.conf":
    path /etc/httpd/conf.d/$name,
    mode => 0644,
    owner => 'apache',
    group => 'group',
    content => "<VirtualHost *:80>
    ServerName kermit.henson.local
    DocumentRoot /var/www/html/kermit
    </VirtualHost>"
    }
    Defining a new object uses lower case:
    service["sshd"] {ensure => running }
    Uppercase is for referencing existing objects:
    require => Service["sshd"]
    Be careful not to define an object twice (it will throw an error)
    Every object has builtins to allow for logic involving other objects:
    require – you can say the httpd service requires the package
    before – you can say the package comes “before” the service
    notify
    subscribe – the service is subscribed to the config file so if the config file
    tag – attach a human word to your object
    Example:
    package{'httpd':
    ensure => present
    }
    file { "host.conf":
    path /etc/httpd/conf.d/$name,
    mode => 0644,
    owner => 'apache',
    group => 'group',
    content => "<VirtualHost *:80>
    ServerName kermit.henson.local
    DocumentRoot /var/www/html/kermit
    </VirtualHost>"

    require => Package['httpd']
    }

    puppet resource is a great way to translate your existing site into a manifest – a manifest is a collection of files that apply to the client.
    puppet resource --types
    prints out a list of types.
    If you want to find out what a puppet manifest would look like on your current system, use puppet resource. Here’s an example of finding a puppet manifest for your swap mount:
    puppet resource mount swap
    mount { 'swap':
    ensure => 'unmounted',
    device => '/dev/vg00/swapvol',
    dump => '0',
    fstype => 'swap',
    options => 'defaults',
    pass => '0',
    target => '/etc/fstab',
    }

    Facter/Facts
    Facter runs Ruby code to generate a value for facts (61 facts available), here are sample facts:
    $processor0
    $ipaddress
    $architecture
    $fqdn
    $is_virtual
    $memorysize
    $selinux
    $timezone
    You can extend this by adding additional facts. w00t!
    Variables
    can use as parameters
    can use in conditionals
    facts (e.g. $::processor0)
    user defined ($myvar = “My Value”)
    $::variable is a top-scope variable. More on scope later.
    Conditionals
    if/elsif/else
    case
    selector
    booleans:
    ($x==$y)
    ($x!=$y)
    ($x > $y)
    ($x < $y)
    ($x < $y) and !($x < $y)
    can do arithmetic
    $x+$y
    $y >> 1 (why would you do this? but you can!)
    case $myvar {
    "My Value": {$content = "All is good" }
    "": { $content = "bad var" }
    default: { $content = $myvar }
    }
    case $::hostname {
    /^ldap/: {include ldapserver}
    /^www/: {include webserver}
    default: {include base}
    }
    match =~ not match !~
    if $::hostname =~ /^ldap/ { include ldapserver }
    if $::hostname != /test/ { include production }
    capture -
    /^ldap(\d+)/ =~ {include "ldap$1"}
    Ternary operator (C) (actually it’s more than ternary)
    x == y ? "they are equal" : "they are different"
    file {'httpd.conf':
      path => $operatingsystem ? {
        'Ubuntu' => '/etc/apache2/$name',
        'RedHat' => '/etc/httpd/conf/$name',
        default => '/usr/local/etc/httpd/conf/$name'
      }
      content => "DocumentRoot /var/www/html"
      }
    in – Arrays
    if $::hostname in ['www','web'] {
      service {'httpd':
        ensure => true
        }
      }
    if $::kernelversion in ['2.6.35-22','2.6.38.6-26.rc1.fcl5'] {
    service{'sshd':
    ensure=>false
    }
    }
    BNF syntax of puppet grammar
    Templates
    ERB syntax
    <% Ruby code %>
    <%= Ruby expression %>
    <%# comment %>
    -%> – this puts no newline, so for example this will be all one line:
    foo bar <% stuff here -%> baz bap
    <%% escapes, so replace with <%
    %%> escapes, so replace with %>
    <%= @fact %>
    example:
    client.conf.erb
    <%= if @ipaddress_eth0 != "NONE" %>
    ServerName <%= @printserver %>
    <% end %>
    iteration in templates
    |domain| is variable expansion – replace with whatever the value of domain is
    template example:
    /etc/puppet/manifests/resolv.conf.erb
    # resolv.conf build by Puppet
    domain <%= domain %>
    search <% searchdomains.each do |domain| -%><%= domain -%><% end -%><%= @domain %>
    <% nameservers.each do |server| -%>
    nameserver <%= server %>
    <% end -%>
    $nameservers = [ 'ns1.example.com', 'ns2.example.com', 'ns3.example.com' ]
    $searchdomains = [ 'inside.example.com', 'outside.example.com', 'ns3.example.com' ]
    file { "resolvconf":
    path => "/etc/resolv.conf",
    mode => 0644, owner=root, group=root,
    content => template('resolvconf.erb')
    }
    Concatenation
    content => template('header.erb','resolvconf.erb')
    cron
    If you want to manage crontab, use the “cron” type
    attributes:
    name
    command
    hour
    minute
    month
    monthday
    weekday
    user
    If you want to manage cron.d, or cron.hourly, use files.
    Even though it’s called “cron” it depends on the provider, so you can use it on Mac OS X for example.
    cron {'yum_cleaner':
    command => "/usr/bin/yum clean all",
    user => root,
    hour => 2,
    minute => 0
    }
    exec
    attributes:
    title – must be unique across “exec” (not across the whole manifest)
    command
    creates -
    cwd
    environment
    group
    logoutpu
    augeas

    Mozilla DB News, Friday May 4th

    Happy Star Wars Day. May the Fourth be with you!

    We are still busy with data center moves! Only two more to go through, until we are all done. Twice this week, people have come to me, frantically saying “This database is moving NEXT WEEK and I need it migrated! Help!” and in both instances my reply has been, “Those servers are ready to go, they’ve been replicating data for weeks.”

    Which led coworker Shyam Mani to make a Chill Meme (has bad language), which then prompted me to make a Keep Calm Meme.

    • Added some metadata for http://graphs.mozilla.org.
    • Moved our internal inventory database.
    • Started to update all the documentation we have, with all the new machine names and virtual IPs. This includes making sure backups are appropriately documented, and in several cases we realized backups just were not running. So much of my week this week was getting 2 of our backup servers set up with 6 instances total to back up. These backups were taken by taking a hot backup of production and restoring them, so we know that restores work too, for all these servers! This project is not completely finished, but did take up several days of my week this week.
    • Created a database and users for the Mozilla web page maker webtool.
    • Created a new database for the buildbot application.
    • Helped migrate the database for the Mozilla Graphs webtool.
    • Worked on my talks for the Professional IT Community Conference next week.
    • Made travel plans for the OUG Harmony Conference in Finland May 30-31st.
    • Got the ball rolling on OurSQL CDs and some super-secret but totally awesome (and less expensive than a T-shirt) Open DB Camp swag for the free Open DB Camp USA, June 8-10th, co-located with the free SouthEast LinuxFest in Charlotte, North Carolina.
    • I also sent e-mail to previous sponsors asking for sponsorship of Open DB Camp. With a minimum of $250 for your business to be recognized, this is an easy way to stretch your marketing budget dollars!
    • Submitted paperwork to renew financial support for OurSQL through May 31st, 2013. Thanx Oracle TechNet!

    It has been a crazy few weeks, and will only get crazier as I will be traveling more and more….This weekend is the only weekend in May I will be spending at home!

    2012 Percona Live: MySQL Conference and Expo Session Videos

    A few weeks ago was the Percona Live: MySQL Conference & Expo. As in years past, I have recorded videos. Scroll down to see the 15 videos I am able to share. Very special thanks go to my fellow Mozillian espressive of the Mozilla WebDev team who was generous enough to make this look very nice, as CSS is not my specialty.

    Unfortunately, Percona will not let me share the tutorial videos I recorded – which is odd, since O’Reilly had no problems with me recording tutorials in 2008 (memcached tutorial), 2009 (part 1 and part 2 of a metadata tutorial) and 2010 (part 1 and part 2 of a tutorial about config options) – I have no videos from 2011 since I did not attend the conference.

    Without further ado, here are the 15 session videos I was able to record and am able to share with you:

    2012 Percona Live: MySQL Conference and Expo

    Percona Live: NYC Conflicts with Oracle OpenWorld

    I was checking out my schedule as I prepared to submit some talks to Percona Live: NYC, when I noticed that it conflicts with Oracle OpenWorld, and starts the day after MySQL Connect finishes. The dates for Oracle OpenWorld were announced at the end of last year’s OpenWorld, so Percona has made a bold statement by choosing to conflict with Oracle OpenWorld. I do not believe Percona knew about MySQL Connect, but it’s happening all the same.

    This is all mostly to say that I cannot attend Percona Live: NYC – I have obligations as an Oracle ACE Director to attend some meetings before and during Oracle OpenWorld, but even if I did not, I just submitted presentations to MySQL Connect and it’s not feasible for me to do that kind of flight with a time zone change and try to be coherent for a presentation. Therefore, I will not be submitting any presentations to Percona Live. I suppose folks will “notice” that Oracle is missing from this conference as well – just like the Percona Live: MySQL Conference and Expo, Percona is making it difficult for Oracle to actually attend.

    For those who do wish to submit, the call for papers for Percona Live: NYC is open until May 14th (according to Terry Erisman).

    MySQL Talks I Am Giving

    In the next 6 weeks or so, I will be doing a bunch of speaking. You can come see me if you will be in the New York City metro area (New Brunswick, New Jersey); Charlottesville, Virginia; near Helsinki, Finland and Charlotte, North Carolina.

    I will be speaking at the Professional IT Community Conference about MySQL Security. This low-cost conference run by the League of Professional System Administrators, or LOPSA, is not to be missed. The conference runs from Friday, May 11th through Saturday, May 12th. I have spoken at both previous PICC’s and I learned plenty from system administrators while I was not speaking.

    I will also be bringing my MySQL Security talk to the Central Virginia MySQL Meetup on Wednesday, May 16th – special thanks to Baron Schwartz for having me.

    At the end of May, I will be speaking at the OUG Harmony Conference in Finland about MySQL security and “Optimizing MySQL JOINs and Subqueries”. If you will be near Aulanko and are interested in MySQL, make sure to come by!

    At the beginning of June, I will bring the MySQL Security talk and a talk on Intermediate MySQL Administration to Open Database Camp at the Southeast LinuxFest in Charlotte, North Carolina. This is the first year we are co-locating the Open Database Camp, and I think it will be very successful! Southeast LinuxFest is *free*, although OpenSQLCamp is looking for sponsors – co-location is not free, but it is easier since we have a fixed cost and only have to fundraise until we hit that fixed cost.

    Mozilla DB News, Fri Apr 27th

    It has been 4 weeks since I last posted the goings-on for Mozilla DBs. April is always a crazy month because of the annual MySQL conference (Some great pics here). This year it was the Percona Live: MySQL Conference and Expo. And of course as soon as I get caught up from the conference, I have to submit more sessions to MySQL Connect (call for papers closes Sunday May 6th) and Percona Live: NYC (anyone know when the call for papers for this will close?).

    At the conference, I gave a lightning talk and a tutorial. I have posted the slides for those interested. Unfortunately, Percona asked me not to post the recordings of tutorials. I had cleared recording with them, but apparently during the last tutorial session one of the 2 video cameras was turned off and I was informed after the fact that I was not supposed to be recording them. Which is odd, since
    O’Reilly had no problems with me recording tutorials in 2008 (memcached tutorial), 2009 (part 1 and part 2 of a metadata tutorial) and 2010 (part 1>/A> and part 2 of a tutorial about config options). I have no videos from 2011 since I did not attend the conference.

    At any rate, this is just to let you know that due to Percona’s policy, there will not be any tutorial recordings available (should they decide to change their policy, the tapes have not been recorded over yet). The session recordings are forthcoming, I will blog here when they are ready for viewing.

    We have also been busy with data center moves! Due to MySQL’s flexibile architecure, we were able to move db services with very little interruption to the end user. This is a big deal because we have to change monitoring checks and network flows as well as MySQL ACLs (database authorization – Mozilla is very good about only allowing specific hosts or groups of hosts to connect to MySQL!)

    In the last 4 weeks, we have done many moves:

    • We not only moved the database cluster that buildbot was on, but we also built a cluster specifically for buildbot, with no other services on it. Before, it shared a database with other services such as graphs, cruncher and autoland.
    • Upgraded a production slave of Bugzilla to MySQL 5.1 and put it in the mix. It has been running for a few weeks without any problems, so I will be upgrading the other slaves as soon as I have time.
    • Helped debug why https://reps.mozilla.org/people/ was slow. It was not a database issue. (I have to say, after doing 5 years of consulting, where problems are usually the database, it’s nice to work in a shop where the problem usually is NOT the database!)
    • Started to turn an old mail server, which had lots of space, into a backup server in our Phoenix data center.
    • Added some metadata for http://graphs.mozilla.org.
    • Moved our PHPMyAdmin server. It moved, and nobody noticed, so either I did a good job, or nobody’s really using it!
    • Added new database grants so our metrics team could access new customBugzilla fields.
    • Moved our metrics databases.
    • Moved the web development databases.
    • Move our internal inventory database. This is tricky, because we relied heavily on our inventory database during our move, so we had to be extra careful that we did not cause any problems with the move…while we were moving this server! Of course that was not a huge issue, as we had a database in a third data center that took all the traffic and became the active database cluster as we moved the original cluster (from San Jose to Santa Clara).
    • There was plenty of PostgreSQL work to do as well. I learned how to refresh our stage database from production and also got a lot more practice in reprocessing crash reports for our Crash Stats database.
    • Did a test-run of moving the database behind a Mozilla wiki, and mentored the Web Operations team on how to do it. They did it successfully, which meant I did not have to be up in the wee hours of the morning!
    • Got a sanitized copy of the Bugzilla database to some developers, a few times.
    • Created the Mozilla Labs database cluster.
    • Created the Bedrock database cluster – on Percona Server 5.5! I am very excited to be using a 5.5 version on new projects. One of the goals I am making slow progress on is upgrading servers from MySQL 5.0.
    • Created the new developer database cluster.
    • Created the Personas database cluster.
    • Decommissioned old http://support.mozilla.com database servers.
    • Decommissioned old http://addons.mozilla.com database servers.
    • Decommissioned old Bugzilla staging database servers.
    • Created a new Bugzilla staging database cluster.
    • Debugged a Hive problem – For reference, the problem was getting this error:

    ERROR metadata.Hive: javax.jdo.JDOException: Couldnt obtain a new sequence (unique id) : Binary logging not possible. Message: Transaction level 'READ-COMMITTED' in InnoDB is not safe for binlog mode 'STATEMENT'

    The problem is that Hive automatically sets the transaction isolation level to be READ_COMMITTED by default. The solution is
    to change Hive’s configuration settings to set the transaction isolation level to REPEATABLE_READ, to match MySQL’s default.

    It has been a crazy few weeks, and will only get crazier as I will be traveling more and more!

    When is wait_timeout not wait_timeout?

    Over the weekend I came across an extremely curious issue with MySQL. It seemed that no matter how many times I tried to set the wait_timeout, it would always show the value of interactive_timeout. I even tried restarting mysql, to no avail.

    Eventually I figured it out. When I was in an *interactive session*, wait_timeout displays as the value of interactive_timeout. Otherwise, it showed the appropriate value. Here’s what I found, when interactive_timeout was set to 600 and wait_timeout was set to 14400 (this is on an analytics server, so setting the value that high actually makes sense):

    [root@mysql1 ~]# mysql -e "show variables like 'interactive_timeout'"
    +---------------------+-------+
    | Variable_name | Value |
    +---------------------+-------+
    | interactive_timeout | 600 |
    +---------------------+-------+

    [root@mysql1 ~]# mysql -e "show variables like 'wait_timeout'"
    +---------------+-------+
    | Variable_name | Value |
    +---------------+-------+
    | wait_timeout | 14400 |
    +---------------+-------+

    When using non-interactive logins, like mysql -e “COMMAND”, wait_timeout has the appropriate value. However, in an interactive session, wait_timeout had the same value as interactive_timeout:

    [root@mysql1 ~]# mysql
    Welcome to the MySQL monitor. Commands end with ; or \g.
    Your MySQL connection id is 23814
    Server version: 5.1.61-rel13.2-log Percona Server (GPL), 13.2, Revision 430

    Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.

    Oracle is a registered trademark of Oracle Corporation and/or its
    affiliates. Other names may be trademarks of their respective
    owners.

    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

    mysql> show variables like 'interactive_timeout';
    +---------------------+-------+
    | Variable_name | Value |
    +---------------------+-------+
    | interactive_timeout | 600 |
    +---------------------+-------+
    1 row in set (0.00 sec)

    mysql> show variables like 'wait_timeout';
    +---------------+-------+
    | Variable_name | Value |
    +---------------+-------+
    | wait_timeout | 600 |
    +---------------+-------+
    1 row in set (0.00 sec)

    I observed this behavior with Percona Server 5.5.21-55, and with Oracle’s MySQL 5.1.61 and 5.0.77, so it is neither a new feature, nor is it limited to Percona only.

    Putting on my “reverse engineering” hat, my guess is that MySQL looks at the value of “wait_timeout” to decide when to timeout, and when you use an interactive session, wait_timeout is set to the value of interactive_timeout. In other words, I guess that interactive_timeout serves only to set wait_timeout for interactive sessions.

    I am not sure if this is a bug or a feature, but I have seen plenty of these kinds of “subtle hacks” in MySQL so it would not surprise me if this is the way it was intended to work. It’s extremely confusing to figure out though, when you try to set the variable and then check it….here is one of my frustrating sessions, where the change didn’t seem to “stick”:

    mysql> set global wait_timeout=14400;
    Query OK, 0 rows affected (0.00 sec)

    mysql> show variables like 'wait_timeout';
    +---------------+-------+
    | Variable_name | Value |
    +---------------+-------+
    | wait_timeout | 600 |
    +---------------+-------+
    1 row in set (0.00 sec)

    mysql> exit
    Bye
    [root@mysql1 ~]# mysql
    Welcome to the MySQL monitor. Commands end with ; or \g.
    Your MySQL connection id is 23810
    Server version: 5.1.61-rel13.2-log Percona Server (GPL), 13.2, Revision 430

    Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

    mysql> show variables like 'wait_timeout';
    +---------------+-------+
    | Variable_name | Value |
    +---------------+-------+
    | wait_timeout | 600 |
    +---------------+-------+
    1 row in set (0.00 sec)




    Edited to Add - My guesses were correct! Many people have pointed out, this is a documented way that this works.

    Liveblogging: What’s New In MySQL 5.5 and 5.6 Replication

    Liveblog: What’s New In MySQL 5.5 and 5.6 Replication
    presented by Giuseppe Maxia at the Percona Live MySQL Conference and Expo.

    Has been working with MySQL for 11 years.
    Yesterday Oracle released MySQL 5.6.5 with new stuff, so the content was rewritten last night.
    main features we’ll talk about:
    5.5 semi synchronous replication
    5.6 delayed replication
    server uuid
    crash-safe slave
    multi-threaded slave
    global transaction identifiers

    In MySQL 5.5 – what is semisync replication? It increases reliability by making sure that the changes are committed on at least one of the slaves before the write query is returned.
    To use it, install the plugin rpl_semi_sync_master on the master, and rpl_semi_sync_slave on the slave. Then you configure the master to use the plugin, and then you restart both the master and slave to activate the plugin and config.

    Once you restart you see variables like rpl_semi_sync and you can see whether it’s enabled, what the timeout, trace level, etc are.

    Rpl_semi_sync_master_no_trx (how many trx didn’t go to the slaves) and Rpl_semi_sync_master_yes_tx (was it successfully sent) are status variables = counters on the master.

    In MySQL 5.6 – delayed replication is part of the MySQL that ships with MySQL, you no longer need to use pt-slave-delay.
    CHANGE MASTER TO MASTER_DELAY=#seconds;

    show slave status shows you SQL_Delay and SQL_Remaining_Delay

    UUID – each master has a UUID in addition to the server_id. This is important for global transaction IDs, which makes sure that even if you change server_id’s the global transaction ID is still associated with the same server. [Giuseppe did not know why the server_id wasn't abandoned totally in favor of the UUID.]

    More crash-safe replication:
    SHOW VARIABLES LIKE ‘%info%’ ;
    there are master_info_repository, relay_log_info_file, relay_log_info_repository, sync_master_info, sync_relay_log_info
    also there’s a variables called slave_parallel_workers

    These variables help ensure that if the slave crashes, the relay log and master.info files are kept in sync and don’t try to retrieve or apply a statement more than once.

    The crash-safe features are only for innodb, but MySQL ships the following tables as MyISAM:
    slave_master_info
    slave_relay_log_info
    slave_worker_info
    so when you start, ALTER TABLE …ENGINE=INNODB

    When you do SHOW SLAVE STATUS you’ll see
    Master_info_file: mysql.slave_master_info

    SELECT * FROM tables you get a lot of info – slave_master_info is most, but not all, of the same info in SHOW SLAVE STATUS.

    slave_worker_info has 1 line for each worker you have set.

    facts about multi-threaded slave aka parallel replication

    Requires MySQL 5.6 in both master and slave (can’t do 5.5 master, 5.6 slave) to actually do parallel replication.
    Parallel replication with a MySQL 5.5 master will slow down replication – so you can do parallel replication but it’s 3x slower (right now).
    Data gets parallelized by the schema – so if you have 2 queries in the same schema, they can’t be done in parallel, but 2 queries in 2 different schemas, they can be done in parallel. If you only have one schema, don’t bother using parallel replication.

    do it by setting the global dynamic parameter:

    SET GLOBAL slave_parallel_worker=10; (or 3, or whatever, default is 0). How do you choose the # of parallel workers to choose? MySQL is smart enough to figure out how to use the parallel workers around the schemas you have. In tests, parallel replication was 3-4x faster than regular replication.

    Global transaction ID – If a master fails, all the data from the master cannot be accessed. So you have to figure out which slave is the most advanced one (in terms of time), promote that slave to the master, and figure out which transactions the other slaves were missing. But it’s hard to know in the relay log, because it can be a different file/position in each slave. So the global transaction id uses a unique number, so you don’t have to worry about the relay log filename/position.
    to use the global transaction id, all machines in the cluster (master and slaves):
    log-bin
    log-slave-updates
    gtid-mode=ON
    disable-gtd-unsafe-statemnets – tricky
    Why tricky? This feature does not work on non-transactional tables, like MyISAM. If you try to remove the anonymous user from the mysql.user table and you have this set, it’s not safe so it doesn’t work and you get a master error. GRANT, REVOKE, DROP USER will work, but DELETE, INSERT, etc don’t work.

    CREATE TABLE…SELECT does not work with this feature, so don’t turn it on if you use that.

    After changing the variables in the mysql config file, restart the servers to pick up the changes.

    You then see the @@SESSION.GTID_NEXT=’44556A96-8417-11E1-9589-2BD5ACDD51FD:1′
    after the : is the transaction ID number (1 in this case), everything before it is the server’s UUID. Using this information you can more easily find which transactions in the binary log you need.

    There are 2 more lines in SHOW SLAVE STATUS\G:
    Retrieved_Gtid_Set:1-100
    Executed_Gtid_Set:1-100
    (That’s the set of numbers)

    —–

    And that’s it, he then took questions.

    Google-Hacking MySQL and More MySQL Security Slides Are Up

    If you missed my 3-hour tutorial on “Google-Hacking MySQL and More MySQL Security at the Percona Live: MySQL Conference & Expo this morning, the PDF slides are now available at http://bit.ly/mysqlsectut.

    There will also be video, which will appear in the next few weeks.

    *Not* Going to Percona Live: MySQL Conference & Expo?

    If you cannot make next week’s Percona Live: MySQL Conference and Expo, I am proud to announce that Mozilla will be live-streaming on Air Mozilla the events on Wednesday, April 11th. The events are (at the very least) the MySQL Community Awards and the Lightning Talks. The MySQL Community Awards are happening at 6:00 pm Pacific Time and the Lightning Talks are happening at 6:30 pm Pacific Time. (If the times change, I’ll let you know)
    To watch the events on Air Mozilla, go to https://air.mozilla.org/ at the appropriate time. I will try to remember to post a reminder, and if for some reason the tech doesn’t work out, I’ll let you know by

    Going to Percona Live: MySQL Conference and Expo?

    Make sure to leave some room in your suitcase if you’re going to the Percona Live: MySQL Conference and Expo next week. The OurSQL podcast folks – that’s me, current co-host Gerry Narvaja and former co-host Sarah Novotny – will be giving away FREE CD’s and signing them during the book signing on Wednesday night. I will also have a limited number of copies of The MySQL Administrator’s Bible for sale for USD$40 – or you can order it ahead of time from Amazon.com and save yourself about $10.

    Mozilla will be having a booth in the Dot Org pavilion this year. Stop by and say hi, let us know what you love (and hate) about Firefox, Thunderbird, Bugzilla and any other Mozilla product you use. And do not forget to look at the Mozilla has several job openings – though sadly none for full-time DBAs at this time. But if you’re a developer or systems/network administrator or work in any other sector of IT, there’s probably a job opening you’d fit into.

    I packed this morning for the trip I’m making – first to New York City to speak at the Effective MySQL User Group tomorrow night starting at 7 pm (come at 6:45 for the pizza!). Then I’m off to New Jersey to spend Passover with my family, which got a little larger just 2 weeks ago (congrats to my brother and sister-in-law!). After that I fly to California…so I’m packed for three events right now.

    Let the fun begin!

    Mozilla DB News, Fri Mar 30th

    Well, this week has been pretty hectic. We are really getting to the “move ALL the machines!” stage of the data center move, and the train we have going on April 9th includes many, many virtual machines. I am also preparing for the Percona Live: MySQL Conference and Expo in just over a week. Probably the most exciting news is that the Mozilla booth in the Dot Org Pavilion will have a Boot-to-Gecko device available. If you want to see Mozilla’s foray into the mobile device market, come see us at the conference!
    This weekend I’ll be burning the midnight oil working on presentations, for the Percona Live: MySQL Conference and Expo (tutorial and lightning talk), SkySQL & MariaDB Solutions Day and for a presentation at Effective MySQL in New York City next week. But, that’s in the future, let’s delve into the past week, shall we?
    - Ordered 10 copies of the MySQL Administrator’s Bible for the book signing on Wednesday, April 11th. I will be selling the book for US $40, since it’s easier to not have to make change, and there is an ATM on site. If you have your copy, I’m happy to sign it.
    - Shipped a box of CD’s with sample OurSQL episodes. This is also for the book signing (I guess it’s a “media signing”?) on Wed night. But unlike the books, the CD’s are free, so make sure to stop by and get one! 25 episodes with links to the show notes website pages.
    - Gathering folks to staff Mozilla’s Dot Org Pavilion booth, and swag to give away.
    In non-logistical, actual database news:
    - Released a new major version of Socorro, our crash-stats application, including database changes
    - Moved a virtual machine for a PHPmyadmin server.
    - Restored the backup to a development machine that had a bizarre turn of events reset the data back to December 2011.
    - Created databases and users for the development and stage versions of wiki.mozilla.org and treestatus, a simple wsgi app to maintain the open/closed state of development trees.
    - Dealt with our Addons primary db master crashing (again). This was after it had crashed 2 weeks ago and we were advised to update the firmware, which we had done. Needless to say, our secondary db is now our primary db, and we now have n+1 slaves (we normally operate with n+2 slaves). The crashed primary master is up and running, but we are not putting it back into production use until we get a definitive answer as to what’s wrong with it.
    - Upgraded the staging master database for Bugzilla to Percona’s patched version of MySQL 5.1. This weekend I will be upgrading a production slave….I’m so excited! Note that the database is also being converted to innodb_file_per_table while we are at it.
    - Updated permissions for our metrics team to be able to access a new custom Bugzilla field
    - Debugged an application having “Aborted connections” on the MySQL side and CLOSE_WAIT sockets on the client side. (The application wasn’t closing the database handler properly)
    - Worked on a migration plan for the Buildbot databases to be on their own server, instead of a server shared with other small Mozilla applications.
    - Did a database export of non-vouched Mozillians accounts, so that we could try to vouch them. Are you a Mozillian yet? Add yourself as a Mozillian!

    Mozilla DB News, Thu Mar 22nd

    I had planned to take tomorrow off as a fun “play hooky” day, but I still have responsibilities, and with the crazy way this work has been going, I feel like I’ve done 5 days of work in 4 days’ time.

    So far this week, I feel like this:

    I posted that to our internal IRC channel, and it was submitted by someone else to Mozilla Memes, with the title “Life is always better with awesome DBA’s on staff. I love how Mozilla openly appreciates its employees, in both serious and fun ways.

    Now, why was this week so crazy? Well, unlike previous weeks, very little work was done on the data center move (even though we have many, many machines yet to move).

    • An HP machine that was a db server for Addons and the new Mozilla Marketplace had a hardware problem and crashed. At one point in time, addons had n+1 slaves – if one slave went down, we were OK. However, we realized that we’re right about capacity now – if one slave goes down, everything else strains to meet the workload. So we added 2 new read slaves to the pool.
    • Started graphing the addons database servers with the MySQL Cacti Templates.
    • Gave a user read-only access to the Bouncer database for the purpose of debugging a problem.
    • Gave the Buildbot application its own database for production use, instead of being on a shared interface.
    • Upgraded the aforementioned Buildbot database, including getting puppet to recognize and play nicely with a new version of MySQL.
    • Rebuilt and refreshed a developer database cluster with backups (I love testing restores!).
    • Created databases/authentication for developer sites (3 different ones)
    • Exported databases from Firefox Flicks and support.mozilla.org

    It feels like a short list, but all those backup refreshes take a while, and setting up puppet so it can support upgraded MySQL packages took a while. Especially figuring out dependencies for perl-DBD-MySQL!

    To Percona Live Speakers: Permission for Conference Video Recording

    Many folks know that I have recorded sessions at conferences I attend, most notably the MySQL conferences. At last year’s OSCon, I had some trouble with the convention center staff getting upset that I was going into rooms during off-times (like before the morning sessions and at the end of the day) because I did not have a staff badge. They did not complain to me, they complained to O’Reilly, who knew I was recording and was turning the other cheek until then. The O’Reilly associate I spoke with was friendly, but let me know of the complaint and said, “You will get permission from the speakers before publishing, right?”

    And of course I said yes. However, that was a pain – getting permission after the fact, since not everyone put their contact information in their presentation.

    And I have been hemming and hawing about whether or not I want to record at Percona Live: MySQL Conference and Expo. I have had enough people ask if I was going to do it, and specifically ask me to do it, that I am putting a call out here for permission. There are some rules:

    0) If Percona has a problem with it, I won’t record.
    1) I will not record in a room Percona is recording in. There only needs to be one recording, and Percona’s are professional quality (and they pay for it to be such!). Mine are free, but amateur quality.
    2) I will only record sessions I have permission to record and redistribute. Permission must be given to me, preferably by a comment here, but I’ll accept an e-mail to me, if you have my e-mail. (I prefer it as a comment here because then it’s a matter of public record)

    Mozilla DB news, Friday March 16th

    • While adding a custom field to Bugzilla to track the newest SeaMonkey version, the script ran into a lock wait timeout and aborted. Some of the data needed to be manually inserted to finish adding the custom field.
    • We then needed to add database grants so our metrics team could access the new fields.
    • Added access so the Autoland staging server
    • We added the DBAs to what gets paged for our new backup server.
    • This seemed to be the week that a few of machines started having disk issues, though all of them were one-offs (as opposed to having to set expire_logs_days). I did run into a fascinating issue where binary logs for a machine were 7G even though the maximum size was supposed to be 1G.
    • This was also the week that some cron jobs did not get run, because we “sprung ahead”. Monday was a fun day, but luckily everything was easy to fix. Lesson learned: do NOT run anything via cron from 0200 to 0259 because if your server is set to a time zone that observes Daylight Saving Time, it will run twice in October/November and zero times in March.
    • The mozillians.org team wanted some data about group names so they could optimize their searching, so we gave them a data export.
    • We removed some company-sensitive comments from a bugzilla bug.
    • Due to machines being moved around from the old data center to the new one, we had a new location for the developers to pick up their nightly exports of the support.mozilla.com database.
    • Did you know I co-host a weekly podcast about MySQL? It’s called OurSQL Cast. You can find it on Feedburner and iTunes. Episode 83 is up, called “The NewSQL World”, and we interview Ori Herrnstadt, the CTO of Akiban.
    • We got several new database nodes kickstarted in our new data center.
    • We are preparing to upgrade MySQL on Bugzilla’s staging server, which will happen on Sunday.
    • /

    PGDay NYC, Austin and DC!

    The talks for PGDay NYC 2012 have been announced. The full lineup of talks for the one-day conference on Monday, April 2, 2012, is available at http://pgday.nycpug.org/schedule/.
    There is a $50 discount on tickets using the code PGMEETUP (http://pgdaynyc2012.eventbrite.com/?discount=PGMEETUP) from now through March 16th.
    PGDay NYC is a one day PostgreSQL conference in NYC featuring both well-known PostgreSQL community members as speakers and local users sharing their practical experience of using PostgreSQL as a key part of their infrastructure. PGDay NYC 2012 is part of the “PG Corridor Days” series of one-day conferences to help promote PostgreSQL usage in their locales.
    The other conferences are PGDay Austin on Wednesday, March 28 http://www.postgresql.org/about/event/1379/ and PGDay DC on Friday, March 30 (http://pgday.bwpug.org/). These are all non-profit events organized by local users.

    MySQL Community Dinner at Pedro’s

    Once again, this year there will be a community dinner at Pedro’s. Pythian is organizing this pay-your-own-way, informally fun way to meet and re-connect with colleagues and friends, old and new. It is open to everyone, and is right after the opening reception Percona is throwing (that’s from 4:30 to 6:30).

    So on Tuesday, April 10th, meet at 6:30 pm in the lobby of the Hyatt to walk over (about 1 mile), or meet at 7 pm at Pedro’s – 3935 Freedom Circle, Santa Clara, CA. If you want to come, please RSVP by leaving a comment on the Pythian post, so they have an accurate headcount.

    Seeking Articles About HTML5 and Python

    Because I wrote a book and am somewhat visible in the community, I often get requests to write articles, and many of those requests are not completely appropriate for me. For example, today I got an e-mail from Lukas Rakowski of Software Press, who makes PHP Solutions Magazine. Basically, they’re looking for articles about HTML5 and Python – neither of which are subjects I could author an article on (other than “I know HTML5 is all sorts of awesome” and “I’ve been meaning to script in Python more”).
    However, I know that there are many Mozillians that can write and can write about HTML5 and/or Python. To that end, here is the message I received:
    We are creating a new version of popular magazine PHP Solutions, which is available online.
    The magazine is about programming languages and software creation technologies.
    Articles published in the our magazine provide ready-made solutions,
    which the reader can use in their work.
    We are looking for specialist in web development and programing, who will be
    interested in writing some articles in this subject.
    May you be interested in sharing your knowledge with us?
    We currently seek for HTML5 and/or Python specialists.
    Please send me more information about yourself
    and your experience in this subject.
    If you are interested, please contact Lukas at lukasz dot rakowski at software.com.pl – I have no further information about this; I don’t know what benefits/compensation are involved, how frequently the new magazine will come out, etc. I am just passing along the message.

    Mozilla DB news, Week of Fri Mar 9th

    This week I am in Mountain View to help with some physical data center moves. It is nice to get to see where the magic happens, and our new data center is pretty awesome. Our data center technicians, Derek Moore and Erica Muxlow, are really doing a stellar job, as are their interns.
    At the office:
    - Visited the San Francisco office. What a great view!
    - A dependent subquery that had been running fine for weeks started acting up, causing 1000% more load than normal on a machine that hosted many sites, including the Firefox Flicks $10,000 Contest website and Tinderbox Push Load. We changed it to a join and added an appropriate index, and the query now runs much faster and is not causing any load.
    - Created a user and database for the development site of Mozilla Labs
    - Pointed our webdev team to some newer exports for some data they were looking for
    - Enabled MySQL access for another buildbot host
    - Cleaned up some older binary logs that were hanging around after a binlog name change in January, also set expire-logs-days for the same server
    - Converted some data from latin1 to utf8 on a staging machine and debug’d a cron job that was duplicating data
    - Helped assemble 32 new SSD drives for new database blades (and thanks to the rest of the systems team for installing the drives and kickstarting the machines!)
    At the data center:
    - Racked 16 blades, including scanning in and updating the inventory database with serial numbers and asset tags
    - For 6 of those 16, had to physically dig out the toe tags that had the serial numbers on it (due to a manufacturing problem I’m sure)
    - Attached rails to several new servers

    Percona Live Early Bird Pricing Ends Next Week!

    Don’t miss your chance to get the early bird pricing for Percona Live:MySQL Conference & Expo! Early bird pricing ends Monday, March 12th, so you have a week left to get maximum savings. If you want even more savings, register with code PL-pod for 10% off. For example, the early bird pricing for the 2 days of the conference sessions is $595, or if you want all 3 days, the pricing is $795. With 10% off your price comes to $535 for 2 days, or $625 for all three days.

    I will be doing a tutorial on MySQL Security, including White-hat Google Hacking with MySQL, on Tuesday at 9 am. There is a wonderful lineup of speakers and topics, so be sure to register while you can get the lowest price possible.

    Also, Mozilla will be having a booth in the DotOrg Pavilion, so if you have questions about any of our products, like Firefox or Thunderbird, or newer products like Firefox on Mobile or Boot2Gecko, the new mobile hardware platform, stop on by during the conference!

    DB Friday, March 2nd edition

    Next week I will be in Mountain View to help with the data center move we are doing. My responsibilities will mostly be remote-style work anyway, but it will be nice to see some of my team members in person. So as I close out this week and reflect about what got done in the Mozilla DB world, I also wanted to make clear why I write these. These posts help answer the questions “What does a DBA do?” and “What does Mozilla use databases for?”

    This week in Mozilla databases:

    • A database with some tables with a latin1 charset and some tables with a utf8 charset was converted so that all the tables used utf8. This involved exporting using mysqldump --default-character-set=utf8, dropping the table and re-creating it with a utf8 charset, and importing the data.
    • The default configuration for MySQL databases at Mozilla has been changed to set the default character set to utf8, and existing configurations have been changed to set the default charset to utf8. We use puppet to set up and maintain configuration files.
    • A plan to upgrade our Bugzilla databases has been made, approved, and is in progress. And before anyone asks – yes, it is fully my intention to work on query optimization so that future versions of Bugzilla will have awesome queries; however, there is plenty of internal work to do at Mozilla first!
    • Files were generated with the general log and prepared with pt-log-player for benchmarking a system with SSD’s.
    • In preparation for our data center move, we promoted a machine in another data center to be the master for our download.mozilla.org service.
    • We worked with our Infrastructure Security team gathering data on a security issue.
    • Created a new slave for a development web environment
    • Currently our physical backups are cold backups where we shut down our backup instance of MySQL and copy files. We have started to work on implementing Xtrabackup for these cases. This is a more long-term project, as backups work fine for now, but every step on the way to using Xtrabackup is a good one. We also use mysqldump for logical backups.
    • Created new databases for the ci.mozilla.org service
    • Added the custom field “cf_blocking_fennec10″ to the Bugzilla database
    • Updated data and schema for the development and staging databases for CaseConductor, a litmus replacement for the QA team
      Last week’s theme was:
    • There were some tweaks to be made with scripts that use the backups for ETL and updating data for development environments so they would function properly.

    What does pt-show-grants look like?

    The OurSQL Podcast did an episode on some of the lesser-known but very useful tools in the Percona Toolkit. pt-show-grants is one of those tools that I use pretty frequently. While the manual page has an explanation of all the features and a few examples, you don’t really see the output, and often you decide whether or not to use a tool based on what it gives you as output.

    So here is a small example of an actual command I did today using pt-show-grants. I wanted to find the grants for a particular user. To do that without pt-show-grants, I’d have to login to MySQL, run

    mysql> SELECT host FROM mysql.user WHERE user='aus4_dev';

    And then use that host information in a SHOW GRANTS statement:

    mysql> SHOW GRANTS FOR aus4_dev@HOST;

    But I would have to do this for each HOST – if there were 2 hosts, I’d have to run the SHOW GRANTS command twice.

    Happily, pt-show-grants has an option called –only, which will show you all user@host combinations for the username you specify. I have login information stored in a .my.cnf on this particular dev machine, and except for the password and host, this is an exact copy/paste of what I typed and the output:

    [scabral@dev1.db ~]$ bin/pt-show-grants --only aus4_dev
    -- Grants dumped by pt-show-grants
    -- Dumped from server Localhost via UNIX socket, MySQL 5.1.52-log at 2012-03-01 08:52:01
    -- Grants for 'aus4_dev'@'10.0.0.1'
    GRANT USAGE ON *.* TO 'aus4_dev'@'10.0.0.1' IDENTIFIED BY PASSWORD '*1234567890ABCDEF1234567890ABCDEF12345678';
    GRANT ALL PRIVILEGES ON `aus4_dev`.* TO 'aus4_dev'@'10.0.0.2';
    -- Grants for 'aus4_dev'@'10.0.0.2'
    GRANT USAGE ON *.* TO 'aus4_dev'@'10.0.0.2' IDENTIFIED BY PASSWORD '*1234567890ABCDEF1234567890ABCDEF12345678';
    GRANT ALL PRIVILEGES ON `aus4_dev`.* TO 'aus4_dev'@'10.0.0.2';

    By default, if I did not put in the –only, it would show me all the users that I was allowed to see. There is also an –ignore option, so if you want to show all users except a particular username, you can do that as well.

    Being able to find all user@host users and their grants given a particular username is very handy and eliminates the need to go into the database to find the hostnames.

    DB Friday, February 24th edition

    The overriding theme this week – we are still doing a lot of shifting around of databases pending our data center move.

    This week, the database team at Mozilla has:
    - Discovered some minor but important issues following a crash. We have increased alerting to ensure that if cleanup after a crash is necessary, the proper folks will be notified.
    - Created a new database for Tinderbox Push Log for bug caching. In the process, updated some sorely out-of-date database documentation.
    - Added new machines to graphs.mozilla.org

    - Cleaned out puppet entries from some database machines that no longer exist.

    - Updated database entries in puppet to include or exclude the “read_only” setting as appropriate, and are monitoring that setting with the PalominoDB Nagios Plugin for MySQL.

    - Updated a process for loading a Postgres database; instead of four hours, the process only takes a minute.

    - Racked the Scalarc appliances for our proof-of-concept (POC) evaluation.

    - Started a proof-of-concept (POC) evaluation with ScaleBase.

    - In preparation for our data center move:
    – We now have redundant backups, both logical and physical, on our most-used development machine as well as the databases backing addons.mozilla.org and support.mozilla.org. We are also monitoring the backups to ensure that no backup is too old. (We are working on testing restores of the major databases as well, but there is only so much we can do every week, and having the backups took priority).

    – We have moved our staging site for our Crash Stats site.
    – We have turned off the old site for support.mozilla.org in the data center we are moving away from.
    – We have ensured that the databases for addons.mozilla.org are no longer being used and are ready to be turned off.

    Caching for Monitoring: Timing is Everything

    I found Baron’s reasoning on why the Percona Nagios plugins do not use caching interesting. On the surface, the logic is sound – you do not want to cache when you want real-time monitoring.
    I have not yet had time to look at the Percona plugins for Nagios, though I do want to, because back at PalominoDB I helped write a Nagios plugin for MySQL that allows you to do arbitrary calculations. By “arbitrary calculations” I mean you can have a calculation like “Threads_connected/max_connections*100″ and set a threshold of “>80″. You can mix and match MySQL status variables and system variables, and use any perl functions as well, including basic arithmetic.
    We put a caching function in there, complete with examples. Why would we do that if caching is bad, as Baron says? Well, long-term caching is bad, but it is certainly acceptable to have caching with a threshold lower than the check interval. With the PalominoDB Nagios plugin for MySQL, you can have 10 different calculations, and set the cache threshold on a per-check basis – the examples use 60 seconds.
    If I have 10 calculations, I can set it so the checks do not re-connect to MySQL if there is a file that’s less than 60 seconds old. With checks that run every 5 minutes by default, it makes complete sense to cache the first run of SHOW STATUS/SHOW VARIABLES/SHOW PROCESSLIST, and the other 9 checks use the cache file – but only if it fresh within the past minute.
    It is quite likely that the Percona Nagios plugins for MySQL do not lend themselves to this type of caching. Because the PalominoDB Nagios plugin is so powerful, it makes sense to have this type of micro-caching. Otherwise, each additional monitoring check adds more strain on the database.

    Edited to add: I just took a look at the Percona monitoring plugins for Nagios and they check very, very different things. The level of customization is not as flexible as it is with the PalominoDB Nagios plugin, because it is checking very different things. The Percona monitoring plugins are a set of 12 different checks, as opposed to the 1 check that the PalominoDB plugin has. Still, I could see a value in caching the output of SHOW ENGINE INNODB STATUS (or whatever it uses) for the pmp-check-mysql-innodb plugin, so you can run it three times - once with the idle_blocker_duration option, once with the waiter_count option, and once with the max_duration option, and it only runs the command needed once.

    Basically it comes down to this: Baron is correct when he says "Running SHOW STATUS infrequently doesn’t add load to the server." and I am correct when I say "Running SHOW STATUS frequently adds a lot of load to the server." The Percona plugin and the PalominoDB plugin are completely different, and there seems to be very little overlap.

    This week in Mozilla Databases: Friday February 17, 2012

    I have been at Mozilla nearly three months, and I used to blog a lot more than I currently do. A lot of the content I used to blog about I end up blogging and talking about in OurSQL: The MySQL Database Community Podcast. And I have also been getting used to the Mozilla firehose, as well as my own firehose of database projects that need to be done.
    There are two very large projects that are time-sensitive that I am working on: migrating databases from an older data center to a newer one, and the impending public launch of the Mozilla Apps Store.
    That being said, this week in Mozilla databases we have:
    - migrated/improved/built our dev/stage databases for Socorro, our crash stats database.
    - put monitoring on a newer backup server, after a random check showed replication had been stopped on one backup instance for several days due to the master’s binary logs changing names. Of course we also fixed that broken replication.
    - made more progress getting the newer backup server to act like the older backup server – we do physical and logical backups, and currently the logical backups are working properly. The physical backups are a legacy cold backup, and I will not be migrating that, instead opting to use xtrabackup.
    - turned off our Scalarc software, as we now have an appliance for our proof-of-concept test.
    - retired 2 machines that were not in use, and exist in our older data center. I am always paranoid when I shut a machine down, triple-checking that I am on the right server when I type “shutdown now”.
    - did a test migration of the production database for Mozilla QA, from the old data center to the new one.
    - added new custom fields to Bugzilla for the release of Thunderbird 10.
    - Created new databases and access for:
    Case Conductor, a replacement for Litmus for the QA team
    De Todos Para Todos, Mozilla’s outreach project to Latin America.
    There is much much more to come in the weeks ahead!

    This week in Mozilla Databases

    I have been at Mozilla nearly three months, and I used to blog a lot more than I currently do. A lot of the content I used to blog about I end up blogging and talking about in OurSQL: The MySQL Database Community Podcast. And I have also been getting used to the Mozilla firehose, as well as my own firehose of database projects that need to be done.

    There are two very large projects that are time-sensitive that I am working on: migrating databases from an older data center to a newer one, and the impending public launch of the Mozilla Apps Store.

    That being said, this week in Mozilla databases we have:

    - migrated/improved/built our dev/stage databases for Socorro, our crash stats database.

    - put monitoring on a newer backup server, after a random check showed replication had been stopped on one backup instance for several days due to the master's binary logs changing names. Of course we also fixed that broken replication.

    - made more progress getting the newer backup server to act like the older backup server - we do physical and logical backups, and currently the logical backups are working properly. The physical backups are a legacy cold backup, and I will not be migrating that, instead opting to use xtrabackup.

    - turned off our Scalarc software, as we now have an appliance for our proof-of-concept test.

    - retired 2 machines that were not in use, and exist in our older data center. I am always paranoid when I shut a machine down, triple-checking that I am on the right server when I type "shutdown now".

    - did a test migration of the production database for Mozilla QA, from the old data center to the new one.

    - added new custom fields to Bugzilla for the release of Thunderbird 10.

    - Created new databases and access for:
    - Case Conductor, a replacement for Litmus for the QA team
    - De Todos Para Todos, Mozilla's outreach project to Latin America.

    There is much much more to come in the weeks ahead!

    User Group Videos....

    I have been attempting to be better about letting folks in the general MySQL world know when the videos from the Boston MySQL User Group are up. We have a great group, with anywhere from 20-50 attendees each month, and the speakers are always great, even when the group asks some tough questions.* Thanks to the generous donation of time that Richard Laskey puts in editing the videos, there are 2 videos I can share with you:

    Ivan Zoratti of SkySQL presented MySQL HA Reloaded at the December MySQL Boston Users Group. Slides are online at http://www.slideshare.net/izoratti/ha-reloaded-11261643

    Tim Callaghan of Tokutek presented Fractal Tree Indexes -- Theoretical Overview and Customer Use Cases at the January MySQL Boston Users Group.

    The next meeting is Monday, February 13th from 7-9 pm and will feature "Scaling MySQL with Transparent Database Sharding", by Victoria Dudin of ScaleBase. RSVP for the free meeting here or wait until the video comes out, if you are not local.

    * The group is really great about RSVP'ing too -- we usually get the approximate number of folks RSVP'd actually showing up. I recently went to a local user group that had over 100 people RSVP' and only about 1/3 of them showed up. I guess if you know the pattern you know how to 'read' the numbers, but as a speaker I would have been disappointed to expect an audience of 100 and have an audience of 30.

    Then again, maybe the speaker was informed of this...I have no clue. I guess what I am trying to say is that I really appreciate the Boston group!

    Syndicate content