2012 Mozilla DB Year in Graphs

I’m not a wizard with infographics, there are almost 400 databases at Mozilla, in 11 different categories. Here is how each category fares in number of databases:

Mozilla DBs in 2012

Here is how each category measures up with regards to database size – clearly, our crash-stats database (which is on Postgres, not MySQL) is the largest:

2012 size of all Mozilla databases

So here is another pie chart with the relative sizes of the MySQL databases:
2012 size of MySQL databases at Mozilla

I’m sure I’ve miscategorized some things (for instance, are metrics on AMO classified under AMO/Marketplace or “internal tools”?) but here are the categories I used:

air.m.o – air.mozilla.org
AMO/Marketplace – addons/marketplace
blog/web page – it’s a db behind a blog or mostly static webpage
bugzilla – Bugzilla
Crash-stats – Socorro, crash-stats.mozilla.com – Where apps like Firefox send crash details.
Internal tool – If the db behind this is down, moco/mofo people may not be able to do their work. This covers applications from graphs.mozilla.org to inventory.mozilla.org to the PTO app.
release tool – If this db is down, releases can not happen (but this db is not a tree-closing db).
SUMO – support.mozilla.org
Tree-closing – if this db is down, the tree closes (and releases can’t happen)
World-facing – if this db is down, non moco/mofo ppl will notice. These are specifically tools that folks interact with, including the Mozilla Developer Network and sites like gameon.mozilla.org
World-interfacing – This db is critical to tools we use to interface with the world, though not necessarily world visible. basket.mozilla.org, Mozillians, etc.

The count of databases includes all production/dev/stage servers. The size is the size of the database on one of the production/dev/stage machines. For example, Bugzilla has 6 servers in use – 4 in production and 2 in stage. The size is the size of the master in production and the master in stage, combined. This way we have not grossly inflated the size of the database, even though technically speaking we do have to manage the data on each of the servers.

For next year, I hope to be able to gather this kind of information automatically, and have easily accessible comprehensive numbers for bandwidth, number of queries per day on each server, and more.

Comments are closed.