Writing Storage Engines

Brian Aker — Director of Architecture

MySQL Server
SQL-Based, aiming to be SQL-03 compliant. Stable, scalable, easy to use, modular, high-performance RDBMS.

Client Library Support
Libmysql c-library (think OCI)
JDBC — type IV
ODBC
Perl DBD::DBI
PHP (built in)
ADO.Net, OleDB, Ruby, Erlang, Eiffel, Smalltalk and more provided by third parties.

So great because you can connect in many ways to different storage engines
Architecture

sql/ is for kernel
mysys/ is the portable runtime library. MySQL is ported to 52 platforms
(was 53 but Brian deleted OS/2 last week 🙂 ). The portable runtime library wraps commands like pwrite, unlinking and renaming files so that operating system does not matter.
mysql-test/ is for your test cases — run “mysql testrun –record”. Can take SQL from web applications, put them in test files, and run against upgraded mysql to see if the new versions break the code.

Definitions:
Storage Engine — code that stores data. Common API, so storage engine engineers don’t need to know SQL.
Handler an instance of a class. It controls the storage engine. Handler instantiates an object fetching data in the db. ie, handler is an instance of a table, etc.
Handlerton the structure. Storage engine needs to know things like a db was created, or a transaction was committed. So this is how we can talk to the storage engine. Not complete in 5.0, but complete in 5.1 — no need to hack with handler.cc code to put in your own hooks.

What you will need:
All code is written in simplified C++
An example storage engine (there’s one provided with MySQL). You can create something you need just by changing the skeleton.
Your own ideas

Is this done?
sure — Friendster, Google, Yahoo, Solid, Oracle.

Server’s kernel.

All database changes go to parser, then rewriter, then optimizer, then handler, then storage engine. (DML goes to query cache first, DDL skips that part.)

So what is a storage engine? “Data formats on disk” or “on the web” etc.

MySQL server instance talks to handlerton or handler itself.

[there was create handlerton code & explanation here]

Jeremy Zawodny’s “Writing a Simple Storage Engine”

Storage Engine Methods
storage/example is 5.1 directory to find it
in 5.0, in sql directory.

Table control
::create()
::open()
::close()
::delete()
(need to open, create, close and drop tables)

great examples!

Scan Path:
Locks -> Info (tell us about metadata of table, if no records, no need to go on. If few, don’t bother with indexes. If lots, use that info later in cost-based optimizer) -> Read Init -> Read Rows -> Cleanup

Trace of calls:
ha_example::store_lock
ha_example::external_lock (used to call flock() )
ha_example::info (all information from SHOW STATUS comes from this)
ha_example::rnd_init (can tell it if it’s going to fetch random or sequential blocks)
ha_example::extra Cash record in HA_rrnd()
ha_example::rnd_next
ha_example::rnd_next
ha_example::rnd_next
ha_example::extra End cacheing of records (def)
ha_example::external_lock
ha_example::extra Reset database to after open

only myisam uses all these extra Cash records, because Monty wrote it. InnoDB uses about 6 of them.

Check out push-down system for transactional engines.

Delete a row needs improvement to interface.

More — transaction methods (simple one in FEDERATED), bulk load methods, defrag methods, and more — read handler.h and documentation.

Lot that can be done in autoconf.

sql/Makefile.am — add your include and source file
sql/handler.h — register your handler
sql/mysql_priv.h — set up your variable for SHOW VARIABLES
sql/handler.cc — add yourself to the handler create list
sql/mysqld.cc — set up your variable for SHOW

grep for example in files, because that word is only used for the storage engine example.

More info:
sql/ha_example.h or .cc
Look at docs on mysql.com
forums.mysql.com
lists.mysql.com (internals)
MySQL Network (business opportunities available)

Brian Aker — Director of Architecture

MySQL Server
SQL-Based, aiming to be SQL-03 compliant. Stable, scalable, easy to use, modular, high-performance RDBMS.

Client Library Support
Libmysql c-library (think OCI)
JDBC — type IV
ODBC
Perl DBD::DBI
PHP (built in)
ADO.Net, OleDB, Ruby, Erlang, Eiffel, Smalltalk and more provided by third parties.

So great because you can connect in many ways to different storage engines
Architecture

sql/ is for kernel
mysys/ is the portable runtime library. MySQL is ported to 52 platforms
(was 53 but Brian deleted OS/2 last week 🙂 ). The portable runtime library wraps commands like pwrite, unlinking and renaming files so that operating system does not matter.
mysql-test/ is for your test cases — run “mysql testrun –record”. Can take SQL from web applications, put them in test files, and run against upgraded mysql to see if the new versions break the code.

Definitions:
Storage Engine — code that stores data. Common API, so storage engine engineers don’t need to know SQL.
Handler an instance of a class. It controls the storage engine. Handler instantiates an object fetching data in the db. ie, handler is an instance of a table, etc.
Handlerton the structure. Storage engine needs to know things like a db was created, or a transaction was committed. So this is how we can talk to the storage engine. Not complete in 5.0, but complete in 5.1 — no need to hack with handler.cc code to put in your own hooks.

What you will need:
All code is written in simplified C++
An example storage engine (there’s one provided with MySQL). You can create something you need just by changing the skeleton.
Your own ideas

Is this done?
sure — Friendster, Google, Yahoo, Solid, Oracle.

Server’s kernel.

All database changes go to parser, then rewriter, then optimizer, then handler, then storage engine. (DML goes to query cache first, DDL skips that part.)

So what is a storage engine? “Data formats on disk” or “on the web” etc.

MySQL server instance talks to handlerton or handler itself.

[there was create handlerton code & explanation here]

Jeremy Zawodny’s “Writing a Simple Storage Engine”

Storage Engine Methods
storage/example is 5.1 directory to find it
in 5.0, in sql directory.

Table control
::create()
::open()
::close()
::delete()
(need to open, create, close and drop tables)

great examples!

Scan Path:
Locks -> Info (tell us about metadata of table, if no records, no need to go on. If few, don’t bother with indexes. If lots, use that info later in cost-based optimizer) -> Read Init -> Read Rows -> Cleanup

Trace of calls:
ha_example::store_lock
ha_example::external_lock (used to call flock() )
ha_example::info (all information from SHOW STATUS comes from this)
ha_example::rnd_init (can tell it if it’s going to fetch random or sequential blocks)
ha_example::extra Cash record in HA_rrnd()
ha_example::rnd_next
ha_example::rnd_next
ha_example::rnd_next
ha_example::extra End cacheing of records (def)
ha_example::external_lock
ha_example::extra Reset database to after open

only myisam uses all these extra Cash records, because Monty wrote it. InnoDB uses about 6 of them.

Check out push-down system for transactional engines.

Delete a row needs improvement to interface.

More — transaction methods (simple one in FEDERATED), bulk load methods, defrag methods, and more — read handler.h and documentation.

Lot that can be done in autoconf.

sql/Makefile.am — add your include and source file
sql/handler.h — register your handler
sql/mysql_priv.h — set up your variable for SHOW VARIABLES
sql/handler.cc — add yourself to the handler create list
sql/mysqld.cc — set up your variable for SHOW

grep for example in files, because that word is only used for the storage engine example.

More info:
sql/ha_example.h or .cc
Look at docs on mysql.com
forums.mysql.com
lists.mysql.com (internals)
MySQL Network (business opportunities available)