So I’ve been looking into the Archive Storage Engine. What I would really like to do with it is get data in realtime, because (of course) the higher-ups want reports on realtime data — that is, they are not satisfied with a report that is run regularly, they want all the data up until “now”.
It is inadvisable to replicate from one storage engine type to another. I have not yet played with it, but since an Archive table does not allow updates and deletes, replicating from a MyISAM or InnoDB table to an Archive one is a bad idea.
Most folks probably run a batch job; but I wonder if it can be done in real-time. Or rather, ‘what is the best way to run it real-time?’ One way, off the top of my head, is to do this are to replicate to a blackhole table with a trigger, to insert into an archive table whenever an INSERT statement is called. The blackhole table should not give an error upon UPDATE or DELETE statements.
This also allows for easy aggregation, because the trigger can say “update the count and the country of a new profile” instead of having an entire replicated set of data, with reports running “SELECT count(*)”. Instead of copying all the data and running the same reports on a different server/table, we can now collect the data we actually want, which is “1 new paid membership at time t located in Sao Paulo, Brazil.” For reporting, we do not care what the name of the member is.
I have searched around but have not yet found how users are getting data into their archived databases. I need a sandbox server at work so I can play with the options.