Database snapshot conflict reduction

 so if you want to take a database snapshot

(a database snapshot is a full copy of the state of a database at a point in time, unaltered)

and it takes like 12 years to record it

people are gonna be writing into your database in those 12 years

uh oh!

heres the good news

you can write a sequential execution callback handler

(edit 12/15/20)

callback handler: a service (handler) that receives full blocks of code (callbacks) to eventually execute. in this case, the blocks of code are instructions on how to modify the database, which are fulfilled as soon as possible.

sequential execution: the callbacks happen in the order that they came into the handler. first in, first out.

purpose of the sequential execution callback handler? to block anything that modifies the database so that the database snapshot is taken "unmodified", that is, accurately. (implementations vary. here are my ideas below). in order to do so, it is layered on top of all modification requests.

...

format:

request -> action

request -> blocker -> action

the arrow is a label for a function call - the action happens once the request is made because the "action function" is called by whatever receives the request.

the blocker is our callback handler by definition - it inserts a delay that prevents the request from being fulfilled.; it can also do anything it wants, like making the action possible, prioritizing a favorable state for the action, or throwing the action away and dying.

you might call "do nothing" and "pass" as sorts of callback handler too, and that would be more or less right.

request -> "do nothing" -> action

same as

request -> nothing happens

..

request -> "pass" -> action

since "pass" doesn't do anything, this has the same effect as

request -> action

note that it's rare for actions to actually be passed as callbacks. in this usage, you can replace "callback" with "function identifying tag" or replace "callback handler blocker" with one "function-specific blocker" for each function and it would work more or less the same way.

my heart, however, is with callbacks. they fit the spirit of the problem being solved.


normal process:

data write request -> write an entry in the database

data update request -> update an entry in the database

data delete request -> delete an entry in the database

example (with parallel processes):

w -> w -> take snapshot -> u -> w -> (snapshot process completes) -> more requests

since the snapshot process completes after an entry altering operation, you have potentially different data in the snapshot vs. what the snapshot was supposed to be

functioning locking process:

data write request -> lock if a snapshot is happening -> write an entry in the database

data update request -> lock if a snapshot is happening -> update an entry in the database

data delete request -> lock if a snapshot is happening -> delete an entry in the database

example (with functioning locking):

w -> w -> take snapshot -> freeze all incoming requests (u, w) -> (snapshot process completes) -> (resume incoming requests) u -> w -> more requests

the issue with this is that each request can be delayed for n (size of database being stored in snapshot) time, and there is still an inconvenient backlog once the snapshot process completes that also creates an unnecessary load on the database system

conflict reduction process:

data write request -> message the snapshot recording service and lock -> write an entry in the database

data update request -> message the snapshot recording service and lock -> update an entry in the database

data delete request -> message the snapshot recording service and lock -> delete an entry in the database

example (with conflict reduction):

w -> w -> take snapshot -> u -> (prioritize snapshot storage of target of u) -> target of u is unlocked in a limited number of steps and u completes -> w -> (prioritize snapshot storage of target of w) -> target of w is unlocked in a limited number of steps and w completes -> (snapshot process completes) -> more requests

it correctly stores the values for the date that the snapshot was taken at, as well as ensuring that processes don't get blocked for unreasonable amounts of time

you might call this "snapshot recording as a service"

Comments

Popular Posts